gobbli.augment.base module

class gobbli.augment.base.BaseAugment[source]

Bases: abc.ABC

Base class for data augmentation methods.

abstract augment(X, times=5, p=0.1)[source]

Return additional texts for each text in the passed array.

Parameters
  • X (List[str]) – Input texts.

  • times (int) – How many texts to generate per text in the input.

  • p (float) – Probability of considering each token in the input for replacement. Note that some tokens aren’t able to be replaced by a given augmentation method and will be ignored, so the actual proportion of replaced tokens in your input may be much lower than this number.

Return type

List[str]

Returns

Generated texts (length = times * len(X)).

classmethod data_dir()[source]
Return type

Path

Returns

The data directory used for this class of augmentation model.

gobbli.augment.base.augment_dir()[source]
Return type

Path