gobbli.augment.wordnet module¶
-
class
gobbli.augment.wordnet.
WordNet
(skip_download_check=False, spacy_model='en_core_web_sm')[source]¶ Bases:
gobbli.augment.base.BaseAugment
Data augmentation method based on WordNet. Replaces words with similar words according to the WordNet ontology. Texts will be Part of Speech-tagged using spaCy to help ensure only sensible replacements (i.e., within the same part of speech) are considered.
- Parameters
-
augment
(X, times=5, p=0.1)[source]¶ Return additional texts for each text in the passed array.
- Parameters
X¶ (
List
[str
]) – Input texts.times¶ (
int
) – How many texts to generate per text in the input.p¶ (
float
) – Probability of considering each token in the input for replacement. Note that some tokens aren’t able to be replaced by a given augmentation method and will be ignored, so the actual proportion of replaced tokens in your input may be much lower than this number.
- Return type
List
[str
]- Returns
Generated texts (length =
times * len(X)
).
-
classmethod
data_dir
()¶ - Return type
Path
- Returns
The data directory used for this class of augmentation model.