gobbli package¶
Subpackages¶
- gobbli.augment package
- gobbli.dataset package
- gobbli.experiment package
- gobbli.inspect package
- gobbli.model package
- gobbli.test package
- Subpackages
- gobbli.test.augment package
- gobbli.test.classification package
- gobbli.test.dataset package
- gobbli.test.experiment package
- gobbli.test.interactive package
- gobbli.test.model package
- Submodules
- gobbli.test.model.test_base_model module
- gobbli.test.model.test_bert module
- gobbli.test.model.test_fasttext module
- gobbli.test.model.test_mtdnn module
- gobbli.test.model.test_sklearn module
- gobbli.test.model.test_spacy module
- gobbli.test.model.test_transformer module
- gobbli.test.model.test_use module
- Module contents
- Submodules
- Submodules
- Module contents
- Subpackages
Submodules¶
Module contents¶
-
class
gobbli.
TokenizeMethod
[source]¶ Bases:
enum.Enum
Enum describing the different canned tokenization methods gobbli supports. Processes requiring tokenization should generally allow a user to pass in a custom tokenization function if their needs aren’t met by one of these.
-
SPLIT
¶ Naive tokenization based on whitespace. Probably only useful for testing. Tokens will be lowercased.
-
SPACY
¶ Simple tokenization using spaCy’s English language model. Tokens will be lowercased, and non-alphabetic tokens will be filtered out.
-
SENTENCEPIECE
¶ SentencePiece-based tokenization.
-
SENTENCEPIECE
= 'sentencepiece'
-
SPACY
= 'spacy'
-
SPLIT
= 'split'
-