gobbli.dataset.cmu_movie_summary module

class gobbli.dataset.cmu_movie_summary.MovieSummaryDataset(*args, **kwargs)[source]

Bases: gobbli.dataset.base.BaseDataset

gobbli Dataset for the CMU Movie Summary dataset, framed as a multilabel classification problem predicting movie genres from plot summaries.

http://www.cs.cmu.edu/~ark/personas/

Blank constructor needed to satisfy mypy

METADATA_FILE = 'MovieSummaries/movie.metadata.tsv'
PLOT_SUMMARIES_FILE = 'MovieSummaries/plot_summaries.txt'
TRAIN_PCT = 0.8
X_test()[source]
X_train()[source]
classmethod data_dir()
Return type

Path

embed_input(embed_batch_size=32, pooling=<EmbedPooling.MEAN: 'mean'>, limit=None)
Return type

EmbedInput

classmethod load(*args, **kwargs)
Return type

BaseDataset

predict_input(predict_batch_size=32, limit=None)
Return type

PredictInput

train_input(train_batch_size=32, valid_batch_size=8, num_train_epochs=3, valid_proportion=0.2, split_seed=1234, shuffle_seed=1234, limit=None)
Return type

TrainInput

y_test()[source]
y_train()[source]