gobbli.inspect.evaluate module¶

class gobbli.inspect.evaluate.ClassificationError(X, y_true, y_pred_proba)[source]¶

Bases: object

Describes an error in classification. Reports the original text, the true label, and the predicted probability.

Parameters

X¶ (str) – The original text.
y_true¶ (Union[str, List[str]]) – The true label(s).
y_pred_proba¶ (Dict[str, float]) – The model predicted probability for each class.

property y_pred¶

Return type: str
Returns: The class with the highest predicted probability for this observation.

y_pred_multilabel(threshold=0.5)[source]¶

Parameters: threshold¶ (float) – The predicted probability threshold for predictions
Return type: List[str]
Returns: The predicted labels for this observation (predicted probability greater than the given threshold)

class gobbli.inspect.evaluate.ClassificationEvaluation(labels, X, y_true, y_pred_proba, metric_funcs=None)[source]¶

Bases: object

Provides several methods for evaluating the results from a classification problem.

Parameters

labels¶ (List[str]) – The set of unique labels in the dataset.
X¶ (List[str]) – The list of texts that were classified.
y_true¶ (Union[List[str], List[List[str]]]) – The true labels for the dataset.
y_pred_proba¶ (DataFrame) – A dataframe containing a row for each observation in X and a column for each label in the training data. Cells are predicted probabilities.

errors(k=10)[source]¶

Output the biggest mistakes for each class by the classifier.

Parameters: k¶ (int) – The number of results to return for each of false positives and false negatives.
Return type: Dict[str, Tuple[List[ClassificationError], List[ClassificationError]]]
Returns: A dictionary whose keys are label names and values are 2-tuples. The first element is a list of the top k false positives, and the second element is a list of the top k false negatives.

errors_for_label(label, k=10)[source]¶

Output the biggest mistakes for the given class by the classifier

Parameters

label¶ (str) – The label to return errors for.
k¶ (int) – The number of results to return for each of false positives and false negatives.

Returns

A 2-tuple. The first element is a list of the top k false positives, and the second element is a list of the top k false negatives.

errors_report(k=10)[source]¶

Parameters: k¶ (int) – The number of results to return for each of false positives and false negatives.
Return type: str
Returns: A nicely-formatted human-readable report describing the biggest mistakes made by the classifier for each class.

metrics()[source]¶

Return type: Dict[str, float]
Returns: A dictionary containing various metrics of model performance on the test dataset.

metrics_report()[source]¶

Return type: str
Returns: A nicely formatted human-readable report describing metrics of model performance on the test dataset.

plot(sample_size=None)[source]¶

Parameters: sample_size¶ (Optional[int]) – Optional number of points to sample for the plot. Unsampled plots may be difficult to save due to their size.
Return type: Chart
Returns: An Altair chart visualizing predicted probabilities and true classes to visually identify where errors are being made.

property y_pred_multiclass¶

Return type: List[str]
Returns: Predicted class for each observation (assuming multiclass context).

property y_pred_multilabel¶

Return type: DataFrame
Returns: Indicator dataframe containing a 0 if each label wasn’t predicted and 1 if it was for each observation.

property y_true_multiclass¶

property y_true_multilabel¶

gobbli.inspect.evaluate.DEFAULT_METRICS = {'Accuracy': <function <lambda>>, 'Weighted F1 Score': <function <lambda>>, 'Weighted Precision Score': <function <lambda>>, 'Weighted Recall Score': <function <lambda>>}¶: The default set of metrics to evaluate classification models with. Users may want to extend this.

gobbli.inspect.evaluate.MetricFunc = typing.Callable[[typing.Sequence[str], pandas.core.frame.DataFrame], float]¶: A function used to calculate some metric. It should accept a sequence of true labels (y_true) and a dataframe of shape (n_samples, n_classes) containing predicted probabilities; it should output a real number.