gobbli.inspect.evaluate module¶
-
class
gobbli.inspect.evaluate.
ClassificationError
(X, y_true, y_pred_proba)[source]¶ Bases:
object
Describes an error in classification. Reports the original text, the true label, and the predicted probability.
- Parameters
-
property
y_pred
¶ - Return type
str
- Returns
The class with the highest predicted probability for this observation.
-
class
gobbli.inspect.evaluate.
ClassificationEvaluation
(labels, X, y_true, y_pred_proba, metric_funcs=None)[source]¶ Bases:
object
Provides several methods for evaluating the results from a classification problem.
- Parameters
labels¶ (
List
[str
]) – The set of unique labels in the dataset.X¶ (
List
[str
]) – The list of texts that were classified.y_true¶ (
Union
[List
[str
],List
[List
[str
]]]) – The true labels for the dataset.y_pred_proba¶ (
DataFrame
) – A dataframe containing a row for each observation in X and a column for each label in the training data. Cells are predicted probabilities.
-
errors
(k=10)[source]¶ Output the biggest mistakes for each class by the classifier.
- Parameters
k¶ (
int
) – The number of results to return for each of false positives and false negatives.- Return type
Dict
[str
,Tuple
[List
[ClassificationError
],List
[ClassificationError
]]]- Returns
A dictionary whose keys are label names and values are 2-tuples. The first element is a list of the top
k
false positives, and the second element is a list of the topk
false negatives.
-
errors_for_label
(label, k=10)[source]¶ Output the biggest mistakes for the given class by the classifier
-
errors_report
(k=10)[source]¶ - Parameters
k¶ (
int
) – The number of results to return for each of false positives and false negatives.- Return type
str
- Returns
A nicely-formatted human-readable report describing the biggest mistakes made by the classifier for each class.
-
metric_funcs
= None¶
-
metrics
()[source]¶ - Return type
Dict
[str
,float
]- Returns
A dictionary containing various metrics of model performance on the test dataset.
-
metrics_report
()[source]¶ - Return type
str
- Returns
A nicely formatted human-readable report describing metrics of model performance on the test dataset.
-
plot
(sample_size=None)[source]¶ - Parameters
sample_size¶ (
Optional
[int
]) – Optional number of points to sample for the plot. Unsampled plots may be difficult to save due to their size.- Return type
Chart
- Returns
An Altair chart visualizing predicted probabilities and true classes to visually identify where errors are being made.
-
property
y_pred_multiclass
¶ - Return type
List
[str
]- Returns
Predicted class for each observation (assuming multiclass context).
-
property
y_pred_multilabel
¶ - Return type
DataFrame
- Returns
Indicator dataframe containing a 0 if each label wasn’t predicted and 1 if it was for each observation.
-
property
y_true_multiclass
¶ - Return type
List
[str
]
-
property
y_true_multilabel
¶ - Return type
DataFrame
-
gobbli.inspect.evaluate.
DEFAULT_METRICS
= {'Accuracy': <function <lambda>>, 'Weighted F1 Score': <function <lambda>>, 'Weighted Precision Score': <function <lambda>>, 'Weighted Recall Score': <function <lambda>>}¶ The default set of metrics to evaluate classification models with. Users may want to extend this.
-
gobbli.inspect.evaluate.
MetricFunc
= typing.Callable[[typing.Sequence[str], pandas.core.frame.DataFrame], float]¶ A function used to calculate some metric. It should accept a sequence of true labels (y_true) and a dataframe of shape (n_samples, n_classes) containing predicted probabilities; it should output a real number.