TabularPredictor.evaluate_predictions#
- TabularPredictor.evaluate_predictions(y_true, y_pred, sample_weight=None, decision_threshold=None, silent=False, auxiliary_metrics=True, detailed_report=False) dict [source]#
Evaluate the provided prediction probabilities against ground truth labels. Evaluation is based on the eval_metric previously specified in init, or default metrics if none was specified.
- Parameters
y_true (
np.array
orpd.Series
) – The ordered collection of ground-truth labels.y_pred (
pd.Series
orpd.DataFrame
) – The ordered collection of prediction probabilities or predictions. Obtainable via the output of predictor.predict_proba. Caution: For certain types of eval_metric (such as ‘roc_auc’), y_pred must be predicted-probabilities rather than predicted labels.sample_weight (
pd.Series
, default = None) – Sample weight for each row of data. If None, uniform sample weights are used.decision_threshold (float, default = None) – The decision threshold to use when converting prediction probabilities to predictions. This will impact the scores of metrics such as f1 and accuracy. If None, defaults to predictor.decision_threshold. Ignored unless problem_type=’binary’. Refer to the predictor.decision_threshold docstring for more information.
silent (bool, default = False) – If False, performance results are printed.
auxiliary_metrics (bool, default = True) – Should we compute other (problem_type specific) metrics in addition to the default metric?
detailed_report (bool, default = False) – Should we computed more detailed versions of the auxiliary_metrics? (requires auxiliary_metrics = True)
- Returns
Returns dict where keys = metrics, values = performance along each metric.
NOTE (Metrics scores always show in higher is better form.)
This means that metrics such as log_loss and root_mean_squared_error will have their signs FLIPPED, and values will be negative.