TimeSeriesPredictor.evaluate¶
- TimeSeriesPredictor.evaluate(data: TimeSeriesDataFrame | DataFrame | Path | str, model: str | None = None, metrics: str | TimeSeriesScorer | List[str | TimeSeriesScorer] | None = None, cutoff: int | None = None, display: bool = False, use_cache: bool = True) Dict[str, float] [source]¶
Evaluate the forecast accuracy for given dataset.
This method measures the forecast accuracy using the last
self.prediction_length
time steps of each time series indata
as a hold-out set.Note
Metrics are always reported in ‘higher is better’ format. This means that metrics such as MASE or MAPE will be multiplied by -1, so their values will be negative. This is necessary to avoid the user needing to know the metric to understand if higher is better when looking at the evaluation results.
- Parameters:
data (Union[TimeSeriesDataFrame, pd.DataFrame, Path, str]) –
The data to evaluate the best model on. If a
cutoff
is not provided, the lastprediction_length
time steps of each time series indata
will be held out for prediction and forecast accuracy will be calculated on these time steps. When acutoff
is provided, the-cutoff
-th to the-cutoff + prediction_length
-th time steps of each time series are used for evaluation.Must include both historical and future data (i.e., length of all time series in
data
must be at leastprediction_length + 1
, ifcutoff
is not provided,-cutoff + 1
otherwise).The names and dtypes of columns and static features in
data
must match thetrain_data
used to train the predictor.If provided data is a pandas.DataFrame, AutoGluon will attempt to convert it to a TimeSeriesDataFrame. If a str or a Path is provided, AutoGluon will attempt to load this file.
model (str, optional) – Name of the model that you would like to evaluate. By default, the best model during training (with highest validation score) will be used.
metrics (str, TimeSeriesScorer or List[Union[str, TimeSeriesScorer]], optional) – Metric or a list of metrics to compute scores with. Defaults to
self.eval_metric
. Supports both metric names as strings and custom metrics based on TimeSeriesScorer.cutoff (int, optional) – A negative integer less than or equal to
-1 * prediction_length
denoting the time step indata
where the forecast evaluation starts, i.e., time series are evaluated from the-cutoff
-th to the-cutoff + prediction_length
-th time step. Defaults to-1 * prediction_length
, using the lastprediction_length
time steps of each time series for evaluation.display (bool, default = False) – If True, the scores will be printed.
use_cache (bool, default = True) – If True, will attempt to use the cached predictions. If False, cached predictions will be ignored. This argument is ignored if
cache_predictions
was set to False when creating theTimeSeriesPredictor
.
- Returns:
scores_dict – Dictionary where keys = metrics, values = performance along each metric. For consistency, error metrics will have their signs flipped to obey this convention. For example, negative MAPE values will be reported. To get the
eval_metric
score, dooutput[predictor.eval_metric.name]
.- Return type:
Dict[str, float]