autogluon.timeseries.TimeSeriesPredictor#

class autogluon.timeseries.TimeSeriesPredictor(target: Optional[str] = None, known_covariates_names: Optional[List[str]] = None, prediction_length: int = 1, eval_metric: Optional[str] = None, path: Optional[str] = None, verbosity: int = 2, quantile_levels: Optional[List[float]] = None, ignore_time_index: bool = False, validation_splitter: Union[str, AbstractTimeSeriesSplitter] = 'last_window', **kwargs)[source]#

AutoGluon TimeSeriesPredictor predicts future values of multiple related time series.

TimeSeriesPredictor provides probabilistic (distributional) multi-step-ahead forecasts for univariate time series. The forecast includes both the mean (i.e., conditional expectation of future values given the past), as well as the quantiles of the forecast distribution, indicating the range of possible future outcomes.

TimeSeriesPredictor fits both “global” deep learning models that are shared across all time series (e.g., DeepAR, Transformer), as well as “local” statistical models that are fit to each individual time series (e.g., ARIMA, ETS).

TimeSeriesPredictor expects input data and makes predictions in the TimeSeriesDataFrame format.

Parameters
  • target (str, default = "target") – Name of column that contains the target values to forecast (i.e., numeric observations of the time series).

  • prediction_length (int, default = 1) – The forecast horizon, i.e., How many time steps into the future the models should be trained to predict. For example, if time series contain daily observations, setting prediction_length = 3 will train models that predict up to 3 days into the future from the most recent observation.

  • eval_metric (str, default = "mean_wQuantileLoss") –

    Metric by which predictions will be ultimately evaluated on future test data. AutoGluon tunes hyperparameters in order to improve this metric on validation data, and ranks models (on validation data) according to this metric. Available options:

    • "mean_wQuantileLoss": mean weighted quantile loss, defined as average of quantile losses for the specified quantile_levels scaled by the total value of the time series

    • "MAPE": mean absolute percentage error

    • "sMAPE": “symmetric” mean absolute percentage error

    • "MASE": mean absolute scaled error

    • "MSE": mean squared error

    • "RMSE": root mean squared error

    For more information about these metrics, see https://docs.aws.amazon.com/forecast/latest/dg/metrics.html.

  • known_covariates_names (List[str], optional) –

    Names of the covariates that are known in advance for all time steps in the forecast horizon. These are also known as dynamic features, exogenous variables, additional regressors or related time series. Examples of such covariates include holidays, promotions or weather forecasts.

    Currently, only numeric (float of integer dtype) are supported.

    If known_covariates_names are provided, then:

    • fit(), evaluate(), and leaderboard() will expect a data frame with columns listed in known_covariates_names (in addition to the target column).

    • predict() will expect an additional keyword argument known_covariates containing the future values of the known covariates in TimeSeriesDataFrame format.

  • quantile_levels (List[float], optional) – List of increasing decimals that specifies which quantiles should be estimated when making distributional forecasts. Defaults to [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]. Can alternatively be provided with the keyword argument quantiles.

  • path (str, optional) – Path to the directory where models and intermediate outputs will be saved. Defaults to a timestamped folder AutogluonModels/ag-[TIMESTAMP] that will be created in the working directory.

  • verbosity (int, default = 2) – Verbosity levels range from 0 to 4 and control how much information is printed to stdout. Higher levels correspond to more detailed print statements, and verbosity=0 suppresses output including warnings. If using logging, you can alternatively control amount of information printed via logger.setLevel(L), where L ranges from 0 to 50 (Note: higher values of L correspond to fewer print statements, opposite of verbosity levels).

  • ignore_time_index (bool, default = False) – If True, the predictor will ignore the datetime indexes during both training and testing, and will replace the data indexes with dummy timestamps in second frequency. In this case, the forecast output time indexes will be arbitrary values, and seasonality will be turned off for local models.

  • validation_splitter (Union[str, AbstractTimeSeriesSplitter], default = "last_window") –

    Strategy for splitting train_data into training and validation parts during fit(). If tuning_data is passed to fit(), validation_splitter is ignored. Possible choices:

    • "last_window": use last prediction_length time steps of each time series for validation.

    • "multi_window": use last 3 non-overlapping windows of length prediction_length of each time series for validation.

    • object of type AbstractTimeSeriesSplitter implementing a custom splitting strategy (for advanced users only).

  • learner_type (AbstractLearner, default = TimeSeriesLearner) – A class which inherits from AbstractLearner. The learner specifies the inner logic of the TimeSeriesPredictor.

  • label (str) – Alias for target.

  • learner_kwargs (dict, optional) – Keyword arguments to send to the learner (for advanced users only). Options include trainer_type, a class inheriting from AbstractTrainer which controls training of multiple models. If path and eval_metric are re-specified within learner_kwargs, these are ignored.

  • quantiles (List[float]) – Alias for quantile_levels.

__init__(target: Optional[str] = None, known_covariates_names: Optional[List[str]] = None, prediction_length: int = 1, eval_metric: Optional[str] = None, path: Optional[str] = None, verbosity: int = 2, quantile_levels: Optional[List[float]] = None, ignore_time_index: bool = False, validation_splitter: Union[str, AbstractTimeSeriesSplitter] = 'last_window', **kwargs)[source]#

Methods

evaluate

Evaluate the performance for given dataset, computing the score determined by self.eval_metric on the given data set, and with the same prediction_length used when training models.

fit

Fit probabilistic forecasting models to the given time series dataset.

fit_summary

Output summary of information about models produced during fit().

get_model_best

Returns the name of the best model from trainer.

get_model_names

Returns the list of model names trained by this predictor object.

info

Returns a dictionary of objects each describing an attribute of the training process and trained models.

leaderboard

Return a leaderboard showing the performance of every trained model, the output is a pandas data frame with columns:

load

Load an existing TimeSeriesPredictor from given path.

predict

Return quantile and mean forecasts for the given dataset, starting from the end of each time series.

refit_full

save

Save this predictor to file in directory specified by this Predictor's path.

score

See, evaluate().

Attributes

predictor_file_name

validation_splitter