autogluon.timeseries.TimeSeriesPredictor

class autogluon.timeseries.TimeSeriesPredictor(target: str | None = None, known_covariates_names: List[str] | None = None, prediction_length: int = 1, freq: str | None = None, eval_metric: str | TimeSeriesScorer | None = None, eval_metric_seasonal_period: int | None = None, path: str | Path | None = None, verbosity: int = 2, log_to_file: bool = True, log_file_path: str | Path = 'auto', quantile_levels: List[float] | None = None, cache_predictions: bool = True, learner_type: Type[AbstractLearner] | None = None, learner_kwargs: dict | None = None, label: str | None = None, **kwargs)[source]

AutoGluon TimeSeriesPredictor predicts future values of multiple related time series.

TimeSeriesPredictor provides probabilistic (quantile) multi-step-ahead forecasts for univariate time series. The forecast includes both the mean (i.e., conditional expectation of future values given the past), as well as the quantiles of the forecast distribution, indicating the range of possible future outcomes.

TimeSeriesPredictor fits both “global” deep learning models that are shared across all time series (e.g., DeepAR, Transformer), as well as “local” statistical models that are fit to each individual time series (e.g., ARIMA, ETS).

TimeSeriesPredictor expects input data and makes predictions in the TimeSeriesDataFrame format.

Parameters:
  • target (str, default = "target") – Name of column that contains the target values to forecast (i.e., numeric observations of the time series).

  • prediction_length (int, default = 1) – The forecast horizon, i.e., How many time steps into the future the models should be trained to predict. For example, if time series contain daily observations, setting prediction_length = 3 will train models that predict up to 3 days into the future from the most recent observation.

  • freq (str, optional) –

    Frequency of the time series data (see pandas documentation for available frequencies). For example, "D" for daily data or "h" for hourly data.

    By default, the predictor will attempt to automatically infer the frequency from the data. This argument should only be set in two cases:

    1. The time series data has irregular timestamps, so frequency cannot be inferred automatically.

    2. You would like to resample the original data at a different frequency (for example, convert hourly measurements into daily measurements).

    If freq is provided when creating the predictor, all data passed to the predictor will be automatically resampled at this frequency.

  • eval_metric (Union[str, TimeSeriesScorer], default = "WQL") –

    Metric by which predictions will be ultimately evaluated on future test data. AutoGluon tunes hyperparameters in order to improve this metric on validation data, and ranks models (on validation data) according to this metric.

    Probabilistic forecast metrics (evaluated on quantile forecasts for the specified quantile_levels):

    • "SQL": scaled quantile loss

    • "WQL": weighted quantile loss

    Point forecast metrics (these are always evaluated on the "mean" column of the predictions):

    • "MAE": mean absolute error

    • "MAPE": mean absolute percentage error

    • "MASE": mean absolute scaled error

    • "MSE": mean squared error

    • "RMSE": root mean squared error

    • "RMSLE": root mean squared logarithmic error

    • "RMSSE": root mean squared scaled error

    • "SMAPE": “symmetric” mean absolute percentage error

    • "WAPE": weighted absolute percentage error

    For more information about these metrics, see Forecasting Time Series - Evaluation Metrics.

  • eval_metric_seasonal_period (int, optional) – Seasonal period used to compute some evaluation metrics such as mean absolute scaled error (MASE). Defaults to None, in which case the seasonal period is computed based on the data frequency.

  • known_covariates_names (List[str], optional) –

    Names of the covariates that are known in advance for all time steps in the forecast horizon. These are also known as dynamic features, exogenous variables, additional regressors or related time series. Examples of such covariates include holidays, promotions or weather forecasts.

    Currently, only numeric (float of integer dtype) are supported.

    If known_covariates_names are provided, then:

    • fit(), evaluate(), and leaderboard() will expect a data frame with columns listed in known_covariates_names (in addition to the target column).

    • predict() will expect an additional keyword argument known_covariates containing the future values of the known covariates in TimeSeriesDataFrame format.

  • quantile_levels (List[float], optional) – List of increasing decimals that specifies which quantiles should be estimated when making distributional forecasts. Defaults to [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9].

  • path (str or pathlib.Path, optional) – Path to the directory where models and intermediate outputs will be saved. Defaults to a timestamped folder AutogluonModels/ag-[TIMESTAMP] that will be created in the working directory.

  • verbosity (int, default = 2) – Verbosity levels range from 0 to 4 and control how much information is printed to stdout. Higher levels correspond to more detailed print statements, and verbosity=0 suppresses output including warnings. Verbosity 0 corresponds to Python’s ERROR log level, where only error outputs will be logged. Verbosity 1 and 2 will additionally log warnings and info outputs, respectively. Verbosity 4 enables all logging output including debug messages from AutoGluon and all logging in dependencies (GluonTS, PyTorch Lightning, AutoGluon-Tabular, etc.)

  • log_to_file (bool, default = True) – Whether to save the logs into a file for later reference

  • log_file_path (Union[str, Path], default = "auto") – File path to save the logs. If auto, logs will be saved under predictor_path/logs/predictor_log.txt. Will be ignored if log_to_file is set to False

  • cache_predictions (bool, default = True) – If True, the predictor will cache and reuse the predictions made by individual models whenever predict(), leaderboard(), or evaluate() methods are called. This allows to significantly speed up these methods. If False, caching will be disabled. You can set this argument to False to reduce disk usage at the cost of longer prediction times.

  • label (str, optional) – Alias for target.

__init__(target: str | None = None, known_covariates_names: List[str] | None = None, prediction_length: int = 1, freq: str | None = None, eval_metric: str | TimeSeriesScorer | None = None, eval_metric_seasonal_period: int | None = None, path: str | Path | None = None, verbosity: int = 2, log_to_file: bool = True, log_file_path: str | Path = 'auto', quantile_levels: List[float] | None = None, cache_predictions: bool = True, learner_type: Type[AbstractLearner] | None = None, learner_kwargs: dict | None = None, label: str | None = None, **kwargs)[source]

Methods

evaluate

Evaluate the forecast accuracy for given dataset.

feature_importance

Calculates feature importance scores for the given model via replacing each feature by a shuffled version of the same feature (also known as permutation feature importance) or by assigning a constant value representing the median or mode of the feature, and computing the relative decrease in the model's predictive performance.

fit

Fit probabilistic forecasting models to the given time series dataset.

fit_summary

Output summary of information about models produced during fit().

info

Returns a dictionary of objects each describing an attribute of the training process and trained models.

leaderboard

Return a leaderboard showing the performance of every trained model, the output is a pandas data frame with columns:

load

Load an existing TimeSeriesPredictor from given path.

model_names

Returns the list of model names trained by this predictor object.

persist

Persist models in memory for reduced inference latency.

plot

Plot historic time series values and the forecasts.

predict

Return quantile and mean forecasts for the given dataset, starting from the end of each time series.

refit_full

Retrain model on all of the data (training + validation).

save

Save this predictor to file in directory specified by this Predictor's path.

unpersist

Unpersist models in memory for reduced memory usage.

Attributes

model_best

Returns the name of the best model from trainer.

predictor_file_name