autogluon.timeseries.TimeSeriesPredictor¶
- class autogluon.timeseries.TimeSeriesPredictor(target: str | None = None, known_covariates_names: List[str] | None = None, prediction_length: int = 1, freq: str = None, eval_metric: str | TimeSeriesScorer | None = None, eval_metric_seasonal_period: int | None = None, path: str | Path | None = None, verbosity: int = 2, log_to_file: bool = True, log_file_path: str | Path = 'auto', quantile_levels: List[float] | None = None, cache_predictions: bool = True, learner_type: Type[AbstractLearner] | None = None, learner_kwargs: dict | None = None, label: str | None = None, **kwargs)[source]¶
AutoGluon
TimeSeriesPredictor
predicts future values of multiple related time series.TimeSeriesPredictor
provides probabilistic (quantile) multi-step-ahead forecasts for univariate time series. The forecast includes both the mean (i.e., conditional expectation of future values given the past), as well as the quantiles of the forecast distribution, indicating the range of possible future outcomes.TimeSeriesPredictor
fits both “global” deep learning models that are shared across all time series (e.g., DeepAR, Transformer), as well as “local” statistical models that are fit to each individual time series (e.g., ARIMA, ETS).TimeSeriesPredictor
expects input data and makes predictions in theTimeSeriesDataFrame
format.- Parameters:
target (str, default = "target") – Name of column that contains the target values to forecast (i.e., numeric observations of the time series).
prediction_length (int, default = 1) – The forecast horizon, i.e., How many time steps into the future the models should be trained to predict. For example, if time series contain daily observations, setting
prediction_length = 3
will train models that predict up to 3 days into the future from the most recent observation.freq (str, optional) –
Frequency of the time series data (see pandas documentation for available frequencies). For example,
"D"
for daily data or"h"
for hourly data.By default, the predictor will attempt to automatically infer the frequency from the data. This argument should only be set in two cases:
The time series data has irregular timestamps, so frequency cannot be inferred automatically.
You would like to resample the original data at a different frequency (for example, convert hourly measurements into daily measurements).
If
freq
is provided when creating the predictor, all data passed to the predictor will be automatically resampled at this frequency.eval_metric (Union[str, TimeSeriesScorer], default = "WQL") –
Metric by which predictions will be ultimately evaluated on future test data. AutoGluon tunes hyperparameters in order to improve this metric on validation data, and ranks models (on validation data) according to this metric.
Probabilistic forecast metrics (evaluated on quantile forecasts for the specified
quantile_levels
):"SQL"
: scaled quantile loss"WQL"
: weighted quantile loss
Point forecast metrics (these are always evaluated on the
"mean"
column of the predictions):"MAE"
: mean absolute error"MAPE"
: mean absolute percentage error"MASE"
: mean absolute scaled error"MSE"
: mean squared error"RMSE"
: root mean squared error"RMSLE"
: root mean squared logarithmic error"RMSSE"
: root mean squared scaled error"SMAPE"
: “symmetric” mean absolute percentage error"WAPE"
: weighted absolute percentage error
For more information about these metrics, see Forecasting Time Series - Evaluation Metrics.
eval_metric_seasonal_period (int, optional) – Seasonal period used to compute some evaluation metrics such as mean absolute scaled error (MASE). Defaults to
None
, in which case the seasonal period is computed based on the data frequency.known_covariates_names (List[str], optional) –
Names of the covariates that are known in advance for all time steps in the forecast horizon. These are also known as dynamic features, exogenous variables, additional regressors or related time series. Examples of such covariates include holidays, promotions or weather forecasts.
If
known_covariates_names
are provided, then:fit()
,evaluate()
, andleaderboard()
will expect a data frame with columns listed inknown_covariates_names
(in addition to thetarget
column).predict()
will expect an additional keyword argumentknown_covariates
containing the future values of the known covariates inTimeSeriesDataFrame
format.
quantile_levels (List[float], optional) – List of increasing decimals that specifies which quantiles should be estimated when making distributional forecasts. Defaults to
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
.path (str or pathlib.Path, optional) – Path to the directory where models and intermediate outputs will be saved. Defaults to a timestamped folder
AutogluonModels/ag-[TIMESTAMP]
that will be created in the working directory.verbosity (int, default = 2) – Verbosity levels range from 0 to 4 and control how much information is printed to stdout. Higher levels correspond to more detailed print statements, and
verbosity=0
suppresses output including warnings. Verbosity 0 corresponds to Python’s ERROR log level, where only error outputs will be logged. Verbosity 1 and 2 will additionally log warnings and info outputs, respectively. Verbosity 4 enables all logging output including debug messages from AutoGluon and all logging in dependencies (GluonTS, PyTorch Lightning, AutoGluon-Tabular, etc.)log_to_file (bool, default = True) – Whether to save the logs into a file for later reference
log_file_path (Union[str, Path], default = "auto") – File path to save the logs. If auto, logs will be saved under predictor_path/logs/predictor_log.txt. Will be ignored if log_to_file is set to False
cache_predictions (bool, default = True) – If True, the predictor will cache and reuse the predictions made by individual models whenever
predict()
,leaderboard()
, orevaluate()
methods are called. This allows to significantly speed up these methods. If False, caching will be disabled. You can set this argument to False to reduce disk usage at the cost of longer prediction times.label (str, optional) – Alias for
target
.
- __init__(target: str | None = None, known_covariates_names: List[str] | None = None, prediction_length: int = 1, freq: str = None, eval_metric: str | TimeSeriesScorer | None = None, eval_metric_seasonal_period: int | None = None, path: str | Path | None = None, verbosity: int = 2, log_to_file: bool = True, log_file_path: str | Path = 'auto', quantile_levels: List[float] | None = None, cache_predictions: bool = True, learner_type: Type[AbstractLearner] | None = None, learner_kwargs: dict | None = None, label: str | None = None, **kwargs)[source]¶
Methods
Evaluate the forecast accuracy for given dataset.
Calculates feature importance scores for the given model via replacing each feature by a shuffled version of the same feature (also known as permutation feature importance) or by assigning a constant value representing the median or mode of the feature, and computing the relative decrease in the model's predictive performance.
Fit probabilistic forecasting models to the given time series dataset.
Output summary of information about models produced during
fit()
.Returns a dictionary of objects each describing an attribute of the training process and trained models.
Return a leaderboard showing the performance of every trained model, the output is a pandas data frame with columns:
Load an existing
TimeSeriesPredictor
from givenpath
.Returns the list of model names trained by this predictor object.
Persist models in memory for reduced inference latency.
Plot historic time series values and the forecasts.
Return quantile and mean forecasts for the given dataset, starting from the end of each time series.
Retrain model on all of the data (training + validation).
Save this predictor to file in directory specified by this Predictor's
path
.Unpersist models in memory for reduced memory usage.
Attributes
model_best
Returns the name of the best model from trainer.
predictor_file_name