Forecasting Time Series - Model Zoo#

Note

This documentation is intended for advanced users and may not be comprehensive.

For a stable public API, refer to the documentation for TimeSeriesPredictor.

This page contains the list of time series forecasting models available in AutoGluon. The available hyperparameters for each model are listed under Other Parameters.

This list is useful if you want to override the default hyperparameters (Manually configuring models) or define custom hyperparameter search spaces (Hyperparameter tuning), as described in the In-depth Tutorial. For example, the following code will train a TimeSeriesPredictor with DeepAR and ETS models with default hyperparameters (and a weighted ensemble on top of them):

predictor = TimeSeriesPredictor().fit(
   train_data,
   hyperparameters={
      "DeepAR": {},
      "ETS": {},
   },
)

The model names in the hyperparameters dictionary don’t have to include the "Model" suffix (e.g., both "DeepAR" and "DeepARModel" correspond to DeepARModel).

Note that some of the models’ hyperparameters have names and default values that are different from the original libraries.

Default models#

NaiveModel

Baseline model that sets the forecast equal to the last observed value.

SeasonalNaiveModel

Baseline model that sets the forecast equal to the last observed value from the same season.

ARIMAModel

Autoregressive Integrated Moving Average (ARIMA) model.

ETSModel

Exponential smoothing with trend and seasonality.

AutoARIMAModel

Automatically tuned ARIMA model.

AutoETSModel

Automatically tuned exponential smoothing with trend and seasonality.

ThetaModel

Theta forecasting model [Assimakopoulos2000].

DynamicOptimizedThetaModel

Optimized Theta forecasting model [Fiorucci2016].

DirectTabularModel

Predict all future time series values simultaneously using TabularPredictor from AutoGluon-Tabular.

RecursiveTabularModel

Predict future time series values one by one using TabularPredictor from AutoGluon-Tabular.

DeepARModel

Autoregressive forecasting model based on a recurrent neural network [Salinas2020].

DLinearModel

Simple feedforward neural network that subtracts trend before forecasting [Zeng2023].

PatchTSTModel

Transformer-based forecaster that segments each time series into patches [Nie2023].

SimpleFeedForwardModel

Simple feedforward neural network that simultaneously predicts all future values.

TemporalFusionTransformerModel

Combines LSTM with a transformer layer to predict the quantiles of all future target values [Lim2021].

NaiveModel#

class autogluon.timeseries.models.NaiveModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Baseline model that sets the forecast equal to the last observed value.

Quantiles are obtained by assuming that the residuals follow zero-mean normal distribution, scale of which is estimated from the empirical distribution of the residuals. As described in https://otexts.com/fpp3/prediction-intervals.html

SeasonalNaiveModel#

class autogluon.timeseries.models.SeasonalNaiveModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Baseline model that sets the forecast equal to the last observed value from the same season.

Quantiles are obtained by assuming that the residuals follow zero-mean normal distribution, scale of which is estimated from the empirical distribution of the residuals. As described in https://otexts.com/fpp3/prediction-intervals.html

Parameters

seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, will fall back to Naive forecast. Seasonality will also be disabled, if the length of the time series is < seasonal_period.

ARIMAModel#

class autogluon.timeseries.models.ARIMAModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Autoregressive Integrated Moving Average (ARIMA) model.

Based on statsmodels.tsa.statespace.sarimax.SARIMAX.

Our implementation contains several improvements over the Statsmodels version, such as multi-CPU training and reducing the disk usage when saving models.

Parameters
  • order (Tuple[int, int, int], default = (1, 1, 1)) – The (p, d, q) order of the model for the number of AR parameters, differences, and MA parameters to use.

  • seasonal_order (Tuple[int, int, int], default = (0, 0, 0)) – The (P, D, Q) parameters of the seasonal ARIMA model. Setting to (0, 0, 0) disables seasonality.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled.

  • trend ({"n", "c", "t", "ct"}, default = "c") – Parameter controlling the trend polynomial. Allowed values are “n” (no trend), “c” (constant), “t” (linear) and “ct” (constant plus linear).

  • enforce_stationarity (bool, default = True) – Whether to transform the AR parameters to enforce stationarity in the autoregressive component of the model. If ARIMA crashes during fitting with an LU decomposition error, you can either set enforce_stationarity to False or increase the differencing parameter d in order.

  • enforce_invertibility (bool, default = True) – Whether to transform the MA parameters to enforce invertibility in the moving average component of the model.

  • maxiter (int, default = 50) – Number of iterations during optimization.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

ETSModel#

class autogluon.timeseries.models.ETSModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Exponential smoothing with trend and seasonality.

Based on statsmodels.tsa.exponential_smoothing.ets.ETSModel.

Our implementation contains several improvements over the Statsmodels version, such as multi-CPU training and reducing the disk usage when saving models.

Parameters
  • error ({"add", "mul"}, default = "add") – Error model. Allowed values are “add” (additive) and “mul” (multiplicative). Note that “mul” is only applicable to time series with positive values.

  • trend ({"add", "mul", None}, default = "add") – Trend component model. Allowed values are “add” (additive), “mul” (multiplicative) and None (disabled). Note that “mul” is only applicable to time series with positive values.

  • damped_trend (bool, default = False) – Whether or not the included trend component is damped.

  • seasonal ({"add", "mul", None}, default = "add") – Seasonal component model. Allowed values are “add” (additive), “mul” (multiplicative) and None (disabled). Note that “mul” is only applicable to time series with positive values.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled. Seasonality will also be disabled, if the length of the time series is < 2 * seasonal_period.

  • maxiter (int, default = 1000) – Number of iterations during optimization.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

AutoARIMAModel#

class autogluon.timeseries.models.AutoARIMAModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Automatically tuned ARIMA model.

Automatically selects the best (p,d,q,P,D,Q) model parameters using an information criterion

Based on statsforecast.models.AutoARIMA.

Parameters
  • d (int, optional) – Order of first differencing. If None, will be determined automatically using a statistical test.

  • D (int, optional) – Order of seasonal differencing. If None, will be determined automatically using a statistical test.

  • max_p (int, default = 5) – Maximum number of autoregressive terms.

  • max_q (int, default = 5) – Maximum order of moving average.

  • max_P (int, default = 2) – Maximum number of seasonal autoregressive terms.

  • max_Q (int, default = 2) – Maximum order of seasonal moving average.

  • max_d (int, default = 2) – Maximum order of first differencing.

  • max_D (int, default = 1) – Maximum order of seasonal differencing.

  • start_p (int, default = 2) – Starting value of p in stepwise procedure.

  • start_q (int, default = 2) – Starting value of q in stepwise procedure.

  • start_P (int, default = 1) – Starting value of P in stepwise procedure.

  • start_Q (int, default = 1) – Starting value of Q in stepwise procedure.

  • stationary (bool, default = False) – Restrict search to stationary models.

  • seasonal (bool, default = True) – Whether to consider seasonal models.

  • approximation (bool, default = True) – Approximate optimization for faster convergence.

  • allowdrift (bool, default = False) – If True, drift term is allowed.

  • allowmean (bool, default = True) – If True, non-zero mean is allowed.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

AutoETSModel#

class autogluon.timeseries.models.AutoETSModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Automatically tuned exponential smoothing with trend and seasonality.

Automatically selects the best ETS (Error, Trend, Seasonality) model using an information criterion

Based on statsforecast.models.AutoETS.

Parameters
  • model (str, default = "ZZZ") – Model string describing the configuration of the E (error), T (trend) and S (seasonal) model components. Each component can be one of “M” (multiplicative), “A” (additive), “N” (omitted). For example when model=”ANN” (additive error, no trend, and no seasonality), ETS will explore only a simple exponential smoothing.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

ThetaModel#

class autogluon.timeseries.models.ThetaModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Theta forecasting model [Assimakopoulos2000].

Based on statsforecast.models.Theta.

References

Assimakopoulos2000

Assimakopoulos, Vassilis, and Konstantinos Nikolopoulos. “The theta model: a decomposition approach to forecasting.” International journal of forecasting 16.4 (2000): 521-530.

Parameters
  • decomposition_type ({"multiplicative", "additive"}, default = "multiplicative") – Seasonal decomposition type.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

DynamicOptimizedThetaModel#

class autogluon.timeseries.models.DynamicOptimizedThetaModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Optimized Theta forecasting model [Fiorucci2016].

Based on statsforecast.models.DynamicOptimizedTheta.

References

Fiorucci2016

Fiorucci, Jose et al. “Models for optimising the theta method and their relationship to state space models.” International journal of forecasting 32.4 (2016): 1151-1161.

Parameters
  • decomposition_type ({"multiplicative", "additive"}, default = "multiplicative") – Seasonal decomposition type.

  • seasonal_period (int or None, default = None) – Number of time steps in a complete seasonal cycle for seasonal models. For example, 7 for daily data with a weekly cycle or 12 for monthly data with an annual cycle. When set to None, seasonal_period will be inferred from the frequency of the training data. Can also be specified manually by providing an integer > 1. If seasonal_period (inferred or provided) is equal to 1, seasonality will be disabled.

  • n_jobs (int or float, default = 0.5) – Number of CPU cores used to fit the models in parallel. When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used. When set to a positive integer, that many cores are used. When set to -1, all CPU cores are used.

  • max_ts_length (int, default = 2500) – If not None, only the last max_ts_length time steps of each time series will be used to train the model. This significantly speeds up fitting and usually leads to no change in accuracy.

DirectTabularModel#

class autogluon.timeseries.models.DirectTabularModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Predict all future time series values simultaneously using TabularPredictor from AutoGluon-Tabular.

A single TabularPredictor is used to forecast all future time series values using the following features:

  • lag features (observed time series values) based on freq of the data

  • time features (e.g., day of the week) based on the timestamp of the measurement

  • lagged known and past covariates (if available)

  • static features of each item (if available)

Features not known during the forecast horizon (e.g., future target values) are replaced by NaNs.

If eval_metric=="mean_wQuantileLoss", the TabularPredictor will be trained with "quantile" problem type. Otherwise, TabularPredictor will be trained with "regression" problem type, and dummy quantiles will be obtained by assuming that the residuals follow zero-mean normal distribution.

Parameters
  • max_num_samples (int, default = 1_000_000) – Maximum number of rows in the training and validation sets. If the number of rows in train or validation data exceeds max_num_samples, then max_num_samples many rows are subsampled from the dataframe.

  • tabular_hyperparameters (Dict[Dict[str, Any]], optional) – Hyperparameters dictionary passed to TabularPredictor.fit. Contains the names of models that should be fit. Defaults to {"GBM" :{}}.

RecursiveTabularModel#

class autogluon.timeseries.models.RecursiveTabularModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Predict future time series values one by one using TabularPredictor from AutoGluon-Tabular.

Based on the mlforecast library.

Parameters
  • lags (List[int], default = None) – Lags of the target that will be used as features for predictions. If None, will be determined automatically based on the frequency of the data.

  • date_features (List[Union[str, Callable]], default = None) – Features computed from the dates. Can be pandas date attributes or functions that will take the dates as input. If None, will be determined automatically based on the frequency of the data.

  • differences (List[int], default = None) – Differences to take of the target before computing the features. These are restored at the forecasting step. If None, will be set to [seasonal_period], where seasonal_period is determined based on the data frequency.

  • standardize (bool, default = True) – If True, time series values will be standardized by subtracting mean & dividing by standard deviation.

  • tabular_hyperparameters (Dict[Dict[str, Any]], optional) – Hyperparameters dictionary passed to TabularPredictor.fit. Contains the names of models that should be fit. Defaults to {"GBM": {}}.

  • tabular_fit_kwargs (Dict[str, Any], optional) – Additional keyword arguments passed to TabularPredictor.fit. Defaults to an empty dict.

  • max_num_samples (int, default = 1_000_000) – If given, training and validation datasets will contain at most this many rows (starting from the end of each series).

  • subsampling_strategy ({"items", "timesteps", None}, default = "items") – Strategy used to limit memory consumption of the model if the dataset is too large. Use “items” if the dataset contains many time series, “timesteps” if the dataset contains a few very long time series, or None to disable subsampling. Only applies to datasets with > 20_000_000 rows.

DeepARModel#

class autogluon.timeseries.models.DeepARModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Autoregressive forecasting model based on a recurrent neural network [Salinas2020].

Based on gluonts.torch.model.deepar.DeepAREstimator. See GluonTS documentation for additional hyperparameters.

References

Salinas2020

Salinas, David, et al. “DeepAR: Probabilistic forecasting with autoregressive recurrent networks.” International Journal of Forecasting. 2020.

Parameters
  • context_length (int, default = max(10, 2 * prediction_length)) – Number of steps to unroll the RNN for before computing predictions

  • disable_static_features (bool, default = False) – If True, static features won’t be used by the model even if they are present in the dataset. If False, static features will be used by the model if they are present in the dataset.

  • disable_known_covariates (bool, default = False) – If True, known covariates won’t be used by the model even if they are present in the dataset. If False, known covariates will be used by the model if they are present in the dataset.

  • num_layers (int, default = 2) – Number of RNN layers

  • hidden_size (int, default = 40) – Number of RNN cells for each layer

  • dropout_rate (float, default = 0.1) – Dropout regularization parameter

  • embedding_dimension (int, optional) – Dimension of the embeddings for categorical features (if None, defaults to [min(50, (cat+1)//2) for cat in cardinality])

  • distr_output (gluonts.torch.distributions.DistributionOutput, default = StudentTOutput()) – Distribution to use to evaluate observations and sample predictions

  • scaling (bool, default = True) – Whether to automatically scale the target values

  • epochs (int, default = 100) – Number of epochs the model will be trained for

  • batch_size (int, default = 64) – Size of batches used during training

  • num_batches_per_epoch (int, default = 50) – Number of batches processed every epoch

  • learning_rate (float, default = 1e-3,) – Learning rate used during training

DLinearModel#

class autogluon.timeseries.models.DLinearModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Simple feedforward neural network that subtracts trend before forecasting [Zeng2023].

Based on gluonts.torch.model.d_linear.DLinearEstimator. See GluonTS documentation for additional hyperparameters.

References

Zeng2023

Zeng, Ailing, et al. “Are transformers effective for time series forecasting?” AAAI Conference on Artificial Intelligence. 2023.

Parameters
  • context_length (int, default = 96) – Number of time units that condition the predictions

  • hidden_dimension (int, default = 20) – Size of hidden layers in the feedforward network

  • distr_output (gluonts.torch.distributions.DistributionOutput, default = StudentTOutput()) – Distribution to fit.

  • scaling ({"mean", "std", None}, default = "mean") – Scaling applied to the inputs. One of "mean" (mean absolute scaling), "std" (standardization), None (no scaling).

  • epochs (int, default = 100) – Number of epochs the model will be trained for

  • batch_size (int, default = 64) – Size of batches used during training

  • num_batches_per_epoch (int, default = 50) – Number of batches processed every epoch

  • learning_rate (float, default = 1e-3,) – Learning rate used during training

  • weight_decay (float, default = 1e-8) – Weight decay regularization parameter.

PatchTSTModel#

class autogluon.timeseries.models.PatchTSTModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Transformer-based forecaster that segments each time series into patches [Nie2023].

Based on gluonts.torch.model.d_linear.PatchTSTEstimator. See GluonTS documentation for additional hyperparameters.

References

Nie2023

Nie, Yuqi, et al. “A Time Series is Worth 64 Words: Long-term Forecasting with Transformers.” International Conference on Learning Representations. 2023.

Parameters
  • context_length (int, default = 96) – Number of time units that condition the predictions

  • patch_len (int, default = 16) – Length of the patch.

  • stride (int, default = 8) – Stride of the patch.

  • d_model (int, default = 32) – Size of hidden layers in the Transformer encoder.

  • nhead (int, default = 4) – Number of attention heads in the Transformer encoder which must divide d_model.

  • num_encoder_layers (int, default = 2) – Number of layers in the Transformer encoder.

  • distr_output (gluonts.torch.distributions.DistributionOutput, default = StudentTOutput()) – Distribution to fit.

  • scaling ({"mean", "std", None}, default = "mean") – Scaling applied to the inputs. One of "mean" (mean absolute scaling), "std" (standardization), None (no scaling).

  • epochs (int, default = 100) – Number of epochs the model will be trained for

  • batch_size (int, default = 64) – Size of batches used during training

  • num_batches_per_epoch (int, default = 50) – Number of batches processed every epoch

  • learning_rate (float, default = 1e-3,) – Learning rate used during training

  • weight_decay (float, default = 1e-8) – Weight decay regularization parameter.

SimpleFeedForwardModel#

class autogluon.timeseries.models.SimpleFeedForwardModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Simple feedforward neural network that simultaneously predicts all future values.

Based on gluonts.torch.model.simple_feedforward.SimpleFeedForwardEstimator. See GluonTS documentation for additional hyperparameters.

Parameters
  • context_length (int, default = max(10, 2 * prediction_length)) – Number of time units that condition the predictions

  • hidden_dimensions (List[int], default = [20, 20]) – Size of hidden layers in the feedforward network

  • distr_output (gluonts.torch.distributions.DistributionOutput, default = StudentTOutput()) – Distribution to fit.

  • batch_normalization (bool, default = False) – Whether to use batch normalization

  • mean_scaling (bool, default = True) – Scale the network input by the data mean and the network output by its inverse

  • epochs (int, default = 100) – Number of epochs the model will be trained for

  • batch_size (int, default = 64) – Size of batches used during training

  • num_batches_per_epoch (int, default = 50) – Number of batches processed every epoch

  • learning_rate (float, default = 1e-3,) – Learning rate used during training

TemporalFusionTransformerModel#

class autogluon.timeseries.models.TemporalFusionTransformerModel(freq: Optional[str] = None, prediction_length: int = 1, path: Optional[str] = None, name: Optional[str] = None, eval_metric: Optional[str] = None, hyperparameters: Optional[Dict[str, Any]] = None, **kwargs)[source]#

Combines LSTM with a transformer layer to predict the quantiles of all future target values [Lim2021].

Based on gluonts.torch.model.tft.TemporalFusionTransformerEstimator. See GluonTS documentation for additional hyperparameters.

References

Lim2021

Lim, Bryan, et al. “Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting.” International Journal of Forecasting. 2021.

Parameters
  • context_length (int, default = max(64, 2 * prediction_length)) – Number of past values used for prediction.

  • disable_static_features (bool, default = False) – If True, static features won’t be used by the model even if they are present in the dataset. If False, static features will be used by the model if they are present in the dataset.

  • disable_known_covariates (bool, default = False) – If True, known covariates won’t be used by the model even if they are present in the dataset. If False, known covariates will be used by the model if they are present in the dataset.

  • disable_past_covariates (bool, default = False) – If True, past covariates won’t be used by the model even if they are present in the dataset. If False, past covariates will be used by the model if they are present in the dataset.

  • hidden_dim (int, default = 32) – Size of the LSTM & transformer hidden states.

  • variable_dim (int, default = 32) – Size of the feature embeddings.

  • num_heads (int, default = 4) – Number of attention heads in self-attention layer in the decoder.

  • dropout_rate (float, default = 0.1) – Dropout regularization parameter

  • epochs (int, default = 100) – Number of epochs the model will be trained for

  • batch_size (int, default = 64) – Size of batches used during training

  • num_batches_per_epoch (int, default = 50) – Number of batches processed every epoch

  • learning_rate (float, default = 1e-3,) – Learning rate used during training

MXNet Models#

Following MXNet-based models from GluonTS are available in AutoGluon.

  • DeepARMXNetModel

  • MQCNNMXNetModel

  • MQRNNMXNetModel

  • SimpleFeedForwardMXNetModel

  • TemporalFusionTransformerMXNetModel

  • TransformerMXNetModel

Documentation and hyperparameter settings for these models can be found here.

Using the above models requires installing Apache MXNet v1.9. This can be done as follows:

python -m pip install mxnet~=1.9

If you want to use a GPU, install the version of MXNet that matches your CUDA version. See the MXNet documentation for more info.

If a GPU is available and MXNet version with CUDA is installed, all the MXNet models will be trained using the GPU. Otherwise, the models will be trained on CPU.

Additional features#

Overview of the additional features and covariates supported by different models. Models not included in this table currently do not support any additional features.

Model

Static features (continuous)

Static features (categorical)

Known covariates (continuous)

Past covariates (continuous)

DirectTabularModel

RecursiveTabularModel

DeepARModel

TemporalFusionTransformerModel

DeepARMXNetModel

MQCNNMXNetModel

TemporalFusionTransformerMXNetModel