TabularPredictor.fit_extra¶

TabularPredictor.fit_extra(hyperparameters: str | Dict[str, Any], time_limit: float = None, base_model_names: List[str] = None, fit_weighted_ensemble: bool = True, fit_full_last_level_weighted_ensemble: bool = True, full_weighted_ensemble_additionally: bool = False, num_cpus: str | int = 'auto', num_gpus: str | int = 'auto', fit_strategy: Literal['auto', 'sequential', 'parallel'] = 'auto', memory_limit: float | str = 'auto', **kwargs) → TabularPredictor[source]¶

Fits additional models after the original TabularPredictor.fit() call. The original train_data and tuning_data will be used to train the models.

Parameters:

hyperparameters (str or dict) – Refer to argument documentation in TabularPredictor.fit(). If base_model_names is specified and hyperparameters is using the level-based key notation, the key of the level which directly uses the base models should be 1. The level in the hyperparameters dictionary is relative, not absolute.
time_limit (int, default = None) – Refer to argument documentation in TabularPredictor.fit().
base_model_names (List[str], default = None) – The names of the models to use as base models for this fit call. Base models will provide their out-of-fold predictions as additional features to the models in hyperparameters. If specified, all models trained will be stack ensembles. If None, models will be trained as if they were specified in TabularPredictor.fit(), without depending on existing models. Only valid if bagging is enabled.
fit_weighted_ensemble (bool, default = True) – If True, a WeightedEnsembleModel will be fit in each stack layer. A weighted ensemble will often be stronger than an individual model while being very fast to train. It is recommended to keep this value set to True to maximize predictive quality.
fit_full_last_level_weighted_ensemble (bool, default = True) – If True, the WeightedEnsembleModel of the last stacking level will be fit with all (successful) models from all previous layers as base models. If stacking is disabled, settings this to True or False makes no difference because the WeightedEnsembleModel L2 always uses all models from L1. It is recommended to keep this value set to True to maximize predictive quality.
full_weighted_ensemble_additionally (bool, default = False) – If True, AutoGluon will fit two WeightedEnsembleModels after training all stacking levels. Setting this to True, simulates calling fit_weighted_ensemble() after calling fit(). Has no affect if fit_full_last_level_weighted_ensemble is False and does not fit an additional WeightedEnsembleModel if stacking is disabled.
num_cpus (int, default = "auto") – The total amount of cpus you want AutoGluon predictor to use. Auto means AutoGluon will make the decision based on the total number of cpus available and the model requirement for best performance. Users generally don’t need to set this value
num_gpus (int, default = "auto") – The total amount of gpus you want AutoGluon predictor to use. Auto means AutoGluon will make the decision based on the total number of gpus available and the model requirement for best performance. Users generally don’t need to set this value
fit_strategy (Literal["auto", "sequential", "parallel"], default = "auto") –
The strategy used to fit models. If “auto”, uses the same fit_strategy as used in the original TabularPredictor.fit() call. If “sequential”, models will be fit sequentially. This is the most stable option with the most readable logging. If “parallel”, models will be fit in parallel with ray, splitting available compute between them.

Note: “parallel” is experimental and may run into issues. It was first added in version 1.2.0.

For machines with 16 or more CPU cores, it is likely that “parallel” will be faster than “sequential”.

Added in version 1.2.0.
memory_limit (float | str, default = "auto") – The total amount of memory in GB you want AutoGluon predictor to use. “auto” means AutoGluon will use all available memory on the system (that is detectable by psutil). Note that this is only a soft limit! AutoGluon uses this limit to skip training models that are expected to require too much memory or stop training a model that would exceed the memory limit. AutoGluon does not guarantee the enforcement of this limit (yet). Nevertheless, we expect AutoGluon to abide by the limit in most cases or, at most, go over the limit by a small margin. For most virtualized systems (e.g., in the cloud) and local usage on a server or laptop, “auto” is ideal for this parameter. We recommend manually setting the memory limit (and any other resources) on systems with shared resources that are controlled by the operating system (e.g., SLURM and cgroups). Otherwise, AutoGluon might wrongly assume more resources are available for fitting a model than the operating system allows, which can result in model training failing or being very inefficient.
**kwargs –
Refer to kwargs documentation in TabularPredictor.fit(). Note that the following kwargs are not available in fit_extra as they cannot be changed from their values set in fit():

[holdout_frac, num_bag_folds, auto_stack, feature_generator, unlabeled_data]

Moreover, dynamic_stacking is also not available in fit_extra as the detection of stacked overfitting is only supported at the first fit time. pseudo_data : pd.DataFrame, default = None

Data that has been self labeled by Autogluon model and will be incorporated into training during ‘fit_extra’