TabularPredictor.save_space¶

TabularPredictor.save_space(remove_data=True, remove_fit_stack=True, requires_save=True, reduce_children=False)[source]¶

Reduces the memory and disk size of predictor by deleting auxiliary model files that aren’t needed for prediction on new data. This function has NO impact on inference accuracy. It is recommended to invoke this method if the only goal is to use the trained model for prediction. However, certain advanced functionality may no longer be available after save_space() has been called.

Parameters:

remove_data (bool, default = True) –
Whether to remove cached files of the original training and validation data. Only reduces disk usage, it has no impact on memory usage. This is especially useful when the original data was large. This is equivalent to setting cache_data=False during the original fit().

Will disable all advanced functionality that requires cache_data=True.
remove_fit_stack (bool, default = True) –
Whether to remove information required to fit new stacking models and continue fitting bagged models with new folds. Only reduces disk usage, it has no impact on memory usage. This includes:

out-of-fold (OOF) predictions

This is useful for multiclass problems with many classes, as OOF predictions can become very large on disk. (1 GB per model in extreme cases) This disables predictor.refit_full() for stacker models.
requires_save (bool, default = True) –
Whether to remove information that requires the model to be saved again to disk. Typically this only includes flag variables that don’t have significant impact on memory or disk usage, but should technically be updated due to the removal of more important information.

An example is the is_data_saved boolean variable in trainer, which should be updated to False if remove_data=True was set.
reduce_children (bool, default = False) – Whether to apply the reduction rules to bagged ensemble children models. These are the models trained for each fold of the bagged ensemble. This should generally be kept as False since the most important memory and disk reduction techniques are automatically applied to these models during the original fit() call.