Predicting Columns in a Table - Deployment Optimization

Open In Colab Open In SageMaker Studio Lab

This tutorial will cover how to perform the end-to-end AutoML process to create an optimized and deployable AutoGluon artifact for production usage.

This tutorial assumes you have already read Predicting Columns in a Table - Quick Start and Predicting Columns in a Table - In Depth.

Fitting a TabularPredictor

We will again use the AdultIncome dataset as in the previous tutorials and train a predictor to predict whether the person’s income exceeds $50,000 or not, which is recorded in the class column of this table.

from autogluon.tabular import TabularDataset, TabularPredictor
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
label = 'class'
subsample_size = 500  # subsample subset of data for faster demo, try setting this to much larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head()
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country class
6118 51 Private 39264 Some-college 10 Married-civ-spouse Exec-managerial Wife White Female 0 0 40 United-States >50K
23204 58 Private 51662 10th 6 Married-civ-spouse Other-service Wife White Female 0 0 8 United-States <=50K
29590 40 Private 326310 Some-college 10 Married-civ-spouse Craft-repair Husband White Male 0 0 44 United-States <=50K
18116 37 Private 222450 HS-grad 9 Never-married Sales Not-in-family White Male 0 2339 40 El-Salvador <=50K
33964 62 Private 109190 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 15024 0 40 United-States >50K
save_path = 'agModels-predictClass-deployment'  # specifies folder to store trained models
predictor = TabularPredictor(label=label, path=save_path).fit(train_data)
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.4.1b20250805
Python Version:     3.12.10
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Memory Avail:       28.90 GB / 30.95 GB (93.4%)
Disk Space Avail:   206.84 GB / 255.99 GB (80.8%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'    : Strong accuracy with fast inference speed.
	presets='good'    : Good accuracy with very fast inference speed.
	presets='medium'  : Fast training time, ideal for initial prototyping.
Using hyperparameters preset: hyperparameters='default'
Beginning AutoGluon training ...
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment"
Train Data Rows:    500
Train Data Columns: 14
Label Column:       class
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
	2 unique label values:  [' >50K', ' <=50K']
	If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       binary
Preprocessing data ...
Selected class <--> label mapping:  class 1 =  >50K, class 0 =  <=50K
	Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive ( >50K) vs negative ( <=50K) class.
	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    29596.02 MB
	Train Data (Original)  Memory Usage: 0.25 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
		('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('int', ['bool']) : 1 | ['sex']
	0.1s = Fit runtime
	14 features in original data used to generate 14 features in processed data.
	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.08s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 400, Val Rows: 100
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
}
Fitting 11 L1 models, fit_strategy="sequential" ...
Fitting model: LightGBMXT ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.9 GB
	0.83	 = Validation score   (accuracy)
	0.39s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: LightGBM ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.9 GB
	0.85	 = Validation score   (accuracy)
	0.22s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: RandomForestGini ...
	Fitting with cpus=8, gpus=0
	0.84	 = Validation score   (accuracy)
	0.56s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: RandomForestEntr ...
	Fitting with cpus=8, gpus=0
	0.83	 = Validation score   (accuracy)
	0.52s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: CatBoost ...
	Fitting with cpus=4, gpus=0
	0.85	 = Validation score   (accuracy)
	0.8s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: ExtraTreesGini ...
	Fitting with cpus=8, gpus=0
	0.82	 = Validation score   (accuracy)
	0.53s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: ExtraTreesEntr ...
	Fitting with cpus=8, gpus=0
	0.81	 = Validation score   (accuracy)
	0.53s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.8 GB
	0.83	 = Validation score   (accuracy)
	2.69s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: XGBoost ...
	Fitting with cpus=4, gpus=0
	0.85	 = Validation score   (accuracy)
	0.38s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: NeuralNetTorch ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.5 GB
	0.83	 = Validation score   (accuracy)
	2.21s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBMLarge ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.4 GB
	0.83	 = Validation score   (accuracy)
	0.66s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'LightGBM': 1.0}
	0.85	 = Validation score   (accuracy)
	0.07s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 10.01s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 19879.2 rows/s (100 batch size)
Disabling decision threshold calibration for metric `accuracy` due to having fewer than 10000 rows of validation data for calibration, to avoid overfitting (100 rows).
	`accuracy` is generally not improved through threshold calibration. Force calibration via specifying `calibrate_decision_threshold=True`.
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment")

Next, load separate test data to demonstrate how to make predictions on new examples at inference time:

test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
y_test = test_data[label]  # values to predict
test_data.head()
Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country class
0 31 Private 169085 11th 7 Married-civ-spouse Sales Wife White Female 0 0 20 United-States <=50K
1 17 Self-emp-not-inc 226203 12th 8 Never-married Sales Own-child White Male 0 0 45 United-States <=50K
2 47 Private 54260 Assoc-voc 11 Married-civ-spouse Exec-managerial Husband White Male 0 1887 60 United-States >50K
3 21 Private 176262 Some-college 10 Never-married Exec-managerial Own-child White Female 0 0 30 United-States <=50K
4 17 Private 241185 12th 8 Never-married Prof-specialty Own-child White Male 0 0 20 United-States <=50K

We use our trained models to make predictions on the new data:

predictor = TabularPredictor.load(save_path)  # unnecessary, just demonstrates how to load previously-trained predictor from file

y_pred = predictor.predict(test_data)
y_pred
0        <=50K
1        <=50K
2         >50K
3        <=50K
4        <=50K
         ...  
9764     <=50K
9765     <=50K
9766     <=50K
9767     <=50K
9768     <=50K
Name: class, Length: 9769, dtype: object

We can use leaderboard to evaluate the performance of each individual trained model on our labeled test data:

predictor.leaderboard(test_data)
model score_test score_val eval_metric pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 RandomForestGini 0.842870 0.84 accuracy 0.098668 0.046491 0.557529 0.098668 0.046491 0.557529 1 True 3
1 CatBoost 0.842461 0.85 accuracy 0.008353 0.003541 0.799393 0.008353 0.003541 0.799393 1 True 5
2 RandomForestEntr 0.841130 0.83 accuracy 0.098077 0.046339 0.523186 0.098077 0.046339 0.523186 1 True 4
3 XGBoost 0.839902 0.85 accuracy 0.035705 0.005714 0.376254 0.035705 0.005714 0.376254 1 True 9
4 LightGBM 0.839799 0.85 accuracy 0.015203 0.004305 0.218295 0.015203 0.004305 0.218295 1 True 2
5 WeightedEnsemble_L2 0.839799 0.85 accuracy 0.016673 0.005030 0.288410 0.001470 0.000726 0.070114 2 True 12
6 NeuralNetTorch 0.837138 0.83 accuracy 0.045333 0.010311 2.213367 0.045333 0.010311 2.213367 1 True 10
7 LightGBMXT 0.836421 0.83 accuracy 0.007769 0.004407 0.391984 0.007769 0.004407 0.391984 1 True 1
8 ExtraTreesGini 0.833862 0.82 accuracy 0.097154 0.050375 0.525635 0.097154 0.050375 0.525635 1 True 6
9 ExtraTreesEntr 0.833862 0.81 accuracy 0.099956 0.057210 0.530676 0.099956 0.057210 0.530676 1 True 7
10 NeuralNetFastAI 0.830791 0.83 accuracy 0.122786 0.008419 2.692888 0.122786 0.008419 2.692888 1 True 8
11 LightGBMLarge 0.817074 0.83 accuracy 0.011738 0.003245 0.661469 0.011738 0.003245 0.661469 1 True 11

Snapshot a Predictor with .clone()

Now that we have a working predictor artifact, we may want to alter it in a variety of ways to better suite our needs. For example, we may want to delete certain models to reduce disk usage via .delete_models(), or train additional models on top of the ones we already have via .fit_extra().

While you can do all of these operations on your predictor, you may want to be able to be able to revert to a prior state of the predictor in case something goes wrong. This is where predictor.clone() comes in.

predictor.clone() allows you to create a snapshot of the given predictor, cloning the artifacts of the predictor to a new location. You can then freely play around with the predictor and always load the earlier snapshot in case you want to undo your actions.

All you need to do to clone a predictor is specify a new directory path to clone to:

save_path_clone = save_path + '-clone'
# will return the path to the cloned predictor, identical to save_path_clone
path_clone = predictor.clone(path=save_path_clone)
Cloned TabularPredictor located in '/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment' to 'agModels-predictClass-deployment-clone'.
	To load the cloned predictor: predictor_clone = TabularPredictor.load(path="agModels-predictClass-deployment-clone")

Note that this logic doubles disk usage, as it completely clones every predictor artifact on disk to make an exact replica.

Now we can load the cloned predictor:

predictor_clone = TabularPredictor.load(path=path_clone)
# You can alternatively load the cloned TabularPredictor at the time of cloning:
# predictor_clone = predictor.clone(path=save_path_clone, return_clone=True)

We can see that the cloned predictor has the same leaderboard and functionality as the original:

y_pred_clone = predictor.predict(test_data)
y_pred_clone
0        <=50K
1        <=50K
2         >50K
3        <=50K
4        <=50K
         ...  
9764     <=50K
9765     <=50K
9766     <=50K
9767     <=50K
9768     <=50K
Name: class, Length: 9769, dtype: object
y_pred.equals(y_pred_clone)
True
predictor_clone.leaderboard(test_data)
model score_test score_val eval_metric pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 RandomForestGini 0.842870 0.84 accuracy 0.095832 0.046491 0.557529 0.095832 0.046491 0.557529 1 True 3
1 CatBoost 0.842461 0.85 accuracy 0.007397 0.003541 0.799393 0.007397 0.003541 0.799393 1 True 5
2 RandomForestEntr 0.841130 0.83 accuracy 0.096292 0.046339 0.523186 0.096292 0.046339 0.523186 1 True 4
3 XGBoost 0.839902 0.85 accuracy 0.034519 0.005714 0.376254 0.034519 0.005714 0.376254 1 True 9
4 LightGBM 0.839799 0.85 accuracy 0.014982 0.004305 0.218295 0.014982 0.004305 0.218295 1 True 2
5 WeightedEnsemble_L2 0.839799 0.85 accuracy 0.016423 0.005030 0.288410 0.001441 0.000726 0.070114 2 True 12
6 NeuralNetTorch 0.837138 0.83 accuracy 0.044763 0.010311 2.213367 0.044763 0.010311 2.213367 1 True 10
7 LightGBMXT 0.836421 0.83 accuracy 0.007699 0.004407 0.391984 0.007699 0.004407 0.391984 1 True 1
8 ExtraTreesGini 0.833862 0.82 accuracy 0.097291 0.050375 0.525635 0.097291 0.050375 0.525635 1 True 6
9 ExtraTreesEntr 0.833862 0.81 accuracy 0.098950 0.057210 0.530676 0.098950 0.057210 0.530676 1 True 7
10 NeuralNetFastAI 0.830791 0.83 accuracy 0.117042 0.008419 2.692888 0.117042 0.008419 2.692888 1 True 8
11 LightGBMLarge 0.817074 0.83 accuracy 0.011572 0.003245 0.661469 0.011572 0.003245 0.661469 1 True 11

Now let’s do some extra logic with the clone, such as calling refit_full:

predictor_clone.refit_full()

predictor_clone.leaderboard(test_data)
Refitting models via `predictor.refit_full` using all of the data (combined train and validation)...
	Models trained in this way will have the suffix "_FULL" and have NaN validation score.
	This process is not bound by time_limit, but should take less time than the original `predictor.fit` call.
	To learn more, refer to the `.refit_full` method docstring which explains how "_FULL" models differ from normal models.
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: LightGBMXT_FULL ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.4 GB
	0.21s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: LightGBM_FULL ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.4 GB
	0.23s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: RandomForestGini_FULL ...
	Fitting with cpus=8, gpus=0
	0.54s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: RandomForestEntr_FULL ...
	Fitting with cpus=8, gpus=0
	0.53s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: CatBoost_FULL ...
	Fitting with cpus=4, gpus=0
	0.02s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: ExtraTreesGini_FULL ...
	Fitting with cpus=8, gpus=0
	0.52s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: ExtraTreesEntr_FULL ...
	Fitting with cpus=8, gpus=0
	0.53s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: NeuralNetFastAI_FULL ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.3 GB
No improvement since epoch 0: early stopping
	0.32s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: XGBoost_FULL ...
	Fitting with cpus=4, gpus=0
	0.05s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: NeuralNetTorch_FULL ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.3 GB
	0.43s	 = Training   runtime
Fitting 1 L1 models, fit_strategy="sequential" ...
Fitting model: LightGBMLarge_FULL ...
	Fitting with cpus=4, gpus=0, mem=0.0/28.3 GB
	0.24s	 = Training   runtime
Fitting model: WeightedEnsemble_L2_FULL | Skipping fit via cloning parent ...
	Ensemble Weights: {'LightGBM': 1.0}
	0.07s	 = Training   runtime
Updated best model to "WeightedEnsemble_L2_FULL" (Previously "WeightedEnsemble_L2"). AutoGluon will default to using "WeightedEnsemble_L2_FULL" for predict() and predict_proba().
Refit complete, total runtime = 3.95s ... Best model: "WeightedEnsemble_L2_FULL"
model score_test score_val eval_metric pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 CatBoost_FULL 0.842870 NaN accuracy 0.006056 NaN 0.022880 0.006056 NaN 0.022880 1 True 17
1 RandomForestGini 0.842870 0.84 accuracy 0.096052 0.046491 0.557529 0.096052 0.046491 0.557529 1 True 3
2 CatBoost 0.842461 0.85 accuracy 0.007634 0.003541 0.799393 0.007634 0.003541 0.799393 1 True 5
3 RandomForestEntr 0.841130 0.83 accuracy 0.096811 0.046339 0.523186 0.096811 0.046339 0.523186 1 True 4
4 LightGBM_FULL 0.840823 NaN accuracy 0.017952 NaN 0.227156 0.017952 NaN 0.227156 1 True 14
5 WeightedEnsemble_L2_FULL 0.840823 NaN accuracy 0.019378 NaN 0.297271 0.001426 NaN 0.070114 2 True 24
6 XGBoost 0.839902 0.85 accuracy 0.034147 0.005714 0.376254 0.034147 0.005714 0.376254 1 True 9
7 LightGBM 0.839799 0.85 accuracy 0.015268 0.004305 0.218295 0.015268 0.004305 0.218295 1 True 2
8 WeightedEnsemble_L2 0.839799 0.85 accuracy 0.016708 0.005030 0.288410 0.001440 0.000726 0.070114 2 True 12
9 RandomForestGini_FULL 0.839390 NaN accuracy 0.097366 NaN 0.538620 0.097366 NaN 0.538620 1 True 15
10 RandomForestEntr_FULL 0.839185 NaN accuracy 0.097762 NaN 0.534288 0.097762 NaN 0.534288 1 True 16
11 LightGBMXT_FULL 0.837957 NaN accuracy 0.008525 NaN 0.205105 0.008525 NaN 0.205105 1 True 13
12 NeuralNetTorch 0.837138 0.83 accuracy 0.044931 0.010311 2.213367 0.044931 0.010311 2.213367 1 True 10
13 LightGBMXT 0.836421 0.83 accuracy 0.008042 0.004407 0.391984 0.008042 0.004407 0.391984 1 True 1
14 XGBoost_FULL 0.836319 NaN accuracy 0.031770 NaN 0.045425 0.031770 NaN 0.045425 1 True 21
15 ExtraTreesEntr_FULL 0.835705 NaN accuracy 0.100702 NaN 0.531995 0.100702 NaN 0.531995 1 True 19
16 ExtraTreesGini 0.833862 0.82 accuracy 0.097739 0.050375 0.525635 0.097739 0.050375 0.525635 1 True 6
17 ExtraTreesEntr 0.833862 0.81 accuracy 0.098340 0.057210 0.530676 0.098340 0.057210 0.530676 1 True 7
18 ExtraTreesGini_FULL 0.833453 NaN accuracy 0.098816 NaN 0.521801 0.098816 NaN 0.521801 1 True 18
19 NeuralNetFastAI 0.830791 0.83 accuracy 0.125398 0.008419 2.692888 0.125398 0.008419 2.692888 1 True 8
20 LightGBMLarge 0.817074 0.83 accuracy 0.011631 0.003245 0.661469 0.011631 0.003245 0.661469 1 True 11
21 NeuralNetTorch_FULL 0.815641 NaN accuracy 0.046524 NaN 0.431732 0.046524 NaN 0.431732 1 True 22
22 LightGBMLarge_FULL 0.809704 NaN accuracy 0.011571 NaN 0.241738 0.011571 NaN 0.241738 1 True 23
23 NeuralNetFastAI_FULL 0.768246 NaN accuracy 0.116563 NaN 0.323847 0.116563 NaN 0.323847 1 True 20

We can see that we were able to fit additional models, but for whatever reason we may want to undo this operation.

Luckily, our original predictor is untouched!

predictor.leaderboard(test_data)
model score_test score_val eval_metric pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 RandomForestGini 0.842870 0.84 accuracy 0.097736 0.046491 0.557529 0.097736 0.046491 0.557529 1 True 3
1 CatBoost 0.842461 0.85 accuracy 0.007975 0.003541 0.799393 0.007975 0.003541 0.799393 1 True 5
2 RandomForestEntr 0.841130 0.83 accuracy 0.096985 0.046339 0.523186 0.096985 0.046339 0.523186 1 True 4
3 XGBoost 0.839902 0.85 accuracy 0.037916 0.005714 0.376254 0.037916 0.005714 0.376254 1 True 9
4 LightGBM 0.839799 0.85 accuracy 0.015207 0.004305 0.218295 0.015207 0.004305 0.218295 1 True 2
5 WeightedEnsemble_L2 0.839799 0.85 accuracy 0.016694 0.005030 0.288410 0.001487 0.000726 0.070114 2 True 12
6 NeuralNetTorch 0.837138 0.83 accuracy 0.045044 0.010311 2.213367 0.045044 0.010311 2.213367 1 True 10
7 LightGBMXT 0.836421 0.83 accuracy 0.008049 0.004407 0.391984 0.008049 0.004407 0.391984 1 True 1
8 ExtraTreesEntr 0.833862 0.81 accuracy 0.098754 0.057210 0.530676 0.098754 0.057210 0.530676 1 True 7
9 ExtraTreesGini 0.833862 0.82 accuracy 0.099693 0.050375 0.525635 0.099693 0.050375 0.525635 1 True 6
10 NeuralNetFastAI 0.830791 0.83 accuracy 0.122739 0.008419 2.692888 0.122739 0.008419 2.692888 1 True 8
11 LightGBMLarge 0.817074 0.83 accuracy 0.012002 0.003245 0.661469 0.012002 0.003245 0.661469 1 True 11

We can simply clone a new predictor from our original, and we will no longer be impacted by the call to refit_full on the prior clone.

Snapshot a deployment optimized Predictor via .clone_for_deployment()

Instead of cloning an exact copy, we can instead clone a copy which has the minimal set of artifacts needed to do prediction.

Note that this optimized clone will have very limited functionality outside of calling predict and predict_proba. For example, it will be unable to train more models.

save_path_clone_opt = save_path + '-clone-opt'
# will return the path to the cloned predictor, identical to save_path_clone_opt
path_clone_opt = predictor.clone_for_deployment(path=save_path_clone_opt)
Cloned TabularPredictor located in '/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment' to 'agModels-predictClass-deployment-clone-opt'.
	To load the cloned predictor: predictor_clone = TabularPredictor.load(path="agModels-predictClass-deployment-clone-opt")
Clone: Keeping minimum set of models required to predict with best model 'WeightedEnsemble_L2'...
Deleting model LightGBMXT. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/LightGBMXT will be removed.
Deleting model RandomForestGini. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/RandomForestGini will be removed.
Deleting model RandomForestEntr. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/RandomForestEntr will be removed.
Deleting model CatBoost. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/CatBoost will be removed.
Deleting model ExtraTreesGini. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/ExtraTreesGini will be removed.
Deleting model ExtraTreesEntr. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/ExtraTreesEntr will be removed.
Deleting model NeuralNetFastAI. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/NeuralNetFastAI will be removed.
Deleting model XGBoost. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/XGBoost will be removed.
Deleting model NeuralNetTorch. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/NeuralNetTorch will be removed.
Deleting model LightGBMLarge. All files under /home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictClass-deployment-clone-opt/models/LightGBMLarge will be removed.
Clone: Removing artifacts unnecessary for prediction. NOTE: Clone can no longer fit new models, and most functionality except for predict and predict_proba will no longer work
predictor_clone_opt = TabularPredictor.load(path=path_clone_opt)

To avoid loading the model in every prediction call, we can persist the model in memory by:

predictor_clone_opt.persist()
Persisting 2 models in memory. Models will require 0.0% of memory.
['WeightedEnsemble_L2', 'LightGBM']

We can see that the optimized clone still makes the same predictions:

y_pred_clone_opt = predictor_clone_opt.predict(test_data)
y_pred_clone_opt
0        <=50K
1        <=50K
2         >50K
3        <=50K
4        <=50K
         ...  
9764     <=50K
9765     <=50K
9766     <=50K
9767     <=50K
9768     <=50K
Name: class, Length: 9769, dtype: object
y_pred.equals(y_pred_clone_opt)
True
predictor_clone_opt.leaderboard(test_data)
model score_test score_val eval_metric pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 LightGBM 0.839799 0.85 accuracy 0.014848 0.004305 0.218295 0.014848 0.004305 0.218295 1 True 1
1 WeightedEnsemble_L2 0.839799 0.85 accuracy 0.015635 0.005030 0.288410 0.000787 0.000726 0.070114 2 True 2

We can check the disk usage of the optimized clone compared to the original:

size_original = predictor.disk_usage()
size_opt = predictor_clone_opt.disk_usage()
print(f'Size Original:  {size_original} bytes')
print(f'Size Optimized: {size_opt} bytes')
print(f'Optimized predictor achieved a {round((1 - (size_opt/size_original)) * 100, 1)}% reduction in disk usage.')
Size Original:  18351007 bytes
Size Optimized: 182055 bytes
Optimized predictor achieved a 99.0% reduction in disk usage.

We can also investigate the difference in the files that exist in the original and optimized predictor.

Original:

predictor.disk_usage_per_file()
/models/ExtraTreesGini/model.pkl                        5065911
/models/ExtraTreesEntr/model.pkl                        5024141
/models/RandomForestGini/model.pkl                      3408886
/models/RandomForestEntr/model.pkl                      3267285
/models/XGBoost/xgb.ubj                                  506961
/models/LightGBMLarge/model.pkl                          310869
/models/NeuralNetTorch/model.pkl                         253993
/models/NeuralNetFastAI/model-internals.pkl              170508
/models/LightGBM/model.pkl                               147792
/models/CatBoost/model.pkl                                52238
/models/LightGBMXT/model.pkl                              43138
/utils/data/X.pkl                                         27584
/learner.pkl                                              10352
/models/WeightedEnsemble_L2/model.pkl                     10340
/metadata.json                                             9707
/utils/data/X_val.pkl                                      8350
/utils/data/y.pkl                                          7462
/models/XGBoost/model.pkl                                  6180
/models/trainer.pkl                                        5229
/models/NeuralNetFastAI/model.pkl                          2714
/utils/data/y_val.pkl                                      2355
/models/WeightedEnsemble_L2/utils/model_template.pkl       1269
/predictor.pkl                                              903
/models/WeightedEnsemble_L2/utils/oof.pkl                   765
/utils/attr/LightGBM/y_pred_proba_val.pkl                   551
/utils/attr/LightGBMLarge/y_pred_proba_val.pkl              551
/utils/attr/XGBoost/y_pred_proba_val.pkl                    551
/utils/attr/NeuralNetTorch/y_pred_proba_val.pkl             551
/utils/attr/ExtraTreesGini/y_pred_proba_val.pkl             551
/utils/attr/RandomForestGini/y_pred_proba_val.pkl           551
/utils/attr/LightGBMXT/y_pred_proba_val.pkl                 551
/utils/attr/NeuralNetFastAI/y_pred_proba_val.pkl            551
/utils/attr/CatBoost/y_pred_proba_val.pkl                   551
/utils/attr/ExtraTreesEntr/y_pred_proba_val.pkl             551
/utils/attr/RandomForestEntr/y_pred_proba_val.pkl           551
/version.txt                                                 14
Name: size, dtype: int64

Optimized:

predictor_clone_opt.disk_usage_per_file()
/models/LightGBM/model.pkl               147820
/models/WeightedEnsemble_L2/model.pkl     10390
/learner.pkl                              10352
/metadata.json                             9707
/models/trainer.pkl                        2869
/predictor.pkl                              903
/version.txt                                 14
Name: size, dtype: int64

Compile models for maximized inference speed

In order to further improve inference efficiency, we can call .compile() to automatically convert sklearn function calls into their ONNX equivalents. Note that this is currently an experimental feature, which only improves RandomForest and TabularNeuralNetwork models. The compilation and inference speed acceleration require installation of skl2onnx and onnxruntime packages. To install supported versions of these packages automatically, we can call pip install autogluon.tabular[skl2onnx] on top of an existing AutoGluon installation, or pip install autogluon.tabular[all,skl2onnx] on a new AutoGluon installation.

It is important to make sure the predictor is cloned, because once the models are compiled, it won’t support fitting.

predictor_clone_opt.compile()
Compiling 2 Models ...
Skipping compilation for WeightedEnsemble_L2 ... (No config specified)
Skipping compilation for LightGBM ... (No config specified)
Finished compiling models, total runtime = 0s.

With the compiled predictor, the prediction results might not be exactly the same but should be very close.

y_pred_compile_opt = predictor_clone_opt.predict(test_data)
y_pred_compile_opt
0        <=50K
1        <=50K
2         >50K
3        <=50K
4        <=50K
         ...  
9764     <=50K
9765     <=50K
9766     <=50K
9767     <=50K
9768     <=50K
Name: class, Length: 9769, dtype: object

Now all that is left is to upload the optimized predictor to a centralized storage location such as S3. To use this predictor in a new machine / system, simply download the artifact to local disk and load the predictor. Ensure that when loading a predictor you use the same Python version and AutoGluon version used during training to avoid instability.