Predicting Columns in a Table - Quick Start

Via a simple fit() call, AutoGluon can produce highly-accurate models to predict the values in one column of a data table based on the rest of the columns’ values. Use AutoGluon with tabular data for both classification and regression problems. This tutorial demonstrates how to use AutoGluon to produce a classification model that predicts whether or not a person’s income exceeds $50,000.

To start, import autogluon and TabularPrediction module as your task:

import autogluon as ag
from autogluon import TabularPrediction as task

Load training data from a CSV file into an AutoGluon Dataset object. This object is essentially equivalent to a Pandas DataFrame and the same methods can be applied to both.

train_data = task.Dataset(file_path='https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
subsample_size = 500  # subsample subset of data for faster demo, try setting this to much larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
print(train_data.head())
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv | Columns = 15 / 15 | Rows = 39073 -> 39073
       age workclass  fnlwgt      education  education-num  6118    51   Private   39264   Some-college             10
23204   58   Private   51662           10th              6
29590   40   Private  326310   Some-college             10
18116   37   Private  222450        HS-grad              9
33964   62   Private  109190      Bachelors             13

            marital-status        occupation    relationship    race      sex  6118    Married-civ-spouse   Exec-managerial            Wife   White   Female
23204   Married-civ-spouse     Other-service            Wife   White   Female
29590   Married-civ-spouse      Craft-repair         Husband   White     Male
18116        Never-married             Sales   Not-in-family   White     Male
33964   Married-civ-spouse   Exec-managerial         Husband   White     Male

       capital-gain  capital-loss  hours-per-week  native-country   class
6118              0             0              40   United-States    >50K
23204             0             0               8   United-States   <=50K
29590             0             0              44   United-States   <=50K
18116             0          2339              40     El-Salvador   <=50K
33964         15024             0              40   United-States    >50K

Note that we loaded data from a CSV file stored in the cloud (AWS s3 bucket), but you can you specify a local file-path instead if you have already downloaded the CSV file to your own machine (e.g., using wget). Each row in the table train_data corresponds to a single training example. In this particular dataset, each row corresponds to an individual person, and the columns contain various characteristics reported during a census.

Let’s first use these features to predict whether the person’s income exceeds $50,000 or not, which is recorded in the class column of this table.

label_column = 'class'
print("Summary of class variable: \n", train_data[label_column].describe())
Summary of class variable:
 count        500
unique         2
top        <=50K
freq         365
Name: class, dtype: object
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)

Now use AutoGluon to train multiple models:

dir = 'agModels-predictClass'  # specifies folder where to store trained models
predictor = task.fit(train_data=train_data, label=label_column, output_directory=dir)
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Beginning AutoGluon training ...
AutoGluon will save models to agModels-predictClass/
AutoGluon Version:  0.0.14b20201027
Train Data Rows:    500
Train Data Columns: 14
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
    2 unique label values:  [' >50K', ' <=50K']
    If 'binary' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping:  class 1 =  <=50K, class 0 =  >50K
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    22028.71 MB
    Train Data (Original)  Memory Usage: 0.3 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
            ('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
    Types of features in processed data (raw dtype, special dtypes):
            ('category', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
            ('int', [])      : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
    0.1s = Fit runtime
    14 features in original data used to generate 14 features in processed data.
    Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.06s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
AutoGluon will early stop models using evaluation metric: 'accuracy'
Fitting model: RandomForestClassifierGini ...
    0.88     = Validation accuracy score
    0.52s    = Training runtime
    0.11s    = Validation runtime
Fitting model: RandomForestClassifierEntr ...
    0.88     = Validation accuracy score
    0.52s    = Training runtime
    0.11s    = Validation runtime
Fitting model: ExtraTreesClassifierGini ...
    0.87     = Validation accuracy score
    0.41s    = Training runtime
    0.11s    = Validation runtime
Fitting model: ExtraTreesClassifierEntr ...
    0.87     = Validation accuracy score
    0.41s    = Training runtime
    0.11s    = Validation runtime
Fitting model: KNeighborsClassifierUnif ...
    0.76     = Validation accuracy score
    0.0s     = Training runtime
    0.1s     = Validation runtime
Fitting model: KNeighborsClassifierDist ...
    0.75     = Validation accuracy score
    0.0s     = Training runtime
    0.1s     = Validation runtime
Fitting model: LightGBMClassifier ...
    0.87     = Validation accuracy score
    0.18s    = Training runtime
    0.01s    = Validation runtime
Fitting model: LightGBMClassifierXT ...
    0.91     = Validation accuracy score
    0.17s    = Training runtime
    0.01s    = Validation runtime
Fitting model: CatboostClassifier ...
    0.91     = Validation accuracy score
    0.78s    = Training runtime
    0.01s    = Validation runtime
Fitting model: NeuralNetClassifier ...
    0.87     = Validation accuracy score
    6.42s    = Training runtime
    0.02s    = Validation runtime
Fitting model: LightGBMClassifierCustom ...
    0.82     = Validation accuracy score
    0.46s    = Training runtime
    0.01s    = Validation runtime
Fitting model: weighted_ensemble_k0_l1 ...
    0.93     = Validation accuracy score
    0.25s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 11.84s ...

Next, load separate test data to demonstrate how to make predictions on new examples at inference time:

test_data = task.Dataset(file_path='https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
y_test = test_data[label_column]  # values to predict
test_data_nolab = test_data.drop(labels=[label_column],axis=1)  # delete label column to prove we're not cheating
print(test_data_nolab.head())
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769
   age          workclass  fnlwgt      education  education-num  0   31            Private  169085           11th              7
1   17   Self-emp-not-inc  226203           12th              8
2   47            Private   54260      Assoc-voc             11
3   21            Private  176262   Some-college             10
4   17            Private  241185           12th              8

        marital-status        occupation relationship    race      sex  0   Married-civ-spouse             Sales         Wife   White   Female
1        Never-married             Sales    Own-child   White     Male
2   Married-civ-spouse   Exec-managerial      Husband   White     Male
3        Never-married   Exec-managerial    Own-child   White   Female
4        Never-married    Prof-specialty    Own-child   White     Male

   capital-gain  capital-loss  hours-per-week  native-country
0             0             0              20   United-States
1             0             0              45   United-States
2             0          1887              60   United-States
3             0             0              30   United-States
4             0             0              20   United-States

We use our trained models to make predictions on the new data and then evaluate performance:

predictor = task.load(dir)  # unnecessary, just demonstrates how to load previously-trained predictor from file

y_pred = predictor.predict(test_data_nolab)
print("Predictions:  ", y_pred)
perf = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Evaluation: accuracy on test data: 0.8367284266557478
Evaluations on test data:
{
    "accuracy": 0.8367284266557478,
    "accuracy_score": 0.8367284266557478,
    "balanced_accuracy_score": 0.7332244231481168,
    "matthews_corrcoef": 0.5159814299085932,
    "f1_score": 0.8367284266557478
}
Predictions:   [' <=50K' ' <=50K' ' <=50K' ... ' <=50K' ' <=50K' ' <=50K']
Detailed (per-class) classification report:
{
    " <=50K": {
        "precision": 0.8657257057207095,
        "recall": 0.9302107099718159,
        "f1-score": 0.8968105065666041,
        "support": 7451
    },
    " >50K": {
        "precision": 0.7050482132728304,
        "recall": 0.5362381363244176,
        "f1-score": 0.6091644204851752,
        "support": 2318
    },
    "accuracy": 0.8367284266557478,
    "macro avg": {
        "precision": 0.78538695949677,
        "recall": 0.7332244231481168,
        "f1-score": 0.7529874635258896,
        "support": 9769
    },
    "weighted avg": {
        "precision": 0.8275999582036471,
        "recall": 0.8367284266557478,
        "f1-score": 0.8285574993461361,
        "support": 9769
    }
}

Now you’re ready to try AutoGluon on your own tabular datasets! As long as they’re stored in a popular format like CSV, you should be able to achieve strong predictive performance with just 2 lines of code:

from autogluon import TabularPrediction as task
predictor = task.fit(train_data=task.Dataset(file_path=<file-name>), label_column=<variable-name>)

Note: This simple call to fit() is intended for your first prototype model. In a subsequent section, we’ll demonstrate how to maximize predictive performance by additionally specifying two fit() arguments: presets and eval_metric.

Description of fit():

Here we discuss what happened during fit().

Since there are only two possible values of the class variable, this was a binary classification problem, for which an appropriate performance metric is accuracy. AutoGluon automatically infers this as well as the type of each feature (i.e., which columns contain continuous numbers vs. discrete categories). AutogGluon can also automatically handle common issues like missing data and rescaling feature values.

We did not specify separate validation data and so AutoGluon automatically choses a random training/validation split of the data. The data used for validation is seperated from the training data and is used to determine the models and hyperparameter-values that produce the best results. Rather than just a single model, AutoGluon trains multiple models and ensembles them together to ensure superior predictive performance.

By default, AutoGluon tries to fit various types of models including neural networks and tree ensembles. Each type of model has various hyperparameters, which traditionally, the user would have to specify. AutoGluon automates this process.

AutoGluon automatically and iteratively tests values for hyperparameters to produce the best performance on the validation data. This involves repeatedly training models under different hyperparameter settings and evaluating their performance. This process can be computationally-intensive, so fit() can parallelize this process across multiple threads (and machines if distributed resources are available). To control runtimes, you can specify various arguments in fit() as demonstrated in the subsequent In-Depth tutorial.

For tabular problems, fit() returns a Predictor object. For classification, you can easily output predicted class probabilities instead of predicted classes:

pred_probs = predictor.predict_proba(test_data_nolab)
positive_class = [label for label in predictor.class_labels if predictor.class_labels_internal_map[label]==1][0]  # which label is considered 'positive' class
print(f"Predicted probabilities of class '{positive_class}':", pred_probs)
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Predicted probabilities of class ' <=50K': [0.84353125 0.9664155  0.5624747  ... 0.71672904 0.99887526 0.6153773 ]

Besides inference, this object can also summarize what happened during fit.

results = predictor.fit_summary()
* Summary of fit() *
Estimated performance of each model:
                         model  score_val  pred_time_val  fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0      weighted_ensemble_k0_l1       0.93       0.120496  0.833716                0.000601           0.253308            1       True         12
1         LightGBMClassifierXT       0.91       0.011834  0.167455                0.011834           0.167455            0       True          8
2           CatboostClassifier       0.91       0.011948  0.779188                0.011948           0.779188            0       True          9
3   RandomForestClassifierGini       0.88       0.108146  0.520327                0.108146           0.520327            0       True          1
4   RandomForestClassifierEntr       0.88       0.108371  0.517802                0.108371           0.517802            0       True          2
5           LightGBMClassifier       0.87       0.011594  0.178709                0.011594           0.178709            0       True          7
6          NeuralNetClassifier       0.87       0.024590  6.418294                0.024590           6.418294            0       True         10
7     ExtraTreesClassifierGini       0.87       0.108060  0.412953                0.108060           0.412953            0       True          3
8     ExtraTreesClassifierEntr       0.87       0.108263  0.411150                0.108263           0.411150            0       True          4
9     LightGBMClassifierCustom       0.82       0.012151  0.461372                0.012151           0.461372            0       True         11
10    KNeighborsClassifierUnif       0.76       0.103539  0.002171                0.103539           0.002171            0       True          5
11    KNeighborsClassifierDist       0.75       0.103456  0.002164                0.103456           0.002164            0       True          6
Number of models trained: 12
Types of models trained:
{'LGBModel', 'WeightedEnsembleModel', 'TabularNeuralNetModel', 'KNNModel', 'CatboostModel', 'RFModel', 'XTModel'}
Bagging used: False
Stack-ensembling used: False
Hyperparameter-tuning used: False
User-specified hyperparameters:
{'default': {'NN': [{}], 'GBM': [{}, {'extra_trees': True, 'AG_args': {'name_suffix': 'XT'}}], 'CAT': [{}], 'RF': [{'criterion': 'gini', 'AG_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'AG_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}], 'XT': [{'criterion': 'gini', 'AG_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'AG_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}], 'KNN': [{'weights': 'uniform', 'AG_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'AG_args': {'name_suffix': 'Dist'}}], 'custom': [{'num_boost_round': 10000, 'num_threads': -1, 'objective': 'binary', 'verbose': -1, 'boosting_type': 'gbdt', 'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 5, 'two_round': True, 'seed_value': 0, 'AG_args': {'model_type': 'GBM', 'name_suffix': 'Custom', 'disable_in_hpo': True}}]}}
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])      : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Plot summary of models saved to file: agModels-predictClass/SummaryOfModels.html
* End of fit() summary *

From this summary, we can see that AutoGluon trained many different types of models as well as an ensemble of the best-performing models. The summary also describes the actual models that were trained during fit and how well each model performed on the held-out validation data. We can view what properties AutoGluon automatically inferred about our prediction task:

print("AutoGluon infers problem type is: ", predictor.problem_type)
print("AutoGluon identified the following types of features:")
print(predictor.feature_metadata)
AutoGluon infers problem type is:  binary
AutoGluon identified the following types of features:
('category', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])      : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)

AutoGluon correctly recognized our prediction problem to be a binary classification task and decided that variables such as age should be represented as integers, whereas variables such as workclass should be represented as categorical objects. The feature_metadata attribute allows you to see the inferred data type of each predictive variable after preprocessing (this is it’s raw dtype; some features may also be associated with additional special dtypes if produced via feature-engineering, e.g. numerical representations of a datetime/text column).

We can evaluate the performance of each individual trained model on our (labeled) test data:

predictor.leaderboard(test_data, silent=True)
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 CatboostClassifier 0.844815 0.91 0.024144 0.011948 0.779188 0.024144 0.011948 0.779188 0 True 9
1 LightGBMClassifierXT 0.841540 0.91 0.033296 0.011834 0.167455 0.033296 0.011834 0.167455 0 True 8
2 weighted_ensemble_k0_l1 0.836728 0.93 0.258299 0.120496 0.833716 0.002807 0.000601 0.253308 1 True 12
3 LightGBMClassifier 0.833657 0.87 0.020627 0.011594 0.178709 0.020627 0.011594 0.178709 0 True 7
4 RandomForestClassifierGini 0.832531 0.88 0.116970 0.108146 0.520327 0.116970 0.108146 0.520327 0 True 1
5 RandomForestClassifierEntr 0.829051 0.88 0.116018 0.108371 0.517802 0.116018 0.108371 0.517802 0 True 2
6 ExtraTreesClassifierEntr 0.820145 0.87 0.219508 0.108263 0.411150 0.219508 0.108263 0.411150 0 True 4
7 LightGBMClassifierCustom 0.819224 0.82 0.063698 0.012151 0.461372 0.063698 0.012151 0.461372 0 True 11
8 ExtraTreesClassifierGini 0.819224 0.87 0.222196 0.108060 0.412953 0.222196 0.108060 0.412953 0 True 3
9 NeuralNetClassifier 0.788822 0.87 1.083219 0.024590 6.418294 1.083219 0.024590 6.418294 0 True 10
10 KNeighborsClassifierUnif 0.735285 0.76 0.104963 0.103539 0.002171 0.104963 0.103539 0.002171 0 True 5
11 KNeighborsClassifierDist 0.694953 0.75 0.106245 0.103456 0.002164 0.106245 0.103456 0.002164 0 True 6

When we call predict(), AutoGluon automatically predicts with the model that displayed the best performance on validation data (i.e. the weighted-ensemble). We can instead specify which model to use for predictions like this:

predictor.predict(test_data, model='NeuralNetClassifier')

Above the scores of predictive performance were based on a default evaluation metric (accuracy for binary classification). Performance in certain applications may be measured by different metrics than the ones AutoGluon optimizes for by default. If you know the metric that counts in your application, you should specify it as demonstrated in the next section.

Maximizing predictive performance

To get the best predictive accuracy with AutoGluon, you should generally use it like this:

time_limits = 60 # for quick demonstration only, you should set this to longest time you are willing to wait (in seconds)
metric = 'roc_auc' # specify your evaluation metric here
predictor = task.fit(train_data=train_data, label=label_column, time_limits=time_limits,
                     eval_metric=metric, presets='best_quality')
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
No output_directory specified. Models will be saved in: AutogluonModels/ag-20201027_210722/
Beginning AutoGluon training ... Time limit = 60s
AutoGluon will save models to AutogluonModels/ag-20201027_210722/
AutoGluon Version:  0.0.14b20201027
Train Data Rows:    500
Train Data Columns: 14
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
    2 unique label values:  [' >50K', ' <=50K']
    If 'binary' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping:  class 1 =  <=50K, class 0 =  >50K
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    21910.26 MB
    Train Data (Original)  Memory Usage: 0.3 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
            ('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
    Types of features in processed data (raw dtype, special dtypes):
            ('category', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
            ('int', [])      : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
    0.0s = Fit runtime
    14 features in original data used to generate 14 features in processed data.
    Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.06s ...
AutoGluon will gauge predictive performance using evaluation metric: 'roc_auc'
    This metric expects predicted probabilities rather than predicted class labels, so you'll need to use predict_proba() instead of predict()
    To change this, specify the eval_metric argument of fit()
AutoGluon will early stop models using evaluation metric: 'log_loss'
Fitting model: RandomForestClassifierGini_STACKER_l0 ... Training model for up to 59.94s of the 59.94s of remaining time.
    0.8906   = Validation roc_auc score
    2.59s    = Training runtime
    0.54s    = Validation runtime
Fitting model: RandomForestClassifierEntr_STACKER_l0 ... Training model for up to 56.74s of the 56.74s of remaining time.
    0.8892   = Validation roc_auc score
    2.59s    = Training runtime
    0.54s    = Validation runtime
Fitting model: ExtraTreesClassifierGini_STACKER_l0 ... Training model for up to 53.54s of the 53.54s of remaining time.
    0.8892   = Validation roc_auc score
    2.07s    = Training runtime
    0.54s    = Validation runtime
Fitting model: ExtraTreesClassifierEntr_STACKER_l0 ... Training model for up to 50.85s of the 50.85s of remaining time.
    0.8907   = Validation roc_auc score
    2.06s    = Training runtime
    0.54s    = Validation runtime
Fitting model: KNeighborsClassifierUnif_STACKER_l0 ... Training model for up to 48.16s of the 48.16s of remaining time.
    0.5214   = Validation roc_auc score
    0.02s    = Training runtime
    0.51s    = Validation runtime
Fitting model: KNeighborsClassifierDist_STACKER_l0 ... Training model for up to 47.61s of the 47.61s of remaining time.
    0.5415   = Validation roc_auc score
    0.02s    = Training runtime
    0.51s    = Validation runtime
Fitting model: LightGBMClassifier_STACKER_l0 ... Training model for up to 47.06s of the 47.06s of remaining time.
    0.892    = Validation roc_auc score
    0.98s    = Training runtime
    0.06s    = Validation runtime
Fitting model: LightGBMClassifierXT_STACKER_l0 ... Training model for up to 46.01s of the 46.0s of remaining time.
    0.8994   = Validation roc_auc score
    0.95s    = Training runtime
    0.06s    = Validation runtime
Fitting model: CatboostClassifier_STACKER_l0 ... Training model for up to 44.97s of the 44.97s of remaining time.
    0.8961   = Validation roc_auc score
    2.6s     = Training runtime
    0.06s    = Validation runtime
Fitting model: NeuralNetClassifier_STACKER_l0 ... Training model for up to 42.29s of the 42.29s of remaining time.
    0.8337   = Validation roc_auc score
    27.33s   = Training runtime
    0.13s    = Validation runtime
Fitting model: LightGBMClassifierCustom_STACKER_l0 ... Training model for up to 14.79s of the 14.79s of remaining time.
    0.8673   = Validation roc_auc score
    1.71s    = Training runtime
    0.06s    = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: weighted_ensemble_k0_l1 ... Training model for up to 59.94s of the 12.98s of remaining time.
    0.9064   = Validation roc_auc score
    1.02s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 48.06s ...

This command implements the following strategy to maximize accuracy:

  • Specify the argument presets='best_quality', which allows AutoGluon to automatically construct powerful model ensembles based on stacking/bagging, and will greatly improve the resulting predictions if granted sufficient training time. The default value of presets is 'medium_quality_faster_train', which produces less accurate models but facilitates faster prototyping. With presets, you can flexibly prioritize predictive accuracy vs. training/inference speed. For example, if you care less about predictive performance and want to quickly deploy a basic model, consider using: presets=['good_quality_faster_inference_only_refit', 'optimize_for_deployment'].

  • Provide the eval_metric if you know what metric will be used to evaluate predictions in your application. Some other non-default metrics you might use include things like: 'f1' (for binary classification), 'roc_auc' (for binary classification), 'log_loss' (for classification), 'mean_absolute_error' (for regression), 'median_absolute_error' (for regression). You can also define your own custom metric function, see examples in the folder: autogluon/utils/tabular/metrics/

  • Include all your data in train_data and do not provide tuning_data (AutoGluon will split the data more intelligently to fit its needs).

  • Do not specify the hyperparameter_tune argument (counterintuitively, hyperparameter tuning is not the best way to spend a limited training time budgets, as model ensembling is often superior). We recommend you only use hyperparameter_tune if your goal is to deploy a single model rather than an ensemble.

  • Do not specify hyperparameters argument (allow AutoGluon to adaptively select which models/hyperparameters to use).

  • Set time_limits to the longest amount of time (in seconds) that you are willing to wait. AutoGluon’s predictive performance improves the longer fit() is allowed to run.

Regression (predicting numeric table columns):

To demonstrate that fit() can also automatically handle regression tasks, we now try to predict the numeric age variable in the same table based on the other features:

age_column = 'age'
print("Summary of age variable: \n", train_data[age_column].describe())
Summary of age variable:
 count    500.00000
mean      39.65200
std       13.52393
min       17.00000
25%       29.00000
50%       38.00000
75%       49.00000
max       85.00000
Name: age, dtype: float64
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_14/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)

We again call fit(), imposing a time-limit this time (in seconds), and also demonstrate a shorthand method to evaluate the resulting model on the test data (which contain labels):

predictor_age = task.fit(train_data=train_data, output_directory="agModels-predictAge", label=age_column, time_limits=60)
performance = predictor_age.evaluate(test_data)
Beginning AutoGluon training ... Time limit = 60s
AutoGluon will save models to agModels-predictAge/
AutoGluon Version:  0.0.14b20201027
Train Data Rows:    500
Train Data Columns: 14
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
    Label info (max, min, mean, stddev): (85, 17, 39.652, 13.52393)
    If 'regression' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    21857.84 MB
    Train Data (Original)  Memory Usage: 0.32 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])    : 5 | ['fnlwgt', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
            ('object', []) : 9 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
    Types of features in processed data (raw dtype, special dtypes):
            ('category', []) : 9 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
            ('int', [])      : 5 | ['fnlwgt', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
    0.1s = Fit runtime
    14 features in original data used to generate 14 features in processed data.
    Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.06s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
    To change this, specify the eval_metric argument of fit()
AutoGluon will early stop models using evaluation metric: 'root_mean_squared_error'
Fitting model: RandomForestRegressorMSE ... Training model for up to 59.94s of the 59.94s of remaining time.
    -11.6028         = Validation root_mean_squared_error score
    0.51s    = Training runtime
    0.11s    = Validation runtime
Fitting model: ExtraTreesRegressorMSE ... Training model for up to 59.3s of the 59.3s of remaining time.
    -11.7519         = Validation root_mean_squared_error score
    0.41s    = Training runtime
    0.11s    = Validation runtime
Fitting model: KNeighborsRegressorUnif ... Training model for up to 58.76s of the 58.76s of remaining time.
    -15.6869         = Validation root_mean_squared_error score
    0.0s     = Training runtime
    0.1s     = Validation runtime
Fitting model: KNeighborsRegressorDist ... Training model for up to 58.65s of the 58.65s of remaining time.
    -15.1801         = Validation root_mean_squared_error score
    0.0s     = Training runtime
    0.1s     = Validation runtime
Fitting model: LightGBMRegressor ... Training model for up to 58.54s of the 58.54s of remaining time.
    -11.9474         = Validation root_mean_squared_error score
    0.18s    = Training runtime
    0.01s    = Validation runtime
Fitting model: LightGBMRegressorXT ... Training model for up to 58.35s of the 58.34s of remaining time.
    -11.7971         = Validation root_mean_squared_error score
    0.17s    = Training runtime
    0.01s    = Validation runtime
Fitting model: CatboostRegressor ... Training model for up to 58.16s of the 58.16s of remaining time.
    -11.9308         = Validation root_mean_squared_error score
    0.36s    = Training runtime
    0.01s    = Validation runtime
Fitting model: NeuralNetRegressor ... Training model for up to 57.79s of the 57.79s of remaining time.
    -13.1903         = Validation root_mean_squared_error score
    6.62s    = Training runtime
    0.03s    = Validation runtime
Fitting model: LightGBMRegressorCustom ... Training model for up to 51.13s of the 51.13s of remaining time.
    -12.1676         = Validation root_mean_squared_error score
    0.43s    = Training runtime
    0.01s    = Validation runtime
Fitting model: weighted_ensemble_k0_l1 ... Training model for up to 59.94s of the 50.09s of remaining time.
    -11.2598         = Validation root_mean_squared_error score
    0.39s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 10.31s ...
Predictive performance on given dataset: root_mean_squared_error = 10.63547608742431

Note that we didn’t need to tell AutoGluon this is a regression problem, it automatically inferred this from the data and reported the appropriate performance metric (RMSE by default). To specify a particular evaluation metric other than the default, set the eval_metric argument of fit() and AutoGluon will tailor its models to optimize your metric (e.g. eval_metric = 'mean_absolute_error'). For evaluation metrics where higher values are worse (like RMSE), AutoGluon may sometimes flips their sign and print them as negative values during training (as it internally assumes higher values are better).

Data Formats: AutoGluon can currently operate on data tables already loaded into Python as pandas DataFrames, or those stored in files of CSV format or Parquet format. If your data live in multiple tables, you will first need to join them into a single table whose rows correspond to statistically independent observations (datapoints) and columns correspond to different features (aka. variables/covariates).