Text Predictionnavigate_next Text Prediction - Customization and Hyperparameter Search
Quick search
code
Show Source
Stable Version Documentation API Installation Tutorials Github Other Versions Documentation
AutoGluon Documentation
Table Of Contents
  • Tabular Prediction
    • Predicting Columns in a Table - Quick Start
    • Predicting Columns in a Table - In Depth
    • How to use AutoGluon for Kaggle competitions
    • Multimodal Data Tables: Combining BERT/Transformers and Classical Tabular Models
    • Predicting Multiple Columns in a Table (Multi-Label Prediction)
    • FAQ
  • Image Prediction
    • Image Prediction - Quick Start
    • Image Prediction - Search Space and Hyperparameter Optimization (HPO)
    • Image Prediction - How to Use Your Own Datasets
  • Object Detection
    • Object Detection - Quick Start
  • Text Prediction
    • Text Prediction - Quick Start
    • Text Prediction - Multimodal Table with Text
    • Text Prediction - Customization and Hyperparameter Search
  • Tune Custom Models
    • Search Space and Decorator
    • Search Algorithms
    • Searchable Objects
    • Tune Training Scripts
    • Distributed Search
    • Getting started with Advanced HPO Algorithms
  • Neural Architecture Search
    • Demo RL Searcher
    • How to Use ENAS/ProxylessNAS in Ten Minutes
  • For PyTorch Users
    • Tune PyTorch Model on MNIST
  • AutoGluon Predictors
  • autogluon.core
  • autogluon.core.space
  • autogluon.core.scheduler
  • autogluon.core.searcher
  • autogluon.core.utils
  • autogluon.features
  • autogluon.tabular.models
  • autogluon.model_zoo
AutoGluon Documentation
Table Of Contents
  • Tabular Prediction
    • Predicting Columns in a Table - Quick Start
    • Predicting Columns in a Table - In Depth
    • How to use AutoGluon for Kaggle competitions
    • Multimodal Data Tables: Combining BERT/Transformers and Classical Tabular Models
    • Predicting Multiple Columns in a Table (Multi-Label Prediction)
    • FAQ
  • Image Prediction
    • Image Prediction - Quick Start
    • Image Prediction - Search Space and Hyperparameter Optimization (HPO)
    • Image Prediction - How to Use Your Own Datasets
  • Object Detection
    • Object Detection - Quick Start
  • Text Prediction
    • Text Prediction - Quick Start
    • Text Prediction - Multimodal Table with Text
    • Text Prediction - Customization and Hyperparameter Search
  • Tune Custom Models
    • Search Space and Decorator
    • Search Algorithms
    • Searchable Objects
    • Tune Training Scripts
    • Distributed Search
    • Getting started with Advanced HPO Algorithms
  • Neural Architecture Search
    • Demo RL Searcher
    • How to Use ENAS/ProxylessNAS in Ten Minutes
  • For PyTorch Users
    • Tune PyTorch Model on MNIST
  • AutoGluon Predictors
  • autogluon.core
  • autogluon.core.space
  • autogluon.core.scheduler
  • autogluon.core.searcher
  • autogluon.core.utils
  • autogluon.features
  • autogluon.tabular.models
  • autogluon.model_zoo

Text Prediction - Customization and Hyperparameter Search¶

This advanced tutorial teaches you how to control the hyperparameter tuning process in TextPredictor by specifying:

  • A custom search space of candidate hyperparameter values to consider.

  • Which hyperparameter optimization (HPO) method should be used to actually search through this space.

import numpy as np
import warnings
import autogluon as ag
warnings.filterwarnings('ignore')
np.random.seed(123)

Stanford Sentiment Treebank Data¶

For demonstration, we use the Stanford Sentiment Treebank (SST) dataset.

from autogluon.core.utils.loaders.load_pd import load
subsample_size = 1000  # subsample for faster demo, you may try specifying larger value
train_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/train.parquet')
test_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/dev.parquet')
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head(10)
sentence label
43787 very pleasing at its best moments 1
16159 , american chai is enough to make you put away... 0
59015 too much like an infomercial for ram dass 's l... 0
5108 a stirring visual sequence 1
67052 cool visual backmasking 1
35938 hard ground 0
49879 the striking , quietly vulnerable personality ... 1
51591 pan nalin 's exposition is beautiful and myste... 1
56780 wonderfully loopy 1
28518 most beautiful , evocative 1

Configuring the TextPredictor¶

Pre-configured Hyperparameters¶

We provided a series of pre-configured hyperparameters. You may list the keys from ag_text_presets via list_presets.

from autogluon.text import ag_text_presets, list_presets
list_presets()
{'simple_presets': ['default',
  'lower_quality_fast_train',
  'medium_quality_faster_train',
  'best_quality'],
 'advanced_presets': ['electra_small_fuse_late',
  'electra_base_fuse_late',
  'electra_large_fuse_late',
  'roberta_base_fuse_late',
  'multi_cased_bert_base_fuse_late',
  'electra_base_fuse_early',
  'electra_base_all_text']}

There are two kinds of presets. The simple_presets are pre-defined configurations recommended for most users, which allow you specify whether you care more about predictive accuracy ('best_quality') or more about training/inference speed ('lower_quality_fast_train')

The advanced_presets are pre-configured networks using different Transformer backbones such as ELECTRA, RoBERTa, or Multilingual BERT, and different feature fusion strategies. For example, electra_small_fuse_late means we use the ELECTRA-small model as the network backbone for text fields and use the late fusion strategy described in “.. _sec_textprediction_architecture:”. The default preset is the same as electra_base_fuse_late. Now let’s train a model on our data with specified presets.

from autogluon.text import TextPredictor
predictor = TextPredictor(path='ag_text_sst_electra_small', eval_metric='acc', label='label')
predictor.set_verbosity(0)
predictor.fit(train_data, presets='electra_small_fuse_late', time_limit=60, seed=123)
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_electra_small/task0/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f456b8a0bd0>

Below we report both f1 and acc metrics for our predictions. Note that if you really want to obtain the best F1 score, you should set eval_metric='f1' when constructing the TextPredictor.

predictor.evaluate(test_data, metrics=['f1', 'acc'])
{'f1': 0.7720504009163803, 'acc': 0.7717889908256881}

To view the pre-registered hyperparameters, you can call ag_text_presets.create(presets_name), e.g.,

import pprint
pprint.pprint(ag_text_presets.create('electra_small_fuse_late'))
{'models': {'MultimodalTextModel': {'backend': 'gluonnlp_v0',
                                    'search_space': {'model.backbone.name': 'google_electra_small',
                                                     'model.network.agg_net.agg_type': 'concat',
                                                     'model.network.aggregate_categorical': True,
                                                     'model.use_avg_nbest': True,
                                                     'optimization.batch_size': 128,
                                                     'optimization.layerwise_lr_decay': 0.8,
                                                     'optimization.lr': Categorical[0.0001],
                                                     'optimization.nbest': 3,
                                                     'optimization.num_train_epochs': 10,
                                                     'optimization.per_device_batch_size': 8,
                                                     'optimization.wd': 0.0001,
                                                     'preprocessing.categorical.convert_to_text': False,
                                                     'preprocessing.numerical.convert_to_text': False}}},
 'tune_kwargs': {'num_trials': 1,
                 'scheduler_options': None,
                 'search_options': None,
                 'search_strategy': 'random'}}

Another way to specify a custom TextPredictor configuration is via the hyperparameters argument.

predictor.fit(train_data, hyperparameters=ag_text_presets.create('electra_small_fuse_late'),
              time_limit=30, seed=123)
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_electra_small/task0/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f456b8a0bd0>

Custom Hyperparameter Values¶

The pre-registered configurations provide reasonable default hyperparameters. A common workflow is to first train a model with one of the presets and then tune some hyperparameters to see if the performance can be further improved. In the example below, we set the number of training epochs to 5 and the learning rate to be 5E-5.

hyperparameters = ag_text_presets.create('electra_small_fuse_late')
hyperparameters['models']['MultimodalTextModel']['search_space']['optimization.num_train_epochs'] = 5
hyperparameters['models']['MultimodalTextModel']['search_space']['optimization.lr'] = ag.core.space.Categorical(5E-5)
predictor.fit(train_data, hyperparameters=hyperparameters, time_limit=30, seed=123)
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_electra_small/task0/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f456b8a0bd0>

Register Your Own Configuration¶

You can also register your custom hyperparameter settings as new presets in ag_text_presets. Below, the electra_small_fuse_late_train5 preset uses ELECTRA-small as its backbone and trains for 5 epochs with a weight-decay of 1E-2.

@ag_text_presets.register()
def electra_small_fuse_late_train5():
    hyperparameters = ag_text_presets.create('electra_small_fuse_late')
    hyperparameters['models']['MultimodalTextModel']['search_space']['optimization.num_train_epochs'] = 5
    hyperparameters['models']['MultimodalTextModel']['search_space']['optimization.wd'] = 1E-2
    return hyperparameters

predictor.fit(train_data, presets='electra_small_fuse_late_train5', time_limit=60, seed=123)
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_electra_small/task0/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f456b8a0bd0>

HPO over a Customized Search Space via Random Search¶

To control which hyperparameter values are considered during fit(), we specify the hyperparameters argument. Rather than specifying a particular fixed value for a hyperparameter, we can specify a space of values to search over via ag.core.space. We can also specify which HPO method to use for the search via search_strategy (a simple random search is specified below). In this example, we search for good values of the following hyperparameters:

  • warmup

  • number of hidden units in the final MLP layer that maps aggregated features to output prediction

  • learning rate

  • weight decay

def electra_small_basic_demo_hpo():
    hparams = ag_text_presets.create('electra_small_fuse_late')
    search_space = hparams['models']['MultimodalTextModel']['search_space']
    search_space['optimization.per_device_batch_size'] = 8
    search_space['model.network.agg_net.mid_units'] = ag.core.space.Int(32, 128)
    search_space['optimization.warmup_portion'] = ag.core.space.Categorical(0.1, 0.2)
    search_space['optimization.lr'] = ag.core.space.Real(1E-5, 2E-4)
    search_space['optimization.wd'] = ag.core.space.Categorical(1E-4, 1E-3, 1E-2)
    search_space['optimization.num_train_epochs'] = 5
    hparams['tune_kwargs']['search_strategy'] = 'random'
    return hparams

We can now call fit() with hyperparameter-tuning over our custom search space. Below num_trials controls the maximal number of different hyperparameter configurations for which AutoGluon will train models (4 models are trained under different hyperparameter configurations in this case). To achieve good performance in your applications, you should use larger values of num_trials, which may identify superior hyperparameter values but will require longer runtimes.

predictor_sst_rs = TextPredictor(path='ag_text_sst_random_search', label='label', eval_metric='acc')
predictor_sst_rs.set_verbosity(0)
predictor_sst_rs.fit(train_data,
                      hyperparameters=electra_small_basic_demo_hpo(),
                      time_limit=60 * 2,
                      num_trials=4,
                      seed=123)
  0%|          | 0/4 [00:00<?, ?it/s]
(task:0)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_random_search/task0/training.log
(task:1)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_random_search/task1/training.log
(task:2)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_random_search/task2/training.log
(task:3)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_random_search/task3/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f44717ca9d0>

We can again evaluate our model’s performance on separate test data.

test_score = predictor_sst_rs.evaluate(test_data, metrics=['acc', 'f1'])
print('Best Config = {}'.format(predictor_sst_rs.results['best_config']))
print('Total Time = {}s'.format(predictor_sst_rs.results['total_time']))
print('Accuracy = {:.2f}%'.format(test_score['acc'] * 100))
print('F1 = {:.2f}%'.format(test_score['f1'] * 100))
Best Config = {'search_space▁model.network.agg_net.mid_units': 64, 'search_space▁optimization.lr': 0.00019631030903737018, 'search_space▁optimization.warmup_portion▁choice': 1, 'search_space▁optimization.wd▁choice': 0}
Total Time = 99.55360150337219s
Accuracy = 79.70%
F1 = 80.27%

HPO via Bayesian Optimization + Hyperband¶

Alternatively, we can use more advanced searchers for HPO like a combination of Hyperband and Bayesian Optimization. Hyperband will try multiple hyperparameter configurations simultaneously and will early stop training under poor configurations to free compute resources for exploring new hyperparameter configurations. Compared to random search, Bayesian Optimization more cleverly selects the next hyperparameter values to try.

hyperparameters = electra_small_basic_demo_hpo()
hyperparameters['tune_kwargs']['search_strategy'] = 'bayesopt_hyperband'
hyperparameters['tune_kwargs']['scheduler_options'] = {'max_t': 15} # Maximal number of epochs for training the neural network
predictor_sst_hb = TextPredictor(path='ag_text_sst_hb', label='label', eval_metric='acc')
predictor_sst_hb.set_verbosity(0)
predictor_sst_hb.fit(train_data,
                     hyperparameters=hyperparameters,
                     time_limit=60 * 2,
                     num_trials=8,
                     seed=123)
  0%|          | 0/8 [00:00<?, ?it/s]
(task:4)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task4/training.log
(task:5)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task5/training.log
(task:6)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task6/training.log
(task:7)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task7/training.log
(task:8)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task8/training.log
(task:9)    All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task9/training.log
(task:10)   All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_text_sst_hb/task10/training.log
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f445003da90>
test_score = predictor_sst_hb.evaluate(test_data, metrics=['acc', 'f1'])
print('Best Config = {}'.format(predictor_sst_hb.results['best_config']))
print('Total Time = {}s'.format(predictor_sst_hb.results['total_time']))
print('Accuracy = {:.2f}%'.format(test_score['acc'] * 100))
print('F1 = {:.2f}%'.format(test_score['f1'] * 100))
Best Config = {'search_space▁model.network.agg_net.mid_units': 80, 'search_space▁optimization.lr': 0.000105, 'search_space▁optimization.warmup_portion▁choice': 0, 'search_space▁optimization.wd▁choice': 0}
Total Time = 120.513906955719s
Accuracy = 76.03%
F1 = 76.22%

You can also try setting hyperparameters['tune_kwargs']['search_strategy'] to be 'bayesopt' or 'local_sequential_auto' as alternative HPO methods.

Table Of Contents

  • Text Prediction - Customization and Hyperparameter Search
    • Stanford Sentiment Treebank Data
    • Configuring the TextPredictor
      • Pre-configured Hyperparameters
      • Custom Hyperparameter Values
      • Register Your Own Configuration
    • HPO over a Customized Search Space via Random Search
    • HPO via Bayesian Optimization + Hyperband
Previous
Text Prediction - Multimodal Table with Text
Next
Tune Custom Models