Text Prediction - Quick Start

Here we briefly demonstrate the TextPredictor, which helps you automatically train and deploy models for various Natural Language Processing (NLP) tasks. This tutorial presents two examples of NLP tasks:

The general usage of the TextPredictor is similar to AutoGluon’s TabularPredictor. We format NLP datasets as tables where certain columns contain text fields and a special column contains the labels to predict, and each row corresponds to one training example. Here, the labels can be discrete categories (classification) or numerical values (regression).

%matplotlib inline

import numpy as np
import warnings
import matplotlib.pyplot as plt
warnings.filterwarnings('ignore')
np.random.seed(123)

Sentiment Analysis Task

First, we consider the Stanford Sentiment Treebank (SST) dataset, which consists of movie reviews and their associated sentiment. Given a new movie review, the goal is to predict the sentiment reflected in the text (in this case a binary classification, where reviews are labeled as 1 if they convey a positive opinion and labeled as 0 otherwise). Let’s first load and look at the data, noting the labels are stored in a column called label.

from autogluon.core.utils.loaders.load_pd import load
train_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/train.parquet')
test_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/dev.parquet')
subsample_size = 1000  # subsample data for faster demo, try setting this to larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head(10)
sentence label
43787 very pleasing at its best moments 1
16159 , american chai is enough to make you put away... 0
59015 too much like an infomercial for ram dass 's l... 0
5108 a stirring visual sequence 1
67052 cool visual backmasking 1
35938 hard ground 0
49879 the striking , quietly vulnerable personality ... 1
51591 pan nalin 's exposition is beautiful and myste... 1
56780 wonderfully loopy 1
28518 most beautiful , evocative 1

Above the data happen to be stored in a Parquet table format, but you can also directly load() data from a CSV file instead. While here we load files from AWS S3 cloud storage, these could instead be local files on your machine. After loading, train_data is simply a Pandas DataFrame, where each row represents a different training example (for machine learning to be appropriate, the rows should be independent and identically distributed).

Training

To ensure this tutorial runs quickly, we simply call fit() with a subset of 1000 training examples and limit its runtime to approximately 1 minute. To achieve reasonable performance in your applications, you are recommended to set much longer time_limit (eg. 1 hour), or do not specify time_limit at all (time_limit=None).

from autogluon.text import TextPredictor

predictor = TextPredictor(label='label', eval_metric='acc', path='./ag_sst')
predictor.fit(train_data, time_limit=60)
Problem Type="binary"
Column Types:
   - "sentence": text
   - "label": categorical

NumPy-shape semantics has been activated in your code. This is required for creating and manipulating scalar and zero-size tensors, which were not supported in MXNet before, as in the official NumPy library. Please DO NOT manually deactivate this semantics while using mxnet.numpy and mxnet.numpy_extension modules.
The GluonNLP V0 backend is used. We will use 8 cpus and 1 gpus to train each trial.
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/training.log
Fitting and transforming the train data...
Done! Preprocessor saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/preprocessor.pkl
Process dev set...
Done!
Max length for chunking text: 64, Stochastic chunk: Train-False/Test-False, Test #repeat: 1.
#Total Params/Fixed Params=108990466/0
Using gradient accumulation. Global batch size = 128
Local training results will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/results_local.jsonl.
[Iter 1/70, Epoch 0] train loss=8.76e-01, gnorm=9.82e+00, lr=1.43e-05, #samples processed=128, #sample per second=86.20. ETA=1.71min
[Iter 2/70, Epoch 0] train loss=7.94e-01, gnorm=6.16e+00, lr=2.86e-05, #samples processed=128, #sample per second=152.00. ETA=1.32min
[Iter 2/70, Epoch 0] valid f1=7.2204e-01, mcc=0.0000e+00, roc_auc=4.2305e-01, accuracy=5.6500e-01, log_loss=1.0976e+00, time spent=0.460s, total time spent=0.06min. Find new best=True, Find new top-3=True
[Iter 3/70, Epoch 0] train loss=1.29e+00, gnorm=1.51e+01, lr=4.29e-05, #samples processed=128, #sample per second=52.62. ETA=1.77min
[Iter 4/70, Epoch 0] train loss=1.15e+00, gnorm=1.27e+01, lr=5.71e-05, #samples processed=128, #sample per second=158.25. ETA=1.53min
[Iter 4/70, Epoch 0] valid f1=5.6716e-01, mcc=1.4791e-01, roc_auc=6.2750e-01, accuracy=5.6500e-01, log_loss=6.9451e-01, time spent=0.457s, total time spent=0.12min. Find new best=True, Find new top-3=True
[Iter 5/70, Epoch 0] train loss=6.92e-01, gnorm=6.95e+00, lr=7.14e-05, #samples processed=128, #sample per second=49.45. ETA=1.77min
[Iter 6/70, Epoch 0] train loss=6.23e-01, gnorm=1.95e+01, lr=8.57e-05, #samples processed=128, #sample per second=148.04. ETA=1.60min
[Iter 6/70, Epoch 0] valid f1=7.1947e-01, mcc=7.6355e-02, roc_auc=7.2882e-01, accuracy=5.7500e-01, log_loss=6.7021e-01, time spent=0.460s, total time spent=0.18min. Find new best=True, Find new top-3=True
[Iter 7/70, Epoch 0] train loss=6.59e-01, gnorm=7.78e+00, lr=1.00e-04, #samples processed=128, #sample per second=45.92. ETA=1.77min
[Iter 8/70, Epoch 1] train loss=7.61e-01, gnorm=1.45e+01, lr=9.84e-05, #samples processed=128, #sample per second=162.55. ETA=1.63min
[Iter 8/70, Epoch 1] valid f1=5.5866e-01, mcc=2.7262e-01, roc_auc=6.9301e-01, accuracy=6.0500e-01, log_loss=6.6740e-01, time spent=0.457s, total time spent=0.25min. Find new best=True, Find new top-3=True
[Iter 9/70, Epoch 1] train loss=7.56e-01, gnorm=5.62e+00, lr=9.68e-05, #samples processed=128, #sample per second=39.89. ETA=1.79min
[Iter 10/70, Epoch 1] train loss=7.24e-01, gnorm=5.96e+00, lr=9.52e-05, #samples processed=128, #sample per second=158.69. ETA=1.66min
[Iter 10/70, Epoch 1] valid f1=7.5839e-01, mcc=3.2452e-01, roc_auc=8.7845e-01, accuracy=6.4000e-01, log_loss=5.7693e-01, time spent=0.466s, total time spent=0.32min. Find new best=True, Find new top-3=True
[Iter 11/70, Epoch 1] train loss=6.31e-01, gnorm=4.66e+00, lr=9.37e-05, #samples processed=128, #sample per second=40.13. ETA=1.77min
[Iter 12/70, Epoch 1] train loss=5.70e-01, gnorm=5.85e+00, lr=9.21e-05, #samples processed=128, #sample per second=160.47. ETA=1.66min
[Iter 12/70, Epoch 1] valid f1=7.9397e-01, mcc=6.1951e-01, roc_auc=9.1527e-01, accuracy=7.9500e-01, log_loss=4.7745e-01, time spent=0.469s, total time spent=0.38min. Find new best=True, Find new top-3=True
[Iter 13/70, Epoch 1] train loss=5.57e-01, gnorm=5.38e+00, lr=9.05e-05, #samples processed=128, #sample per second=43.72. ETA=1.72min
[Iter 14/70, Epoch 1] train loss=3.91e-01, gnorm=3.17e+00, lr=8.89e-05, #samples processed=128, #sample per second=166.41. ETA=1.62min
[Iter 14/70, Epoch 1] valid f1=8.6381e-01, mcc=6.6579e-01, roc_auc=9.0927e-01, accuracy=8.2500e-01, log_loss=4.5740e-01, time spent=0.459s, total time spent=0.44min. Find new best=True, Find new top-3=True
[Iter 15/70, Epoch 2] train loss=3.69e-01, gnorm=6.26e+00, lr=8.73e-05, #samples processed=128, #sample per second=46.08. ETA=1.66min
[Iter 16/70, Epoch 2] train loss=2.38e-01, gnorm=2.43e+00, lr=8.57e-05, #samples processed=128, #sample per second=165.74. ETA=1.57min
[Iter 16/70, Epoch 2] valid f1=9.0909e-01, mcc=7.9963e-01, roc_auc=9.5209e-01, accuracy=9.0000e-01, log_loss=3.1594e-01, time spent=0.459s, total time spent=0.50min. Find new best=True, Find new top-3=True
[Iter 17/70, Epoch 2] train loss=3.59e-01, gnorm=5.29e+00, lr=8.41e-05, #samples processed=128, #sample per second=45.87. ETA=1.59min
[Iter 18/70, Epoch 2] train loss=2.87e-01, gnorm=5.75e+00, lr=8.25e-05, #samples processed=128, #sample per second=160.27. ETA=1.51min
[Iter 18/70, Epoch 2] valid f1=8.8000e-01, mcc=7.0770e-01, roc_auc=9.4660e-01, accuracy=8.5000e-01, log_loss=4.6460e-01, time spent=0.462s, total time spent=0.55min. Find new best=False, Find new top-3=True
[Iter 19/70, Epoch 2] train loss=3.11e-01, gnorm=8.02e+00, lr=8.10e-05, #samples processed=128, #sample per second=62.39. ETA=1.50min
[Iter 20/70, Epoch 2] train loss=1.73e-01, gnorm=2.98e+00, lr=7.94e-05, #samples processed=128, #sample per second=169.09. ETA=1.43min
[Iter 20/70, Epoch 2] valid f1=8.8393e-01, mcc=7.3638e-01, roc_auc=9.4395e-01, accuracy=8.7000e-01, log_loss=3.2271e-01, time spent=0.465s, total time spent=0.59min. Find new best=False, Find new top-3=True
[Iter 21/70, Epoch 2] train loss=3.55e-01, gnorm=4.42e+00, lr=7.78e-05, #samples processed=128, #sample per second=63.15. ETA=1.41min
[Iter 22/70, Epoch 3] train loss=2.36e-01, gnorm=8.88e+00, lr=7.62e-05, #samples processed=128, #sample per second=152.06. ETA=1.35min
[Iter 22/70, Epoch 3] valid f1=8.7336e-01, mcc=7.0417e-01, roc_auc=9.4233e-01, accuracy=8.5500e-01, log_loss=3.1476e-01, time spent=0.466s, total time spent=0.64min. Find new best=False, Find new top-3=True
[Iter 23/70, Epoch 3] train loss=1.99e-01, gnorm=6.51e+00, lr=7.46e-05, #samples processed=128, #sample per second=60.93. ETA=1.34min
[Iter 24/70, Epoch 3] train loss=3.36e-01, gnorm=8.62e+00, lr=7.30e-05, #samples processed=128, #sample per second=166.09. ETA=1.28min
[Iter 24/70, Epoch 3] valid f1=8.6957e-01, mcc=6.8006e-01, roc_auc=9.3999e-01, accuracy=8.3500e-01, log_loss=4.5034e-01, time spent=0.466s, total time spent=0.67min. Find new best=False, Find new top-3=False
[Iter 25/70, Epoch 3] train loss=2.44e-01, gnorm=4.81e+00, lr=7.14e-05, #samples processed=128, #sample per second=100.07. ETA=1.24min
[Iter 26/70, Epoch 3] train loss=1.69e-01, gnorm=1.84e+00, lr=6.98e-05, #samples processed=128, #sample per second=172.57. ETA=1.19min
[Iter 26/70, Epoch 3] valid f1=8.9498e-01, mcc=7.7011e-01, roc_auc=9.4426e-01, accuracy=8.8500e-01, log_loss=3.3178e-01, time spent=0.460s, total time spent=0.72min. Find new best=False, Find new top-3=True
[Iter 27/70, Epoch 3] train loss=2.08e-01, gnorm=2.65e+00, lr=6.83e-05, #samples processed=128, #sample per second=59.39. ETA=1.17min
[Iter 28/70, Epoch 3] train loss=1.86e-01, gnorm=4.24e+00, lr=6.67e-05, #samples processed=128, #sample per second=166.82. ETA=1.12min
[Iter 28/70, Epoch 3] valid f1=8.8136e-01, mcc=7.1518e-01, roc_auc=9.3317e-01, accuracy=8.6000e-01, log_loss=3.9546e-01, time spent=0.471s, total time spent=0.76min. Find new best=False, Find new top-3=False
[Iter 29/70, Epoch 4] train loss=9.32e-02, gnorm=2.09e+00, lr=6.51e-05, #samples processed=128, #sample per second=100.36. ETA=1.09min
[Iter 30/70, Epoch 4] train loss=2.13e-01, gnorm=7.57e+00, lr=6.35e-05, #samples processed=128, #sample per second=160.73. ETA=1.04min
[Iter 30/70, Epoch 4] valid f1=8.5490e-01, mcc=6.3946e-01, roc_auc=9.2290e-01, accuracy=8.1500e-01, log_loss=6.5264e-01, time spent=0.473s, total time spent=0.79min. Find new best=False, Find new top-3=False
[Iter 31/70, Epoch 4] train loss=1.75e-01, gnorm=6.40e+00, lr=6.19e-05, #samples processed=128, #sample per second=100.74. ETA=1.01min
[Iter 32/70, Epoch 4] train loss=2.13e-01, gnorm=7.47e+00, lr=6.03e-05, #samples processed=128, #sample per second=156.40. ETA=0.97min
[Iter 32/70, Epoch 4] valid f1=8.8333e-01, mcc=7.1741e-01, roc_auc=9.4039e-01, accuracy=8.6000e-01, log_loss=4.1703e-01, time spent=0.470s, total time spent=0.83min. Find new best=False, Find new top-3=False
[Iter 33/70, Epoch 4] train loss=8.22e-02, gnorm=2.65e+00, lr=5.87e-05, #samples processed=128, #sample per second=103.48. ETA=0.94min
[Iter 34/70, Epoch 4] train loss=5.65e-02, gnorm=2.71e+00, lr=5.71e-05, #samples processed=128, #sample per second=164.56. ETA=0.90min
[Iter 34/70, Epoch 4] valid f1=9.1743e-01, mcc=8.2149e-01, roc_auc=9.5392e-01, accuracy=9.1000e-01, log_loss=3.1405e-01, time spent=0.465s, total time spent=0.89min. Find new best=True, Find new top-3=True
Training completed. Auto-saving to "./ag_sst/". For loading the model, you can use predictor = TextPredictor.load("./ag_sst/")
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f5fe1875190>

Above we specify that: the column named label contains the label values to predict, AutoGluon should optimize its predictions for the accuracy evaluation metric, trained models should be saved in the ag_sst folder, and training should run for around 60 seconds.

Evaluation

After training, we can easily evaluate our predictor on separate test data formatted similarly to our training data.

test_score = predictor.evaluate(test_data)
print('Accuracy = {:.2f}%'.format(test_score * 100))
Accuracy = 89.11%

By default, evaluate() will report the evaluation metric previously specified, which is accuracy in our example. You may also specify additional metrics, e.g. F1 score, when calling evaluate.

test_score = predictor.evaluate(test_data, metrics=['acc', 'f1'])
print(test_score)
{'acc': 0.8910550458715596, 'f1': 0.886499402628435}

Prediction

And you can easily obtain predictions from these models by calling predictor.predict().

sentence1 = "it's a charming and often affecting journey."
sentence2 = "It's slow, very, very, very slow."
predictions = predictor.predict({'sentence': [sentence1, sentence2]})
print('"Sentence":', sentence1, '"Predicted Sentiment":', predictions[0])
print('"Sentence":', sentence2, '"Predicted Sentiment":', predictions[1])
"Sentence": it's a charming and often affecting journey. "Predicted Sentiment": 1
"Sentence": It's slow, very, very, very slow. "Predicted Sentiment": 0

For classification tasks, you can ask for predicted class-probabilities instead of predicted classes.

probs = predictor.predict_proba({'sentence': [sentence1, sentence2]})
print('"Sentence":', sentence1, '"Predicted Class-Probabilities":', probs[0])
print('"Sentence":', sentence2, '"Predicted Class-Probabilities":', probs[1])
"Sentence": it's a charming and often affecting journey. "Predicted Class-Probabilities": 0    0.002691
1    0.987142
Name: 0, dtype: float32
"Sentence": It's slow, very, very, very slow. "Predicted Class-Probabilities": 0    0.997309
1    0.012858
Name: 1, dtype: float32

We can just as easily produce predictions over an entire dataset.

test_predictions = predictor.predict(test_data)
test_predictions.head()
0    1
1    0
2    1
3    1
4    0
Name: label, dtype: int64

Intermediate Training Results

After training, you can explore intermediate training results in predictor.results.

predictor.results.tail(3)
iteration report_idx epoch f1 mcc roc_auc accuracy log_loss find_better find_new_topn nbest_stat elapsed_time reward_attr eval_metric exp_dir
15 32 16 4 0.883333 0.717406 0.940393 0.86 0.417029 False False [[0.87, 0.9, 0.885], [20, 16, 26]] 49 0.86 accuracy /var/lib/jenkins/workspace/workspace/autogluon...
16 34 17 4 0.917431 0.821490 0.953921 0.91 0.314053 True True [[0.91, 0.9, 0.885], [34, 16, 26]] 53 0.91 accuracy /var/lib/jenkins/workspace/workspace/autogluon...
17 34 18 4 0.917431 0.821490 0.953921 0.91 0.314053 True True [[0.91, 0.9, 0.885], [34, 16, 26]] 53 0.91 accuracy /var/lib/jenkins/workspace/workspace/autogluon...

Save and Load

The trained predictor is automatically saved at the end of fit(), and you can easily reload it.

loaded_predictor = TextPredictor.load('ag_sst')
loaded_predictor.predict_proba({'sentence': [sentence1, sentence2]})
0 1
0 0.002691 0.997309
1 0.987142 0.012858

You can also save the predictor to any location by calling .save().

loaded_predictor.save('my_saved_dir')
loaded_predictor2 = TextPredictor.load('my_saved_dir')
loaded_predictor2.predict_proba({'sentence': [sentence1, sentence2]})
0 1
0 0.002691 0.997309
1 0.987142 0.012858

Extract Embeddings

You can also use a trained predictor to extract embeddings that maps each row of the data table to an embedding vector extracted from intermediate neural network representations of the row.

embeddings = predictor.extract_embedding(test_data)
print(embeddings)
[[-1.0752565  -0.45979443 -1.0423399  ... -0.90400136  0.5868748
   0.4255725 ]
 [-0.55730903 -0.13102823 -0.53008664 ... -0.10362382  0.48921028
  -0.09010505]
 [-0.7394443  -0.12840375 -0.79802924 ... -0.64508057  0.55865663
   0.32286584]
 ...
 [-0.27061418  0.22129104 -0.5108863  ... -0.17564254  0.26975504
   0.33493665]
 [-0.1949847   0.29287753 -0.80287546 ...  0.19060707  0.1706643
   0.02425722]
 [-0.88445365  0.10447957 -0.49338517 ... -0.6624912   0.04803396
   0.0722226 ]]

Here, we use TSNE to visualize these extracted embeddings. We can see that there are two clusters corresponding to our two labels, since this network has been trained to discriminate between these labels.

from sklearn.manifold import TSNE
X_embedded = TSNE(n_components=2, random_state=123).fit_transform(embeddings)
for val, color in [(0, 'red'), (1, 'blue')]:
    idx = (test_data['label'].to_numpy() == val).nonzero()
    plt.scatter(X_embedded[idx, 0], X_embedded[idx, 1], c=color, label=f'label={val}')
plt.legend(loc='best')
<matplotlib.legend.Legend at 0x7f5f2551b450>
../../_images/output_beginner_02414c_25_1.png

Continuous Training

You can also load a predictor and call .fit() again to continue training the same predictor with new data.

new_predictor = TextPredictor.load('ag_sst')
new_predictor.fit(train_data, time_limit=30, save_path='ag_sst_continue_train')
test_score = new_predictor.evaluate(test_data, metrics=['acc', 'f1'])
print(test_score)
Continue training the existing model...
The GluonNLP V0 backend is used. We will use 8 cpus and 1 gpus to train each trial.
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/autonlp/task0/training.log
Done! Preprocessor saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/autonlp/task0/preprocessor.pkl
Process dev set...
Done!
Max length for chunking text: 64, Stochastic chunk: Train-False/Test-False, Test #repeat: 1.
#Total Params/Fixed Params=108990466/0
Using gradient accumulation. Global batch size = 128
Local training results will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/autonlp/task0/results_local.jsonl.
[Iter 1/70, Epoch 0] train loss=1.85e-01, gnorm=6.40e+00, lr=1.43e-05, #samples processed=128, #sample per second=129.75. ETA=1.13min
[Iter 2/70, Epoch 0] train loss=6.92e-02, gnorm=1.67e+00, lr=2.86e-05, #samples processed=128, #sample per second=157.93. ETA=1.02min
[Iter 2/70, Epoch 0] valid f1=8.9627e-01, mcc=7.4970e-01, roc_auc=9.4965e-01, accuracy=8.7500e-01, log_loss=3.9193e-01, time spent=0.449s, total time spent=0.05min. Find new best=True, Find new top-3=True
[Iter 3/70, Epoch 0] train loss=1.28e-01, gnorm=3.63e+00, lr=4.29e-05, #samples processed=128, #sample per second=58.90. ETA=1.48min
[Iter 4/70, Epoch 0] train loss=1.98e-01, gnorm=9.95e+00, lr=5.71e-05, #samples processed=128, #sample per second=174.38. ETA=1.29min
[Iter 4/70, Epoch 0] valid f1=9.0833e-01, mcc=7.8025e-01, roc_auc=9.5301e-01, accuracy=8.9000e-01, log_loss=3.9402e-01, time spent=0.446s, total time spent=0.11min. Find new best=True, Find new top-3=True
[Iter 5/70, Epoch 0] train loss=8.58e-02, gnorm=4.06e+00, lr=7.14e-05, #samples processed=128, #sample per second=50.32. ETA=1.57min
[Iter 6/70, Epoch 0] train loss=3.23e-01, gnorm=1.29e+01, lr=8.57e-05, #samples processed=128, #sample per second=159.92. ETA=1.43min
[Iter 6/70, Epoch 0] valid f1=8.8235e-01, mcc=7.8148e-01, roc_auc=9.5423e-01, accuracy=8.8000e-01, log_loss=4.8538e-01, time spent=0.453s, total time spent=0.15min. Find new best=False, Find new top-3=True
[Iter 7/70, Epoch 0] train loss=2.81e-01, gnorm=1.51e+01, lr=1.00e-04, #samples processed=128, #sample per second=75.88. ETA=1.46min
[Iter 8/70, Epoch 1] train loss=1.11e-01, gnorm=4.19e+00, lr=9.84e-05, #samples processed=128, #sample per second=156.80. ETA=1.36min
[Iter 8/70, Epoch 1] valid f1=8.9069e-01, mcc=7.3549e-01, roc_auc=9.5280e-01, accuracy=8.6500e-01, log_loss=5.0302e-01, time spent=0.458s, total time spent=0.18min. Find new best=False, Find new top-3=False
[Iter 9/70, Epoch 1] train loss=1.82e-01, gnorm=6.39e+00, lr=9.68e-05, #samples processed=128, #sample per second=103.12. ETA=1.33min
[Iter 10/70, Epoch 1] train loss=1.30e-01, gnorm=2.35e+00, lr=9.52e-05, #samples processed=128, #sample per second=158.89. ETA=1.26min
[Iter 10/70, Epoch 1] valid f1=9.2511e-01, mcc=8.2689e-01, roc_auc=9.5687e-01, accuracy=9.1500e-01, log_loss=2.7101e-01, time spent=0.454s, total time spent=0.25min. Find new best=True, Find new top-3=True
[Iter 11/70, Epoch 1] train loss=1.12e-01, gnorm=2.71e+00, lr=9.37e-05, #samples processed=128, #sample per second=41.91. ETA=1.40min
[Iter 12/70, Epoch 1] train loss=1.04e-01, gnorm=4.07e+00, lr=9.21e-05, #samples processed=128, #sample per second=172.28. ETA=1.32min
[Iter 12/70, Epoch 1] valid f1=8.8618e-01, mcc=7.2342e-01, roc_auc=9.5351e-01, accuracy=8.6000e-01, log_loss=4.3734e-01, time spent=0.447s, total time spent=0.28min. Find new best=False, Find new top-3=False
[Iter 13/70, Epoch 1] train loss=1.22e-01, gnorm=4.86e+00, lr=9.05e-05, #samples processed=128, #sample per second=106.00. ETA=1.29min
[Iter 14/70, Epoch 1] train loss=2.95e-02, gnorm=2.70e+00, lr=8.89e-05, #samples processed=128, #sample per second=169.62. ETA=1.22min
[Iter 14/70, Epoch 1] valid f1=9.0909e-01, mcc=7.9963e-01, roc_auc=9.6104e-01, accuracy=9.0000e-01, log_loss=3.3115e-01, time spent=0.451s, total time spent=0.33min. Find new best=False, Find new top-3=True
[Iter 15/70, Epoch 2] train loss=1.19e-01, gnorm=9.15e+00, lr=8.73e-05, #samples processed=128, #sample per second=60.64. ETA=1.25min
[Iter 16/70, Epoch 2] train loss=1.87e-01, gnorm=8.36e+00, lr=8.57e-05, #samples processed=128, #sample per second=173.28. ETA=1.19min
[Iter 16/70, Epoch 2] valid f1=9.1845e-01, mcc=8.0701e-01, roc_auc=9.5443e-01, accuracy=9.0500e-01, log_loss=4.7094e-01, time spent=0.449s, total time spent=0.38min. Find new best=False, Find new top-3=True
[Iter 17/70, Epoch 2] train loss=7.35e-02, gnorm=5.53e+00, lr=8.41e-05, #samples processed=128, #sample per second=60.30. ETA=1.21min
[Iter 18/70, Epoch 2] train loss=1.97e-01, gnorm=1.15e+01, lr=8.25e-05, #samples processed=128, #sample per second=164.33. ETA=1.16min
[Iter 18/70, Epoch 2] valid f1=8.7302e-01, mcc=6.8927e-01, roc_auc=9.4238e-01, accuracy=8.4000e-01, log_loss=8.1398e-01, time spent=0.453s, total time spent=0.41min. Find new best=False, Find new top-3=False
[Iter 19/70, Epoch 2] train loss=2.06e-01, gnorm=8.24e+00, lr=8.10e-05, #samples processed=128, #sample per second=103.97. ETA=1.13min
[Iter 20/70, Epoch 2] train loss=5.90e-02, gnorm=5.89e+00, lr=7.94e-05, #samples processed=128, #sample per second=174.08. ETA=1.09min
[Iter 20/70, Epoch 2] valid f1=9.0667e-01, mcc=7.8671e-01, roc_auc=9.6094e-01, accuracy=8.9500e-01, log_loss=3.7555e-01, time spent=0.452s, total time spent=0.44min. Find new best=False, Find new top-3=False
[Iter 21/70, Epoch 2] train loss=8.65e-02, gnorm=7.87e+00, lr=7.78e-05, #samples processed=128, #sample per second=104.57. ETA=1.06min
[Iter 22/70, Epoch 3] train loss=1.24e-01, gnorm=3.39e+01, lr=7.62e-05, #samples processed=128, #sample per second=158.77. ETA=1.02min
[Iter 22/70, Epoch 3] valid f1=8.9952e-01, mcc=8.0265e-01, roc_auc=9.5819e-01, accuracy=8.9500e-01, log_loss=4.3100e-01, time spent=0.452s, total time spent=0.48min. Find new best=False, Find new top-3=False
Training completed. Auto-saving to "ag_sst_continue_train/". For loading the model, you can use predictor = TextPredictor.load("ag_sst_continue_train/")
{'acc': 0.9059633027522935, 'f1': 0.9088888888888889}

Sentence Similarity Task

Next, let’s use AutoGluon to train a model for evaluating how semantically similar two sentences are. We use the Semantic Textual Similarity Benchmark dataset for illustration.

sts_train_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet')[['sentence1', 'sentence2', 'score']]
sts_test_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet')[['sentence1', 'sentence2', 'score']]
sts_train_data.head(10)
Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet | Columns = 4 / 4 | Rows = 5749 -> 5749
Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet | Columns = 4 / 4 | Rows = 1500 -> 1500
sentence1 sentence2 score
0 A plane is taking off. An air plane is taking off. 5.00
1 A man is playing a large flute. A man is playing a flute. 3.80
2 A man is spreading shreded cheese on a pizza. A man is spreading shredded cheese on an uncoo... 3.80
3 Three men are playing chess. Two men are playing chess. 2.60
4 A man is playing the cello. A man seated is playing the cello. 4.25
5 Some men are fighting. Two men are fighting. 4.25
6 A man is smoking. A man is skating. 0.50
7 The man is playing the piano. The man is playing the guitar. 1.60
8 A man is playing on a guitar and singing. A woman is playing an acoustic guitar and sing... 2.20
9 A person is throwing a cat on to the ceiling. A person throws a cat on the ceiling. 5.00

In this data, the column named score contains numerical values (which we’d like to predict) that are human-annotated similarity scores for each given pair of sentences.

print('Min score=', min(sts_train_data['score']), ', Max score=', max(sts_train_data['score']))
Min score= 0.0 , Max score= 5.0

Let’s train a regression model to predict these scores. Note that we only need to specify the label column and AutoGluon automatically determines the type of prediction problem and an appropriate loss function. Once again, you should increase the short time_limit below to obtain reasonable performance in your own applications.

predictor_sts = TextPredictor(label='score', path='./ag_sts')
predictor_sts.fit(sts_train_data, time_limit=60)
Problem Type="regression"
Column Types:
   - "sentence1": text
   - "sentence2": text
   - "score": numerical

The GluonNLP V0 backend is used. We will use 8 cpus and 1 gpus to train each trial.
All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/training.log
Fitting and transforming the train data...
Done! Preprocessor saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/preprocessor.pkl
Process dev set...
Done!
Max length for chunking text: 128, Stochastic chunk: Train-False/Test-False, Test #repeat: 1.
#Total Params/Fixed Params=108990337/0
Using gradient accumulation. Global batch size = 128
Local training results will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/results_local.jsonl.
[Iter 3/410, Epoch 0] train loss=1.41e+00, gnorm=1.39e+01, lr=7.32e-06, #samples processed=384, #sample per second=99.03. ETA=8.75min
[Iter 6/410, Epoch 0] train loss=1.06e+00, gnorm=1.16e+01, lr=1.46e-05, #samples processed=384, #sample per second=93.08. ETA=8.97min
[Iter 9/410, Epoch 0] train loss=8.36e-01, gnorm=1.17e+01, lr=2.20e-05, #samples processed=384, #sample per second=118.67. ETA=8.34min
[Iter 9/410, Epoch 0] valid r2=5.1128e-01, root_mean_squared_error=1.0390e+00, mean_absolute_error=8.1696e-01, time spent=2.070s, total time spent=0.24min. Find new best=True, Find new top-3=True
[Iter 12/410, Epoch 0] train loss=6.89e-01, gnorm=7.31e+00, lr=2.93e-05, #samples processed=384, #sample per second=57.46. ETA=9.90min
[Iter 15/410, Epoch 0] train loss=5.97e-01, gnorm=8.97e+00, lr=3.66e-05, #samples processed=384, #sample per second=112.03. ETA=9.37min
[Iter 18/410, Epoch 0] train loss=5.40e-01, gnorm=4.70e+00, lr=4.39e-05, #samples processed=384, #sample per second=115.08. ETA=8.96min
[Iter 18/410, Epoch 0] valid r2=6.1777e-01, root_mean_squared_error=9.1883e-01, mean_absolute_error=7.4361e-01, time spent=2.065s, total time spent=0.47min. Find new best=True, Find new top-3=True
[Iter 21/410, Epoch 0] train loss=4.92e-01, gnorm=6.03e+00, lr=5.12e-05, #samples processed=384, #sample per second=56.42. ETA=9.72min
[Iter 24/410, Epoch 0] train loss=5.69e-01, gnorm=4.96e+00, lr=5.85e-05, #samples processed=384, #sample per second=107.40. ETA=9.40min
[Iter 27/410, Epoch 0] train loss=4.12e-01, gnorm=4.30e+00, lr=6.59e-05, #samples processed=384, #sample per second=87.00. ETA=9.33min
[Iter 27/410, Epoch 0] valid r2=5.8926e-01, root_mean_squared_error=9.5248e-01, mean_absolute_error=7.4617e-01, time spent=2.070s, total time spent=0.70min. Find new best=False, Find new top-3=True
[Iter 30/410, Epoch 0] train loss=4.44e-01, gnorm=1.38e+01, lr=7.32e-05, #samples processed=384, #sample per second=63.84. ETA=9.60min
[Iter 33/410, Epoch 0] train loss=3.92e-01, gnorm=4.51e+00, lr=8.05e-05, #samples processed=384, #sample per second=113.44. ETA=9.31min
[Iter 36/410, Epoch 0] train loss=3.61e-01, gnorm=3.20e+00, lr=8.78e-05, #samples processed=384, #sample per second=107.37. ETA=9.08min
[Iter 36/410, Epoch 0] valid r2=7.6088e-01, root_mean_squared_error=7.2675e-01, mean_absolute_error=5.8205e-01, time spent=2.074s, total time spent=0.94min. Find new best=True, Find new top-3=True
Training completed. Auto-saving to "./ag_sts/". For loading the model, you can use predictor = TextPredictor.load("./ag_sts/")
<autogluon.text.text_prediction.predictor.predictor.TextPredictor at 0x7f5f4149a150>

We again evaluate our trained model’s performance on separate test data. Below we choose to compute the following metrics: RMSE, Pearson Correlation, and Spearman Correlation.

test_score = predictor_sts.evaluate(sts_test_data, metrics=['rmse', 'pearsonr', 'spearmanr'])
print('RMSE = {:.2f}'.format(test_score['rmse']))
print('PEARSONR = {:.4f}'.format(test_score['pearsonr']))
print('SPEARMANR = {:.4f}'.format(test_score['spearmanr']))
RMSE = 0.75
PEARSONR = 0.8760
SPEARMANR = 0.8775

Let’s use our model to predict the similarity score between a few sentences.

sentences = ['The child is riding a horse.',
             'The young boy is riding a horse.',
             'The young man is riding a horse.',
             'The young man is riding a bicycle.']

score1 = predictor_sts.predict({'sentence1': [sentences[0]],
                                'sentence2': [sentences[1]]}, as_pandas=False)

score2 = predictor_sts.predict({'sentence1': [sentences[0]],
                                'sentence2': [sentences[2]]}, as_pandas=False)

score3 = predictor_sts.predict({'sentence1': [sentences[0]],
                                'sentence2': [sentences[3]]}, as_pandas=False)
print(score1, score2, score3)
[4.2377853] [2.738769] [0.39222687]

Although the TextPredictor is only designed for classification and regression tasks, it can directly be used for many NLP tasks if you properly format them into a data table. Note that there can be many text columns in this data table. Refer to the TextPredictor documentation to see all of the available methods/options.

Unlike TabularPredictor which trains/ensembles many different kinds of models,  TextPredictor fits only Transformer neural network models. These are fit to your data via transfer learning from pretrained NLP models like: BERT, ALBERT, and ELECTRA. TextPredictor also enables training on multi-modal data tables that contain text, numeric and categorical columns, and the neural network hyperparameter can be automatically tuned with Hyperparameter Optimization (HPO), which will be introduced in the other tutorials.

Note: TextPredictor depends on the GluonNLP package. Due to an ongoing upgrade of GluonNLP, we are currently using a custom version of the package: autogluon-contrib-nlp. In a future release, AutoGluon will support the official GluonNLP 1.0, but the APIs demonstrated here will remain the same.