.. _sec_textprediction_beginner: Text Prediction - Quick Start ============================= Here we briefly demonstrate the ``TextPredictor``, which helps you automatically train and deploy models for various Natural Language Processing (NLP) tasks. This tutorial presents two examples of NLP tasks: - `Sentiment Analysis `__ - `Sentence Similarity `__ The general usage of the ``TextPredictor`` is similar to AutoGluon's ``TabularPredictor``. We format NLP datasets as tables where certain columns contain text fields and a special column contains the labels to predict, and each row corresponds to one training example. Here, the labels can be discrete categories (classification) or numerical values (regression). .. code:: python %matplotlib inline import numpy as np import warnings import matplotlib.pyplot as plt warnings.filterwarnings('ignore') np.random.seed(123) Sentiment Analysis Task ----------------------- First, we consider the Stanford Sentiment Treebank (`SST `__) dataset, which consists of movie reviews and their associated sentiment. Given a new movie review, the goal is to predict the sentiment reflected in the text (in this case a **binary classification**, where reviews are labeled as 1 if they convey a positive opinion and labeled as 0 otherwise). Let's first load and look at the data, noting the labels are stored in a column called **label**. .. code:: python from autogluon.core.utils.loaders.load_pd import load train_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/train.parquet') test_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/dev.parquet') subsample_size = 1000 # subsample data for faster demo, try setting this to larger values train_data = train_data.sample(n=subsample_size, random_state=0) train_data.head(10) .. raw:: html

	sentence	label
43787	very pleasing at its best moments	1
16159	, american chai is enough to make you put away...	0
59015	too much like an infomercial for ram dass 's l...	0
5108	a stirring visual sequence	1
67052	cool visual backmasking	1
35938	hard ground	0
49879	the striking , quietly vulnerable personality ...	1
51591	pan nalin 's exposition is beautiful and myste...	1
56780	wonderfully loopy	1
28518	most beautiful , evocative	1

Above the data happen to be stored in a `Parquet `__ table format, but you can also directly ``load()`` data from a `CSV `__ file instead. While here we load files from `AWS S3 cloud storage `__, these could instead be local files on your machine. After loading, ``train_data`` is simply a `Pandas DataFrame `__, where each row represents a different training example (for machine learning to be appropriate, the rows should be independent and identically distributed). Training ~~~~~~~~ To ensure this tutorial runs quickly, we simply call ``fit()`` with a subset of 1000 training examples and limit its runtime to approximately 1 minute. To achieve reasonable performance in your applications, you are recommended to set much longer ``time_limit`` (eg. 1 hour), or do not specify ``time_limit`` at all (``time_limit=None``). .. code:: python from autogluon.text import TextPredictor predictor = TextPredictor(label='label', eval_metric='acc', path='./ag_sst') predictor.fit(train_data, time_limit=60) .. parsed-literal:: :class: output INFO:root:NumPy-shape semantics has been activated in your code. This is required for creating and manipulating scalar and zero-size tensors, which were not supported in MXNet before, as in the official NumPy library. Please DO NOT manually deactivate this semantics while using `mxnet.numpy` and `mxnet.numpy_extension` modules. INFO:autogluon.text.text_prediction.mx.models:The GluonNLP V0 backend is used. We will use 8 cpus and 1 gpus to train each trial. .. parsed-literal:: :class: output All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/training.log .. parsed-literal:: :class: output INFO:root:Fitting and transforming the train data... INFO:root:Done! Preprocessor saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/preprocessor.pkl INFO:root:Process dev set... INFO:root:Done! INFO:root:Max length for chunking text: 64, Stochastic chunk: Train-False/Test-False, Test #repeat: 1. INFO:root:#Total Params/Fixed Params=108990466/0 Level 15:root:Using gradient accumulation. Global batch size = 128 INFO:root:Local training results will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sst/task0/results_local.jsonl. Level 15:root:[Iter 1/70, Epoch 0] train loss=8.76e-01, gnorm=9.82e+00, lr=1.43e-05, #samples processed=128, #sample per second=85.79. ETA=1.72min Level 15:root:[Iter 2/70, Epoch 0] train loss=7.94e-01, gnorm=6.16e+00, lr=2.86e-05, #samples processed=128, #sample per second=134.54. ETA=1.39min Level 25:root:[Iter 2/70, Epoch 0] valid f1=7.2204e-01, mcc=0.0000e+00, roc_auc=4.2305e-01, accuracy=5.6500e-01, log_loss=1.0976e+00, time spent=0.484s, total time spent=0.07min. Find new best=True, Find new top-3=True Level 15:root:[Iter 3/70, Epoch 0] train loss=1.29e+00, gnorm=1.51e+01, lr=4.29e-05, #samples processed=128, #sample per second=50.87. ETA=1.85min Level 15:root:[Iter 4/70, Epoch 0] train loss=1.15e+00, gnorm=1.27e+01, lr=5.71e-05, #samples processed=128, #sample per second=152.23. ETA=1.60min Level 25:root:[Iter 4/70, Epoch 0] valid f1=5.6716e-01, mcc=1.4791e-01, roc_auc=6.2750e-01, accuracy=5.6500e-01, log_loss=6.9451e-01, time spent=0.483s, total time spent=0.13min. Find new best=True, Find new top-3=True Level 15:root:[Iter 5/70, Epoch 0] train loss=6.92e-01, gnorm=6.95e+00, lr=7.14e-05, #samples processed=128, #sample per second=44.98. ETA=1.87min Level 15:root:[Iter 6/70, Epoch 0] train loss=6.23e-01, gnorm=1.95e+01, lr=8.57e-05, #samples processed=128, #sample per second=142.10. ETA=1.70min Level 25:root:[Iter 6/70, Epoch 0] valid f1=7.1947e-01, mcc=7.6355e-02, roc_auc=7.2882e-01, accuracy=5.7500e-01, log_loss=6.7021e-01, time spent=0.487s, total time spent=0.19min. Find new best=True, Find new top-3=True Level 15:root:[Iter 7/70, Epoch 0] train loss=6.59e-01, gnorm=7.78e+00, lr=1.00e-04, #samples processed=128, #sample per second=43.82. ETA=1.87min Level 15:root:[Iter 8/70, Epoch 1] train loss=7.61e-01, gnorm=1.45e+01, lr=9.84e-05, #samples processed=128, #sample per second=154.82. ETA=1.72min Level 25:root:[Iter 8/70, Epoch 1] valid f1=5.5866e-01, mcc=2.7262e-01, roc_auc=6.9301e-01, accuracy=6.0500e-01, log_loss=6.6740e-01, time spent=0.492s, total time spent=0.27min. Find new best=True, Find new top-3=True Level 15:root:[Iter 9/70, Epoch 1] train loss=7.56e-01, gnorm=5.62e+00, lr=9.68e-05, #samples processed=128, #sample per second=35.67. ETA=1.91min Level 15:root:[Iter 10/70, Epoch 1] train loss=7.24e-01, gnorm=5.96e+00, lr=9.52e-05, #samples processed=128, #sample per second=151.57. ETA=1.77min Level 25:root:[Iter 10/70, Epoch 1] valid f1=7.5839e-01, mcc=3.2452e-01, roc_auc=8.7845e-01, accuracy=6.4000e-01, log_loss=5.7693e-01, time spent=0.488s, total time spent=0.34min. Find new best=True, Find new top-3=True Level 15:root:[Iter 11/70, Epoch 1] train loss=6.31e-01, gnorm=4.66e+00, lr=9.37e-05, #samples processed=128, #sample per second=37.72. ETA=1.89min Level 15:root:[Iter 12/70, Epoch 1] train loss=5.70e-01, gnorm=5.85e+00, lr=9.21e-05, #samples processed=128, #sample per second=153.32. ETA=1.77min Level 25:root:[Iter 12/70, Epoch 1] valid f1=7.9397e-01, mcc=6.1951e-01, roc_auc=9.1527e-01, accuracy=7.9500e-01, log_loss=4.7745e-01, time spent=0.494s, total time spent=0.41min. Find new best=True, Find new top-3=True Level 15:root:[Iter 13/70, Epoch 1] train loss=5.57e-01, gnorm=5.38e+00, lr=9.05e-05, #samples processed=128, #sample per second=37.69. ETA=1.85min Level 15:root:[Iter 14/70, Epoch 1] train loss=3.91e-01, gnorm=3.17e+00, lr=8.89e-05, #samples processed=128, #sample per second=158.46. ETA=1.74min Level 25:root:[Iter 14/70, Epoch 1] valid f1=8.6381e-01, mcc=6.6579e-01, roc_auc=9.0927e-01, accuracy=8.2500e-01, log_loss=4.5740e-01, time spent=0.502s, total time spent=0.48min. Find new best=True, Find new top-3=True Level 15:root:[Iter 15/70, Epoch 2] train loss=3.69e-01, gnorm=6.26e+00, lr=8.73e-05, #samples processed=128, #sample per second=37.06. ETA=1.81min Level 15:root:[Iter 16/70, Epoch 2] train loss=2.38e-01, gnorm=2.43e+00, lr=8.57e-05, #samples processed=128, #sample per second=156.22. ETA=1.71min Level 25:root:[Iter 16/70, Epoch 2] valid f1=9.0909e-01, mcc=7.9963e-01, roc_auc=9.5209e-01, accuracy=9.0000e-01, log_loss=3.1594e-01, time spent=0.494s, total time spent=0.55min. Find new best=True, Find new top-3=True Level 15:root:[Iter 17/70, Epoch 2] train loss=3.59e-01, gnorm=5.29e+00, lr=8.41e-05, #samples processed=128, #sample per second=37.61. ETA=1.76min Level 15:root:[Iter 18/70, Epoch 2] train loss=2.87e-01, gnorm=5.75e+00, lr=8.25e-05, #samples processed=128, #sample per second=150.47. ETA=1.67min Level 25:root:[Iter 18/70, Epoch 2] valid f1=8.8000e-01, mcc=7.0770e-01, roc_auc=9.4660e-01, accuracy=8.5000e-01, log_loss=4.6460e-01, time spent=0.492s, total time spent=0.60min. Find new best=False, Find new top-3=True Level 15:root:[Iter 19/70, Epoch 2] train loss=3.11e-01, gnorm=8.02e+00, lr=8.10e-05, #samples processed=128, #sample per second=56.18. ETA=1.65min Level 15:root:[Iter 20/70, Epoch 2] train loss=1.73e-01, gnorm=2.98e+00, lr=7.94e-05, #samples processed=128, #sample per second=157.56. ETA=1.57min Level 25:root:[Iter 20/70, Epoch 2] valid f1=8.8393e-01, mcc=7.3638e-01, roc_auc=9.4395e-01, accuracy=8.7000e-01, log_loss=3.2271e-01, time spent=0.497s, total time spent=0.66min. Find new best=False, Find new top-3=True Level 15:root:[Iter 21/70, Epoch 2] train loss=3.55e-01, gnorm=4.42e+00, lr=7.78e-05, #samples processed=128, #sample per second=54.59. ETA=1.56min Level 15:root:[Iter 22/70, Epoch 3] train loss=2.36e-01, gnorm=8.88e+00, lr=7.62e-05, #samples processed=128, #sample per second=140.54. ETA=1.49min Level 25:root:[Iter 22/70, Epoch 3] valid f1=8.7336e-01, mcc=7.0417e-01, roc_auc=9.4233e-01, accuracy=8.5500e-01, log_loss=3.1476e-01, time spent=0.499s, total time spent=0.71min. Find new best=False, Find new top-3=True Level 15:root:[Iter 23/70, Epoch 3] train loss=1.99e-01, gnorm=6.51e+00, lr=7.46e-05, #samples processed=128, #sample per second=53.61. ETA=1.48min Level 15:root:[Iter 24/70, Epoch 3] train loss=3.36e-01, gnorm=8.62e+00, lr=7.30e-05, #samples processed=128, #sample per second=155.76. ETA=1.41min Level 25:root:[Iter 24/70, Epoch 3] valid f1=8.6957e-01, mcc=6.8006e-01, roc_auc=9.3999e-01, accuracy=8.3500e-01, log_loss=4.5034e-01, time spent=0.502s, total time spent=0.75min. Find new best=False, Find new top-3=False Level 15:root:[Iter 25/70, Epoch 3] train loss=2.44e-01, gnorm=4.81e+00, lr=7.14e-05, #samples processed=128, #sample per second=92.69. ETA=1.37min Level 15:root:[Iter 26/70, Epoch 3] train loss=1.69e-01, gnorm=1.84e+00, lr=6.98e-05, #samples processed=128, #sample per second=160.76. ETA=1.31min Level 25:root:[Iter 26/70, Epoch 3] valid f1=8.9498e-01, mcc=7.7011e-01, roc_auc=9.4426e-01, accuracy=8.8500e-01, log_loss=3.3178e-01, time spent=0.510s, total time spent=0.80min. Find new best=False, Find new top-3=True Level 15:root:[Iter 27/70, Epoch 3] train loss=2.08e-01, gnorm=2.65e+00, lr=6.83e-05, #samples processed=128, #sample per second=51.44. ETA=1.30min Level 15:root:[Iter 28/70, Epoch 3] train loss=1.86e-01, gnorm=4.24e+00, lr=6.67e-05, #samples processed=128, #sample per second=157.87. ETA=1.24min Level 25:root:[Iter 28/70, Epoch 3] valid f1=8.8136e-01, mcc=7.1518e-01, roc_auc=9.3317e-01, accuracy=8.6000e-01, log_loss=3.9546e-01, time spent=0.510s, total time spent=0.84min. Find new best=False, Find new top-3=False Level 15:root:[Iter 29/70, Epoch 4] train loss=9.32e-02, gnorm=2.09e+00, lr=6.51e-05, #samples processed=128, #sample per second=93.50. ETA=1.20min Level 15:root:[Iter 30/70, Epoch 4] train loss=2.13e-01, gnorm=7.57e+00, lr=6.35e-05, #samples processed=128, #sample per second=149.73. ETA=1.15min Level 25:root:[Iter 30/70, Epoch 4] valid f1=8.5490e-01, mcc=6.3946e-01, roc_auc=9.2290e-01, accuracy=8.1500e-01, log_loss=6.5264e-01, time spent=0.504s, total time spent=0.87min. Find new best=False, Find new top-3=False INFO:numexpr.utils:NumExpr defaulting to 8 threads. INFO:root:Training completed. Auto-saving to "./ag_sst/". For loading the model, you can use `predictor = TextPredictor.load("./ag_sst/")` .. parsed-literal:: :class: output Above we specify that: the column named **label** contains the label values to predict, AutoGluon should optimize its predictions for the accuracy evaluation metric, trained models should be saved in the **ag\_sst** folder, and training should run for around 60 seconds. Evaluation ~~~~~~~~~~ After training, we can easily evaluate our predictor on separate test data formatted similarly to our training data. .. code:: python test_score = predictor.evaluate(test_data) print('Accuracy = {:.2f}%'.format(test_score * 100)) .. parsed-literal:: :class: output Accuracy = 89.33% By default, ``evaluate()`` will report the evaluation metric previously specified, which is ``accuracy`` in our example. You may also specify additional metrics, e.g. F1 score, when calling evaluate. .. code:: python test_score = predictor.evaluate(test_data, metrics=['acc', 'f1']) print(test_score) .. parsed-literal:: :class: output {'acc': 0.893348623853211, 'f1': 0.890716803760282} Prediction ~~~~~~~~~~ And you can easily obtain predictions from these models by calling ``predictor.predict()``. .. code:: python sentence1 = "it's a charming and often affecting journey." sentence2 = "It's slow, very, very, very slow." predictions = predictor.predict({'sentence': [sentence1, sentence2]}) print('"Sentence":', sentence1, '"Predicted Sentiment":', predictions[0]) print('"Sentence":', sentence2, '"Predicted Sentiment":', predictions[1]) .. parsed-literal:: :class: output "Sentence": it's a charming and often affecting journey. "Predicted Sentiment": 1 "Sentence": It's slow, very, very, very slow. "Predicted Sentiment": 0 For classification tasks, you can ask for predicted class-probabilities instead of predicted classes. .. code:: python probs = predictor.predict_proba({'sentence': [sentence1, sentence2]}) print('"Sentence":', sentence1, '"Predicted Class-Probabilities":', probs[0]) print('"Sentence":', sentence2, '"Predicted Class-Probabilities":', probs[1]) .. parsed-literal:: :class: output "Sentence": it's a charming and often affecting journey. "Predicted Class-Probabilities": 0 0.002646 1 0.974943 Name: 0, dtype: float32 "Sentence": It's slow, very, very, very slow. "Predicted Class-Probabilities": 0 0.997355 1 0.025057 Name: 1, dtype: float32 We can just as easily produce predictions over an entire dataset. .. code:: python test_predictions = predictor.predict(test_data) test_predictions.head() .. parsed-literal:: :class: output 0 1 1 0 2 1 3 1 4 0 Name: label, dtype: int64 Intermediate Training Results ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ After training, you can explore intermediate training results in ``predictor.results``. .. code:: python predictor.results.tail(3) .. raw:: html

	iteration	report_idx	epoch	f1	mcc	roc_auc	accuracy	log_loss	find_better	find_new_topn	nbest_stat	elapsed_time	reward_attr	eval_metric	exp_dir
13	28	14	3	0.881356	0.715180	0.933171	0.860	0.395456	False	False	[[0.87, 0.9, 0.885], [20, 16, 26]]	50	0.860	accuracy	/var/lib/jenkins/workspace/workspace/autogluon...
14	30	15	4	0.854902	0.639459	0.922897	0.815	0.652641	False	False	[[0.87, 0.9, 0.885], [20, 16, 26]]	52	0.815	accuracy	/var/lib/jenkins/workspace/workspace/autogluon...
15	16	16	2	0.909091	0.799631	0.952090	0.900	0.315937	True	True	[[0.825, 0.9, 0.795], [14, 16, 12]]	32	0.900	accuracy	/var/lib/jenkins/workspace/workspace/autogluon...

Save and Load ~~~~~~~~~~~~~ The trained predictor is automatically saved at the end of ``fit()``, and you can easily reload it. .. code:: python loaded_predictor = TextPredictor.load('ag_sst') loaded_predictor.predict_proba({'sentence': [sentence1, sentence2]}) .. raw:: html

	0	1
0	0.002646	0.997355
1	0.974943	0.025057

You can also save the predictor to any location by calling ``.save()``. .. code:: python loaded_predictor.save('my_saved_dir') loaded_predictor2 = TextPredictor.load('my_saved_dir') loaded_predictor2.predict_proba({'sentence': [sentence1, sentence2]}) .. raw:: html

	0	1
0	0.002646	0.997355
1	0.974943	0.025057

.. _sec_textprediction_extract_embedding: Extract Embeddings ~~~~~~~~~~~~~~~~~~ You can also use a trained predictor to extract embeddings that maps each row of the data table to an embedding vector extracted from intermediate neural network representations of the row. .. code:: python embeddings = predictor.extract_embedding(test_data) print(embeddings) .. parsed-literal:: :class: output [[-1.0801306 -0.44154665 -1.0147676 ... -0.8042417 0.5623589 0.51314175] [-0.5288651 0.13702041 -0.45935678 ... 0.17704543 0.35587373 -0.13050948] [-0.7904423 -0.1516396 -0.736847 ... -0.57204205 0.5236889 0.3740344 ] ... [-0.4152624 0.20381683 -0.39514995 ... -0.22893398 0.23090875 0.36273792] [-0.39312404 0.30050468 -0.6993964 ... 0.1369121 0.16843818 0.09883293] [-0.89676505 0.12524071 -0.3635128 ... -0.51871604 -0.04470562 0.11770649]] Here, we use TSNE to visualize these extracted embeddings. We can see that there are two clusters corresponding to our two labels, since this network has been trained to discriminate between these labels. .. code:: python from sklearn.manifold import TSNE X_embedded = TSNE(n_components=2, random_state=123).fit_transform(embeddings) for val, color in [(0, 'red'), (1, 'blue')]: idx = (test_data['label'].to_numpy() == val).nonzero() plt.scatter(X_embedded[idx, 0], X_embedded[idx, 1], c=color, label=f'label={val}') plt.legend(loc='best') .. parsed-literal:: :class: output .. figure:: output_beginner_02414c_25_1.png Sentence Similarity Task ------------------------ Next, let's use AutoGluon to train a model for evaluating how semantically similar two sentences are. We use the `Semantic Textual Similarity Benchmark `__ dataset for illustration. .. code:: python sts_train_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet')[['sentence1', 'sentence2', 'score']] sts_test_data = load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet')[['sentence1', 'sentence2', 'score']] sts_train_data.head(10) .. parsed-literal:: :class: output INFO:autogluon.core.utils.loaders.load_pd:Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet | Columns = 4 / 4 | Rows = 5749 -> 5749 INFO:autogluon.core.utils.loaders.load_pd:Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet | Columns = 4 / 4 | Rows = 1500 -> 1500 .. raw:: html

	sentence1	sentence2	score
0	A plane is taking off.	An air plane is taking off.	5.00
1	A man is playing a large flute.	A man is playing a flute.	3.80
2	A man is spreading shreded cheese on a pizza.	A man is spreading shredded cheese on an uncoo...	3.80
3	Three men are playing chess.	Two men are playing chess.	2.60
4	A man is playing the cello.	A man seated is playing the cello.	4.25
5	Some men are fighting.	Two men are fighting.	4.25
6	A man is smoking.	A man is skating.	0.50
7	The man is playing the piano.	The man is playing the guitar.	1.60
8	A man is playing on a guitar and singing.	A woman is playing an acoustic guitar and sing...	2.20
9	A person is throwing a cat on to the ceiling.	A person throws a cat on the ceiling.	5.00

In this data, the column named **score** contains numerical values (which we'd like to predict) that are human-annotated similarity scores for each given pair of sentences. .. code:: python print('Min score=', min(sts_train_data['score']), ', Max score=', max(sts_train_data['score'])) .. parsed-literal:: :class: output Min score= 0.0 , Max score= 5.0 Let's train a regression model to predict these scores. Note that we only need to specify the label column and AutoGluon automatically determines the type of prediction problem and an appropriate loss function. Once again, you should increase the short ``time_limit`` below to obtain reasonable performance in your own applications. .. code:: python predictor_sts = TextPredictor(label='score', path='./ag_sts') predictor_sts.fit(sts_train_data, time_limit=60) .. parsed-literal:: :class: output INFO:root:Problem Type="regression" INFO:root:Column Types: - "sentence1": text - "sentence2": text - "score": numerical INFO:autogluon.text.text_prediction.mx.models:The GluonNLP V0 backend is used. We will use 8 cpus and 1 gpus to train each trial. .. parsed-literal:: :class: output All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/training.log .. parsed-literal:: :class: output INFO:root:Fitting and transforming the train data... INFO:root:Done! Preprocessor saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/preprocessor.pkl INFO:root:Process dev set... INFO:root:Done! INFO:root:Max length for chunking text: 128, Stochastic chunk: Train-False/Test-False, Test #repeat: 1. INFO:root:#Total Params/Fixed Params=108990337/0 Level 15:root:Using gradient accumulation. Global batch size = 128 INFO:root:Local training results will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-text-v3/docs/_build/eval/tutorials/text_prediction/ag_sts/task0/results_local.jsonl. Level 15:root:[Iter 3/410, Epoch 0] train loss=1.39e+00, gnorm=1.51e+01, lr=7.32e-06, #samples processed=384, #sample per second=79.31. ETA=10.92min Level 15:root:[Iter 6/410, Epoch 0] train loss=1.11e+00, gnorm=1.03e+01, lr=1.46e-05, #samples processed=384, #sample per second=94.11. ETA=10.00min Level 15:root:[Iter 9/410, Epoch 0] train loss=9.64e-01, gnorm=1.11e+01, lr=2.20e-05, #samples processed=384, #sample per second=88.47. ETA=9.84min Level 25:root:[Iter 9/410, Epoch 0] valid r2=-2.4154e-02, root_mean_squared_error=1.5040e+00, mean_absolute_error=1.2487e+00, time spent=2.391s, total time spent=0.28min. Find new best=True, Find new top-3=True Level 15:root:[Iter 12/410, Epoch 0] train loss=9.11e-01, gnorm=1.32e+01, lr=2.93e-05, #samples processed=384, #sample per second=46.64. ETA=11.88min Level 15:root:[Iter 15/410, Epoch 0] train loss=8.99e-01, gnorm=1.14e+01, lr=3.66e-05, #samples processed=384, #sample per second=83.51. ETA=11.45min Level 15:root:[Iter 18/410, Epoch 0] train loss=7.37e-01, gnorm=8.52e+00, lr=4.39e-05, #samples processed=384, #sample per second=87.07. ETA=11.07min Level 25:root:[Iter 18/410, Epoch 0] valid r2=5.2090e-01, root_mean_squared_error=1.0287e+00, mean_absolute_error=8.1591e-01, time spent=2.429s, total time spent=0.57min. Find new best=True, Find new top-3=True Level 15:root:[Iter 21/410, Epoch 0] train loss=5.92e-01, gnorm=1.14e+01, lr=5.12e-05, #samples processed=384, #sample per second=43.81. ETA=12.12min Level 15:root:[Iter 24/410, Epoch 0] train loss=8.96e-01, gnorm=1.89e+01, lr=5.85e-05, #samples processed=384, #sample per second=90.06. ETA=11.67min Level 15:root:[Iter 27/410, Epoch 0] train loss=5.44e-01, gnorm=9.57e+00, lr=6.59e-05, #samples processed=384, #sample per second=80.98. ETA=11.41min Level 25:root:[Iter 27/410, Epoch 0] valid r2=5.1818e-01, root_mean_squared_error=1.0316e+00, mean_absolute_error=8.4692e-01, time spent=2.474s, total time spent=0.85min. Find new best=False, Find new top-3=True Level 15:root:[Iter 30/410, Epoch 0] train loss=5.25e-01, gnorm=3.97e+00, lr=7.32e-05, #samples processed=384, #sample per second=50.82. ETA=11.79min Level 15:root:[Iter 33/410, Epoch 0] train loss=4.78e-01, gnorm=6.69e+00, lr=8.05e-05, #samples processed=384, #sample per second=80.05. ETA=11.54min Level 15:root:[Iter 36/410, Epoch 0] train loss=3.89e-01, gnorm=4.69e+00, lr=8.78e-05, #samples processed=384, #sample per second=81.37. ETA=11.31min Level 25:root:[Iter 36/410, Epoch 0] valid r2=5.6263e-01, root_mean_squared_error=9.8287e-01, mean_absolute_error=7.8097e-01, time spent=2.456s, total time spent=1.16min. Find new best=True, Find new top-3=True INFO:root:Training completed. Auto-saving to "./ag_sts/". For loading the model, you can use `predictor = TextPredictor.load("./ag_sts/")` .. parsed-literal:: :class: output We again evaluate our trained model's performance on separate test data. Below we choose to compute the following metrics: RMSE, Pearson Correlation, and Spearman Correlation. .. code:: python test_score = predictor_sts.evaluate(sts_test_data, metrics=['rmse', 'pearsonr', 'spearmanr']) print('RMSE = {:.2f}'.format(test_score['rmse'])) print('PEARSONR = {:.4f}'.format(test_score['pearsonr'])) print('SPEARMANR = {:.4f}'.format(test_score['spearmanr'])) .. parsed-literal:: :class: output RMSE = 0.76 PEARSONR = 0.8702 SPEARMANR = 0.8696 Let's use our model to predict the similarity score between a few sentences. .. code:: python sentences = ['The child is riding a horse.', 'The young boy is riding a horse.', 'The young man is riding a horse.', 'The young man is riding a bicycle.'] score1 = predictor_sts.predict({'sentence1': [sentences[0]], 'sentence2': [sentences[1]]}, as_pandas=False) score2 = predictor_sts.predict({'sentence1': [sentences[0]], 'sentence2': [sentences[2]]}, as_pandas=False) score3 = predictor_sts.predict({'sentence1': [sentences[0]], 'sentence2': [sentences[3]]}, as_pandas=False) print(score1, score2, score3) .. parsed-literal:: :class: output [4.0398216] [3.2965467] [1.2349687] Although the ``TextPredictor`` is only designed for classification and regression tasks, it can directly be used for many NLP tasks if you properly format them into a data table. Note that there can be many text columns in this data table. Refer to the `TextPredictor documentation <../../api/autogluon.predictor.html#autogluon.text.TextPredictor.fit>`__ to see all of the available methods/options. Unlike ``TabularPredictor`` which trains/ensembles many different kinds of models, ``TextPredictor`` fits only Transformer neural network models. These are fit to your data via transfer learning from pretrained NLP models like: `BERT `__, `ALBERT `__, and `ELECTRA `__. ``TextPredictor`` also enables training on multi-modal data tables that contain text, numeric and categorical columns, and the neural network hyperparameter can be automatically tuned with Hyperparameter Optimization (HPO), which will be introduced in the other tutorials. **Note:** ``TextPredictor`` depends on the `GluonNLP `__ package. Due to an ongoing upgrade of GluonNLP, we are currently using a custom version of the package: `autogluon-contrib-nlp `__. In a future release, AutoGluon will support the official GluonNLP 1.0, but the APIs demonstrated here will remain the same.