Text Prediction - Heterogeneous Data Types

In your applications, your text data may be mixed with other common data types like numerical data and categorical data (which are commonly found in tabular data). The TextPrediction task in AutoGluon can train a single neural network that jointly operates on multiple feature types, including text, categorical, and numerical columns. Here we’ll again use the Semantic Textual Similarity dataset to illustrate this functionality.

import numpy as np
import warnings
warnings.filterwarnings('ignore')
np.random.seed(123)

Load Data

from autogluon.utils.tabular.utils.loaders import load_pd

train_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet')
dev_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet')
train_data.head(10)
Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/train.parquet | Columns = 4 / 4 | Rows = 5749 -> 5749
Loaded data from: https://autogluon-text.s3-accelerate.amazonaws.com/glue/sts/dev.parquet | Columns = 4 / 4 | Rows = 1500 -> 1500
sentence1 sentence2 genre score
0 A plane is taking off. An air plane is taking off. main-captions 5.00
1 A man is playing a large flute. A man is playing a flute. main-captions 3.80
2 A man is spreading shreded cheese on a pizza. A man is spreading shredded cheese on an uncoo... main-captions 3.80
3 Three men are playing chess. Two men are playing chess. main-captions 2.60
4 A man is playing the cello. A man seated is playing the cello. main-captions 4.25
5 Some men are fighting. Two men are fighting. main-captions 4.25
6 A man is smoking. A man is skating. main-captions 0.50
7 The man is playing the piano. The man is playing the guitar. main-captions 1.60
8 A man is playing on a guitar and singing. A woman is playing an acoustic guitar and sing... main-captions 2.20
9 A person is throwing a cat on to the ceiling. A person throws a cat on the ceiling. main-captions 5.00

Note the STS dataset contains two text fields: sentence1 and sentence2, one categorical field: genre, and one numerical field score. Let’s try to predict the score based on the other features: sentence1, sentence2, genre.

import autogluon as ag
from autogluon import TextPrediction as task

predictor_score = task.fit(train_data, label='score',
                           time_limits=60, ngpus_per_trial=1, seed=123,
                           output_directory='./ag_sts_mixed_score')
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_15/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
NumPy-shape semantics has been activated in your code. This is required for creating and manipulating scalar and zero-size tensors, which were not supported in MXNet before, as in the official NumPy library. Please DO NOT manually deactivate this semantics while using mxnet.numpy and mxnet.numpy_extension modules.
2020-12-08 20:42:27,127 - root - INFO - All Logs will be saved to ./ag_sts_mixed_score/ag_text_prediction.log
2020-12-08 20:42:27,151 - root - INFO - Train Dataset:
2020-12-08 20:42:27,152 - root - INFO - Columns:

- Text(
   name="sentence1"
   #total/missing=4599/0
   length, min/avg/max=16/57.93/367
)
- Text(
   name="sentence2"
   #total/missing=4599/0
   length, min/avg/max=15/57.63/311
)
- Categorical(
   name="genre"
   #total/missing=4599/0
   num_class (total/non_special)=4/3
   categories=['main-captions', 'main-forums', 'main-news']
   freq=[1612, 358, 2629]
)
- Numerical(
   name="score"
   #total/missing=4599/0
   shape=()
)


2020-12-08 20:42:27,152 - root - INFO - Tuning Dataset:
2020-12-08 20:42:27,153 - root - INFO - Columns:

- Text(
   name="sentence1"
   #total/missing=1150/0
   length, min/avg/max=16/56.84/272
)
- Text(
   name="sentence2"
   #total/missing=1150/0
   length, min/avg/max=16/57.15/249
)
- Categorical(
   name="genre"
   #total/missing=1150/0
   num_class (total/non_special)=4/3
   categories=['main-captions', 'main-forums', 'main-news']
   freq=[388, 92, 670]
)
- Numerical(
   name="score"
   #total/missing=1150/0
   shape=()
)


2020-12-08 20:42:27,154 - root - INFO - Label columns=['score'], Feature columns=['sentence1', 'sentence2', 'genre'], Problem types=['regression'], Label shapes=[()]
2020-12-08 20:42:27,154 - root - INFO - Eval Metric=mse, Stop Metric=mse, Log Metrics=['mse', 'rmse', 'mae']
HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))
  0%|          | 0/576 [00:00<?, ?it/s]
  0%|          | 1/576 [00:00<06:25,  1.49it/s]
  0%|          | 2/576 [00:00<04:57,  1.93it/s]
  1%|          | 3/576 [00:01<04:00,  2.39it/s]
  1%|          | 4/576 [00:01<03:10,  3.00it/s]
  1%|          | 5/576 [00:01<02:37,  3.62it/s]
  1%|          | 6/576 [00:01<02:13,  4.26it/s]
  1%|          | 7/576 [00:01<01:57,  4.84it/s]
  1%|▏         | 8/576 [00:01<01:41,  5.58it/s]
  2%|▏         | 9/576 [00:01<01:29,  6.37it/s]
  2%|▏         | 10/576 [00:01<01:31,  6.22it/s]
  2%|▏         | 11/576 [00:02<01:30,  6.26it/s]
  2%|▏         | 12/576 [00:02<01:20,  6.97it/s]
  2%|▏         | 13/576 [00:02<01:27,  6.41it/s]
  2%|▏         | 14/576 [00:02<01:26,  6.49it/s]
  3%|▎         | 15/576 [00:03<04:34,  2.04it/s]
  3%|▎         | 16/576 [00:03<03:35,  2.60it/s]
  3%|▎         | 17/576 [00:04<02:48,  3.31it/s]
  3%|▎         | 18/576 [00:04<02:17,  4.07it/s]
  3%|▎         | 19/576 [00:04<01:53,  4.92it/s]
  3%|▎         | 20/576 [00:04<01:38,  5.65it/s]
  4%|▎         | 21/576 [00:04<01:36,  5.78it/s]
  4%|▍         | 22/576 [00:04<01:30,  6.13it/s]
  4%|▍         | 23/576 [00:04<01:26,  6.41it/s]
  4%|▍         | 24/576 [00:04<01:19,  6.93it/s]
  4%|▍         | 25/576 [00:05<01:16,  7.20it/s]
  5%|▍         | 26/576 [00:05<01:15,  7.28it/s]
  5%|▍         | 27/576 [00:05<01:13,  7.46it/s]
  5%|▌         | 29/576 [00:05<01:11,  7.63it/s]
  5%|▌         | 30/576 [00:06<04:23,  2.07it/s]
  5%|▌         | 31/576 [00:07<03:23,  2.68it/s]
  6%|▌         | 32/576 [00:07<02:48,  3.23it/s]
  6%|▌         | 33/576 [00:07<02:15,  4.01it/s]
  6%|▌         | 34/576 [00:07<01:50,  4.88it/s]
  6%|▌         | 35/576 [00:07<01:35,  5.66it/s]
  6%|▋         | 36/576 [00:07<01:33,  5.79it/s]
  6%|▋         | 37/576 [00:07<01:23,  6.47it/s]
  7%|▋         | 38/576 [00:07<01:17,  6.91it/s]
  7%|▋         | 39/576 [00:08<01:11,  7.55it/s]
  7%|▋         | 40/576 [00:08<01:07,  7.89it/s]
  7%|▋         | 41/576 [00:08<01:04,  8.32it/s]
  7%|▋         | 42/576 [00:08<01:02,  8.50it/s]
  7%|▋         | 43/576 [00:08<01:03,  8.36it/s]
  8%|▊         | 44/576 [00:08<01:02,  8.53it/s]
  8%|▊         | 45/576 [00:09<04:11,  2.11it/s]
  8%|▊         | 46/576 [00:10<03:14,  2.72it/s]
  8%|▊         | 47/576 [00:10<02:33,  3.44it/s]
  8%|▊         | 48/576 [00:10<02:05,  4.21it/s]
  9%|▊         | 49/576 [00:10<01:46,  4.97it/s]
  9%|▊         | 50/576 [00:10<01:39,  5.30it/s]
  9%|▉         | 51/576 [00:10<01:30,  5.80it/s]
  9%|▉         | 52/576 [00:10<01:21,  6.40it/s]
  9%|▉         | 53/576 [00:10<01:13,  7.12it/s]
  9%|▉         | 54/576 [00:10<01:08,  7.66it/s]
 10%|▉         | 55/576 [00:11<01:07,  7.68it/s]
 10%|▉         | 56/576 [00:11<01:08,  7.63it/s]
 10%|▉         | 57/576 [00:11<01:03,  8.12it/s]
 10%|█         | 58/576 [00:11<01:01,  8.41it/s]
 10%|█         | 59/576 [00:11<01:04,  8.06it/s]
 10%|█         | 60/576 [00:12<04:14,  2.03it/s]
 11%|█         | 61/576 [00:13<03:18,  2.60it/s]
 11%|█         | 62/576 [00:13<02:36,  3.29it/s]
 11%|█         | 63/576 [00:13<02:07,  4.03it/s]
 11%|█         | 64/576 [00:13<01:51,  4.58it/s]
 11%|█▏        | 65/576 [00:13<01:35,  5.33it/s]
 11%|█▏        | 66/576 [00:13<01:24,  6.02it/s]
 12%|█▏        | 67/576 [00:13<01:22,  6.19it/s]
 12%|█▏        | 68/576 [00:13<01:15,  6.70it/s]
 12%|█▏        | 69/576 [00:14<01:10,  7.22it/s]
 12%|█▏        | 70/576 [00:14<01:07,  7.53it/s]
 12%|█▏        | 71/576 [00:14<01:04,  7.78it/s]
 12%|█▎        | 72/576 [00:14<01:02,  8.09it/s]
 13%|█▎        | 73/576 [00:14<01:06,  7.53it/s]
 13%|█▎        | 74/576 [00:14<01:03,  7.88it/s]
 13%|█▎        | 75/576 [00:16<04:04,  2.05it/s]
 13%|█▎        | 76/576 [00:16<03:06,  2.69it/s]
 13%|█▎        | 77/576 [00:16<02:28,  3.35it/s]
 14%|█▎        | 78/576 [00:16<01:59,  4.16it/s]
 14%|█▎        | 79/576 [00:16<01:46,  4.68it/s]
 14%|█▍        | 80/576 [00:16<01:31,  5.43it/s]
 14%|█▍        | 81/576 [00:16<01:20,  6.15it/s]
 14%|█▍        | 82/576 [00:16<01:13,  6.72it/s]
 14%|█▍        | 83/576 [00:16<01:07,  7.27it/s]
 15%|█▍        | 84/576 [00:17<01:04,  7.57it/s]
 15%|█▍        | 85/576 [00:17<01:03,  7.77it/s]
 15%|█▍        | 86/576 [00:17<00:58,  8.32it/s]
 15%|█▌        | 87/576 [00:17<00:56,  8.60it/s]
 15%|█▌        | 88/576 [00:17<00:56,  8.66it/s]
 15%|█▌        | 89/576 [00:17<00:56,  8.64it/s]
 16%|█▌        | 90/576 [00:18<03:55,  2.07it/s]
 16%|█▌        | 91/576 [00:19<03:00,  2.69it/s]
 16%|█▌        | 92/576 [00:19<02:24,  3.35it/s]
 16%|█▌        | 93/576 [00:19<01:57,  4.12it/s]
 16%|█▋        | 94/576 [00:19<01:39,  4.86it/s]
 16%|█▋        | 95/576 [00:19<01:27,  5.48it/s]
 17%|█▋        | 96/576 [00:19<01:16,  6.29it/s]
 17%|█▋        | 97/576 [00:19<01:08,  7.02it/s]
 17%|█▋        | 98/576 [00:19<01:03,  7.49it/s]
 17%|█▋        | 99/576 [00:20<01:01,  7.80it/s]
 17%|█▋        | 100/576 [00:20<00:58,  8.10it/s]
 18%|█▊        | 101/576 [00:20<01:02,  7.55it/s]
 18%|█▊        | 102/576 [00:20<01:01,  7.74it/s]
 18%|█▊        | 103/576 [00:20<01:23,  5.68it/s]
 18%|█▊        | 104/576 [00:20<01:13,  6.47it/s]
 18%|█▊        | 105/576 [00:22<03:53,  2.02it/s]
 18%|█▊        | 106/576 [00:22<02:58,  2.64it/s]
 19%|█▊        | 107/576 [00:22<02:21,  3.31it/s]
 19%|█▉        | 108/576 [00:22<01:56,  4.03it/s]
 19%|█▉        | 109/576 [00:22<01:35,  4.89it/s]
 19%|█▉        | 110/576 [00:22<01:21,  5.74it/s]
 19%|█▉        | 111/576 [00:22<01:11,  6.47it/s]
 19%|█▉        | 112/576 [00:22<01:06,  6.93it/s]
 20%|█▉        | 113/576 [00:22<01:01,  7.59it/s]
 20%|█▉        | 114/576 [00:23<00:59,  7.80it/s]
 20%|█▉        | 115/576 [00:23<00:57,  8.00it/s]
 20%|██        | 116/576 [00:23<00:54,  8.43it/s]
 20%|██        | 117/576 [00:23<00:53,  8.58it/s]
 20%|██        | 118/576 [00:23<00:53,  8.59it/s]
 21%|██        | 119/576 [00:23<00:52,  8.65it/s]
 21%|██        | 120/576 [00:24<03:18,  2.30it/s]
 21%|██        | 121/576 [00:24<02:35,  2.93it/s]
 21%|██        | 122/576 [00:25<02:11,  3.46it/s]
 21%|██▏       | 123/576 [00:25<01:45,  4.29it/s]
 22%|██▏       | 124/576 [00:25<01:30,  5.00it/s]
 22%|██▏       | 125/576 [00:25<01:17,  5.84it/s]
 22%|██▏       | 126/576 [00:25<01:09,  6.43it/s]
 22%|██▏       | 128/576 [00:25<01:03,  7.11it/s]
 22%|██▏       | 129/576 [00:25<00:58,  7.70it/s]
 23%|██▎       | 130/576 [00:26<01:06,  6.75it/s]
 23%|██▎       | 131/576 [00:26<01:01,  7.27it/s]
 23%|██▎       | 132/576 [00:26<00:58,  7.59it/s]
 23%|██▎       | 133/576 [00:26<00:56,  7.88it/s]
 23%|██▎       | 134/576 [00:26<00:53,  8.21it/s]
 23%|██▎       | 135/576 [00:27<03:35,  2.05it/s]
 24%|██▎       | 136/576 [00:27<02:45,  2.66it/s]
 24%|██▍       | 137/576 [00:28<02:09,  3.39it/s]
 24%|██▍       | 138/576 [00:28<01:47,  4.09it/s]
 24%|██▍       | 139/576 [00:28<01:29,  4.88it/s]
 24%|██▍       | 140/576 [00:28<01:16,  5.74it/s]
 24%|██▍       | 141/576 [00:28<01:10,  6.21it/s]
 25%|██▍       | 142/576 [00:28<01:04,  6.74it/s]
 25%|██▍       | 143/576 [00:28<00:59,  7.34it/s]
 25%|██▌       | 144/576 [00:28<00:57,  7.56it/s]
 25%|██▌       | 145/576 [00:29<00:54,  7.87it/s]
 25%|██▌       | 146/576 [00:29<00:52,  8.23it/s]
 26%|██▌       | 148/576 [00:29<00:50,  8.44it/s]
 26%|██▌       | 149/576 [00:29<00:50,  8.40it/s]
 26%|██▌       | 150/576 [00:30<03:05,  2.29it/s]
 26%|██▌       | 151/576 [00:30<02:25,  2.92it/s]
 26%|██▋       | 152/576 [00:30<01:55,  3.66it/s]
 27%|██▋       | 153/576 [00:31<01:35,  4.42it/s]
 27%|██▋       | 155/576 [00:31<01:19,  5.33it/s]
 27%|██▋       | 156/576 [00:31<01:10,  6.00it/s]
 27%|██▋       | 157/576 [00:31<01:02,  6.66it/s]
 27%|██▋       | 158/576 [00:31<00:59,  7.03it/s]
 28%|██▊       | 159/576 [00:31<00:58,  7.10it/s]
 28%|██▊       | 160/576 [00:31<00:54,  7.60it/s]
 28%|██▊       | 161/576 [00:31<00:51,  8.07it/s]
 28%|██▊       | 162/576 [00:32<00:49,  8.39it/s]
 28%|██▊       | 163/576 [00:32<00:52,  7.81it/s]
 28%|██▊       | 164/576 [00:32<00:51,  7.99it/s]
 29%|██▊       | 165/576 [00:33<03:01,  2.26it/s]
 29%|██▉       | 166/576 [00:33<02:23,  2.86it/s]
 29%|██▉       | 167/576 [00:33<01:55,  3.54it/s]
 29%|██▉       | 168/576 [00:33<01:36,  4.21it/s]
 29%|██▉       | 169/576 [00:34<01:24,  4.83it/s]
 30%|██▉       | 170/576 [00:34<01:12,  5.59it/s]
 30%|██▉       | 171/576 [00:34<01:07,  5.99it/s]
 30%|██▉       | 172/576 [00:34<01:00,  6.71it/s]
 30%|███       | 173/576 [00:34<00:54,  7.33it/s]
 30%|███       | 174/576 [00:34<00:51,  7.74it/s]
 30%|███       | 175/576 [00:34<00:51,  7.73it/s]
 31%|███       | 176/576 [00:34<00:48,  8.23it/s]
 31%|███       | 177/576 [00:34<00:47,  8.31it/s]
 31%|███       | 178/576 [00:35<00:47,  8.44it/s]
 31%|███       | 179/576 [00:35<00:46,  8.45it/s]
 31%|███▏      | 180/576 [00:36<02:53,  2.28it/s]
 31%|███▏      | 181/576 [00:36<02:16,  2.90it/s]
 32%|███▏      | 182/576 [00:36<01:50,  3.57it/s]
 32%|███▏      | 183/576 [00:36<01:29,  4.38it/s]
 32%|███▏      | 184/576 [00:36<01:14,  5.25it/s]
 32%|███▏      | 185/576 [00:36<01:06,  5.90it/s]
 32%|███▏      | 186/576 [00:37<00:59,  6.58it/s]
 32%|███▏      | 187/576 [00:37<00:53,  7.24it/s]
 33%|███▎      | 188/576 [00:37<00:51,  7.48it/s]
 33%|███▎      | 189/576 [00:37<00:48,  7.94it/s]
 33%|███▎      | 190/576 [00:37<00:48,  8.04it/s]
 33%|███▎      | 191/576 [00:37<00:47,  8.10it/s]
 33%|███▎      | 192/576 [00:37<00:46,  8.21it/s]
 34%|███▎      | 193/576 [00:37<00:47,  8.13it/s]
 34%|███▎      | 194/576 [00:37<00:44,  8.52it/s]
 34%|███▍      | 195/576 [00:39<02:47,  2.28it/s]
 34%|███▍      | 196/576 [00:39<02:10,  2.90it/s]
 34%|███▍      | 197/576 [00:39<01:43,  3.67it/s]
 34%|███▍      | 198/576 [00:39<01:26,  4.37it/s]
 35%|███▍      | 199/576 [00:39<01:13,  5.16it/s]
 35%|███▍      | 200/576 [00:39<01:03,  5.92it/s]
 35%|███▍      | 201/576 [00:39<00:56,  6.58it/s]
 35%|███▌      | 202/576 [00:39<00:53,  7.05it/s]
 35%|███▌      | 203/576 [00:40<00:49,  7.57it/s]
 35%|███▌      | 204/576 [00:40<00:49,  7.52it/s]
 36%|███▌      | 205/576 [00:40<00:45,  8.08it/s]
 36%|███▌      | 206/576 [00:40<00:46,  8.02it/s]
 36%|███▌      | 207/576 [00:40<00:48,  7.58it/s]
 36%|███▌      | 208/576 [00:40<00:46,  7.97it/s]
 36%|███▋      | 209/576 [00:40<00:44,  8.21it/s]
 36%|███▋      | 210/576 [00:42<02:41,  2.27it/s]
 37%|███▋      | 211/576 [00:42<02:05,  2.92it/s]
 37%|███▋      | 212/576 [00:42<01:40,  3.63it/s]
 37%|███▋      | 213/576 [00:42<01:21,  4.46it/s]
 37%|███▋      | 214/576 [00:42<01:11,  5.09it/s]
 37%|███▋      | 215/576 [00:42<01:01,  5.85it/s]
 38%|███▊      | 216/576 [00:42<00:55,  6.46it/s]
 38%|███▊      | 217/576 [00:42<00:50,  7.09it/s]
 38%|███▊      | 218/576 [00:42<00:47,  7.57it/s]
 38%|███▊      | 219/576 [00:43<00:45,  7.88it/s]
 38%|███▊      | 220/576 [00:43<00:44,  7.93it/s]
 38%|███▊      | 221/576 [00:43<00:42,  8.32it/s]
 39%|███▊      | 222/576 [00:43<00:42,  8.30it/s]
 39%|███▊      | 223/576 [00:43<00:41,  8.55it/s]
 39%|███▉      | 224/576 [00:43<00:41,  8.47it/s]
 39%|███▉      | 225/576 [00:44<02:36,  2.24it/s]
 39%|███▉      | 226/576 [00:44<02:02,  2.85it/s]
 39%|███▉      | 227/576 [00:45<01:37,  3.59it/s]
 40%|███▉      | 229/576 [00:45<01:18,  4.45it/s]
 40%|███▉      | 230/576 [00:45<01:04,  5.33it/s]
 40%|████      | 231/576 [00:45<00:57,  6.01it/s]
 40%|████      | 232/576 [00:45<00:53,  6.48it/s]
 40%|████      | 233/576 [00:45<00:48,  7.02it/s]
 41%|████      | 234/576 [00:45<00:46,  7.36it/s]
 41%|████      | 236/576 [00:46<00:42,  7.98it/s]
 41%|████      | 237/576 [00:46<00:40,  8.41it/s]
 41%|████▏     | 238/576 [00:46<00:40,  8.45it/s]
 41%|████▏     | 239/576 [00:46<00:39,  8.48it/s]
 42%|████▏     | 240/576 [00:47<02:28,  2.26it/s]
 42%|████▏     | 241/576 [00:47<01:57,  2.85it/s]
 42%|████▏     | 242/576 [00:47<01:32,  3.59it/s]
 42%|████▏     | 243/576 [00:47<01:15,  4.44it/s]
 42%|████▏     | 244/576 [00:48<01:03,  5.25it/s]
 43%|████▎     | 245/576 [00:48<00:55,  5.96it/s]
 43%|████▎     | 246/576 [00:48<00:48,  6.74it/s]
 43%|████▎     | 247/576 [00:48<00:45,  7.21it/s]
 43%|████▎     | 248/576 [00:48<00:42,  7.73it/s]
 43%|████▎     | 249/576 [00:48<00:39,  8.19it/s]
 43%|████▎     | 250/576 [00:48<00:38,  8.53it/s]
 44%|████▎     | 251/576 [00:48<00:37,  8.73it/s]
 44%|████▍     | 252/576 [00:48<00:36,  8.79it/s]
 44%|████▍     | 253/576 [00:49<00:37,  8.69it/s]
 44%|████▍     | 254/576 [00:49<00:35,  9.03it/s]
 44%|████▍     | 255/576 [00:50<02:36,  2.06it/s]
 44%|████▍     | 256/576 [00:50<02:00,  2.66it/s]
 45%|████▍     | 257/576 [00:50<01:36,  3.31it/s]
 45%|████▍     | 258/576 [00:50<01:17,  4.13it/s]
 45%|████▍     | 259/576 [00:50<01:04,  4.93it/s]
 45%|████▌     | 260/576 [00:51<00:57,  5.50it/s]
 45%|████▌     | 261/576 [00:51<00:50,  6.18it/s]
 45%|████▌     | 262/576 [00:51<00:46,  6.71it/s]
 46%|████▌     | 263/576 [00:51<00:44,  7.01it/s]
 46%|████▌     | 264/576 [00:51<00:41,  7.58it/s]
 46%|████▌     | 265/576 [00:51<00:38,  8.05it/s]
 46%|████▌     | 266/576 [00:51<00:37,  8.36it/s]
 46%|████▋     | 267/576 [00:51<00:36,  8.51it/s]
 47%|████▋     | 268/576 [00:52<00:35,  8.69it/s]
 47%|████▋     | 269/576 [00:52<00:35,  8.66it/s]
 47%|████▋     | 270/576 [00:53<02:14,  2.28it/s]
 47%|████▋     | 271/576 [00:53<01:45,  2.90it/s]
 47%|████▋     | 272/576 [00:53<01:23,  3.65it/s]
 47%|████▋     | 273/576 [00:53<01:08,  4.40it/s]
 48%|████▊     | 274/576 [00:53<00:58,  5.20it/s]
 48%|████▊     | 275/576 [00:53<00:52,  5.77it/s]
 48%|████▊     | 276/576 [00:54<00:46,  6.44it/s]
 48%|████▊     | 277/576 [00:54<00:41,  7.14it/s]
 48%|████▊     | 278/576 [00:54<00:39,  7.63it/s]
 48%|████▊     | 279/576 [00:54<00:37,  7.96it/s]
 49%|████▊     | 280/576 [00:54<00:36,  8.15it/s]
 49%|████▉     | 282/576 [00:54<00:33,  8.65it/s]
 49%|████▉     | 283/576 [00:54<00:32,  8.97it/s]
 49%|████▉     | 284/576 [00:54<00:32,  9.09it/s]
 49%|████▉     | 285/576 [00:56<02:07,  2.28it/s]
 50%|████▉     | 286/576 [00:56<01:39,  2.92it/s]
 50%|████▉     | 287/576 [00:56<01:18,  3.70it/s]
 50%|█████     | 288/576 [00:56<01:07,  4.29it/s]
 50%|█████     | 289/576 [00:56<00:57,  5.02it/s]
 50%|█████     | 290/576 [00:56<00:49,  5.79it/s]
 51%|█████     | 292/576 [00:56<00:43,  6.56it/s]
 51%|█████     | 293/576 [00:56<00:39,  7.10it/s]
 51%|█████     | 294/576 [00:57<00:38,  7.38it/s]
 51%|█████     | 295/576 [00:57<00:36,  7.73it/s]
 51%|█████▏    | 296/576 [00:57<00:35,  7.82it/s]
 52%|█████▏    | 297/576 [00:57<00:34,  8.09it/s]
 52%|█████▏    | 298/576 [00:57<00:33,  8.22it/s]
 52%|█████▏    | 299/576 [00:57<00:31,  8.66it/s]
 52%|█████▏    | 300/576 [00:59<02:15,  2.04it/s]
 52%|█████▏    | 301/576 [00:59<01:43,  2.66it/s]
 52%|█████▏    | 302/576 [00:59<01:21,  3.34it/s]
 53%|█████▎    | 303/576 [00:59<01:07,  4.06it/s]
 53%|█████▎    | 304/576 [00:59<00:55,  4.93it/s]
 53%|█████▎    | 305/576 [00:59<00:47,  5.66it/s]
 53%|█████▎    | 306/576 [00:59<00:43,  6.20it/s]
 53%|█████▎    | 307/576 [00:59<00:40,  6.57it/s]
 53%|█████▎    | 308/576 [00:59<00:36,  7.27it/s]
 54%|█████▎    | 309/576 [01:00<00:34,  7.67it/s]
 54%|█████▍    | 310/576 [01:00<00:33,  7.89it/s]
 54%|█████▍    | 311/576 [01:00<00:33,  7.97it/s]
 54%|█████▍    | 312/576 [01:00<00:31,  8.27it/s]
 54%|█████▍    | 313/576 [01:00<00:30,  8.50it/s]
 55%|█████▍    | 314/576 [01:01<00:51,  5.07it/s]
 55%|█████▍    | 314/576 [01:01<00:51,  5.08it/s]
score = predictor_score.evaluate(dev_data, metrics='spearmanr')
print('Spearman Correlation=', score['spearmanr'])
/var/lib/jenkins/miniconda3/envs/autogluon_docs-v0_0_15/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above.
  and should_run_async(code)
Spearman Correlation= 0.855421553987981

We can also train a model that predicts the genre using the other columns as features.

predictor_genre = task.fit(train_data, label='genre',
                           time_limits=60, ngpus_per_trial=1, seed=123,
                           output_directory='./ag_sts_mixed_genre')
2020-12-08 20:45:07,390 - root - INFO - All Logs will be saved to ./ag_sts_mixed_genre/ag_text_prediction.log
2020-12-08 20:45:07,416 - root - INFO - Train Dataset:
2020-12-08 20:45:07,417 - root - INFO - Columns:

- Text(
   name="sentence1"
   #total/missing=4599/0
   length, min/avg/max=16/57.82/367
)
- Text(
   name="sentence2"
   #total/missing=4599/0
   length, min/avg/max=15/57.57/311
)
- Categorical(
   name="genre"
   #total/missing=4599/0
   num_class (total/non_special)=3/3
   categories=['main-captions', 'main-forums', 'main-news']
   freq=[1598, 362, 2639]
)
- Numerical(
   name="score"
   #total/missing=4599/0
   shape=()
)


2020-12-08 20:45:07,417 - root - INFO - Tuning Dataset:
2020-12-08 20:45:07,418 - root - INFO - Columns:

- Text(
   name="sentence1"
   #total/missing=1150/0
   length, min/avg/max=16/57.27/260
)
- Text(
   name="sentence2"
   #total/missing=1150/0
   length, min/avg/max=16/57.39/256
)
- Categorical(
   name="genre"
   #total/missing=1150/0
   num_class (total/non_special)=3/3
   categories=['main-captions', 'main-forums', 'main-news']
   freq=[402, 88, 660]
)
- Numerical(
   name="score"
   #total/missing=1150/0
   shape=()
)


2020-12-08 20:45:07,418 - root - INFO - Label columns=['genre'], Feature columns=['sentence1', 'sentence2', 'score'], Problem types=['classification'], Label shapes=[3]
2020-12-08 20:45:07,419 - root - INFO - Eval Metric=acc, Stop Metric=acc, Log Metrics=['acc', 'nll']
HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))
  0%|          | 0/576 [00:00<?, ?it/s]
  0%|          | 1/576 [00:00<05:54,  1.62it/s]
  0%|          | 2/576 [00:00<04:45,  2.01it/s]
  1%|          | 3/576 [00:01<03:47,  2.52it/s]
  1%|          | 4/576 [00:01<03:04,  3.11it/s]
  1%|          | 5/576 [00:01<02:29,  3.82it/s]
  1%|          | 6/576 [00:01<02:08,  4.44it/s]
  1%|          | 7/576 [00:01<01:55,  4.93it/s]
  1%|▏         | 8/576 [00:01<01:41,  5.60it/s]
  2%|▏         | 9/576 [00:01<01:34,  5.99it/s]
  2%|▏         | 10/576 [00:01<01:33,  6.02it/s]
  2%|▏         | 11/576 [00:02<01:25,  6.57it/s]
  2%|▏         | 12/576 [00:02<01:25,  6.59it/s]
  2%|▏         | 13/576 [00:02<01:20,  7.00it/s]
  2%|▏         | 14/576 [00:02<01:14,  7.53it/s]
  3%|▎         | 15/576 [00:03<04:26,  2.10it/s]
  3%|▎         | 16/576 [00:03<03:26,  2.71it/s]
  3%|▎         | 17/576 [00:04<02:44,  3.40it/s]
  3%|▎         | 18/576 [00:04<02:12,  4.20it/s]
  3%|▎         | 19/576 [00:04<01:52,  4.97it/s]
  3%|▎         | 20/576 [00:04<01:36,  5.74it/s]
  4%|▎         | 21/576 [00:04<01:28,  6.26it/s]
  4%|▍         | 22/576 [00:04<01:19,  6.99it/s]
  4%|▍         | 23/576 [00:04<01:13,  7.49it/s]
  4%|▍         | 25/576 [00:04<01:13,  7.46it/s]
  5%|▍         | 26/576 [00:05<01:15,  7.28it/s]
  5%|▍         | 27/576 [00:05<01:10,  7.75it/s]
  5%|▍         | 28/576 [00:05<01:13,  7.49it/s]
  5%|▌         | 29/576 [00:05<01:15,  7.27it/s]
  5%|▌         | 30/576 [00:06<04:21,  2.09it/s]
  5%|▌         | 31/576 [00:06<03:23,  2.68it/s]
  6%|▌         | 32/576 [00:07<02:40,  3.38it/s]
  6%|▌         | 33/576 [00:07<02:15,  4.00it/s]
  6%|▌         | 34/576 [00:07<01:59,  4.54it/s]
  6%|▌         | 35/576 [00:07<01:41,  5.33it/s]
  6%|▋         | 36/576 [00:07<01:31,  5.90it/s]
  6%|▋         | 37/576 [00:07<01:21,  6.63it/s]
  7%|▋         | 38/576 [00:07<01:13,  7.31it/s]
  7%|▋         | 39/576 [00:07<01:10,  7.67it/s]
  7%|▋         | 40/576 [00:07<01:06,  8.11it/s]
  7%|▋         | 41/576 [00:08<01:05,  8.22it/s]
  7%|▋         | 42/576 [00:08<01:03,  8.45it/s]
  7%|▋         | 43/576 [00:08<01:01,  8.65it/s]
  8%|▊         | 44/576 [00:08<01:01,  8.59it/s]
  8%|▊         | 45/576 [00:09<04:13,  2.09it/s]
  8%|▊         | 46/576 [00:09<03:15,  2.71it/s]
  8%|▊         | 47/576 [00:09<02:34,  3.42it/s]
  8%|▊         | 48/576 [00:10<02:06,  4.18it/s]
  9%|▊         | 50/576 [00:10<01:47,  4.91it/s]
  9%|▉         | 51/576 [00:10<01:33,  5.63it/s]
  9%|▉         | 52/576 [00:10<01:22,  6.38it/s]
  9%|▉         | 53/576 [00:10<01:16,  6.88it/s]
  9%|▉         | 54/576 [00:10<01:10,  7.43it/s]
 10%|▉         | 55/576 [00:10<01:07,  7.75it/s]
 10%|▉         | 56/576 [00:11<01:04,  8.05it/s]
 10%|▉         | 57/576 [00:11<01:03,  8.22it/s]
 10%|█         | 58/576 [00:11<01:02,  8.29it/s]
 10%|█         | 59/576 [00:11<01:09,  7.46it/s]
 10%|█         | 60/576 [00:12<04:04,  2.11it/s]
 11%|█         | 61/576 [00:12<03:08,  2.73it/s]
 11%|█         | 62/576 [00:12<02:28,  3.47it/s]
 11%|█         | 63/576 [00:13<02:01,  4.23it/s]
 11%|█         | 64/576 [00:13<01:57,  4.37it/s]
 11%|█▏        | 65/576 [00:13<01:38,  5.21it/s]
 11%|█▏        | 66/576 [00:13<01:25,  5.94it/s]
 12%|█▏        | 67/576 [00:13<01:15,  6.76it/s]
 12%|█▏        | 68/576 [00:13<01:08,  7.45it/s]
 12%|█▏        | 69/576 [00:13<01:03,  7.94it/s]
 12%|█▏        | 70/576 [00:13<01:02,  8.14it/s]
 12%|█▏        | 71/576 [00:13<00:59,  8.43it/s]
 12%|█▎        | 72/576 [00:14<01:06,  7.59it/s]
 13%|█▎        | 73/576 [00:14<01:04,  7.85it/s]
 13%|█▎        | 74/576 [00:14<01:05,  7.72it/s]
 13%|█▎        | 75/576 [00:15<03:35,  2.32it/s]
 13%|█▎        | 76/576 [00:15<02:51,  2.91it/s]
 13%|█▎        | 77/576 [00:15<02:16,  3.65it/s]
 14%|█▎        | 78/576 [00:15<01:51,  4.48it/s]
 14%|█▎        | 79/576 [00:16<01:35,  5.18it/s]
 14%|█▍        | 80/576 [00:16<01:24,  5.88it/s]
 14%|█▍        | 81/576 [00:16<01:23,  5.90it/s]
 14%|█▍        | 82/576 [00:16<01:14,  6.61it/s]
 14%|█▍        | 83/576 [00:16<01:07,  7.31it/s]
 15%|█▍        | 84/576 [00:16<01:09,  7.06it/s]
 15%|█▍        | 85/576 [00:16<01:11,  6.83it/s]
 15%|█▍        | 86/576 [00:16<01:09,  7.07it/s]
 15%|█▌        | 87/576 [00:17<01:06,  7.34it/s]
 15%|█▌        | 88/576 [00:17<01:02,  7.79it/s]
 15%|█▌        | 89/576 [00:17<00:59,  8.13it/s]
 16%|█▌        | 90/576 [00:18<03:27,  2.34it/s]
 16%|█▌        | 91/576 [00:18<02:42,  2.99it/s]
 16%|█▌        | 92/576 [00:18<02:09,  3.74it/s]
 16%|█▌        | 93/576 [00:18<01:46,  4.52it/s]
 16%|█▋        | 94/576 [00:18<01:30,  5.30it/s]
 16%|█▋        | 95/576 [00:19<01:21,  5.87it/s]
 17%|█▋        | 96/576 [00:19<01:14,  6.46it/s]
 17%|█▋        | 97/576 [00:19<01:07,  7.15it/s]
 17%|█▋        | 98/576 [00:19<01:01,  7.78it/s]
 17%|█▋        | 99/576 [00:19<00:57,  8.24it/s]
 17%|█▋        | 100/576 [00:19<00:56,  8.50it/s]
 18%|█▊        | 101/576 [00:19<00:55,  8.53it/s]
 18%|█▊        | 102/576 [00:19<00:53,  8.83it/s]
 18%|█▊        | 103/576 [00:19<00:54,  8.68it/s]
 18%|█▊        | 104/576 [00:20<00:54,  8.67it/s]
 18%|█▊        | 105/576 [00:21<03:43,  2.11it/s]
 18%|█▊        | 106/576 [00:21<02:51,  2.75it/s]
 19%|█▊        | 107/576 [00:21<02:23,  3.27it/s]
 19%|█▉        | 108/576 [00:21<01:56,  4.02it/s]
 19%|█▉        | 110/576 [00:21<01:35,  4.89it/s]
 19%|█▉        | 111/576 [00:22<01:21,  5.70it/s]
 19%|█▉        | 112/576 [00:22<01:11,  6.46it/s]
 20%|█▉        | 113/576 [00:22<01:14,  6.22it/s]
 20%|█▉        | 114/576 [00:22<01:07,  6.82it/s]
 20%|█▉        | 115/576 [00:22<01:04,  7.13it/s]
 20%|██        | 116/576 [00:22<00:59,  7.74it/s]
 20%|██        | 118/576 [00:22<00:54,  8.33it/s]
 21%|██        | 119/576 [00:22<00:55,  8.24it/s]
 21%|██        | 120/576 [00:24<03:16,  2.32it/s]
 21%|██        | 121/576 [00:24<02:33,  2.97it/s]
 21%|██        | 122/576 [00:24<02:07,  3.55it/s]
 21%|██▏       | 123/576 [00:24<01:45,  4.30it/s]
 22%|██▏       | 124/576 [00:24<01:29,  5.07it/s]
 22%|██▏       | 125/576 [00:24<01:18,  5.77it/s]
 22%|██▏       | 126/576 [00:24<01:08,  6.58it/s]
 22%|██▏       | 127/576 [00:24<01:03,  7.12it/s]
 22%|██▏       | 128/576 [00:25<00:58,  7.61it/s]
 22%|██▏       | 129/576 [00:25<00:56,  7.90it/s]
 23%|██▎       | 130/576 [00:25<00:54,  8.25it/s]
 23%|██▎       | 131/576 [00:25<00:53,  8.26it/s]
 23%|██▎       | 132/576 [00:25<00:54,  8.15it/s]
 23%|██▎       | 133/576 [00:25<00:52,  8.46it/s]
 23%|██▎       | 134/576 [00:25<00:51,  8.63it/s]
 23%|██▎       | 135/576 [00:27<03:28,  2.11it/s]
 24%|██▎       | 136/576 [00:27<02:39,  2.75it/s]
 24%|██▍       | 137/576 [00:27<02:08,  3.42it/s]
 24%|██▍       | 138/576 [00:27<01:44,  4.20it/s]
 24%|██▍       | 139/576 [00:27<01:30,  4.84it/s]
 24%|██▍       | 140/576 [00:27<01:21,  5.34it/s]
 24%|██▍       | 141/576 [00:27<01:11,  6.06it/s]
 25%|██▍       | 142/576 [00:27<01:03,  6.79it/s]
 25%|██▍       | 143/576 [00:28<00:59,  7.30it/s]
 25%|██▌       | 144/576 [00:28<00:55,  7.83it/s]
 25%|██▌       | 145/576 [00:28<00:52,  8.27it/s]
 25%|██▌       | 146/576 [00:28<00:49,  8.66it/s]
 26%|██▌       | 147/576 [00:28<00:49,  8.75it/s]
 26%|██▌       | 148/576 [00:28<00:49,  8.67it/s]
 26%|██▌       | 150/576 [00:29<01:54,  3.73it/s]
 26%|██▌       | 151/576 [00:29<01:34,  4.50it/s]
 26%|██▋       | 152/576 [00:30<01:19,  5.33it/s]
 27%|██▋       | 153/576 [00:30<01:12,  5.86it/s]
 27%|██▋       | 154/576 [00:30<01:06,  6.37it/s]
 27%|██▋       | 155/576 [00:30<01:02,  6.74it/s]
 27%|██▋       | 156/576 [00:30<00:58,  7.21it/s]
 27%|██▋       | 157/576 [00:30<00:53,  7.80it/s]
 27%|██▋       | 158/576 [00:30<00:51,  8.10it/s]
 28%|██▊       | 159/576 [00:30<00:52,  8.01it/s]
 28%|██▊       | 160/576 [00:30<00:51,  8.08it/s]
 28%|██▊       | 161/576 [00:31<00:49,  8.45it/s]
 28%|██▊       | 162/576 [00:31<00:49,  8.38it/s]
 28%|██▊       | 163/576 [00:31<00:47,  8.78it/s]
 28%|██▊       | 164/576 [00:31<00:46,  8.77it/s]
 29%|██▊       | 165/576 [00:32<02:55,  2.34it/s]
 29%|██▉       | 166/576 [00:32<02:19,  2.95it/s]
 29%|██▉       | 168/576 [00:32<01:49,  3.72it/s]
 29%|██▉       | 169/576 [00:33<01:29,  4.53it/s]
 30%|██▉       | 170/576 [00:33<01:17,  5.23it/s]
 30%|██▉       | 171/576 [00:33<01:06,  6.06it/s]
 30%|██▉       | 172/576 [00:33<01:01,  6.57it/s]
 30%|███       | 173/576 [00:33<00:55,  7.21it/s]
 30%|███       | 174/576 [00:33<00:52,  7.68it/s]
 30%|███       | 175/576 [00:33<00:49,  8.12it/s]
 31%|███       | 176/576 [00:33<00:47,  8.45it/s]
 31%|███       | 177/576 [00:33<00:46,  8.50it/s]
 31%|███       | 178/576 [00:34<00:46,  8.52it/s]
 31%|███       | 179/576 [00:34<00:46,  8.57it/s]
 31%|███▏      | 180/576 [00:35<02:51,  2.31it/s]
 31%|███▏      | 181/576 [00:35<02:12,  2.98it/s]
 32%|███▏      | 182/576 [00:35<01:44,  3.77it/s]
 32%|███▏      | 183/576 [00:35<01:26,  4.57it/s]
 32%|███▏      | 184/576 [00:35<01:13,  5.35it/s]
 32%|███▏      | 185/576 [00:35<01:03,  6.19it/s]
 32%|███▏      | 186/576 [00:36<00:59,  6.58it/s]
 32%|███▏      | 187/576 [00:36<00:53,  7.27it/s]
 33%|███▎      | 188/576 [00:36<00:52,  7.32it/s]
 33%|███▎      | 189/576 [00:36<00:51,  7.58it/s]
 33%|███▎      | 190/576 [00:36<00:50,  7.70it/s]
 33%|███▎      | 192/576 [00:36<00:47,  8.14it/s]
 34%|███▎      | 193/576 [00:36<00:47,  8.03it/s]
 34%|███▎      | 194/576 [00:36<00:45,  8.33it/s]
 34%|███▍      | 195/576 [00:38<02:44,  2.31it/s]
 34%|███▍      | 196/576 [00:38<02:10,  2.92it/s]
 34%|███▍      | 197/576 [00:38<01:42,  3.69it/s]
 34%|███▍      | 198/576 [00:38<01:25,  4.44it/s]
 35%|███▍      | 199/576 [00:38<01:12,  5.17it/s]
 35%|███▍      | 200/576 [00:38<01:03,  5.92it/s]
 35%|███▍      | 201/576 [00:38<00:58,  6.46it/s]
 35%|███▌      | 202/576 [00:38<00:53,  7.04it/s]
 35%|███▌      | 203/576 [00:39<00:49,  7.51it/s]
 35%|███▌      | 204/576 [00:39<00:46,  8.01it/s]
 36%|███▌      | 205/576 [00:39<00:45,  8.19it/s]
 36%|███▌      | 206/576 [00:39<00:43,  8.41it/s]
 36%|███▌      | 207/576 [00:39<00:42,  8.68it/s]
 36%|███▌      | 208/576 [00:39<00:41,  8.87it/s]
 36%|███▋      | 209/576 [00:39<00:42,  8.63it/s]
 36%|███▋      | 210/576 [00:41<02:53,  2.10it/s]
 37%|███▋      | 211/576 [00:41<02:13,  2.73it/s]
 37%|███▋      | 212/576 [00:41<01:45,  3.46it/s]
 37%|███▋      | 213/576 [00:41<01:25,  4.23it/s]
 37%|███▋      | 214/576 [00:41<01:13,  4.94it/s]
 37%|███▋      | 215/576 [00:41<01:02,  5.76it/s]
 38%|███▊      | 216/576 [00:41<00:55,  6.49it/s]
 38%|███▊      | 217/576 [00:41<00:51,  7.01it/s]
 38%|███▊      | 218/576 [00:41<00:48,  7.37it/s]
 38%|███▊      | 219/576 [00:42<00:46,  7.73it/s]
 38%|███▊      | 220/576 [00:42<00:46,  7.69it/s]
 38%|███▊      | 221/576 [00:42<00:44,  8.01it/s]
 39%|███▊      | 222/576 [00:42<00:43,  8.14it/s]
 39%|███▊      | 223/576 [00:42<00:42,  8.25it/s]
 39%|███▉      | 224/576 [00:42<00:41,  8.49it/s]
 39%|███▉      | 225/576 [00:43<02:32,  2.30it/s]
 39%|███▉      | 226/576 [00:43<01:59,  2.93it/s]
 39%|███▉      | 227/576 [00:44<01:35,  3.67it/s]
 40%|███▉      | 228/576 [00:44<01:16,  4.53it/s]
 40%|███▉      | 229/576 [00:44<01:05,  5.31it/s]
 40%|███▉      | 230/576 [00:44<00:58,  5.96it/s]
 40%|████      | 231/576 [00:44<00:51,  6.70it/s]
 40%|████      | 232/576 [00:44<00:46,  7.35it/s]
 40%|████      | 233/576 [00:44<00:44,  7.78it/s]
 41%|████      | 234/576 [00:44<00:41,  8.20it/s]
 41%|████      | 235/576 [00:44<00:42,  8.10it/s]
 41%|████      | 237/576 [00:45<00:40,  8.44it/s]
 41%|████▏     | 238/576 [00:45<00:39,  8.45it/s]
 41%|████▏     | 239/576 [00:45<00:39,  8.61it/s]
 42%|████▏     | 240/576 [00:46<02:24,  2.32it/s]
 42%|████▏     | 241/576 [00:46<01:51,  3.01it/s]
 42%|████▏     | 242/576 [00:46<01:28,  3.79it/s]
 42%|████▏     | 243/576 [00:46<01:13,  4.51it/s]
 42%|████▏     | 244/576 [00:46<01:02,  5.32it/s]
 43%|████▎     | 245/576 [00:47<00:56,  5.90it/s]
 43%|████▎     | 246/576 [00:47<00:55,  5.96it/s]
 43%|████▎     | 247/576 [00:47<00:50,  6.47it/s]
 43%|████▎     | 248/576 [00:47<00:45,  7.23it/s]
 43%|████▎     | 249/576 [00:47<00:43,  7.60it/s]
 43%|████▎     | 250/576 [00:47<00:41,  7.86it/s]
 44%|████▎     | 251/576 [00:47<00:42,  7.71it/s]
 44%|████▍     | 252/576 [00:47<00:39,  8.25it/s]
 44%|████▍     | 253/576 [00:48<00:38,  8.47it/s]
 44%|████▍     | 254/576 [00:48<00:41,  7.82it/s]
 44%|████▍     | 255/576 [00:49<02:21,  2.26it/s]
 44%|████▍     | 256/576 [00:49<01:50,  2.91it/s]
 45%|████▍     | 257/576 [00:49<01:27,  3.66it/s]
 45%|████▍     | 258/576 [00:49<01:11,  4.46it/s]
 45%|████▍     | 259/576 [00:49<01:00,  5.26it/s]
 45%|████▌     | 260/576 [00:49<00:54,  5.82it/s]
 45%|████▌     | 261/576 [00:50<00:49,  6.42it/s]
 45%|████▌     | 262/576 [00:50<00:45,  6.91it/s]
 46%|████▌     | 263/576 [00:50<00:42,  7.32it/s]
 46%|████▌     | 264/576 [00:50<00:40,  7.75it/s]
 46%|████▌     | 265/576 [00:50<00:38,  8.08it/s]
 46%|████▌     | 266/576 [00:50<00:37,  8.21it/s]
 46%|████▋     | 267/576 [00:50<00:36,  8.37it/s]
 47%|████▋     | 268/576 [00:50<00:36,  8.49it/s]
 47%|████▋     | 269/576 [00:51<00:36,  8.49it/s]
 47%|████▋     | 270/576 [00:52<02:13,  2.30it/s]
 47%|████▋     | 271/576 [00:52<01:43,  2.94it/s]
 47%|████▋     | 272/576 [00:52<01:22,  3.67it/s]
 47%|████▋     | 273/576 [00:52<01:10,  4.30it/s]
 48%|████▊     | 274/576 [00:52<01:00,  5.01it/s]
 48%|████▊     | 275/576 [00:52<00:52,  5.78it/s]
 48%|████▊     | 276/576 [00:52<00:46,  6.45it/s]
 48%|████▊     | 277/576 [00:53<00:42,  7.04it/s]
 48%|████▊     | 278/576 [00:53<00:40,  7.34it/s]
 48%|████▊     | 279/576 [00:53<00:39,  7.53it/s]
 49%|████▊     | 280/576 [00:53<00:39,  7.44it/s]
 49%|████▉     | 281/576 [00:53<00:38,  7.65it/s]
 49%|████▉     | 282/576 [00:53<00:37,  7.89it/s]
 49%|████▉     | 283/576 [00:53<00:35,  8.24it/s]
 49%|████▉     | 284/576 [00:53<00:35,  8.20it/s]
 49%|████▉     | 285/576 [00:55<02:06,  2.31it/s]
 50%|████▉     | 286/576 [00:55<01:37,  2.97it/s]
 50%|████▉     | 287/576 [00:55<01:18,  3.70it/s]
 50%|█████     | 288/576 [00:55<01:05,  4.41it/s]
 50%|█████     | 289/576 [00:55<00:54,  5.23it/s]
 50%|█████     | 290/576 [00:55<00:46,  6.10it/s]
 51%|█████     | 291/576 [00:55<00:42,  6.73it/s]
 51%|█████     | 292/576 [00:55<00:40,  7.08it/s]
 51%|█████     | 293/576 [00:55<00:37,  7.56it/s]
 51%|█████     | 294/576 [00:56<00:36,  7.78it/s]
 51%|█████     | 295/576 [00:56<00:35,  8.00it/s]
 51%|█████▏    | 296/576 [00:56<00:35,  7.94it/s]
 52%|█████▏    | 297/576 [00:56<00:35,  7.80it/s]
 52%|█████▏    | 299/576 [00:56<00:33,  8.39it/s]
 52%|█████▏    | 300/576 [00:57<01:58,  2.33it/s]
 52%|█████▏    | 301/576 [00:57<01:31,  3.00it/s]
 52%|█████▏    | 302/576 [00:58<01:15,  3.61it/s]
 53%|█████▎    | 303/576 [00:58<01:02,  4.37it/s]
 53%|█████▎    | 304/576 [00:58<00:53,  5.11it/s]
 53%|█████▎    | 305/576 [00:58<00:47,  5.75it/s]
 53%|█████▎    | 306/576 [00:58<00:41,  6.51it/s]
 53%|█████▎    | 308/576 [00:58<00:37,  7.20it/s]
 54%|█████▎    | 309/576 [00:58<00:34,  7.70it/s]
 54%|█████▍    | 310/576 [00:58<00:32,  8.25it/s]
 54%|█████▍    | 311/576 [00:59<00:31,  8.33it/s]
 54%|█████▍    | 312/576 [00:59<00:30,  8.67it/s]
 55%|█████▍    | 314/576 [01:00<00:50,  5.18it/s]
 55%|█████▍    | 314/576 [01:01<00:51,  5.09it/s]
score = predictor_genre.evaluate(dev_data, metrics='acc')
print('Genre-prediction Accuracy = {}%'.format(score['acc'] * 100))
Genre-prediction Accuracy = 88.4%