Image Prediction - Quick Start

In this quick start, we’ll use the task of image classification to illustrate how to use AutoGluon’s APIs. This tutorial demonstrates how to load images and corresponding labels into AutoGluon and use this data to obtain a neural network that can classify new images. This is different from traditional machine learning where we need to manually define the neural network and then specify the hyperparameters in the training process. Instead, with just a single call to AutoGluon’s fit function, AutoGluon automatically trains many models with different hyperparameter configurations and returns the model that achieved the highest level of accuracy.

import autogluon.core as ag
from autogluon.vision import ImagePredictor

Create Image Dataset

For demonstration purposes, we use a subset of the Shopee-IET dataset from Kaggle. Each image in this data depicts a clothing item and the corresponding label specifies its clothing category. Our subset of the data contains the following possible labels: BabyPants, BabyShirt, womencasualshoes, womenchiffontop.

We can load a dataset by downloading a url data automatically:

train_dataset, _, test_dataset = ImagePredictor.Dataset.from_folders('https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip')
print(train_dataset)
data/
├── test/
└── train/
                                                 image  label
0    /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      0
1    /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      0
2    /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      0
3    /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      0
4    /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      0
..                                                 ...    ...
795  /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      3
796  /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      3
797  /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      3
798  /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      3
799  /var/lib/jenkins/.gluoncv/datasets/shopee-iet/...      3

[800 rows x 2 columns]

Use AutoGluon to Fit Models

Now, we fit a classifier using AutoGluon as follows:

predictor = ImagePredictor()
# since the original dataset does not provide validation split, the `fit` function splits it randomly with 90/10 ratio
predictor.fit(train_dataset, hyperparameters={'epochs': 2})  # you can trust the default config, we reduce the # epoch to save some build time
INFO:root:time_limit=auto set to time_limit=7200.
INFO:root:Reset labels to [0, 1, 2, 3]
WARNING:gluoncv.auto.tasks.image_classification:The number of requested GPUs is greater than the number of available GPUs.Reduce the number to 1
INFO:gluoncv.auto.tasks.image_classification:Randomly split train_data into train[732]/validation[68] splits.
INFO:gluoncv.auto.tasks.image_classification:Starting fit without HPO
INFO:ImageClassificationEstimator:modified configs(<old> != <new>): {
INFO:ImageClassificationEstimator:root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto
INFO:ImageClassificationEstimator:root.train.early_stop_baseline 0.0 != -inf
INFO:ImageClassificationEstimator:root.train.lr        0.1 != 0.01
INFO:ImageClassificationEstimator:root.train.num_workers 4 != 8
INFO:ImageClassificationEstimator:root.train.batch_size 128 != 16
INFO:ImageClassificationEstimator:root.train.early_stop_max_value 1.0 != inf
INFO:ImageClassificationEstimator:root.train.num_training_samples 1281167 != -1
INFO:ImageClassificationEstimator:root.train.rec_val   ~/.mxnet/datasets/imagenet/rec/val.rec != auto
INFO:ImageClassificationEstimator:root.train.early_stop_patience -1 != 10
INFO:ImageClassificationEstimator:root.train.epochs    10 != 2
INFO:ImageClassificationEstimator:root.train.data_dir  ~/.mxnet/datasets/imagenet != auto
INFO:ImageClassificationEstimator:root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto
INFO:ImageClassificationEstimator:root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto
INFO:ImageClassificationEstimator:root.img_cls.model   resnet50_v1 != resnet50_v1b
INFO:ImageClassificationEstimator:root.valid.batch_size 128 != 16
INFO:ImageClassificationEstimator:root.valid.num_workers 4 != 8
INFO:ImageClassificationEstimator:}
INFO:ImageClassificationEstimator:Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-image-classification-v3/docs/_build/eval/tutorials/image_prediction/c0962c8d/.trial_0/config.yaml
INFO:ImageClassificationEstimator:Start training from [Epoch 0]
INFO:ImageClassificationEstimator:Epoch[0] Batch [49]       Speed: 100.098477 samples/sec   accuracy=0.505000       lr=0.010000
INFO:ImageClassificationEstimator:[Epoch 0] training: accuracy=0.505000
INFO:ImageClassificationEstimator:[Epoch 0] speed: 98 samples/sec   time cost: 10.878299
INFO:ImageClassificationEstimator:[Epoch 0] validation: top1=0.760000 top5=1.000000
INFO:ImageClassificationEstimator:[Epoch 0] Current best top-1: 0.760000 vs previous 0.000000, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-image-classification-v3/docs/_build/eval/tutorials/image_prediction/c0962c8d/.trial_0/best_checkpoint.pkl
INFO:ImageClassificationEstimator:Epoch[1] Batch [49]       Speed: 102.878078 samples/sec   accuracy=0.705000       lr=0.010000
INFO:ImageClassificationEstimator:[Epoch 1] training: accuracy=0.705000
INFO:ImageClassificationEstimator:[Epoch 1] speed: 100 samples/sec  time cost: 10.651336
INFO:ImageClassificationEstimator:[Epoch 1] validation: top1=0.846250 top5=1.000000
INFO:ImageClassificationEstimator:[Epoch 1] Current best top-1: 0.846250 vs previous 0.760000, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-image-classification-v3/docs/_build/eval/tutorials/image_prediction/c0962c8d/.trial_0/best_checkpoint.pkl
INFO:ImageClassificationEstimator:Applying the state from the best checkpoint...
INFO:gluoncv.auto.tasks.image_classification:Finished, total runtime is 32.16 s
INFO:gluoncv.auto.tasks.image_classification:{ 'best_config': { 'batch_size': 16,
                   'dist_ip_addrs': None,
                   'early_stop_baseline': -inf,
                   'early_stop_max_value': inf,
                   'early_stop_patience': 10,
                   'epochs': 2,
                   'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>,
                   'final_fit': False,
                   'gpus': [0],
                   'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-image-classification-v3/docs/_build/eval/tutorials/image_prediction/c0962c8d',
                   'lr': 0.01,
                   'model': 'resnet50_v1b',
                   'ngpus_per_trial': 8,
                   'nthreads_per_trial': 128,
                   'num_trials': 1,
                   'num_workers': 8,
                   'scheduler': 'local',
                   'search_strategy': 'random',
                   'searcher': 'random',
                   'seed': 751,
                   'time_limits': 7200,
                   'wall_clock_tick': 1619664987.3925514},
  'total_time': 23.368402242660522,
  'train_acc': 0.705,
  'valid_acc': 0.84625}
<autogluon.vision.predictor.predictor.ImagePredictor at 0x7fdd07486b10>

Within fit, the dataset is automatically split into training and validation sets. The model with the best hyperparameter configuration is selected based on its performance on the validation set. The best model is finally retrained on our entire dataset (i.e., merging training+validation) using the best configuration.

The best Top-1 accuracy achieved on the validation set is as follows:

fit_result = predictor.fit_summary()
print('Top-1 train acc: %.3f, val acc: %.3f' %(fit_result['train_acc'], fit_result['valid_acc']))
Top-1 train acc: 0.705, val acc: 0.846

Predict on a New Image

Given an example image, we can easily use the final model to predict the label (and the conditional class-probability denoted as score):

image_path = test_dataset.iloc[0]['image']
result = predictor.predict(image_path)
print(result)
0    1
Name: label, dtype: int64

If probabilities of all categories are needed, you can call predict_proba:

proba = predictor.predict_proba(image_path)
print(proba)
          0         1         2         3
0  0.100923  0.779423  0.052378  0.067276

You can also feed in multiple images all together, let’s use images in test dataset as an example:

bulk_result = predictor.predict(test_dataset)
print(bulk_result)
0     1
1     1
2     1
3     2
4     1
     ..
75    3
76    1
77    3
78    3
79    3
Name: label, Length: 80, dtype: int64

An extra column will be included in bulk prediction, indicate the corresponding image for the row. There will be (# image) rows in the result, each row includes class, score, id and image for prediction class, prediction confidence, class id, and image path respectively.

Generate image features with a classifier

Extracting representation from the whole image learned by a model is also very useful. We provide predict_feature function to allow predictor to return the N-dimensional image feature where N depends on the model(usually a 512 to 2048 length vector)

image_path = test_dataset.iloc[0]['image']
feature = predictor.predict_feature(image_path)
print(feature)
                                       image_feature
0  [0.28035307, 0.018974816, 0.46575105, 1.486518...
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-image-classification-v3/venv/lib/python3.7/site-packages/mxnet/gluon/block.py:682: UserWarning: Parameter dense2_weight, dense2_bias is not used by any computation. Is this intended?
  out = self.forward(*args)

Evaluate on Test Dataset

You can evaluate the classifier on a test dataset rather than retrieving the predictions.

The validation and test top-1 accuracy are:

test_acc, _ = predictor.evaluate(test_dataset)
print('Top-1 test acc: %.3f' % test_acc)
Top-1 test acc: 0.838

Save and load classifiers

You can directly save the instances of classifiers:

filename = 'predictor.ag'
predictor.save(filename)
predictor_loaded = ImagePredictor.load(filename)
# use predictor_loaded as usual
result = predictor_loaded.predict(image_path)
print(result)
0    1
Name: label, dtype: int64