AutoMM Detection - Quick Start on a Tiny COCO Format Dataset¶
In this section, our goal is to fast finetune a pretrained model on a small dataset in COCO format, and evaluate on its test set. Both training and test sets are in COCO format. See Convert Data to COCO Format for how to convert other datasets to COCO format.
Setting up the imports¶
To start, let’s import MultiModalPredictor:
from autogluon.multimodal import MultiModalPredictor
Make sure mmcv-full
and mmdet
are installed:
!mim install mmcv-full
!pip install mmdet
Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.12.0/index.html
Requirement already satisfied: mmcv-full in /home/ci/opt/venv/lib/python3.8/site-packages (1.7.0)
Requirement already satisfied: packaging in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (22.0)
Requirement already satisfied: pyyaml in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (5.4.1)
Requirement already satisfied: opencv-python>=3 in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (4.6.0.66)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (1.22.4)
Requirement already satisfied: yapf in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (0.32.0)
Requirement already satisfied: Pillow in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (9.3.0)
Requirement already satisfied: addict in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (2.4.0)
Requirement already satisfied: mmdet in /home/ci/opt/venv/lib/python3.8/site-packages (2.26.0)
Requirement already satisfied: six in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.16.0)
Requirement already satisfied: terminaltables in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (3.1.10)
Requirement already satisfied: pycocotools in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (2.0.6)
Requirement already satisfied: scipy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.8.1)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (3.6.2)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.22.4)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (1.0.6)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (0.11.0)
Requirement already satisfied: pyparsing>=2.2.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (3.0.9)
Requirement already satisfied: pillow>=6.2.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (9.3.0)
Requirement already satisfied: packaging>=20.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (22.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (1.4.4)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (4.38.0)
And also import some other packages that will be used in this tutorial:
import os
import time
from autogluon.core.utils.loaders import load_zip
Downloading Data¶
We have the sample dataset ready in the cloud. Let’s download it:
zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip"
download_dir = "./tiny_motorbike_coco"
load_zip.unzip(zip_file, unzip_dir=download_dir)
data_dir = os.path.join(download_dir, "tiny_motorbike")
train_path = os.path.join(data_dir, "Annotations", "trainval_cocoformat.json")
test_path = os.path.join(data_dir, "Annotations", "test_cocoformat.json")
Downloading ./tiny_motorbike_coco/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip...
100%|██████████| 21.8M/21.8M [00:00<00:00, 59.3MiB/s]
While using COCO format dataset, the input is the json annotation file
of the dataset split. In this example, trainval_cocoformat.json
is
the annotation file of the train-and-validate split, and
test_cocoformat.json
is the annotation file of the test split.
Creating the MultiModalPredictor¶
We select the YOLOv3 with MobileNetV2 as backbone, and input resolution is 320x320, pretrained on COCO dataset. With this setting, it is fast to finetune or inference, and easy to deploy. And we use all the GPUs (if any):
checkpoint_name = "yolov3_mobilenetv2_320_300e_coco"
num_gpus = -1 # use all GPUs
We create the MultiModalPredictor with selected checkpoint name and
number of GPUs. We need to specify the problem_type to
"object_detection"
, and also provide a sample_data_path
for the
predictor to infer the catgories of the dataset. Here we provide the
train_path
, and it also works using any other split of this dataset.
And we also provide a path
to save the predictor. It will be saved
to a automatically generated directory with timestamp under
AutogluonModels
if path
is not specified.
# Init predictor
import uuid
model_path = f"./tmp/{uuid.uuid4().hex}-quick_start_tutorial_temp_save"
predictor = MultiModalPredictor(
hyperparameters={
"model.mmdet_image.checkpoint_name": checkpoint_name,
"env.num_gpus": num_gpus,
},
problem_type="object_detection",
sample_data_path=train_path,
path=model_path,
)
/home/ci/autogluon/multimodal/src/autogluon/multimodal/predictor.py:436: UserWarning: Running object detection. Make sure that you have installed mmdet and mmcv-full, by running 'mim install mmcv-full' and 'pip install mmdet'
warnings.warn(
processing yolov3_mobilenetv2_320_300e_coco...
Output()
[32mSuccessfully downloaded yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
load checkpoint from local path: yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth
The model and loaded state dict do not match exactly
size mismatch for bbox_head.convs_pred.0.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.0.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
size mismatch for bbox_head.convs_pred.1.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.1.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
size mismatch for bbox_head.convs_pred.2.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.2.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
Finetuning the Model¶
We set the learning rate to be 2e-4
. Note that we use a two-stage
learning rate option during finetuning by default, and the model head
will have 100x learning rate. Using a two-stage learning rate with high
learning rate only on head layers makes the model converge faster during
finetuning. It usually gives better performance as well, especially on
small datasets with hundreds or thousands of images. We also set the
epoch to be 15 and batch_size to be 32. We also compute the time of the
fit process here for better understanding the speed. We run it on a
g4.2xlarge EC2 machine on AWS, and part of the command outputs are shown
below:
start = time.time()
# Fit
predictor.fit(
train_path,
hyperparameters={
"optimization.learning_rate": 2e-4, # we use two stage and detection head has 100x lr
"optimization.max_epochs": 30,
"env.per_gpu_batch_size": 32, # decrease it when model is large
},
)
train_end = time.time()
Global seed set to 123
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] | Name | Type | Params ----------------------------------------------------------------------- 0 | model | MMDetAutoModelForObjectDetection | 3.7 M 1 | validation_metric | MeanMetric | 0 ----------------------------------------------------------------------- 3.7 M Trainable params 0 Non-trainable params 3.7 M Total params 14.706 Total estimated model params size (MB) /home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1892: PossibleUserWarning: The number of training batches (5) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch. rank_zero_warn( Epoch 0, global step 1: 'val_direct_loss' reached 28263.55859 (best 28263.55859), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=0-step=1.ckpt' as top 1 Epoch 1, global step 2: 'val_direct_loss' reached 10605.58398 (best 10605.58398), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=1-step=2.ckpt' as top 1 Epoch 1, global step 3: 'val_direct_loss' reached 4444.50098 (best 4444.50098), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=1-step=3.ckpt' as top 1 Epoch 2, global step 4: 'val_direct_loss' reached 2138.37476 (best 2138.37476), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=2-step=4.ckpt' as top 1 Epoch 2, global step 5: 'val_direct_loss' reached 1337.25488 (best 1337.25488), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=2-step=5.ckpt' as top 1 Epoch 3, global step 6: 'val_direct_loss' reached 1239.52478 (best 1239.52478), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=3-step=6.ckpt' as top 1 Epoch 3, global step 7: 'val_direct_loss' reached 971.26068 (best 971.26068), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=3-step=7.ckpt' as top 1 Epoch 4, global step 8: 'val_direct_loss' was not in top 1 Epoch 4, global step 9: 'val_direct_loss' reached 929.80939 (best 929.80939), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=4-step=9.ckpt' as top 1 Epoch 5, global step 10: 'val_direct_loss' was not in top 1 Epoch 5, global step 11: 'val_direct_loss' was not in top 1 Epoch 6, global step 12: 'val_direct_loss' reached 919.20398 (best 919.20398), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=6-step=12.ckpt' as top 1 Epoch 6, global step 13: 'val_direct_loss' was not in top 1 Epoch 7, global step 14: 'val_direct_loss' reached 907.16553 (best 907.16553), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=7-step=14.ckpt' as top 1 Epoch 7, global step 15: 'val_direct_loss' was not in top 1 Epoch 8, global step 16: 'val_direct_loss' was not in top 1 Epoch 8, global step 17: 'val_direct_loss' reached 873.87000 (best 873.87000), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=8-step=17.ckpt' as top 1 Epoch 9, global step 18: 'val_direct_loss' reached 809.06348 (best 809.06348), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=9-step=18.ckpt' as top 1 Epoch 9, global step 19: 'val_direct_loss' was not in top 1 Epoch 10, global step 20: 'val_direct_loss' was not in top 1 Epoch 10, global step 21: 'val_direct_loss' was not in top 1 Epoch 11, global step 22: 'val_direct_loss' reached 790.93079 (best 790.93079), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=11-step=22.ckpt' as top 1 Epoch 11, global step 23: 'val_direct_loss' reached 757.35474 (best 757.35474), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=11-step=23.ckpt' as top 1 Epoch 12, global step 24: 'val_direct_loss' was not in top 1 Epoch 12, global step 25: 'val_direct_loss' reached 732.31500 (best 732.31500), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=12-step=25.ckpt' as top 1 Epoch 13, global step 26: 'val_direct_loss' was not in top 1 Epoch 13, global step 27: 'val_direct_loss' reached 626.65387 (best 626.65387), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=13-step=27.ckpt' as top 1 Epoch 14, global step 28: 'val_direct_loss' was not in top 1 Epoch 14, global step 29: 'val_direct_loss' reached 626.18237 (best 626.18237), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=14-step=29.ckpt' as top 1 Epoch 15, global step 30: 'val_direct_loss' was not in top 1 Epoch 15, global step 31: 'val_direct_loss' was not in top 1 Epoch 16, global step 32: 'val_direct_loss' was not in top 1 Epoch 16, global step 33: 'val_direct_loss' was not in top 1 Epoch 17, global step 34: 'val_direct_loss' was not in top 1 Epoch 17, global step 35: 'val_direct_loss' was not in top 1 Epoch 18, global step 36: 'val_direct_loss' was not in top 1 Epoch 18, global step 37: 'val_direct_loss' was not in top 1 Epoch 19, global step 38: 'val_direct_loss' was not in top 1 Epoch 19, global step 39: 'val_direct_loss' was not in top 1 Epoch 20, global step 40: 'val_direct_loss' was not in top 1 Epoch 20, global step 41: 'val_direct_loss' was not in top 1 Epoch 21, global step 42: 'val_direct_loss' was not in top 1 Epoch 21, global step 43: 'val_direct_loss' was not in top 1 Epoch 22, global step 44: 'val_direct_loss' was not in top 1 Epoch 22, global step 45: 'val_direct_loss' was not in top 1 Epoch 23, global step 46: 'val_direct_loss' was not in top 1 Epoch 23, global step 47: 'val_direct_loss' reached 600.72766 (best 600.72766), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=23-step=47.ckpt' as top 1 Epoch 24, global step 48: 'val_direct_loss' reached 568.21265 (best 568.21265), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/epoch=24-step=48.ckpt' as top 1 Epoch 24, global step 49: 'val_direct_loss' was not in top 1 Epoch 25, global step 50: 'val_direct_loss' was not in top 1 Epoch 25, global step 51: 'val_direct_loss' was not in top 1 Epoch 26, global step 52: 'val_direct_loss' was not in top 1 Epoch 26, global step 53: 'val_direct_loss' was not in top 1 Epoch 27, global step 54: 'val_direct_loss' was not in top 1 Epoch 27, global step 55: 'val_direct_loss' was not in top 1 Epoch 28, global step 56: 'val_direct_loss' was not in top 1 Epoch 28, global step 57: 'val_direct_loss' was not in top 1 Epoch 29, global step 58: 'val_direct_loss' was not in top 1 Epoch 29, global step 59: 'val_direct_loss' was not in top 1 Trainer.fit stopped: max_epochs=30 reached.
Notice that at the end of each progress bar, if the checkpoint at
current stage is saved, it prints the model’s save path. In this
example, it’s ./quick_start_tutorial_temp_save
.
Print out the time and we can see that it’s fast!
print("This finetuning takes %.2f seconds." % (train_end - start))
This finetuning takes 141.92 seconds.
Evaluation¶
To evaluate the model we just trained, run following code.
And the evaluation results are shown in command line output. The first line is mAP in COCO standard, and the second line is mAP in VOC standard (or mAP50). For more details about these metrics, see COCO’s evaluation guideline. Note that for presenting a fast finetuning we use 15 epochs, you could get better result on this dataset by simply increasing the epochs.
predictor.evaluate(test_path)
eval_end = time.time()
loading annotations into memory... Done (t=0.00s) creating index... index created! saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/a2cfefff684247a5b1890518015e5635-quick_start_tutorial_temp_save/object_detection_result_cache.json loading annotations into memory... Done (t=0.00s) creating index... index created! Loading and preparing results... DONE (t=0.01s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox DONE (t=0.20s). Accumulating evaluation results... DONE (t=0.06s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.109 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.287 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.050 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.028 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.033 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.300 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.095 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.169 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.180 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.079 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.121 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.413
Print out the evaluation time:
print("The evaluation takes %.2f seconds." % (eval_end - train_end))
The evaluation takes 1.36 seconds.
We can load a new predictor with previous save_path, and we can also reset the number of GPUs to use if not all the devices are available:
# Load and reset num_gpus
new_predictor = MultiModalPredictor.load(model_path)
new_predictor.set_num_gpus(1)
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
Evaluating the new predictor gives us exactly the same result:
# Evaluate new predictor
new_predictor.evaluate(test_path)
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
WARNING:automm:A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20221213_015239/object_detection_result_cache.json loading annotations into memory... Done (t=0.00s) creating index... index created! Loading and preparing results... DONE (t=0.01s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox DONE (t=0.19s). Accumulating evaluation results... DONE (t=0.06s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.109 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.287 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.050 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.028 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.033 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.300 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.095 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.169 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.180 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.079 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.121 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.413
{'map': 0.10891740880495356}
If we set validation metric to "map"
(Mean Average Precision), and
max epochs to 50
, the predictor will have better performance with
the same pretrained model (YOLOv3). We trained it offline and uploaded
to S3. To load and check the result:
# Load Trained Predictor from S3
zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection/quick_start/AP50_433.zip"
download_dir = "./AP50_433"
load_zip.unzip(zip_file, unzip_dir=download_dir)
better_predictor = MultiModalPredictor.load("./AP50_433/quick_start_tutorial_temp_save")
better_predictor.set_num_gpus(1)
# Evaluate new predictor
better_predictor.evaluate(test_path)
Downloading ./AP50_433/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection/quick_start/AP50_433.zip...
100%|██████████| 27.8M/27.8M [00:00<00:00, 42.3MiB/s]
/home/ci/opt/venv/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator LabelEncoder from version 1.0.2 when using version 1.1.3. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
/home/ci/opt/venv/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator StandardScaler from version 1.0.2 when using version 1.1.3. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
WARNING:automm:A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20221213_015245/object_detection_result_cache.json loading annotations into memory... Done (t=0.00s) creating index... index created! Loading and preparing results... DONE (t=0.23s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox DONE (t=0.17s). Accumulating evaluation results... DONE (t=0.05s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.195 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.433 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.135 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.206 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.158 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.231 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.244 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.138 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.295 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.508
{'map': 0.19495386487978572}
For how to set those hyperparameters and finetune the model with higher performance, see AutoMM Detection - High Performance Finetune on COCO Format Dataset.
Inference¶
Now that we have gone through the model setup, finetuning, and evaluation, this section details the inference. Specifically, we layout the steps for using the model to make predictions and visualize the results.
To run inference on the entire test set, perform:
pred = predictor.predict(test_path)
print(pred)
loading annotations into memory... Done (t=0.00s) creating index... index created! image 0 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 1 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 2 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 3 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 4 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 5 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 6 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 7 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 8 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 9 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 10 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 11 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 12 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 13 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 14 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 15 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 16 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 17 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 18 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 19 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 20 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 21 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 22 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 23 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 24 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 25 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 26 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 27 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 28 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 29 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 30 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 31 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 32 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 33 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 34 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 35 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 36 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 37 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 38 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 39 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 40 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 41 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 42 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 43 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 44 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 45 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 46 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 47 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 48 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... 49 ./tiny_motorbike_coco/tiny_motorbike/Annotatio... bboxes 0 [{'class': 'bicycle', 'bbox': [351.3019, 149.5... 1 [{'class': 'bicycle', 'bbox': [393.4131, 273.4... 2 [{'class': 'bicycle', 'bbox': [429.75284, 115.... 3 [{'class': 'bicycle', 'bbox': [55.545815, 35.8... 4 [{'class': 'bicycle', 'bbox': [274.817, 175.09... 5 [{'class': 'bicycle', 'bbox': [177.26184, 83.1... 6 [{'class': 'car', 'bbox': [-13.707119, 86.6409... 7 [{'class': 'bicycle', 'bbox': [38.406834, 180.... 8 [{'class': 'bicycle', 'bbox': [23.443739, 38.2... 9 [{'class': 'bicycle', 'bbox': [144.87224, 1.01... 10 [{'class': 'bicycle', 'bbox': [434.66788, 109.... 11 [{'class': 'bicycle', 'bbox': [417.5407, 245.4... 12 [{'class': 'car', 'bbox': [179.89146, 76.21586... 13 [{'class': 'bicycle', 'bbox': [27.891464, 88.7... 14 [{'class': 'car', 'bbox': [211.49016, 32.79186... 15 [{'class': 'motorbike', 'bbox': [182.99612, 18... 16 [{'class': 'bicycle', 'bbox': [434.95847, 254.... 17 [{'class': 'bicycle', 'bbox': [52.11297, 265.7... 18 [{'class': 'car', 'bbox': [98.17141, 1.5616298... 19 [{'class': 'bicycle', 'bbox': [145.75581, 86.4... 20 [{'class': 'bicycle', 'bbox': [7.7810974, 131.... 21 [{'class': 'motorbike', 'bbox': [115.9607, 206... 22 [{'class': 'bicycle', 'bbox': [177.0567, 173.9... 23 [{'class': 'bicycle', 'bbox': [348.91235, 0.72... 24 [{'class': 'bicycle', 'bbox': [-1.0005265, 4.4... 25 [{'class': 'bicycle', 'bbox': [410.1275, 162.8... 26 [{'class': 'bicycle', 'bbox': [450.18408, 6.35... 27 [{'class': 'car', 'bbox': [38.084625, -7.45019... 28 [{'class': 'bicycle', 'bbox': [290.44272, 17.9... 29 [{'class': 'bicycle', 'bbox': [156.75359, 275.... 30 [{'class': 'bicycle', 'bbox': [479.9492, 59.03... 31 [{'class': 'bicycle', 'bbox': [377.64926, 111.... 32 [{'class': 'motorbike', 'bbox': [138.73445, 16... 33 [{'class': 'bicycle', 'bbox': [13.154282, 4.69... 34 [{'class': 'bicycle', 'bbox': [23.713373, 26.1... 35 [{'class': 'bicycle', 'bbox': [320.19284, 44.2... 36 [{'class': 'motorbike', 'bbox': [-34.24105, 23... 37 [{'class': 'bicycle', 'bbox': [407.79416, 240.... 38 [{'class': 'motorbike', 'bbox': [28.41878, -18... 39 [{'class': 'bicycle', 'bbox': [96.129456, 7.44... 40 [{'class': 'bicycle', 'bbox': [211.05609, 194.... 41 [{'class': 'bicycle', 'bbox': [-0.9396702, 61.... 42 [{'class': 'car', 'bbox': [41.38574, 124.12684... 43 [{'class': 'car', 'bbox': [154.44525, 38.71461... 44 [{'class': 'motorbike', 'bbox': [69.97977, 128... 45 [{'class': 'motorbike', 'bbox': [198.04515, 12... 46 [{'class': 'bicycle', 'bbox': [352.33554, 86.5... 47 [{'class': 'car', 'bbox': [175.70752, 25.96183... 48 [{'class': 'bicycle', 'bbox': [201.79596, -0.5... 49 [{'class': 'car', 'bbox': [25.747229, 81.39116...
The output pred
is a pandas
DataFrame
that has two columns,
image
and bboxes
.
In image
, each row contains the image path
In bboxes
, each row is a list of dictionaries, each one representing
a bounding box:
{"class": <predicted_class_name>, "bbox": [x1, y1, x2, y2], "score": <confidence_score>}
Note that, by default, the predictor.predict
does not save the
detection results into a file.
To run inference and save results, run the following:
pred = better_predictor.predict(test_path, save_results=True)
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
WARNING:automm:A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
Saved detection results to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20221213_015248/result.txt
Here, we save pred
into a .txt
file, which exactly follows the
same layout as in pred
. You can use a predictor initialzed in anyway
(i.e. finetuned predictor, predictor with pretrained model, etc.). Here,
we demonstrate using the better_predictor
loaded previously.
Visualizing Results¶
To run visualizations, ensure that you have opencv
installed. If you
haven’t already, install opencv
by running
!pip install opencv-python
Requirement already satisfied: opencv-python in /home/ci/opt/venv/lib/python3.8/site-packages (4.6.0.66)
Requirement already satisfied: numpy>=1.14.5 in /home/ci/opt/venv/lib/python3.8/site-packages (from opencv-python) (1.22.4)
To visualize the detection bounding boxes, run the following:
from autogluon.multimodal.utils import Visualizer
conf_threshold = 0.4 # Specify a confidence threshold to filter out unwanted boxes
image_result = pred.iloc[30]
img_path = image_result.image # Select an image to visualize
visualizer = Visualizer(img_path) # Initialize the Visualizer
out = visualizer.draw_instance_predictions(image_result, conf_threshold=conf_threshold) # Draw detections
visualized = out.get_image() # Get the visualized image
from PIL import Image
from IPython.display import display
img = Image.fromarray(visualized, 'RGB')
display(img)

Testing on Your Own Image¶
You can also download an image and run inference on that single image. The follow is an example:
Download the example image:
from autogluon.multimodal import download
image_url = "https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg"
test_image = download(image_url)
Downloading street_small.jpg from https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg...
Run inference:
pred_test_image = better_predictor.predict({"image": [test_image]})
print(pred_test_image)
image bboxes
0 street_small.jpg [{'class': 'bicycle', 'bbox': [235.36739, 216....
Other Examples¶
You may go to AutoMM Examples to explore other examples about AutoMM.
Customization¶
To learn how to customize AutoMM, please refer to Customize AutoMM.
Citation¶
@misc{redmon2018yolov3,
title={YOLOv3: An Incremental Improvement},
author={Joseph Redmon and Ali Farhadi},
year={2018},
eprint={1804.02767},
archivePrefix={arXiv},
primaryClass={cs.CV}
}