AutoMM Detection - Evaluate Pretrained Deformable DETR on COCO Format Dataset#
In this section, our goal is to evaluate Deformable DETR model on COCO17 dataset in COCO format. Previously we introduced two classic models: AutoMM Detection - Evaluate Pretrained YOLOv3 on COCO Format Dataset and AutoMM Detection - Evaluate Pretrained Faster R-CNN on COCO Format Dataset. Recent years Transformer models become more and more popular in Computer Vision, and Deformable DEtection TRansformer (Deformable DETR) reached the SOTA performance in detection task. In terms of speed, it’s slower than YOLOv3 and Faster-RCNN, but it also has higher performance.
To start, let’s import MultiModalPredictor:
from autogluon.multimodal import MultiModalPredictor
We select the two-stage Deformable DETR with ResNet50 as backbone with bounding box finetune,
this model has 15.7 frames per second (FPS) on single A10e GPU with batch_size=1
.
And we use all the GPUs (if any):
checkpoint_name = "deformable_detr_twostage_refine_r50_16x2_50e_coco"
num_gpus = -1 # use all GPUs
We create the MultiModalPredictor with selected checkpoint name and number of GPUs.
We also need to specify the problem_type to "object_detection"
.
predictor = MultiModalPredictor(
hyperparameters={
"model.mmdet_image.checkpoint_name": checkpoint_name,
"env.num_gpus": num_gpus,
},
problem_type="object_detection",
)
Here we use COCO17 for testing.
See other tutorials for AutoMM Detection - Prepare COCO2017 Dataset.
While using COCO dataset, the input is the json annotation file of the dataset split.
In this example, instances_val2017.json
is the annotation file of validation split of COCO17 dataset.
test_path = "coco17/annotations/instances_val2017.json"
To evaluate the pretrained Deformable DETR model we loaded, run:
predictor.evaluate(test_path)
And the evaluation results are shown in command line output. The first value 0.463
is mAP in COCO standard, and the second one 0.659
is mAP in VOC standard (or mAP50). For more details about these metrics, see COCO’s evaluation guideline.
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.463
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.659
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.500
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.298
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.493
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.607
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.358
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.603
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.652
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.451
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.692
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.830
time usage: 389.92
Deformable DETR has best performance but takes more time and (GPU memory) space. If there is a restriction in time or space, see AutoMM Detection - Evaluate Pretrained Faster R-CNN on COCO Format Dataset or AutoMM Detection - Evaluate Pretrained YOLOv3 on COCO Format Dataset. You can also see other tutorials for AutoMM Detection - High Performance Finetune on COCO Format Dataset or AutoMM Detection - High Performance Finetune on COCO Format Dataset.
Other Examples#
You may go to AutoMM Examples to explore other examples about AutoMM.
Customization#
To learn how to customize AutoMM, please refer to Customize AutoMM.
Citation#
@inproceedings{
zhu2021deformable,
title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
author={Xizhou Zhu and Weijie Su and Lewei Lu and Bin Li and Xiaogang Wang and Jifeng Dai},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=gZ9hCDWe6ke}
}