AutoMM Detection - Quick Start on a Tiny COCO Format Dataset

Open In Colab Open In SageMaker Studio Lab

In this section, our goal is to fast finetune a pretrained model on a small dataset in COCO format, and evaluate on its test set. Both training and test sets are in COCO format. See Convert Data to COCO Format for how to convert other datasets to COCO format.

Setting up the imports

To start, make sure mmcv and mmdet are installed. Note: MMDet is no longer actively maintained and is only compatible with MMCV version 2.1.0. Installation can be problematic due to CUDA version compatibility issues. For best results:

  1. Use CUDA 12.4 with PyTorch 2.5

  2. Before installation, run:

    pip install -U pip setuptools wheel
    sudo apt-get install -y ninja-build gcc g++
    

    This will help prevent MMCV installation from hanging during wheel building.

  3. After installation in Jupyter notebook, restart the kernel for changes to take effect.

# Update package tools and install build dependencies
!pip install -U pip setuptools wheel
!sudo apt-get install -y ninja-build gcc g++

# Install MMCV
!python3 -m mim install "mmcv==2.1.0"

# For Google Colab users: If the above fails, use this alternative MMCV installation
# pip install "mmcv==2.1.0" -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1.0/index.html

# Install MMDet
!python3 -m pip install "mmdet==3.2.0"

# Install MMEngine (version >=0.10.6 for PyTorch 2.5 compatibility)
!python3 -m pip install "mmengine>=0.10.6"

Hide code cell output

Requirement already satisfied: pip in /home/ci/opt/venv/lib/python3.12/site-packages (25.2)
Requirement already satisfied: setuptools in /home/ci/opt/venv/lib/python3.12/site-packages (80.9.0)
Requirement already satisfied: wheel in /home/ci/opt/venv/lib/python3.12/site-packages (0.45.1)
/usr/bin/sh: 1: sudo: not found
/home/ci/opt/venv/lib/python3.12/site-packages/mim/commands/list.py:4: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
Looking in links: https://download.openmmlab.com/mmcv/dist/cu126/torch2.7.0/index.html
Requirement already satisfied: mmcv==2.1.0 in /home/ci/opt/venv/lib/python3.12/site-packages (2.1.0)
Requirement already satisfied: addict in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (2.4.0)
Requirement already satisfied: mmengine>=0.3.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (0.10.7)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (2.1.3)
Requirement already satisfied: packaging in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (25.0)
Requirement already satisfied: Pillow in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (11.3.0)
Requirement already satisfied: pyyaml in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (6.0.2)
Requirement already satisfied: yapf in /home/ci/opt/venv/lib/python3.12/site-packages (from mmcv==2.1.0) (0.43.0)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (3.10.5)
Requirement already satisfied: rich in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (14.1.0)
Requirement already satisfied: termcolor in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (3.1.0)
Requirement already satisfied: opencv-python>=3 in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (4.12.0.88)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.4.9)
Requirement already satisfied: pyparsing>=2.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in /home/ci/opt/venv/lib/python3.12/site-packages (from python-dateutil>=2.7->matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.17.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from rich->mmengine>=0.3.0->mmcv==2.1.0) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from rich->mmengine>=0.3.0->mmcv==2.1.0) (2.19.2)
Requirement already satisfied: mdurl~=0.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from markdown-it-py>=2.2.0->rich->mmengine>=0.3.0->mmcv==2.1.0) (0.1.2)
Requirement already satisfied: platformdirs>=3.5.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from yapf->mmcv==2.1.0) (4.3.8)
Requirement already satisfied: mmdet==3.2.0 in /home/ci/opt/venv/lib/python3.12/site-packages (3.2.0)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (3.10.5)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (2.1.3)
Requirement already satisfied: pycocotools in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (2.0.10)
Requirement already satisfied: scipy in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (1.16.1)
Requirement already satisfied: shapely in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (2.1.1)
Requirement already satisfied: six in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (1.17.0)
Requirement already satisfied: terminaltables in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (3.1.10)
Requirement already satisfied: tqdm in /home/ci/opt/venv/lib/python3.12/site-packages (from mmdet==3.2.0) (4.67.1)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (1.4.9)
Requirement already satisfied: packaging>=20.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (25.0)
Requirement already satisfied: pillow>=8 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmdet==3.2.0) (2.9.0.post0)
Requirement already satisfied: mmengine>=0.10.6 in /home/ci/opt/venv/lib/python3.12/site-packages (0.10.7)
Requirement already satisfied: addict in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (2.4.0)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (3.10.5)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (2.1.3)
Requirement already satisfied: pyyaml in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (6.0.2)
Requirement already satisfied: rich in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (14.1.0)
Requirement already satisfied: termcolor in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (3.1.0)
Requirement already satisfied: yapf in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (0.43.0)
Requirement already satisfied: opencv-python>=3 in /home/ci/opt/venv/lib/python3.12/site-packages (from mmengine>=0.10.6) (4.12.0.88)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (4.59.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (1.4.9)
Requirement already satisfied: packaging>=20.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (25.0)
Requirement already satisfied: pillow>=8 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.12/site-packages (from matplotlib->mmengine>=0.10.6) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in /home/ci/opt/venv/lib/python3.12/site-packages (from python-dateutil>=2.7->matplotlib->mmengine>=0.10.6) (1.17.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from rich->mmengine>=0.10.6) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/ci/opt/venv/lib/python3.12/site-packages (from rich->mmengine>=0.10.6) (2.19.2)
Requirement already satisfied: mdurl~=0.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from markdown-it-py>=2.2.0->rich->mmengine>=0.10.6) (0.1.2)
Requirement already satisfied: platformdirs>=3.5.1 in /home/ci/opt/venv/lib/python3.12/site-packages (from yapf->mmengine>=0.10.6) (4.3.8)

To start, let’s import MultiModalPredictor:

from autogluon.multimodal import MultiModalPredictor
/home/ci/autogluon/multimodal/src/autogluon/multimodal/data/templates.py:16: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources

And also import some other packages that will be used in this tutorial:

import os
import time

from autogluon.core.utils.loaders import load_zip

Downloading Data

We have the sample dataset ready in the cloud. Let’s download it:

zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip"
download_dir = "./tiny_motorbike_coco"

load_zip.unzip(zip_file, unzip_dir=download_dir)
data_dir = os.path.join(download_dir, "tiny_motorbike")
train_path = os.path.join(data_dir, "Annotations", "trainval_cocoformat.json")
test_path = os.path.join(data_dir, "Annotations", "test_cocoformat.json")
Downloading ./tiny_motorbike_coco/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip...
  0%|          | 0.00/21.8M [00:00<?, ?iB/s]
 49%|████▉     | 10.7M/21.8M [00:00<00:00, 107MiB/s]
100%|██████████| 21.8M/21.8M [00:00<00:00, 109MiB/s]

Dataset Format

For COCO format datasets, provide JSON annotation files for each split:

  • trainval_cocoformat.json: train and validation data

  • test_cocoformat.json: test data

Model Selection

We use the medium_quality preset which features:

  • Base model: YOLOX-large (pretrained on COCO)

  • Benefits: Fast finetuning, quick inference, easy deployment

Alternative presets available:

  • high_quality: DINO-Resnet50 model

  • best_quality: DINO-SwinL model

Both alternatives offer improved performance at the cost of slower processing and higher GPU memory requirements.

presets = "medium_quality"

When creating the MultiModalPredictor, specify these essential parameters:

  • problem_type="object_detection" to define the task

  • presets="medium_quality" for presets selection

  • sample_data_path pointing to any dataset split (typically train_path) to infer object categories

  • path (optional) to set a custom save location

If no path is specified, the model will be automatically saved to a timestamped directory under AutogluonModels/.

# Init predictor
import uuid

model_path = f"./tmp/{uuid.uuid4().hex}-quick_start_tutorial_temp_save"

predictor = MultiModalPredictor(
    problem_type="object_detection",
    sample_data_path=train_path,
    presets=presets,
    path=model_path,
)

Finetuning the Model

The model uses optimized preset configurations for learning rate, epochs, and batch size. By default, it employs a two-stage learning rate strategy:

Model head layers use 100x higher learning rate This approach accelerates convergence and typically improves performance, especially for small datasets (hundreds to thousands of images)

Timing results below are from a test run on AWS g4.2xlarge EC2 instance:

start = time.time()
predictor.fit(train_path)  # Fit
train_end = time.time()
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
download failed due to SSLError(MaxRetryError("HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))")), retrying, 4 attempts left
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
download failed due to SSLError(MaxRetryError("HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))")), retrying, 3 attempts left
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
download failed due to SSLError(MaxRetryError("HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))")), retrying, 2 attempts left
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
download failed due to SSLError(MaxRetryError("HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))")), retrying, 1 attempt left
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
=================== System Info ===================
AutoGluon Version:  1.4.1b20250821
Python Version:     3.12.10
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Count:          1
Memory Avail:       28.39 GB / 30.95 GB (91.7%)
Disk Space Avail:   WARNING, an exception (FileNotFoundError) occurred while attempting to get available disk space. Consider opening a GitHub Issue.
===================================================
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6a7c9899972d400dae2fe9d764649e0a-quick_start_tutorial_temp_save
    ```
Seed set to 0
---------------------------------------------------------------------------
SSLCertVerificationError                  Traceback (most recent call last)
File ~/opt/venv/lib/python3.12/site-packages/urllib3/connectionpool.py:464, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    463 try:
--> 464     self._validate_conn(conn)
    465 except (SocketTimeout, BaseSSLError) as e:

File ~/opt/venv/lib/python3.12/site-packages/urllib3/connectionpool.py:1093, in HTTPSConnectionPool._validate_conn(self, conn)
   1092 if conn.is_closed:
-> 1093     conn.connect()
   1095 # TODO revise this, see https://github.com/urllib3/urllib3/issues/2791

File ~/opt/venv/lib/python3.12/site-packages/urllib3/connection.py:790, in HTTPSConnection.connect(self)
    788 server_hostname_rm_dot = server_hostname.rstrip(".")
--> 790 sock_and_verified = _ssl_wrap_socket_and_match_hostname(
    791     sock=sock,
    792     cert_reqs=self.cert_reqs,
    793     ssl_version=self.ssl_version,
    794     ssl_minimum_version=self.ssl_minimum_version,
    795     ssl_maximum_version=self.ssl_maximum_version,
    796     ca_certs=self.ca_certs,
    797     ca_cert_dir=self.ca_cert_dir,
    798     ca_cert_data=self.ca_cert_data,
    799     cert_file=self.cert_file,
    800     key_file=self.key_file,
    801     key_password=self.key_password,
    802     server_hostname=server_hostname_rm_dot,
    803     ssl_context=self.ssl_context,
    804     tls_in_tls=tls_in_tls,
    805     assert_hostname=self.assert_hostname,
    806     assert_fingerprint=self.assert_fingerprint,
    807 )
    808 self.sock = sock_and_verified.socket

File ~/opt/venv/lib/python3.12/site-packages/urllib3/connection.py:969, in _ssl_wrap_socket_and_match_hostname(sock, cert_reqs, ssl_version, ssl_minimum_version, ssl_maximum_version, cert_file, key_file, key_password, ca_certs, ca_cert_dir, ca_cert_data, assert_hostname, assert_fingerprint, server_hostname, ssl_context, tls_in_tls)
    967         server_hostname = normalized
--> 969 ssl_sock = ssl_wrap_socket(
    970     sock=sock,
    971     keyfile=key_file,
    972     certfile=cert_file,
    973     key_password=key_password,
    974     ca_certs=ca_certs,
    975     ca_cert_dir=ca_cert_dir,
    976     ca_cert_data=ca_cert_data,
    977     server_hostname=server_hostname,
    978     ssl_context=context,
    979     tls_in_tls=tls_in_tls,
    980 )
    982 try:

File ~/opt/venv/lib/python3.12/site-packages/urllib3/util/ssl_.py:480, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    478 context.set_alpn_protocols(ALPN_PROTOCOLS)
--> 480 ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
    481 return ssl_sock

File ~/opt/venv/lib/python3.12/site-packages/urllib3/util/ssl_.py:524, in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
    522     return SSLTransport(sock, ssl_context, server_hostname)
--> 524 return ssl_context.wrap_socket(sock, server_hostname=server_hostname)

File /usr/local/lib/python3.12/ssl.py:455, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
    449 def wrap_socket(self, sock, server_side=False,
    450                 do_handshake_on_connect=True,
    451                 suppress_ragged_eofs=True,
    452                 server_hostname=None, session=None):
    453     # SSLSocket class handles server_hostname encoding before it calls
    454     # ctx._wrap_socket()
--> 455     return self.sslsocket_class._create(
    456         sock=sock,
    457         server_side=server_side,
    458         do_handshake_on_connect=do_handshake_on_connect,
    459         suppress_ragged_eofs=suppress_ragged_eofs,
    460         server_hostname=server_hostname,
    461         context=self,
    462         session=session
    463     )

File /usr/local/lib/python3.12/ssl.py:1041, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
   1040                 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1041             self.do_handshake()
   1042 except:

File /usr/local/lib/python3.12/ssl.py:1319, in SSLSocket.do_handshake(self, block)
   1318         self.settimeout(None)
-> 1319     self._sslobj.do_handshake()
   1320 finally:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)

During handling of the above exception, another exception occurred:

SSLError                                  Traceback (most recent call last)
File ~/opt/venv/lib/python3.12/site-packages/urllib3/connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    786 # Make the request on the HTTPConnection object
--> 787 response = self._make_request(
    788     conn,
    789     method,
    790     url,
    791     timeout=timeout_obj,
    792     body=body,
    793     headers=headers,
    794     chunked=chunked,
    795     retries=retries,
    796     response_conn=response_conn,
    797     preload_content=preload_content,
    798     decode_content=decode_content,
    799     **response_kw,
    800 )
    802 # Everything went great!

File ~/opt/venv/lib/python3.12/site-packages/urllib3/connectionpool.py:488, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    487         new_e = _wrap_proxy_error(new_e, conn.proxy.scheme)
--> 488     raise new_e
    490 # conn.request() calls http.client.*.request, not the method in
    491 # urllib3.request. It also calls makefile (recv) on the socket.

SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)

The above exception was the direct cause of the following exception:

MaxRetryError                             Traceback (most recent call last)
File ~/opt/venv/lib/python3.12/site-packages/requests/adapters.py:644, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    643 try:
--> 644     resp = conn.urlopen(
    645         method=request.method,
    646         url=url,
    647         body=request.body,
    648         headers=request.headers,
    649         redirect=False,
    650         assert_same_host=False,
    651         preload_content=False,
    652         decode_content=False,
    653         retries=self.max_retries,
    654         timeout=timeout,
    655         chunked=chunked,
    656     )
    658 except (ProtocolError, OSError) as err:

File ~/opt/venv/lib/python3.12/site-packages/urllib3/connectionpool.py:841, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    839     new_e = ProtocolError("Connection aborted.", new_e)
--> 841 retries = retries.increment(
    842     method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    843 )
    844 retries.sleep()

File ~/opt/venv/lib/python3.12/site-packages/urllib3/util/retry.py:519, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    518     reason = error or ResponseError(cause)
--> 519     raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    521 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))

During handling of the above exception, another exception occurred:

SSLError                                  Traceback (most recent call last)
Cell In[8], line 2
      1 start = time.time()
----> 2 predictor.fit(train_path)  # Fit
      3 train_end = time.time()

File ~/autogluon/multimodal/src/autogluon/multimodal/predictor.py:540, in MultiModalPredictor.fit(self, train_data, presets, tuning_data, max_num_tuning_data, id_mappings, time_limit, save_path, hyperparameters, column_types, holdout_frac, teacher_predictor, seed, standalone, hyperparameter_tune_kwargs, clean_ckpts, predictions, labels, predictors)
    537     assert isinstance(predictors, list)
    538     learners = [ele if isinstance(ele, str) else ele._learner for ele in predictors]
--> 540 self._learner.fit(
    541     train_data=train_data,
    542     presets=presets,
    543     tuning_data=tuning_data,
    544     max_num_tuning_data=max_num_tuning_data,
    545     time_limit=time_limit,
    546     save_path=save_path,
    547     hyperparameters=hyperparameters,
    548     column_types=column_types,
    549     holdout_frac=holdout_frac,
    550     teacher_learner=teacher_learner,
    551     seed=seed,
    552     standalone=standalone,
    553     hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
    554     clean_ckpts=clean_ckpts,
    555     id_mappings=id_mappings,
    556     predictions=predictions,
    557     labels=labels,
    558     learners=learners,
    559 )
    561 return self

File ~/autogluon/multimodal/src/autogluon/multimodal/learners/object_detection.py:243, in ObjectDetectionLearner.fit(self, train_data, presets, tuning_data, max_num_tuning_data, time_limit, save_path, hyperparameters, column_types, holdout_frac, seed, standalone, hyperparameter_tune_kwargs, clean_ckpts, **kwargs)
    236 self.fit_sanity_check()
    237 self.prepare_fit_args(
    238     time_limit=time_limit,
    239     seed=seed,
    240     standalone=standalone,
    241     clean_ckpts=clean_ckpts,
    242 )
--> 243 fit_returns = self.execute_fit()
    244 self.on_fit_end(
    245     training_start=training_start,
    246     strategy=fit_returns.get("strategy", None),
   (...)
    249     clean_ckpts=clean_ckpts,
    250 )
    252 return self

File ~/autogluon/multimodal/src/autogluon/multimodal/learners/base.py:577, in BaseLearner.execute_fit(self)
    575     return dict()
    576 else:
--> 577     attributes = self.fit_per_run(**self._fit_args)
    578     self.update_attributes(**attributes)  # only update attributes for non-HPO mode
    579     return attributes

File ~/autogluon/multimodal/src/autogluon/multimodal/learners/object_detection.py:380, in ObjectDetectionLearner.fit_per_run(self, max_time, save_path, ckpt_path, resume, enable_progress_bar, seed, hyperparameters, advanced_hyperparameters, config, df_preprocessor, data_processors, model, standalone, clean_ckpts)
    375 df_preprocessor = self.get_df_preprocessor_per_run(
    376     df_preprocessor=df_preprocessor,
    377     config=config,
    378 )
    379 config = self.update_config_by_data_per_run(config=config, df_preprocessor=df_preprocessor)
--> 380 model = self.get_model_per_run(model=model, config=config)
    381 model = self.compile_model_per_run(config=config, model=model)
    382 data_processors = self.get_data_processors_per_run(
    383     data_processors=data_processors,
    384     config=config,
    385     model=model,
    386     advanced_hyperparameters=advanced_hyperparameters,
    387 )

File ~/autogluon/multimodal/src/autogluon/multimodal/learners/object_detection.py:349, in ObjectDetectionLearner.get_model_per_run(self, model, config)
    347 def get_model_per_run(self, model, config):
    348     if model is None:
--> 349         model = create_fusion_model(
    350             config=config,
    351             num_classes=self._output_shape,
    352             classes=self._classes,
    353         )
    354     return model

File ~/autogluon/multimodal/src/autogluon/multimodal/models/utils.py:1624, in create_fusion_model(config, num_classes, classes, num_numerical_columns, num_categories, numerical_fill_values, pretrained)
   1622 for model_name in names:
   1623     model_config = getattr(config.model, model_name)
-> 1624     model = create_model(
   1625         model_name=model_name,
   1626         model_config=model_config,
   1627         num_classes=num_classes,
   1628         classes=classes,
   1629         num_numerical_columns=num_numerical_columns,
   1630         num_categories=num_categories,
   1631         numerical_fill_values=numerical_fill_values,
   1632         pretrained=pretrained,
   1633     )
   1635     if isinstance(model, functools.partial):  # fusion model
   1636         if fusion_model is None:

File ~/autogluon/multimodal/src/autogluon/multimodal/models/utils.py:1421, in create_model(model_name, model_config, num_classes, classes, num_numerical_columns, num_categories, numerical_fill_values, pretrained, is_matching)
   1418 elif model_name.lower().startswith(MMDET_IMAGE):
   1419     from .mmdet_image import MMDetAutoModelForObjectDetection
-> 1421     model = MMDetAutoModelForObjectDetection(
   1422         prefix=model_name,
   1423         checkpoint_name=model_config.checkpoint_name,
   1424         config_file=model_config.config_file,
   1425         classes=classes,
   1426         pretrained=pretrained,
   1427         output_bbox_format=model_config.output_bbox_format,
   1428         frozen_layers=model_config.frozen_layers,
   1429     )
   1430 elif model_name.lower().startswith(MMOCR_TEXT_DET):
   1431     from .mmocr_text_detection import MMOCRAutoModelForTextDetection

File ~/autogluon/multimodal/src/autogluon/multimodal/models/mmdet_image.py:87, in MMDetAutoModelForObjectDetection.__init__(self, prefix, checkpoint_name, config_file, classes, pretrained, output_bbox_format, frozen_layers)
     81     raise ValueError(
     82         f"Not supported bounding box output format for object detection: {output_bbox_format}. All supported bounding box output formats are: {BBOX_FORMATS}."
     83     )
     85 # TODO: Config only init (without checkpoint)
---> 87 self._get_checkpoint_and_config_file(checkpoint_name=checkpoint_name, config_file=config_file)
     88 self._load_config()
     90 self._update_classes(classes)

File ~/autogluon/multimodal/src/autogluon/multimodal/models/mmdet_image.py:243, in MMDetAutoModelForObjectDetection._get_checkpoint_and_config_file(self, checkpoint_name, config_file)
    240 else:
    241     if checkpoint_name in AG_CUSTOM_MODELS:
    242         # TODO: add sha1_hash
--> 243         checkpoint_file = download(
    244             url=AG_CUSTOM_MODELS[checkpoint_name]["url"],
    245         )
    246         if (
    247             "source" in AG_CUSTOM_MODELS[checkpoint_name]
    248             and AG_CUSTOM_MODELS[checkpoint_name]["source"] == "MegVii"
    249         ):
    250             checkpoint_file = self.convert_megvii_yolox(checkpoint_file)

File ~/autogluon/multimodal/src/autogluon/multimodal/utils/download.py:266, in download(url, path, overwrite, sha1_hash, retries, verify_ssl)
    264             retries -= 1
    265             if retries <= 0:
--> 266                 raise e
    268             print(
    269                 "download failed due to {}, retrying, {} attempt{} left".format(
    270                     repr(e), retries, "s" if retries > 1 else ""
    271                 )
    272             )
    274 return fname

File ~/autogluon/multimodal/src/autogluon/multimodal/utils/download.py:225, in download(url, path, overwrite, sha1_hash, retries, verify_ssl)
    223         s3.meta.client.download_file(s3_bucket_name, s3_key, tmp_path)
    224 else:
--> 225     r = requests.get(url, stream=True, verify=verify_ssl, timeout=(10, 10000))
    226     if r.status_code != 200:
    227         raise RuntimeError("Failed downloading url {}".format(url))

File ~/opt/venv/lib/python3.12/site-packages/requests/api.py:73, in get(url, params, **kwargs)
     62 def get(url, params=None, **kwargs):
     63     r"""Sends a GET request.
     64 
     65     :param url: URL for the new :class:`Request` object.
   (...)
     70     :rtype: requests.Response
     71     """
---> 73     return request("get", url, params=params, **kwargs)

File ~/opt/venv/lib/python3.12/site-packages/requests/api.py:59, in request(method, url, **kwargs)
     55 # By using the 'with' statement we are sure the session is closed, thus we
     56 # avoid leaving sockets open which can trigger a ResourceWarning in some
     57 # cases, and look like a memory leak in others.
     58 with sessions.Session() as session:
---> 59     return session.request(method=method, url=url, **kwargs)

File ~/opt/venv/lib/python3.12/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File ~/opt/venv/lib/python3.12/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File ~/opt/venv/lib/python3.12/site-packages/requests/adapters.py:675, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    671         raise ProxyError(e, request=request)
    673     if isinstance(e.reason, _SSLError):
    674         # This branch is for urllib3 v1.22 and later.
--> 675         raise SSLError(e, request=request)
    677     raise ConnectionError(e, request=request)
    679 except ClosedPoolError as e:

SSLError: HTTPSConnectionPool(host='download.openmmlab.com', port=443): Max retries exceeded with url: /mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1010)')))

Notice that at the end of each progress bar, if the checkpoint at current stage is saved, it prints the model’s save path. In this example, it’s ./quick_start_tutorial_temp_save.

Print out the time and we can see that it’s fast!

print("This finetuning takes %.2f seconds." % (train_end - start))

Evaluation

To evaluate the model we just trained, run following code.

And the evaluation results are shown in command line output. The first line is mAP in COCO standard, and the second line is mAP in VOC standard (or mAP50). For more details about these metrics, see COCO’s evaluation guideline. Note that for presenting a fast finetuning we use presets “medium_quality”, you could get better result on this dataset by simply using “high_quality” or “best_quality” presets, or customize your own model and hyperparameter settings: Customization, and some other examples at Fast Fine-tune Coco or High Performance Fine-tune Coco.

predictor.evaluate(test_path)
eval_end = time.time()

Print out the evaluation time:

print("The evaluation takes %.2f seconds." % (eval_end - train_end))

We can load a new predictor with previous save path, and we can also reset the number of used GPUs if not all the devices are available:

# Load and reset num_gpus
new_predictor = MultiModalPredictor.load(model_path)
new_predictor.set_num_gpus(1)

Evaluating the new predictor gives us exactly the same result:

# Evaluate new predictor
new_predictor.evaluate(test_path)

For how to set the hyperparameters and finetune the model with higher performance, see AutoMM Detection - High Performance Finetune on COCO Format Dataset.

Inference

Let’s perform predictions using our finetuned model. The predictor can process the entire test set with a single command:

pred = predictor.predict(test_path)
print(len(pred))  # Number of predictions
print(pred[:3])   # Sample of first 3 predictions

The predictor returns predictions as a pandas DataFrame with two columns:

  • image: Contains path to each input image

  • bboxes: Contains list of detected objects, where each object is a dictionary:

    {
        "class": "predicted_class_name",
        "bbox": [x1, y1, x2, y2],  # Coordinates of Upper Left and Bottom Right corners
        "score": confidence_score
    }
    

By default, predictions are returned but not saved. To save detection results, use the save parameter in your predict call.

# To save as csv format
pred = predictor.predict(test_path, save_results=True, as_coco=False)
# Or to save as COCO format. Note that the `pred` returned is always a pandas dataframe.
pred = predictor.predict(test_path, save_results=True, as_coco=True, result_save_path="./results.json")

The predictions can be saved in two formats:

  • CSV file: Matches the DataFrame structure with image and bboxes columns

  • COCO JSON: Standard COCO format annotation file

This works with any predictor configuration (pretrained or finetuned models).

Visualizing Results

To run visualizations, ensure that you have opencv installed. If you haven’t already, install opencv by running

!pip install opencv-python

To visualize the detection bounding boxes, run the following:

from autogluon.multimodal.utils import ObjectDetectionVisualizer

conf_threshold = 0.4  # Specify a confidence threshold to filter out unwanted boxes
image_result = pred.iloc[30]

img_path = image_result.image  # Select an image to visualize

visualizer = ObjectDetectionVisualizer(img_path)  # Initialize the Visualizer
out = visualizer.draw_instance_predictions(image_result, conf_threshold=conf_threshold)  # Draw detections
visualized = out.get_image()  # Get the visualized image

from PIL import Image
from IPython.display import display
img = Image.fromarray(visualized, 'RGB')
display(img)

Testing on Your Own Data

You can also predict on your own images with various input format. The follow is an example:

Download the example image:

from autogluon.multimodal.utils import download
image_url = "https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg"
test_image = download(image_url)

Run inference on data in a json file of COCO format (See Convert Data to COCO Format for more details about COCO format). Note that since the root is by default the parent folder of the annotation file, here we put the annotation file in a folder:

import json

# create a input file for demo
data = {"images": [{"id": 0, "width": -1, "height": -1, "file_name": test_image}], "categories": []}
os.mkdir("input_data_for_demo")
input_file = "input_data_for_demo/demo_annotation.json"
with open(input_file, "w+") as f:
    json.dump(data, f)

pred_test_image = predictor.predict(input_file)
print(pred_test_image)

Run inference on data in a list of image file names:

pred_test_image = predictor.predict([test_image])
print(pred_test_image)

Other Examples

You may go to AutoMM Examples to explore other examples about AutoMM.

Customization

To learn how to customize AutoMM, please refer to Customize AutoMM.

Citation

@article{DBLP:journals/corr/abs-2107-08430,
  author    = {Zheng Ge and
               Songtao Liu and
               Feng Wang and
               Zeming Li and
               Jian Sun},
  title     = {{YOLOX:} Exceeding {YOLO} Series in 2021},
  journal   = {CoRR},
  volume    = {abs/2107.08430},
  year      = {2021},
  url       = {https://arxiv.org/abs/2107.08430},
  eprinttype = {arXiv},
  eprint    = {2107.08430},
  timestamp = {Tue, 05 Apr 2022 14:09:44 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2107-08430.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org},
}