.. _sec_automm_detection_voc_to_coco:
AutoMM Detection - Convert VOC Format Dataset to COCO Format
============================================================
`Pascal VOC `__ is a collection
of datasets for object detection. And VOC format refers to the specific
format (in ``.xml`` file) the Pascal VOC dataset is using.
In this tutorial, we will convert VOC2007 dataset from VOC format to
COCO format. See :ref:`sec_automm_detection_prepare_voc` for how to
download it. We will use our tool ``voc2coco``. This python script is in
our code:
`voc2coco.py `__,
and you can also run it as a cli:
``python3 -m autogluon.multimodal.cli.voc2coco``.
**Note: In Autogluon MultiModalPredictor, we strongly recommend using
COCO as your data format.** However, for fast proof testing we also have
limit support for VOC format.
Convert Existing Splits
~~~~~~~~~~~~~~~~~~~~~~~
Under VOC format root path, we have the following folders:
::
Annotations ImageSets JPEGImages
And normally there are some pre-defined split files under
``ImageSets/Main/``:
::
train.txt
val.txt
test.txt
...
We can convert those splits into COCO format by simply running given the
root directory, e.g. \ ``./VOCdevkit/VOC2007``:
::
python3 -m autogluon.multimodal.cli.voc2coco --root_dir ./VOCdevkit/VOC2007
The command line output will show the progress:
::
Start converting !
17%|█████████████████▍ | 841/4952 [00:00<00:00, 15571.88it/s
Now those splits are converted to COCO format in ``Annotations`` folder
under the root directory:
::
train_cocoformat.json
val_cocoformat.json
test_cocoformat.json
...
Convert Existing Splits
~~~~~~~~~~~~~~~~~~~~~~~
Instead of using predefined splits, you can also split the data with the
train/validation/test ratio you want. Note that this does not require
any pre-existing split files. To split train/validation/test by
0.6/0.2/0.2, run:
::
python3 -m autogluon.multimodal.cli.voc2coco --root_dir ./VOCdevkit/VOC2007 --train_ratio 0.6 --val_ratio 0.2
The command line output will show the progress:
::
Start converting !
17%|█████████████████▍ | 841/4952 [00:00<00:00, 15571.88it/s
And this will generate user splited COCO format in ``Annotations``
folder under the root directory:
::
usersplit_train_cocoformat.json
usersplit_val_cocoformat.json
usersplit_test_cocoformat.json
Other Examples
~~~~~~~~~~~~~~
You may go to `AutoMM
Examples `__
to explore other examples about AutoMM.
Customization
~~~~~~~~~~~~~
To learn how to customize AutoMM, please refer to
:ref:`sec_automm_customization`.