Hyperparameter Optimization in AutoMM¶

Hyperparameter optimization (HPO) is a method that helps solve the challenge of tuning hyperparameters of machine learning models. ML algorithms have multiple complex hyperparameters that generate an enormous search space, and the search space in deep learning methods is even larger than traditional ML algorithms. Tuning on a massive search space is a tough challenge, but AutoMM provides various options for you to guide the fitting process based on your domain knowledge and the constraint on computing resources.

Create Image Dataset¶

In this tutorial, we are going to again use the subset of the Shopee-IET dataset from Kaggle for demonstration purpose. Each image contains a clothing item and the corresponding label specifies its clothing category. Our subset of the data contains the following possible labels: BabyPants, BabyShirt, womencasualshoes, womenchiffontop.

We can load a dataset by downloading a url data automatically:

import warnings
warnings.filterwarnings('ignore')
from datetime import datetime

from autogluon.multimodal.utils.misc import shopee_dataset
download_dir = './ag_automm_tutorial_hpo'
train_data, test_data = shopee_dataset(download_dir)
train_data = train_data.sample(frac=0.5)
print(train_data)

Downloading ./ag_automm_tutorial_hpo/file.zip from https://automl-mm-bench.s3.amazonaws.com/vision_datasets/shopee.zip...
                                                 image  label
 /home/ci/autogluon/docs/tutorials/multimodal/a...      0
/home/ci/autogluon/docs/tutorials/multimodal/a...      0
/home/ci/autogluon/docs/tutorials/multimodal/a...      1
/home/ci/autogluon/docs/tutorials/multimodal/a...      0
/home/ci/autogluon/docs/tutorials/multimodal/a...      0
..                                                 ...    ...
/home/ci/autogluon/docs/tutorials/multimodal/a...      2
/home/ci/autogluon/docs/tutorials/multimodal/a...      1
 /home/ci/autogluon/docs/tutorials/multimodal/a...      0
/home/ci/autogluon/docs/tutorials/multimodal/a...      1
/home/ci/autogluon/docs/tutorials/multimodal/a...      2

[400 rows x 2 columns]

  0%|          | 0.00/84.0M [00:00<?, ?iB/s]
 10%|▉         | 8.38M/84.0M [00:00<00:01, 64.4MiB/s]
 22%|██▏       | 18.9M/84.0M [00:00<00:00, 85.5MiB/s]
 36%|███▌      | 30.3M/84.0M [00:00<00:00, 97.9MiB/s]
 50%|█████     | 42.1M/84.0M [00:00<00:00, 106MiB/s]
 64%|██████▍   | 54.1M/84.0M [00:00<00:00, 110MiB/s]
 78%|███████▊  | 65.2M/84.0M [00:00<00:00, 100MiB/s]
 90%|████████▉ | 75.4M/84.0M [00:00<00:00, 99.5MiB/s]
100%|██████████| 84.0M/84.0M [00:00<00:00, 99.5MiB/s]

There are in total 400 data points in this dataset. The image column stores the path to the actual image, and the label column stands for the label class.

The Regular Model Fitting¶

Recall that if we are to use the default settings predefined by Autogluon, we can simply fit the model using MultiModalPredictor with three lines of code:

from autogluon.multimodal import MultiModalPredictor
predictor_regular = MultiModalPredictor(label="label")
start_time = datetime.now()
predictor_regular.fit(
    train_data=train_data,
    hyperparameters = {"model.timm_image.checkpoint_name": "ghostnet_100"}
)
end_time = datetime.now()
elapsed_seconds = (end_time - start_time).total_seconds()
elapsed_min = divmod(elapsed_seconds, 60)
print("Total fitting time: ", f"{int(elapsed_min[0])}m{int(elapsed_min[1])}s")

Total fitting time:  1m7s

No path specified. Models will be saved in: "AutogluonModels/ag-20251024_143656"
=================== System Info ===================
AutoGluon Version:  1.4.1b20251024
Python Version:     3.12.10
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Memory:         GPU 0: 14.57/14.57 GB
Total GPU Memory:   Free: 14.57 GB, Allocated: 0.00 GB, Total: 14.57 GB
GPU Count:          1
Memory Avail:       28.46 GB / 30.95 GB (92.0%)
Disk Space Avail:   159.80 GB / 255.99 GB (62.4%)
===================================================
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	4 unique label values:  [np.int64(0), np.int64(1), np.int64(3), np.int64(2)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656
    ```
Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params | Mode 
------------------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 3.9 M  | train
1 | validation_metric | MulticlassAccuracy              | 0      | train
2 | loss_func         | CrossEntropyLoss                | 0      | train
------------------------------------------------------------------------------
3.9 M     Trainable params
0         Non-trainable params
3.9 M     Total params
15.627    Total estimated model params size (MB)
418       Modules in train mode
0         Modules in eval mode
Epoch 0, global step 1: 'val_accuracy' reached 0.25000 (best 0.25000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=0-step=1.ckpt' as top 3
Epoch 0, global step 3: 'val_accuracy' reached 0.23750 (best 0.25000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=0-step=3.ckpt' as top 3
Epoch 1, global step 4: 'val_accuracy' reached 0.31250 (best 0.31250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=1-step=4.ckpt' as top 3
Epoch 1, global step 6: 'val_accuracy' reached 0.43750 (best 0.43750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=1-step=6.ckpt' as top 3
Epoch 2, global step 7: 'val_accuracy' reached 0.48750 (best 0.48750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=2-step=7.ckpt' as top 3
Epoch 2, global step 9: 'val_accuracy' reached 0.58750 (best 0.58750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=2-step=9.ckpt' as top 3
Epoch 3, global step 10: 'val_accuracy' reached 0.60000 (best 0.60000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=3-step=10.ckpt' as top 3
Epoch 3, global step 12: 'val_accuracy' reached 0.66250 (best 0.66250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=3-step=12.ckpt' as top 3
Epoch 4, global step 13: 'val_accuracy' reached 0.66250 (best 0.66250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=4-step=13.ckpt' as top 3
Epoch 4, global step 15: 'val_accuracy' reached 0.71250 (best 0.71250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=4-step=15.ckpt' as top 3
Epoch 5, global step 16: 'val_accuracy' reached 0.76250 (best 0.76250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=5-step=16.ckpt' as top 3
Epoch 5, global step 18: 'val_accuracy' reached 0.76250 (best 0.76250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=5-step=18.ckpt' as top 3
Epoch 6, global step 19: 'val_accuracy' reached 0.77500 (best 0.77500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=6-step=19.ckpt' as top 3
Epoch 6, global step 21: 'val_accuracy' reached 0.80000 (best 0.80000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=6-step=21.ckpt' as top 3
Epoch 7, global step 22: 'val_accuracy' reached 0.83750 (best 0.83750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=7-step=22.ckpt' as top 3
Epoch 7, global step 24: 'val_accuracy' reached 0.80000 (best 0.83750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=7-step=24.ckpt' as top 3
Epoch 8, global step 25: 'val_accuracy' reached 0.81250 (best 0.83750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=8-step=25.ckpt' as top 3
Epoch 8, global step 27: 'val_accuracy' reached 0.81250 (best 0.83750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=8-step=27.ckpt' as top 3
Epoch 9, global step 28: 'val_accuracy' was not in top 3
Epoch 9, global step 30: 'val_accuracy' was not in top 3
Epoch 10, global step 31: 'val_accuracy' reached 0.83750 (best 0.83750), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=10-step=31.ckpt' as top 3
Epoch 10, global step 33: 'val_accuracy' reached 0.85000 (best 0.85000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=10-step=33.ckpt' as top 3
Epoch 11, global step 34: 'val_accuracy' was not in top 3
Epoch 11, global step 36: 'val_accuracy' reached 0.86250 (best 0.86250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=11-step=36.ckpt' as top 3
Epoch 12, global step 37: 'val_accuracy' reached 0.85000 (best 0.86250), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=12-step=37.ckpt' as top 3
Epoch 12, global step 39: 'val_accuracy' was not in top 3
Epoch 13, global step 40: 'val_accuracy' was not in top 3
Epoch 13, global step 42: 'val_accuracy' was not in top 3
Epoch 14, global step 43: 'val_accuracy' was not in top 3
Epoch 14, global step 45: 'val_accuracy' reached 0.87500 (best 0.87500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=14-step=45.ckpt' as top 3
Epoch 15, global step 46: 'val_accuracy' was not in top 3
Epoch 15, global step 48: 'val_accuracy' was not in top 3
Epoch 16, global step 49: 'val_accuracy' reached 0.87500 (best 0.87500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656/epoch=16-step=49.ckpt' as top 3
Epoch 16, global step 51: 'val_accuracy' was not in top 3
Epoch 17, global step 52: 'val_accuracy' was not in top 3
Epoch 17, global step 54: 'val_accuracy' was not in top 3
Epoch 18, global step 55: 'val_accuracy' was not in top 3
Epoch 18, global step 57: 'val_accuracy' was not in top 3
Epoch 19, global step 58: 'val_accuracy' was not in top 3
Epoch 19, global step 60: 'val_accuracy' was not in top 3
`Trainer.fit` stopped: `max_epochs=20` reached.
Start to fuse 3 checkpoints via the greedy soup algorithm.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143656")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).

Let’s check out the test accuracy of the fitted model:

scores = predictor_regular.evaluate(test_data, metrics=["accuracy"])
print('Top-1 test acc: %.3f' % scores["accuracy"])

Top-1 test acc: 0.713

💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.

Use HPO During Model Fitting¶

If you would like more control over the fitting process, you can specify various options for hyperparameter optimizations(HPO) in MultiModalPredictor by simply adding more options in hyperparameter and hyperparameter_tune_kwargs.

There are a few options we can have in MultiModalPredictor. We use Ray Tune tune library in the backend, so we need to pass in a Tune search space or an AutoGluon search space which will be converted to Tune search space.

Defining the search space of various hyperparameter values for the training of neural networks:

hyperparameters = {
        "optim.lr": tune.uniform(0.00005, 0.005),
        "optim.optim_type": tune.choice(["adamw", "sgd"]),
        "optim.max_epochs": tune.choice(["10", "20"]), 
        "model.timm_image.checkpoint_name": tune.choice(["swin_base_patch4_window7_224", "convnext_base_in22ft1k"])
        }

This is an example but not an exhaustive list. You can find the full supported list in Customize AutoMM

Defining the search strategy for HPO with hyperparameter_tune_kwargs. You can pass in a string or initialize a ray.tune.schedulers.TrialScheduler object.

"searcher": "bayes"

"scheduler": "ASHA"

"num_trials": 20

Ray documentation

"num_to_keep": 3

Let’s work on HPO with combinations of different learning rates and backbone models:

from ray import tune

predictor_hpo = MultiModalPredictor(label="label")

hyperparameters = {
            "optim.lr": tune.uniform(0.00005, 0.001),
            "model.timm_image.checkpoint_name": tune.choice(["ghostnet_100",
                                                             "mobilenetv3_large_100"])
}
hyperparameter_tune_kwargs = {
    "searcher": "bayes", # random
    "scheduler": "ASHA",
    "num_trials": 2,
    "num_to_keep": 3,
}
start_time_hpo = datetime.now()
predictor_hpo.fit(
        train_data=train_data,
        hyperparameters=hyperparameters,
        hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
    )
end_time_hpo = datetime.now()
elapsed_seconds_hpo = (end_time_hpo - start_time_hpo).total_seconds()
elapsed_min_hpo = divmod(elapsed_seconds_hpo, 60)
print("Total fitting time: ", f"{int(elapsed_min_hpo[0])}m{int(elapsed_min_hpo[1])}s")

No path specified. Models will be saved in: "AutogluonModels/ag-20251024_143805"
=================== System Info ===================
AutoGluon Version:  1.4.1b20251024
Python Version:     3.12.10
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Memory:         GPU 0: 14.57/14.57 GB
Total GPU Memory:   Free: 14.57 GB, Allocated: 0.00 GB, Total: 14.57 GB
GPU Count:          1
Memory Avail:       27.51 GB / 30.95 GB (88.9%)
Disk Space Avail:   159.77 GB / 255.99 GB (62.4%)
===================================================
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	4 unique label values:  [np.int64(0), np.int64(1), np.int64(3), np.int64(2)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
/home/ci/opt/venv/lib/python3.12/site-packages/ray/tune/impl/tuner_internal.py:144: RayDeprecationWarning: The `RunConfig` class should be imported from `ray.tune` when passing it to the Tuner. Please update your imports. See this issue for more context and migration options: https://github.com/ray-project/ray/issues/49454. Disable these warnings by setting the environment variable: RAY_TRAIN_ENABLE_V2_MIGRATION_WARNINGS=0
  _log_deprecation_warning(
Removing non-optimal trials and only keep the best one.
Start to fuse 3 checkpoints via the greedy soup algorithm.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20251024_143805")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).

Tune Status

Current time:	2025-10-24 14:39:21
Running for:	00:01:09.95
Memory:	6.0/30.9 GiB

System Info

Trial Status

Trial name	status	loc	model.names	model.timm_image.che ckpoint_name	optim.lr	iter	total time (s)	val_accuracy
356e10a6	TERMINATED	10.0.1.11:3972	('timm_image', _74c0	ghostnet_100	0.000275191	26	42.4435	0.95
a8951bfe	TERMINATED	10.0.1.11:4195	('timm_image', _0a00	mobilenetv3_lar_ed70	0.000378927	4	7.01832	0.425

Trial Progress

Trial name	should_checkpoint	val_accuracy
356e10a6	True	0.95
a8951bfe	True	0.425

Total fitting time:  1m20s

Let’s check out the test accuracy of the fitted model after HPO:

scores_hpo = predictor_hpo.evaluate(test_data, metrics=["accuracy"])
print('Top-1 test acc: %.3f' % scores_hpo["accuracy"])

Top-1 test acc: 0.838

💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.

From the training log, you should be able to see the current best trial as below:

Current best trial: 47aef96a with val_accuracy=0.862500011920929 and parameters={'optim.lr': 0.0007195214018085505, 'model.timm_image.checkpoint_name': 'ghostnet_100'}

After our simple 2-trial HPO run, we got a better test accuracy, by searching different learning rates and models, compared to the out-of-box solution provided in the previous section. HPO helps select the combination of hyperparameters with highest validation accuracy.

Other Examples¶

You may go to AutoMM Examples to explore other examples about AutoMM.

Customization¶

To learn how to customize AutoMM, please refer to Customize AutoMM.