Searchable Objects¶

When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!

Example for Constructing a Network¶

This tutorial covers an example of selecting a neural network’s architecture as a hyperparameter optimization (HPO) task. If you are interested in efficient neural architecture search (NAS), please refer to this other tutorial instead: sec_proxyless_ .

CIFAR ResNet in GluonCV¶

GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:

import pickle
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1

layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.7.1+cu101 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time.
  warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '

We can visualize the network:

import autogluon.core as ag
from autogluon.vision.utils import plot_network

plot_network(net, (1, 3, 32, 32))

../../_images/output_object_d3e86d_3_0.svg

Searchable Network Architecture Using AutoGluon Object¶

autogluon.obj() enables customized search space to any user defined class. It can also be used within autogluon.Categorical() if you have multiple networks to choose from.

@ag.obj(
    nstage1=ag.space.Int(2, 4),
    nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
    def __init__(self, nstage1, nstage2):
        nstage3 = 9 - nstage1 - nstage2
        layers = [nstage1, nstage2, nstage3]
        channels = [16, 16, 32, 64]
        super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)

Create one network instance and print the configuration space:

mynet=MyCifarResNet()
print(mynet.cs)

Configuration space object:
  Hyperparameters:
    nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
    nstage2, Type: UniformInteger, Range: [2, 4], Default: 3

We can also overwrite existing search spaces:

mynet1 = MyCifarResNet(nstage1=1,
                       nstage2=ag.space.Int(5, 10))
print(mynet1.cs)

Configuration space object:
  Hyperparameters:
    nstage2, Type: UniformInteger, Range: [5, 10], Default: 8

Decorate Existing Class¶

We can also use autogluon.obj() to easily decorate any existing classes. For example, if we want to search learning rate and weight decay for Adam optimizer, we only need to add a decorator:

from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
    pass

Then we can create an instance:

myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)

Configuration space object:
  Hyperparameters:
    learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
    wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale

Launch Experiments Using AutoGluon Object¶

AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also works with user-defined training scripts using autogluon.autogluon_register_args(). We can start fitting:

from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)

time_limit=auto set to time_limit=7200.
Starting fit without HPO
modified configs(<old> != <new>): {
root.valid.num_workers 4 != 8
root.valid.batch_size 128 != 16
root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto
root.train.num_workers 4 != 8
root.train.early_stop_max_value 1.0 != inf
root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto
root.train.data_dir  ~/.mxnet/datasets/imagenet != auto
root.train.batch_size 128 != 16
root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto
root.train.num_training_samples 1281167 != -1
root.train.rec_val   ~/.mxnet/datasets/imagenet/rec/val.rec != auto
root.train.lr        0.1 != 0.01
root.train.early_stop_patience -1 != 10
root.train.epochs    10 != 1
root.train.early_stop_baseline 0.0 != -inf
root.img_cls.model   resnet50_v1 != resnet50
}
Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/config.yaml
Start training from [Epoch 0]
Epoch[0] Batch [49] Speed: 72.770480 samples/sec    accuracy=0.132500       lr=0.010000
Epoch[0] Batch [99] Speed: 73.166324 samples/sec    accuracy=0.146250       lr=0.010000
Epoch[0] Batch [149]        Speed: 72.969063 samples/sec    accuracy=0.155417       lr=0.010000
Epoch[0] Batch [199]        Speed: 72.633540 samples/sec    accuracy=0.165937       lr=0.010000
Epoch[0] Batch [249]        Speed: 72.316026 samples/sec    accuracy=0.167000       lr=0.010000
Epoch[0] Batch [299]        Speed: 72.032110 samples/sec    accuracy=0.171875       lr=0.010000
Epoch[0] Batch [349]        Speed: 71.892782 samples/sec    accuracy=0.176786       lr=0.010000
Epoch[0] Batch [399]        Speed: 71.934727 samples/sec    accuracy=0.177969       lr=0.010000
Epoch[0] Batch [449]        Speed: 71.951777 samples/sec    accuracy=0.178333       lr=0.010000
Epoch[0] Batch [499]        Speed: 71.876971 samples/sec    accuracy=0.181500       lr=0.010000
Epoch[0] Batch [549]        Speed: 71.474737 samples/sec    accuracy=0.181591       lr=0.010000
Epoch[0] Batch [599]        Speed: 71.278845 samples/sec    accuracy=0.182917       lr=0.010000
Epoch[0] Batch [649]        Speed: 70.976441 samples/sec    accuracy=0.183462       lr=0.010000
Epoch[0] Batch [699]        Speed: 70.835507 samples/sec    accuracy=0.185179       lr=0.010000
Epoch[0] Batch [749]        Speed: 70.588953 samples/sec    accuracy=0.186500       lr=0.010000
Epoch[0] Batch [799]        Speed: 70.293463 samples/sec    accuracy=0.187891       lr=0.010000
Epoch[0] Batch [849]        Speed: 69.829359 samples/sec    accuracy=0.188971       lr=0.010000
Epoch[0] Batch [899]        Speed: 69.570659 samples/sec    accuracy=0.191042       lr=0.010000
Epoch[0] Batch [949]        Speed: 69.121725 samples/sec    accuracy=0.191645       lr=0.010000
Epoch[0] Batch [999]        Speed: 68.595424 samples/sec    accuracy=0.191875       lr=0.010000
Epoch[0] Batch [1049]       Speed: 68.060402 samples/sec    accuracy=0.192440       lr=0.010000
Epoch[0] Batch [1099]       Speed: 68.928145 samples/sec    accuracy=0.190909       lr=0.010000
Epoch[0] Batch [1149]       Speed: 70.299055 samples/sec    accuracy=0.191576       lr=0.010000
Epoch[0] Batch [1199]       Speed: 70.904903 samples/sec    accuracy=0.193177       lr=0.010000
Epoch[0] Batch [1249]       Speed: 71.366764 samples/sec    accuracy=0.193400       lr=0.010000
Epoch[0] Batch [1299]       Speed: 71.629010 samples/sec    accuracy=0.194183       lr=0.010000
Epoch[0] Batch [1349]       Speed: 71.902969 samples/sec    accuracy=0.194583       lr=0.010000
Epoch[0] Batch [1399]       Speed: 71.912137 samples/sec    accuracy=0.194688       lr=0.010000
Epoch[0] Batch [1449]       Speed: 71.922055 samples/sec    accuracy=0.195560       lr=0.010000
Epoch[0] Batch [1499]       Speed: 71.972581 samples/sec    accuracy=0.196250       lr=0.010000
Epoch[0] Batch [1549]       Speed: 71.918170 samples/sec    accuracy=0.197419       lr=0.010000
Epoch[0] Batch [1599]       Speed: 71.920872 samples/sec    accuracy=0.197656       lr=0.010000
Epoch[0] Batch [1649]       Speed: 71.914389 samples/sec    accuracy=0.198523       lr=0.010000
Epoch[0] Batch [1699]       Speed: 71.716690 samples/sec    accuracy=0.199449       lr=0.010000
Epoch[0] Batch [1749]       Speed: 71.452302 samples/sec    accuracy=0.200250       lr=0.010000
Epoch[0] Batch [1799]       Speed: 71.375081 samples/sec    accuracy=0.200764       lr=0.010000
Epoch[0] Batch [1849]       Speed: 71.168992 samples/sec    accuracy=0.202027       lr=0.010000
Epoch[0] Batch [1899]       Speed: 71.067646 samples/sec    accuracy=0.202928       lr=0.010000
Epoch[0] Batch [1949]       Speed: 70.862554 samples/sec    accuracy=0.203045       lr=0.010000
Epoch[0] Batch [1999]       Speed: 70.642792 samples/sec    accuracy=0.203625       lr=0.010000
Epoch[0] Batch [2049]       Speed: 70.378299 samples/sec    accuracy=0.204451       lr=0.010000
Epoch[0] Batch [2099]       Speed: 70.077728 samples/sec    accuracy=0.204881       lr=0.010000
Epoch[0] Batch [2149]       Speed: 69.779028 samples/sec    accuracy=0.205058       lr=0.010000
Epoch[0] Batch [2199]       Speed: 69.407270 samples/sec    accuracy=0.205199       lr=0.010000
Epoch[0] Batch [2249]       Speed: 68.982103 samples/sec    accuracy=0.205417       lr=0.010000
Epoch[0] Batch [2299]       Speed: 68.543977 samples/sec    accuracy=0.205707       lr=0.010000
Epoch[0] Batch [2349]       Speed: 67.978313 samples/sec    accuracy=0.206277       lr=0.010000
Epoch[0] Batch [2399]       Speed: 67.877752 samples/sec    accuracy=0.206745       lr=0.010000
Epoch[0] Batch [2449]       Speed: 69.436989 samples/sec    accuracy=0.207219       lr=0.010000
Epoch[0] Batch [2499]       Speed: 70.444969 samples/sec    accuracy=0.207525       lr=0.010000
Epoch[0] Batch [2549]       Speed: 71.011626 samples/sec    accuracy=0.208039       lr=0.010000
Epoch[0] Batch [2599]       Speed: 71.458473 samples/sec    accuracy=0.208293       lr=0.010000
Epoch[0] Batch [2649]       Speed: 71.858107 samples/sec    accuracy=0.208679       lr=0.010000
Epoch[0] Batch [2699]       Speed: 71.954840 samples/sec    accuracy=0.209167       lr=0.010000
Epoch[0] Batch [2749]       Speed: 71.955043 samples/sec    accuracy=0.209841       lr=0.010000
Epoch[0] Batch [2799]       Speed: 71.917584 samples/sec    accuracy=0.210491       lr=0.010000
Epoch[0] Batch [2849]       Speed: 71.906585 samples/sec    accuracy=0.210921       lr=0.010000
Epoch[0] Batch [2899]       Speed: 71.898531 samples/sec    accuracy=0.211034       lr=0.010000
Epoch[0] Batch [2949]       Speed: 71.852369 samples/sec    accuracy=0.211822       lr=0.010000
Epoch[0] Batch [2999]       Speed: 71.905512 samples/sec    accuracy=0.212333       lr=0.010000
Epoch[0] Batch [3049]       Speed: 71.922314 samples/sec    accuracy=0.213258       lr=0.010000
Epoch[0] Batch [3099]       Speed: 71.986712 samples/sec    accuracy=0.213347       lr=0.010000
Epoch[0] Batch [3149]       Speed: 71.936237 samples/sec    accuracy=0.213333       lr=0.010000
Epoch[0] Batch [3199]       Speed: 71.932058 samples/sec    accuracy=0.214023       lr=0.010000
Epoch[0] Batch [3249]       Speed: 71.886595 samples/sec    accuracy=0.215038       lr=0.010000
Epoch[0] Batch [3299]       Speed: 72.027008 samples/sec    accuracy=0.215303       lr=0.010000
Epoch[0] Batch [3349]       Speed: 71.912820 samples/sec    accuracy=0.215802       lr=0.010000
[Epoch 0] training: accuracy=0.215833
[Epoch 0] speed: 71 samples/sec     time cost: 759.787446
[Epoch 0] validation: top1=0.299833 top5=0.840500
[Epoch 0] Current best top-1: 0.299833 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/best_checkpoint.pkl
Unable to pickle object due to the reason: Can't pickle <class '__main__.MyCifarResNet'>: it's not the same object as __main__.MyCifarResNet. This object is not saved.
Applying the state from the best checkpoint...
Unable to resume the state from the best checkpoint, using the latest state.
Finished, total runtime is 784.42 s
{ 'best_config': { 'batch_size': 16,
                   'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
),
                   'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>,
                   'dist_ip_addrs': None,
                   'early_stop_baseline': -inf,
                   'early_stop_max_value': inf,
                   'early_stop_patience': 10,
                   'epochs': 1,
                   'final_fit': False,
                   'gpus': [0],
                   'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de',
                   'lr': 0.01,
                   'model': 'resnet50',
                   'ngpus_per_trial': 1,
                   'nthreads_per_trial': 128,
                   'num_trials': 1,
                   'num_workers': 8,
                   'problem_type': 'multiclass',
                   'scheduler': 'local',
                   'search_strategy': 'random',
                   'searcher': 'random',
                   'seed': 530,
                   'time_limits': 7200,
                   'wall_clock_tick': 1630454739.5001423},
  'total_time': 767.6996819972992,
  'train_acc': 0.21583333333333332,
  'valid_acc': 0.29983333333333334}

print(classifier.fit_summary())

{'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}}}