Searchable Objects¶
When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!
Example for Constructing a Network¶
This tutorial covers an example of selecting a neural network’s
architecture as a hyperparameter optimization (HPO) task. If you are
interested in efficient neural architecture search (NAS), please refer
to this other tutorial instead: sec_proxyless_ .
CIFAR ResNet in GluonCV¶
GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:
import pickle
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1
layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.7.1+cu101 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time.
warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '
We can visualize the network:
import autogluon.core as ag
from autogluon.vision.utils import plot_network
plot_network(net, (1, 3, 32, 32))
Searchable Network Architecture Using AutoGluon Object¶
autogluon.obj() enables customized search space to any user
defined class. It can also be used within autogluon.Categorical() if
you have multiple networks to choose from.
@ag.obj(
nstage1=ag.space.Int(2, 4),
nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
Create one network instance and print the configuration space:
mynet=MyCifarResNet()
print(mynet.cs)
Configuration space object:
Hyperparameters:
nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
nstage2, Type: UniformInteger, Range: [2, 4], Default: 3
We can also overwrite existing search spaces:
mynet1 = MyCifarResNet(nstage1=1,
nstage2=ag.space.Int(5, 10))
print(mynet1.cs)
Configuration space object:
Hyperparameters:
nstage2, Type: UniformInteger, Range: [5, 10], Default: 8
Decorate Existing Class¶
We can also use autogluon.obj() to easily decorate any existing
classes. For example, if we want to search learning rate and weight
decay for Adam optimizer, we only need to add a decorator:
from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
pass
Then we can create an instance:
myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)
Configuration space object:
Hyperparameters:
learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale
Launch Experiments Using AutoGluon Object¶
AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also
works with user-defined training scripts using
autogluon.autogluon_register_args(). We can start fitting:
from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)
time_limit=auto set to time_limit=7200.
Starting fit without HPO
modified configs(<old> != <new>): {
root.valid.num_workers 4 != 8
root.valid.batch_size 128 != 16
root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto
root.train.num_workers 4 != 8
root.train.early_stop_max_value 1.0 != inf
root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto
root.train.data_dir ~/.mxnet/datasets/imagenet != auto
root.train.batch_size 128 != 16
root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto
root.train.num_training_samples 1281167 != -1
root.train.rec_val ~/.mxnet/datasets/imagenet/rec/val.rec != auto
root.train.lr 0.1 != 0.01
root.train.early_stop_patience -1 != 10
root.train.epochs 10 != 1
root.train.early_stop_baseline 0.0 != -inf
root.img_cls.model resnet50_v1 != resnet50
}
Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/config.yaml
Start training from [Epoch 0]
Epoch[0] Batch [49] Speed: 72.770480 samples/sec accuracy=0.132500 lr=0.010000
Epoch[0] Batch [99] Speed: 73.166324 samples/sec accuracy=0.146250 lr=0.010000
Epoch[0] Batch [149] Speed: 72.969063 samples/sec accuracy=0.155417 lr=0.010000
Epoch[0] Batch [199] Speed: 72.633540 samples/sec accuracy=0.165937 lr=0.010000
Epoch[0] Batch [249] Speed: 72.316026 samples/sec accuracy=0.167000 lr=0.010000
Epoch[0] Batch [299] Speed: 72.032110 samples/sec accuracy=0.171875 lr=0.010000
Epoch[0] Batch [349] Speed: 71.892782 samples/sec accuracy=0.176786 lr=0.010000
Epoch[0] Batch [399] Speed: 71.934727 samples/sec accuracy=0.177969 lr=0.010000
Epoch[0] Batch [449] Speed: 71.951777 samples/sec accuracy=0.178333 lr=0.010000
Epoch[0] Batch [499] Speed: 71.876971 samples/sec accuracy=0.181500 lr=0.010000
Epoch[0] Batch [549] Speed: 71.474737 samples/sec accuracy=0.181591 lr=0.010000
Epoch[0] Batch [599] Speed: 71.278845 samples/sec accuracy=0.182917 lr=0.010000
Epoch[0] Batch [649] Speed: 70.976441 samples/sec accuracy=0.183462 lr=0.010000
Epoch[0] Batch [699] Speed: 70.835507 samples/sec accuracy=0.185179 lr=0.010000
Epoch[0] Batch [749] Speed: 70.588953 samples/sec accuracy=0.186500 lr=0.010000
Epoch[0] Batch [799] Speed: 70.293463 samples/sec accuracy=0.187891 lr=0.010000
Epoch[0] Batch [849] Speed: 69.829359 samples/sec accuracy=0.188971 lr=0.010000
Epoch[0] Batch [899] Speed: 69.570659 samples/sec accuracy=0.191042 lr=0.010000
Epoch[0] Batch [949] Speed: 69.121725 samples/sec accuracy=0.191645 lr=0.010000
Epoch[0] Batch [999] Speed: 68.595424 samples/sec accuracy=0.191875 lr=0.010000
Epoch[0] Batch [1049] Speed: 68.060402 samples/sec accuracy=0.192440 lr=0.010000
Epoch[0] Batch [1099] Speed: 68.928145 samples/sec accuracy=0.190909 lr=0.010000
Epoch[0] Batch [1149] Speed: 70.299055 samples/sec accuracy=0.191576 lr=0.010000
Epoch[0] Batch [1199] Speed: 70.904903 samples/sec accuracy=0.193177 lr=0.010000
Epoch[0] Batch [1249] Speed: 71.366764 samples/sec accuracy=0.193400 lr=0.010000
Epoch[0] Batch [1299] Speed: 71.629010 samples/sec accuracy=0.194183 lr=0.010000
Epoch[0] Batch [1349] Speed: 71.902969 samples/sec accuracy=0.194583 lr=0.010000
Epoch[0] Batch [1399] Speed: 71.912137 samples/sec accuracy=0.194688 lr=0.010000
Epoch[0] Batch [1449] Speed: 71.922055 samples/sec accuracy=0.195560 lr=0.010000
Epoch[0] Batch [1499] Speed: 71.972581 samples/sec accuracy=0.196250 lr=0.010000
Epoch[0] Batch [1549] Speed: 71.918170 samples/sec accuracy=0.197419 lr=0.010000
Epoch[0] Batch [1599] Speed: 71.920872 samples/sec accuracy=0.197656 lr=0.010000
Epoch[0] Batch [1649] Speed: 71.914389 samples/sec accuracy=0.198523 lr=0.010000
Epoch[0] Batch [1699] Speed: 71.716690 samples/sec accuracy=0.199449 lr=0.010000
Epoch[0] Batch [1749] Speed: 71.452302 samples/sec accuracy=0.200250 lr=0.010000
Epoch[0] Batch [1799] Speed: 71.375081 samples/sec accuracy=0.200764 lr=0.010000
Epoch[0] Batch [1849] Speed: 71.168992 samples/sec accuracy=0.202027 lr=0.010000
Epoch[0] Batch [1899] Speed: 71.067646 samples/sec accuracy=0.202928 lr=0.010000
Epoch[0] Batch [1949] Speed: 70.862554 samples/sec accuracy=0.203045 lr=0.010000
Epoch[0] Batch [1999] Speed: 70.642792 samples/sec accuracy=0.203625 lr=0.010000
Epoch[0] Batch [2049] Speed: 70.378299 samples/sec accuracy=0.204451 lr=0.010000
Epoch[0] Batch [2099] Speed: 70.077728 samples/sec accuracy=0.204881 lr=0.010000
Epoch[0] Batch [2149] Speed: 69.779028 samples/sec accuracy=0.205058 lr=0.010000
Epoch[0] Batch [2199] Speed: 69.407270 samples/sec accuracy=0.205199 lr=0.010000
Epoch[0] Batch [2249] Speed: 68.982103 samples/sec accuracy=0.205417 lr=0.010000
Epoch[0] Batch [2299] Speed: 68.543977 samples/sec accuracy=0.205707 lr=0.010000
Epoch[0] Batch [2349] Speed: 67.978313 samples/sec accuracy=0.206277 lr=0.010000
Epoch[0] Batch [2399] Speed: 67.877752 samples/sec accuracy=0.206745 lr=0.010000
Epoch[0] Batch [2449] Speed: 69.436989 samples/sec accuracy=0.207219 lr=0.010000
Epoch[0] Batch [2499] Speed: 70.444969 samples/sec accuracy=0.207525 lr=0.010000
Epoch[0] Batch [2549] Speed: 71.011626 samples/sec accuracy=0.208039 lr=0.010000
Epoch[0] Batch [2599] Speed: 71.458473 samples/sec accuracy=0.208293 lr=0.010000
Epoch[0] Batch [2649] Speed: 71.858107 samples/sec accuracy=0.208679 lr=0.010000
Epoch[0] Batch [2699] Speed: 71.954840 samples/sec accuracy=0.209167 lr=0.010000
Epoch[0] Batch [2749] Speed: 71.955043 samples/sec accuracy=0.209841 lr=0.010000
Epoch[0] Batch [2799] Speed: 71.917584 samples/sec accuracy=0.210491 lr=0.010000
Epoch[0] Batch [2849] Speed: 71.906585 samples/sec accuracy=0.210921 lr=0.010000
Epoch[0] Batch [2899] Speed: 71.898531 samples/sec accuracy=0.211034 lr=0.010000
Epoch[0] Batch [2949] Speed: 71.852369 samples/sec accuracy=0.211822 lr=0.010000
Epoch[0] Batch [2999] Speed: 71.905512 samples/sec accuracy=0.212333 lr=0.010000
Epoch[0] Batch [3049] Speed: 71.922314 samples/sec accuracy=0.213258 lr=0.010000
Epoch[0] Batch [3099] Speed: 71.986712 samples/sec accuracy=0.213347 lr=0.010000
Epoch[0] Batch [3149] Speed: 71.936237 samples/sec accuracy=0.213333 lr=0.010000
Epoch[0] Batch [3199] Speed: 71.932058 samples/sec accuracy=0.214023 lr=0.010000
Epoch[0] Batch [3249] Speed: 71.886595 samples/sec accuracy=0.215038 lr=0.010000
Epoch[0] Batch [3299] Speed: 72.027008 samples/sec accuracy=0.215303 lr=0.010000
Epoch[0] Batch [3349] Speed: 71.912820 samples/sec accuracy=0.215802 lr=0.010000
[Epoch 0] training: accuracy=0.215833
[Epoch 0] speed: 71 samples/sec time cost: 759.787446
[Epoch 0] validation: top1=0.299833 top5=0.840500
[Epoch 0] Current best top-1: 0.299833 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/best_checkpoint.pkl
Unable to pickle object due to the reason: Can't pickle <class '__main__.MyCifarResNet'>: it's not the same object as __main__.MyCifarResNet. This object is not saved.
Applying the state from the best checkpoint...
Unable to resume the state from the best checkpoint, using the latest state.
Finished, total runtime is 784.42 s
{ 'best_config': { 'batch_size': 16,
'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
),
'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>,
'dist_ip_addrs': None,
'early_stop_baseline': -inf,
'early_stop_max_value': inf,
'early_stop_patience': 10,
'epochs': 1,
'final_fit': False,
'gpus': [0],
'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de',
'lr': 0.01,
'model': 'resnet50',
'ngpus_per_trial': 1,
'nthreads_per_trial': 128,
'num_trials': 1,
'num_workers': 8,
'problem_type': 'multiclass',
'scheduler': 'local',
'search_strategy': 'random',
'searcher': 'random',
'seed': 530,
'time_limits': 7200,
'wall_clock_tick': 1630454739.5001423},
'total_time': 767.6996819972992,
'train_acc': 0.21583333333333332,
'valid_acc': 0.29983333333333334}
print(classifier.fit_summary())
{'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}}}