Searchable Objects¶
When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!
Example for Constructing a Network¶
This tutorial covers an example of selecting a neural network’s
architecture as a hyperparameter optimization (HPO) task. If you are
interested in efficient neural architecture search (NAS), please refer
to this other tutorial instead: sec_proxyless
_ .
CIFAR ResNet in GluonCV¶
GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:
import pickle
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1
layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.7.1+cu101 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time. warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '
We can visualize the network:
import autogluon.core as ag
from autogluon.vision.utils import plot_network
plot_network(net, (1, 3, 32, 32))
Searchable Network Architecture Using AutoGluon Object¶
autogluon.obj()
enables customized search space to any user
defined class. It can also be used within autogluon.Categorical()
if
you have multiple networks to choose from.
@ag.obj(
nstage1=ag.space.Int(2, 4),
nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
Create one network instance and print the configuration space:
mynet=MyCifarResNet()
print(mynet.cs)
Configuration space object:
Hyperparameters:
nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
nstage2, Type: UniformInteger, Range: [2, 4], Default: 3
We can also overwrite existing search spaces:
mynet1 = MyCifarResNet(nstage1=1,
nstage2=ag.space.Int(5, 10))
print(mynet1.cs)
Configuration space object:
Hyperparameters:
nstage2, Type: UniformInteger, Range: [5, 10], Default: 8
Decorate Existing Class¶
We can also use autogluon.obj()
to easily decorate any existing
classes. For example, if we want to search learning rate and weight
decay for Adam optimizer, we only need to add a decorator:
from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
pass
Then we can create an instance:
myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)
Configuration space object:
Hyperparameters:
learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale
Launch Experiments Using AutoGluon Object¶
AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also
works with user-defined training scripts using
autogluon.autogluon_register_args()
. We can start fitting:
from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)
time_limit=auto set to time_limit=7200. Starting fit without HPO modified configs(<old> != <new>): { root.valid.num_workers 4 != 8 root.valid.batch_size 128 != 16 root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto root.train.num_workers 4 != 8 root.train.early_stop_max_value 1.0 != inf root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto root.train.data_dir ~/.mxnet/datasets/imagenet != auto root.train.batch_size 128 != 16 root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto root.train.num_training_samples 1281167 != -1 root.train.rec_val ~/.mxnet/datasets/imagenet/rec/val.rec != auto root.train.lr 0.1 != 0.01 root.train.early_stop_patience -1 != 10 root.train.epochs 10 != 1 root.train.early_stop_baseline 0.0 != -inf root.img_cls.model resnet50_v1 != resnet50 } Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/config.yaml Start training from [Epoch 0] Epoch[0] Batch [49] Speed: 72.770480 samples/sec accuracy=0.132500 lr=0.010000 Epoch[0] Batch [99] Speed: 73.166324 samples/sec accuracy=0.146250 lr=0.010000 Epoch[0] Batch [149] Speed: 72.969063 samples/sec accuracy=0.155417 lr=0.010000 Epoch[0] Batch [199] Speed: 72.633540 samples/sec accuracy=0.165937 lr=0.010000 Epoch[0] Batch [249] Speed: 72.316026 samples/sec accuracy=0.167000 lr=0.010000 Epoch[0] Batch [299] Speed: 72.032110 samples/sec accuracy=0.171875 lr=0.010000 Epoch[0] Batch [349] Speed: 71.892782 samples/sec accuracy=0.176786 lr=0.010000 Epoch[0] Batch [399] Speed: 71.934727 samples/sec accuracy=0.177969 lr=0.010000 Epoch[0] Batch [449] Speed: 71.951777 samples/sec accuracy=0.178333 lr=0.010000 Epoch[0] Batch [499] Speed: 71.876971 samples/sec accuracy=0.181500 lr=0.010000 Epoch[0] Batch [549] Speed: 71.474737 samples/sec accuracy=0.181591 lr=0.010000 Epoch[0] Batch [599] Speed: 71.278845 samples/sec accuracy=0.182917 lr=0.010000 Epoch[0] Batch [649] Speed: 70.976441 samples/sec accuracy=0.183462 lr=0.010000 Epoch[0] Batch [699] Speed: 70.835507 samples/sec accuracy=0.185179 lr=0.010000 Epoch[0] Batch [749] Speed: 70.588953 samples/sec accuracy=0.186500 lr=0.010000 Epoch[0] Batch [799] Speed: 70.293463 samples/sec accuracy=0.187891 lr=0.010000 Epoch[0] Batch [849] Speed: 69.829359 samples/sec accuracy=0.188971 lr=0.010000 Epoch[0] Batch [899] Speed: 69.570659 samples/sec accuracy=0.191042 lr=0.010000 Epoch[0] Batch [949] Speed: 69.121725 samples/sec accuracy=0.191645 lr=0.010000 Epoch[0] Batch [999] Speed: 68.595424 samples/sec accuracy=0.191875 lr=0.010000 Epoch[0] Batch [1049] Speed: 68.060402 samples/sec accuracy=0.192440 lr=0.010000 Epoch[0] Batch [1099] Speed: 68.928145 samples/sec accuracy=0.190909 lr=0.010000 Epoch[0] Batch [1149] Speed: 70.299055 samples/sec accuracy=0.191576 lr=0.010000 Epoch[0] Batch [1199] Speed: 70.904903 samples/sec accuracy=0.193177 lr=0.010000 Epoch[0] Batch [1249] Speed: 71.366764 samples/sec accuracy=0.193400 lr=0.010000 Epoch[0] Batch [1299] Speed: 71.629010 samples/sec accuracy=0.194183 lr=0.010000 Epoch[0] Batch [1349] Speed: 71.902969 samples/sec accuracy=0.194583 lr=0.010000 Epoch[0] Batch [1399] Speed: 71.912137 samples/sec accuracy=0.194688 lr=0.010000 Epoch[0] Batch [1449] Speed: 71.922055 samples/sec accuracy=0.195560 lr=0.010000 Epoch[0] Batch [1499] Speed: 71.972581 samples/sec accuracy=0.196250 lr=0.010000 Epoch[0] Batch [1549] Speed: 71.918170 samples/sec accuracy=0.197419 lr=0.010000 Epoch[0] Batch [1599] Speed: 71.920872 samples/sec accuracy=0.197656 lr=0.010000 Epoch[0] Batch [1649] Speed: 71.914389 samples/sec accuracy=0.198523 lr=0.010000 Epoch[0] Batch [1699] Speed: 71.716690 samples/sec accuracy=0.199449 lr=0.010000 Epoch[0] Batch [1749] Speed: 71.452302 samples/sec accuracy=0.200250 lr=0.010000 Epoch[0] Batch [1799] Speed: 71.375081 samples/sec accuracy=0.200764 lr=0.010000 Epoch[0] Batch [1849] Speed: 71.168992 samples/sec accuracy=0.202027 lr=0.010000 Epoch[0] Batch [1899] Speed: 71.067646 samples/sec accuracy=0.202928 lr=0.010000 Epoch[0] Batch [1949] Speed: 70.862554 samples/sec accuracy=0.203045 lr=0.010000 Epoch[0] Batch [1999] Speed: 70.642792 samples/sec accuracy=0.203625 lr=0.010000 Epoch[0] Batch [2049] Speed: 70.378299 samples/sec accuracy=0.204451 lr=0.010000 Epoch[0] Batch [2099] Speed: 70.077728 samples/sec accuracy=0.204881 lr=0.010000 Epoch[0] Batch [2149] Speed: 69.779028 samples/sec accuracy=0.205058 lr=0.010000 Epoch[0] Batch [2199] Speed: 69.407270 samples/sec accuracy=0.205199 lr=0.010000 Epoch[0] Batch [2249] Speed: 68.982103 samples/sec accuracy=0.205417 lr=0.010000 Epoch[0] Batch [2299] Speed: 68.543977 samples/sec accuracy=0.205707 lr=0.010000 Epoch[0] Batch [2349] Speed: 67.978313 samples/sec accuracy=0.206277 lr=0.010000 Epoch[0] Batch [2399] Speed: 67.877752 samples/sec accuracy=0.206745 lr=0.010000 Epoch[0] Batch [2449] Speed: 69.436989 samples/sec accuracy=0.207219 lr=0.010000 Epoch[0] Batch [2499] Speed: 70.444969 samples/sec accuracy=0.207525 lr=0.010000 Epoch[0] Batch [2549] Speed: 71.011626 samples/sec accuracy=0.208039 lr=0.010000 Epoch[0] Batch [2599] Speed: 71.458473 samples/sec accuracy=0.208293 lr=0.010000 Epoch[0] Batch [2649] Speed: 71.858107 samples/sec accuracy=0.208679 lr=0.010000 Epoch[0] Batch [2699] Speed: 71.954840 samples/sec accuracy=0.209167 lr=0.010000 Epoch[0] Batch [2749] Speed: 71.955043 samples/sec accuracy=0.209841 lr=0.010000 Epoch[0] Batch [2799] Speed: 71.917584 samples/sec accuracy=0.210491 lr=0.010000 Epoch[0] Batch [2849] Speed: 71.906585 samples/sec accuracy=0.210921 lr=0.010000 Epoch[0] Batch [2899] Speed: 71.898531 samples/sec accuracy=0.211034 lr=0.010000 Epoch[0] Batch [2949] Speed: 71.852369 samples/sec accuracy=0.211822 lr=0.010000 Epoch[0] Batch [2999] Speed: 71.905512 samples/sec accuracy=0.212333 lr=0.010000 Epoch[0] Batch [3049] Speed: 71.922314 samples/sec accuracy=0.213258 lr=0.010000 Epoch[0] Batch [3099] Speed: 71.986712 samples/sec accuracy=0.213347 lr=0.010000 Epoch[0] Batch [3149] Speed: 71.936237 samples/sec accuracy=0.213333 lr=0.010000 Epoch[0] Batch [3199] Speed: 71.932058 samples/sec accuracy=0.214023 lr=0.010000 Epoch[0] Batch [3249] Speed: 71.886595 samples/sec accuracy=0.215038 lr=0.010000 Epoch[0] Batch [3299] Speed: 72.027008 samples/sec accuracy=0.215303 lr=0.010000 Epoch[0] Batch [3349] Speed: 71.912820 samples/sec accuracy=0.215802 lr=0.010000 [Epoch 0] training: accuracy=0.215833 [Epoch 0] speed: 71 samples/sec time cost: 759.787446 [Epoch 0] validation: top1=0.299833 top5=0.840500 [Epoch 0] Current best top-1: 0.299833 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de/.trial_0/best_checkpoint.pkl Unable to pickle object due to the reason: Can't pickle <class '__main__.MyCifarResNet'>: it's not the same object as __main__.MyCifarResNet. This object is not saved. Applying the state from the best checkpoint... Unable to resume the state from the best checkpoint, using the latest state. Finished, total runtime is 784.42 s { 'best_config': { 'batch_size': 16, 'custom_net': MyCifarResNet( (features): HybridSequential( (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (3): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (3): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (4): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (5): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (6): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (7): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (4): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW) ) (output): Dense(64 -> 10, linear) ), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'dist_ip_addrs': None, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'early_stop_patience': 10, 'epochs': 1, 'final_fit': False, 'gpus': [0], 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'lr': 0.01, 'model': 'resnet50', 'ngpus_per_trial': 1, 'nthreads_per_trial': 128, 'num_trials': 1, 'num_workers': 8, 'problem_type': 'multiclass', 'scheduler': 'local', 'search_strategy': 'random', 'searcher': 'random', 'seed': 530, 'time_limits': 7200, 'wall_clock_tick': 1630454739.5001423}, 'total_time': 767.6996819972992, 'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334}
print(classifier.fit_summary())
{'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.21583333333333332, 'valid_acc': 0.29983333333333334, 'total_time': 767.6996819972992, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/5377d4de', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f6464113f90>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 530, 'final_fit': False, 'wall_clock_tick': 1630454739.5001423, 'problem_type': 'multiclass'}}}