Version 0.4.1¶
We’re happy to announce the AutoGluon 0.4.1 release. 0.4.1 contains minor enhancements to Tabular, Text, Image, and Multimodal modules, along with many quality of life improvements and fixes.
This release is non-breaking when upgrading from v0.4.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 55 commits from 10 contributors!
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/v0.4.0…v0.4.1
Special thanks to @yiqings, @leandroimail, @huibinshen who were first time contributors to AutoGluon this release!
Full Contributor List (ordered by # of commits):
@Innixma, @zhiqiangdon, @yinweisu, @sxjscience, @yiqings, @gradientsky, @willsmithorg, @canerturkmen, @leandroimail, @huibinshen.
This version supports Python versions 3.7 to 3.9.
Changes¶
AutoMM¶
New features¶
Added
optimization.efficient_finetune
flag to support multiple efficient finetuning algorithms. (#1666) @sxjscienceSupported options:
bit_fit
: “BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models”norm_fit
: An extension of the algorithm in “Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs” and BitFit. We finetune both the parameters in the norm layers as long as the biases.
Enabled knowledge distillation for AutoMM (#1670) @zhiqiangdon
Distillation API for
AutoMMPredictor
reuses the.fit()
function:
from autogluon.text.automm import AutoMMPredictor teacher_predictor = AutoMMPredictor(label="label_column").fit(train_data) student_predictor = AutoMMPredictor(label="label_column").fit( train_data, hyperparameters=student_and_distiller_hparams, teacher_predictor=teacher_predictor, )
Option to turn on returning feature column information (#1711) @zhiqiangdon
The feature column information is turned on for feature column distillation; for other cases it is turned off by default to reduce dataloader‘s latency.
Added a
requires_column_info
flag in data processors and a utility function to turn this flag on or off.
FT-Transformer implementation for tabular data in AutoMM (#1646) @yiqings
Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko, “Revisiting Deep Learning Models for Tabular Data” 2022. (arxiv, official implementation)
Make CLIP support multiple images per sample (#1606) @zhiqiangdon
Added multiple images support for CLIP. Improved data loader robustness: added missing images handling to prevent training crashes.
Added the choice of using a zero image if an image is missing.
Avoid using
eos
as the sep token for CLIP. (#1710) @zhiqiangdonUpdate fusion transformer in AutoMM (#1712) @yiqings
Support constant learning rate in
polynomial_decay
scheduler.Update
[CLS]
token in numerical/categorical transformer.
Added more image augmentations:
verticalflip
,colorjitter
,randomaffine
(#1719) @Linuxdex, @sxjscienceAdded prompts for the percentage of missing images during image column detection. (#1623) @zhiqiangdon
Support
average_precision
in AutoMM (#1697) @sxjscienceConvert
roc_auc
/average_precision
tolog_loss
for torchmetrics (#1715) @zhiqiangdontorchmetrics.AUROC
requires both positive and negative examples are available in a mini-batch. When training a large model, the per gpu batch size is probably small, leading to an incorrectroc_auc
score. Conversion fromroc_auc
tolog_loss
improves training stability.
Added
pytorch-lightning
1.6 support (#1716) @sxjscience
Checkpointing and Model Outputs Changes¶
Updated the names of top-k checkpoint average methods and support customizing model names for terminal input (#1668) @zhiqiangdon
Following paper: https://arxiv.org/pdf/2203.05482.pdf to update top-k checkpoint average names:
union_soup
->uniform_soup
andbest_soup
->best
.Update function names (
customize_config_names
->customize_model_names
andverify_config_names
->verify_model_names
) to make it easier to understand them.Support customizing model names for the terminal input.
Implemented the GreedySoup algorithm proposed in paper. Added
union_soup
,greedy_soup
,best_soup
flags and changed the default value correspondingly. (#1613) @sxjscienceUpdated the
standalone
flag inautomm.predictor.save()
to save the pertained model for offline deployment (#1575) @yiqingsAn efficient implementation to save the downloaded models from transformers for the offline deployment. Revised logic is in #1572, and discussed in #1572 (comment).
Simplified checkpoint template (#1636) @zhiqiangdon
Stopped using pytorch lightning’s model checkpoint template in saving
AutoMMPredictor
’s final model checkpoint.Improved the logic of continuous training. We pass the
ckpt_path
argument to pytorch lightning’s trainer only whenresume=True
.
Unified AutoMM’s model output format and support customizing model names (#1643) @zhiqiangdon
Now each model’s output is dictionary with the model prefix as the first level key. The format is uniform between single model and fusion model.
Now users can customize model names by using the internal registered names (
timm_image
,hf_text
,clip
,numerical_mlp
,categorical_mlp
, andfusion_mlp
) as prefixes. This is helpful when users want to simultaneously use two models of the same type, e.g.,hf_text
. They can just use nameshf_text_0
andhf_text_1
.
Support
standalone
feature inTextPredictor
(#1651) @yiqingsFixed saving and loading tokenizers and text processors (#1656) @zhiqiangdon
Saved pre-trained huggingface tokenizers separately from the data processors.
This change is backwards-compatible with checkpoints saved by version
0.4.0
.
Change load from a classmethod to staticmethod to avoid incorrect usage. (#1697) @sxjscience
Added
AutoMMModelCheckpoint
to avoid evaluating the models to obtain the scores (#1716) @sxjsciencecheckpoint will save the best_k_models into a yaml file so that it can be loaded later to determine the path to model checkpoints.
Extract column features from AutoMM’s model outputs (#1718) @zhiqiangdon
Add one util function to extract column features for both image and text.
Support extracting column features for models
timm_image
,hf_text
, andclip
.
Make AutoMM dataloader return feature column information (#1710) @zhiqiangdon
Bug fixes¶
Fixed calling
save_pretrained_configs
inAutoMMPrediction.save(standalone=True)
when no fusion model exists (here) (#1651) @yiqingsFixed error raising for setting key that does not exist in the configuration (#1613) @sxjscience
Fixed warning message about bf16. (#1625) @sxjscience
Fixed the corner case of calculating the gradient accumulation step (#1633) @sxjscience
Fixes for top-k averaging in the multi-gpu setting (#1707) @zhiqiangdon
Tabular¶
Limited RF
max_leaf_nodes
to 15000 (previously uncapped) (#1717) @InnixmaPreviously, for very large datasets RF/XT memory and disk usage would quickly become unreasonable. This ensures that at a certain point RF and XT will no longer become larger given more rows of training data. Benchmark results showed that the change is an improvement, particularly for the
high_quality
preset.
Limit KNN to 32 CPUs to avoid OpenBLAS error (#1722) @Innixma
Issue #1020. When training K-nearest-neighbors (KNN) models, sometimes a rare error can occur that crashes the entire process:
BLAS : Program is Terminated. Because you tried to allocate too many memory regions. Segmentation fault: 11
This error occurred when the machine had many CPU cores (>64 vCPUs) due to too many threads being created at once. By limiting to 32 cores used, the error is avoided.
Improved memory warning thresholds (#1626) @Innixma
Added
get_results
andmodel_base_kwargs
(#1618) @InnixmaAdded
get_results
to searchers, useful for debugging and for future extensions to HPO functionality. Added new way to init aBaggedEnsembleModel
that avoids having to init the base model prior to initing the bagged ensemble model.
Update resource logic in models (#1689) @Innixma
Previous implementation would crash if user specified
auto
for resources, fixed in this PR.Added
get_minimum_resources
to explicitly define minimum resource requirements within a method.
Updated feature importance default
subsample_size
1000 -> 5000,num_shuffle_sets 3
-> 5 (#1708) @InnixmaThis will improve the quality of the feature importance values by default, especially the 99% confidence bounds. The change increases the time taken by ~8x, but this is acceptable because of the numerous inference speed optimizations done since these defaults were first introduced.
Added notice to ensure serializable custom metrics (#1705) @Innixma
Bug fixes¶
Fixed
evaluate
whenweight_evaluation=True
(#1612) @InnixmaPreviously, AutoGluon would crash if the user specified
predictor.evaluate(...)
orpredictor.evaluate_predictions(...)
whenself.weight_evaluation==True
.
Fixed RuntimeError: dictionary changed size during iteration (#1684, #1685) @leandroimail
Fixed CatBoost custom metric & F1 support (#1690) @Innixma
Fixed HPO not working for bagged models if the bagged model is loaded from disk (#1702) @Innixma
Fixed Feature importance erroring if
self.model_best
isNone
(can happen if no Weighted Ensemble is fit) (#1702) @Innixma
Documentation¶
updated the text tutorial of customizing hyperparameters (#1620) @zhiqiangdon
Added customizable backbones from the Huggingface model zoo and how to use local backbones.
Improved implementations and docstrings of
save_pretrained_models
andconvert_checkpoint_name
. (#1656) @zhiqiangdonAdded cheat sheet to website (#1605) @yinweisu
Doc fix to use correct predictor when calling leaderboard (#1652) @Innixma
Miscellaneous changes¶
[security] updated
pillow
to9.0.1
+ (#1615) @gradientsky[security] updated
ray
to1.10.0
+ (#1616) @yinweisuTabular regression tests improvements (#1555) @willsmithorg
Regression testing of model list and scores in tabular on small synthetic datasets (for speed).
Tests about 20 different calls to
TabularPredictor
on both regression and classification tasks, multiple presets etc.When a test fails it dumps out the config change required to make it pass, for ease of updating.
Disabled image/text predictor when gpu is not available in
TabularPredictor
(#1676) @yinweisuResources are validated before bagging is started. Image/text predictor model would require minimum of 1 gpu.
Use class property to set keys in model classes. In this way, if we customize the prefix key, other keys are automatically updated. (#1669) @zhiqiangdon
Various bugfixes, documentation and CI improvements¶
@yinweisu (#1605, #1611, #1631, #1638, #1691)
@zhiqiangdon (#1721)
@Innixma (#1608, #1701)
@sxjscience (#1714)