AutoGluon¶

Fast and Accurate ML in 3 Lines of Code

Quick Prototyping: Build machine learning solutions on raw data in a few lines of code.
State-of-the-art Techniques: Automatically utilize SOTA models without expert knowledge.
Easy to Deploy: Move from experimentation to production with cloud predictors and pre-built containers.
Customizable: Extensible with custom feature processing, models, and metrics.

Quick Examples¶

Multimodal

Text Classification

from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_pd

data_root = 'https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/'
train_data = load_pd.load(data_root + 'train.parquet')
test_data = load_pd.load(data_root + 'dev.parquet')

predictor = MultiModalPredictor(label='label').fit(train_data=train_data)
predictions = predictor.predict(test_data)

Image Classification

from autogluon.multimodal import MultiModalPredictor
from autogluon.multimodal.utils.misc import shopee_dataset

train_data, test_data = shopee_dataset('./automm_shopee_data')

predictor = MultiModalPredictor(label='label').fit(train_data=train_data)
predictions = predictor.predict(test_data)

NER

from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_pd

data_root = 'https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/'
train_data = load_pd.load(data_root + 'train.csv')
test_data = load_pd.load(data_root + 'test.csv')

predictor = MultiModalPredictor(problem_type="ner", label="entity_annotations")

predictor.fit(train_data)
predictor.evaluate(test_data)

sentence = "Game of Thrones is an American fantasy drama television series created" +
           "by David Benioff"
prediction = predictor.predict({ 'text_snippet': [sentence]})

Matching

from autogluon.multimodal import MultiModalPredictor, utils
import ir_datasets
import pandas as pd

dataset = ir_datasets.load("beir/fiqa/dev")
docs_df = pd.DataFrame(dataset.docs_iter()).set_index("doc_id")

predictor = MultiModalPredictor(problem_type="text_similarity")

doc_embedding = predictor.extract_embedding(docs_df)
q_embedding = predictor.extract_embedding([
  "what happened when the dot com bubble burst?"
])

similarity = utils.compute_semantic_similarity(q_embedding, doc_embedding)

Object Detection

# Install mmcv-related dependencies
!mim install "mmcv==2.1.0"
!pip install "mmdet==3.2.0"

from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_zip

data_zip = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/" + \
           "tiny_motorbike_coco.zip"
load_zip.unzip(data_zip, unzip_dir=".")

train_path = "./tiny_motorbike/Annotations/trainval_cocoformat.json"
test_path = "./tiny_motorbike/Annotations/test_cocoformat.json"

predictor = MultiModalPredictor(
  problem_type="object_detection",
  sample_data_path=train_path
)

predictor.fit(train_path)
score = predictor.evaluate(test_path)

pred = predictor.predict({"image": ["./tiny_motorbike/JPEGImages/000038.jpg"]})

Installation¶

Install AutoGluon using pip:

pip install autogluon

AutoGluon supports Linux, MacOS, and Windows. See Installing AutoGluon for detailed instructions.

Managed Service¶

Looking for a managed AutoML service? We highly recommend checking out Amazon SageMaker Canvas! Powered by AutoGluon, it allows you to create highly accurate machine learning models without any machine learning experience or writing a single line of code.

Community¶

Get involved in the AutoGluon community by joining our Discord!

Citing AutoGluon¶

AutoGluon was originally developed by researchers and engineers at AWS AI. If you use AutoGluon in your research, please refer to our citing guide.