AutoGluon#

_images/autogluon-s.png

AutoML for Image, Text, Time Series, and Tabular Data

Get Started

Quick Prototyping

Build machine learning solutions on raw data in a few lines of code.

State-of-the-art Techniques

Automatically utilize SOTA models without expert knowledge.

Easy to Deploy

Move from experimentation to production with cloud predictors and pre-built containers.

Customizable

Extensible with custom feature processing, models, and metrics.

Quick Examples#

Tabular

Predict the class column in a data table:

from autogluon.tabular import TabularDataset, TabularPredictor

data_root = 'https://autogluon.s3.amazonaws.com/datasets/Inc/'
train_data = TabularDataset(data_root + 'train.csv')
test_data = TabularDataset(data_root + 'test.csv')

predictor = TabularPredictor(label='class').fit(train_data=train_data)
predictions = predictor.predict(test_data)
Multimodal
from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_pd

data_root = 'https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/'
train_data = load_pd.load(data_root + 'train.parquet')
test_data = load_pd.load(data_root + 'dev.parquet')

predictor = MultiModalPredictor(label='label').fit(train_data=train_data)
predictions = predictor.predict(test_data)
from autogluon.multimodal import MultiModalPredictor
from autogluon.multimodal.utils.misc import shopee_dataset

train_data, test_data = shopee_dataset('./automm_shopee_data')

predictor = MultiModalPredictor(label='label').fit(train_data=train_data)
predictions = predictor.predict(test_data)
from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_pd

data_root = 'https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/'
train_data = load_pd.load(data_root + 'train.csv')
test_data = load_pd.load(data_root + 'test.csv')

predictor = MultiModalPredictor(problem_type="ner", label="entity_annotations")

predictor.fit(train_data)
predictor.evaluate(test_data)

sentence = "Game of Thrones is an American fantasy drama television series created" +
           "by David Benioff"
prediction = predictor.predict({ 'text_snippet': [sentence]})
from autogluon.multimodal import MultiModalPredictor, utils
import ir_datasets
import pandas as pd

dataset = ir_datasets.load("beir/fiqa/dev")
docs_df = pd.DataFrame(dataset.docs_iter()).set_index("doc_id")

predictor = MultiModalPredictor(problem_type="text_similarity")

doc_embedding = predictor.extract_embedding(docs_df)
q_embedding = predictor.extract_embedding([
  "what happened when the dot com bubble burst?"
])

similarity = utils.compute_semantic_similarity(q_embedding, doc_embedding)
# Install mmcv-related dependencies
!mim install "mmcv==2.1.0"
!pip install "mmdet==3.2.0"

from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_zip

data_zip = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/" + \
           "tiny_motorbike_coco.zip"
load_zip.unzip(data_zip, unzip_dir=".")

train_path = "./tiny_motorbike/Annotations/trainval_cocoformat.json"
test_path = "./tiny_motorbike/Annotations/test_cocoformat.json"

predictor = MultiModalPredictor(
  problem_type="object_detection",
  sample_data_path=train_path
)

predictor.fit(train_path)
score = predictor.evaluate(test_path)

pred = predictor.predict({"image": ["./tiny_motorbike/JPEGImages/000038.jpg"]})
Time Series

Forecast future values of time series:

from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor

data = TimeSeriesDataFrame('https://autogluon.s3.amazonaws.com/datasets/timeseries/m4_hourly/train.csv')

predictor = TimeSeriesPredictor(target='target', prediction_length=48).fit(data)
predictions = predictor.predict(data)

Installation#

Install AutoGluon using pip:

pip install autogluon

AutoGluon supports Linux, MacOS, and Windows. See Installing AutoGluon for detailed instructions.

Community#

Twitter

Get involved in the AutoGluon community by joining our Discord!