{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# AutoGluon Tabular - Foundational Models\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/autogluon/autogluon/blob/master/docs/tutorials/tabular/tabular-foundational-models.ipynb)\n",
    "[![Open In SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/autogluon/autogluon/blob/master/docs/tutorials/tabular/tabular-foundational-models.ipynb)\n",
    "\n",
    "In this tutorial, we introduce support for cutting-edge foundational tabular models that leverage pre-training and in-context learning to achieve state-of-the-art performance on tabular datasets. These models represent a significant advancement in automated machine learning for structured data.\n",
    "\n",
    "In this tutorial, we'll explore four foundational tabular models:\n",
    "\n",
    "1. **Mitra** - AutoGluon's new state-of-the-art tabular foundation model\n",
    "2. **TabICL** - In-context learning for large tabular datasets\n",
    "3. **TabPFNv2** - Prior-fitted networks for accurate predictions on small data\n",
    "\n",
    "These models excel particularly on small to medium-sized datasets and can run in both zero-shot and fine-tuning modes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "First, let's install AutoGluon with support for foundational models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [],
   "source": [
    "# Individual model installations:\n",
    "!pip install uv\n",
    "!uv pip install autogluon.tabular[mitra]   # For Mitra\n",
    "!uv pip install autogluon.tabular[tabicl]   # For TabICL\n",
    "!uv pip install autogluon.tabular[tabpfn]   # For TabPFNv2\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from autogluon.tabular import TabularDataset, TabularPredictor\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.datasets import load_wine, fetch_california_housing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example Data\n",
    "\n",
    "For this tutorial, we'll demonstrate the foundational models on three different datasets to showcase their versatility:\n",
    "\n",
    "1. **Wine Dataset** (Multi-class Classification) - Medium-sized dataset for comparing model performance\n",
    "3. **California Housing** (Regression) - Regression dataset\n",
    "\n",
    "Let's load and prepare these datasets:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load datasets\n",
    "\n",
    "# 1. Wine (Multi-class Classification)\n",
    "wine_data = load_wine()\n",
    "wine_df = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)\n",
    "wine_df['target'] = wine_data.target\n",
    "\n",
    "# 2. California Housing (Regression)\n",
    "housing_data = fetch_california_housing()\n",
    "housing_df = pd.DataFrame(housing_data.data, columns=housing_data.feature_names)\n",
    "housing_df['target'] = housing_data.target\n",
    "\n",
    "print(\"Dataset shapes:\")\n",
    "print(f\"Wine: {wine_df.shape}\")\n",
    "print(f\"California Housing: {housing_df.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Train/Test Splits\n",
    "\n",
    "Let's create train/test splits for our datasets:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create train/test splits (80/20)\n",
    "wine_train, wine_test = train_test_split(wine_df, test_size=0.2, random_state=42, stratify=wine_df['target'])\n",
    "housing_train, housing_test = train_test_split(housing_df, test_size=0.2, random_state=42)\n",
    "\n",
    "print(\"Training set sizes:\")\n",
    "print(f\"Wine: {len(wine_train)} samples\")\n",
    "print(f\"Housing: {len(housing_train)} samples\")\n",
    "\n",
    "# Convert to TabularDataset\n",
    "wine_train_data = TabularDataset(wine_train)\n",
    "wine_test_data = TabularDataset(wine_test)\n",
    "housing_train_data = TabularDataset(housing_train)\n",
    "housing_test_data = TabularDataset(housing_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Mitra: AutoGluon's Tabular Foundation Model\n",
    "\n",
    "[Mitra](https://huggingface.co/autogluon/mitra-classifier) is a new state-of-the-art tabular foundation model developed by the AutoGluon team, natively supported in AutoGluon with just three lines of code via `predictor.fit())`. Built on the in-context learning paradigm and pretrained exclusively on synthetic data, Mitra introduces a principled pretraining approach by carefully selecting and mixing diverse synthetic priors to promote robust generalization across a wide range of real-world tabular datasets.\n",
    "\n",
    "📊 **Mitra achieves state-of-the-art performance** on major benchmarks including TabRepo, TabZilla, AMLB, and TabArena, especially excelling on small tabular datasets with fewer than 5,000 samples and 100 features, for both classification and regression tasks.\n",
    "\n",
    "🧠 **Mitra supports both zero-shot and fine-tuning modes** and runs seamlessly on both GPU and CPU. Its weights are fully open-sourced under the Apache-2.0 license, making it a privacy-conscious and production-ready solution for enterprises concerned about data sharing and hosting.\n",
    "\n",
    "🔗 **Learn more on Hugging Face:**\n",
    "- Classification model: [autogluon/mitra-classifier](https://huggingface.co/autogluon/mitra-classifier)\n",
    "- Regression model: [autogluon/mitra-regressor](https://huggingface.co/autogluon/mitra-regressor)\n",
    "\n",
    "### Using Mitra for Classification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create predictor with Mitra\n",
    "print(\"Training Mitra classifier on classification dataset...\")\n",
    "mitra_predictor = TabularPredictor(label='target')\n",
    "mitra_predictor.fit(\n",
    "    wine_train_data,\n",
    "    hyperparameters={\n",
    "        'MITRA': {'fine_tune': False}\n",
    "    },\n",
    "   )\n",
    "\n",
    "print(\"\\nMitra training completed!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate Mitra Performance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make predictions\n",
    "mitra_predictions = mitra_predictor.predict(wine_test_data)\n",
    "print(\"Sample Mitra predictions:\")\n",
    "print(mitra_predictions.head(10))\n",
    "\n",
    "# Show prediction probabilities for first few samples\n",
    "mitra_predictions = mitra_predictor.predict_proba(wine_test_data)\n",
    "print(mitra_predictions.head())\n",
    "\n",
    "# Show model leaderboard\n",
    "print(\"\\nMitra Model Leaderboard:\")\n",
    "mitra_predictor.leaderboard(wine_test_data, silent=True)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Finetuning with Mitra"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mitra_predictor_ft = TabularPredictor(label='target')\n",
    "mitra_predictor_ft.fit(\n",
    "    wine_train_data,\n",
    "    hyperparameters={\n",
    "        'MITRA': {'fine_tune': True, 'fine_tune_steps': 10}\n",
    "    },\n",
    "    time_limit=120,  # 2 minutes\n",
    "   )\n",
    "\n",
    "print(\"\\nMitra fine-tuning completed!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluating Fine-tuned Mitra Performance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Show model leaderboard\n",
    "print(\"\\nMitra Model Leaderboard:\")\n",
    "mitra_predictor_ft.leaderboard(wine_test_data, silent=True)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using Mitra for Regression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Create predictor with Mitra for regression\n",
    "print(\"Training Mitra regressor on California Housing dataset...\")\n",
    "mitra_reg_predictor = TabularPredictor(\n",
    "    label='target',\n",
    "    path='./mitra_regressor_model',\n",
    "    problem_type='regression'\n",
    ")\n",
    "mitra_reg_predictor.fit(\n",
    "    housing_train_data.sample(1000), # sample 1000 rows\n",
    "    hyperparameters={\n",
    "        'MITRA': {'fine_tune': False}\n",
    "    },\n",
    ")\n",
    "\n",
    "# Evaluate regression performance\n",
    "mitra_reg_predictor.leaderboard(housing_test_data)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. TabICL: In-Context Learning for Tabular Data\n",
    "\n",
    "**TabICL** (\"**Tab**ular **I**n-**C**ontext **L**earning\") is a foundational model designed specifically for in-context learning on large tabular datasets.\n",
    "\n",
    "**Paper**: [\"TabICL: A Tabular Foundation Model for In-Context Learning on Large Data\"](https://arxiv.org/abs/2502.05564)  \n",
    "**Authors**: Jingang Qu, David Holzmüller, Gaël Varoquaux, Marine Le Morvan  \n",
    "**GitHub**: https://github.com/soda-inria/tabicl\n",
    "\n",
    "TabICL leverages transformer architecture with in-context learning capabilities, making it particularly effective for scenarios where you have limited training data but access to related examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Train TabICL on dataset\n",
    "print(\"Training TabICL on wine dataset...\")\n",
    "tabicl_predictor = TabularPredictor(\n",
    "    label='target',\n",
    "    path='./tabicl_model'\n",
    ")\n",
    "tabicl_predictor.fit(\n",
    "    wine_train_data,\n",
    "    hyperparameters={\n",
    "        'TABICL': {},\n",
    "    },\n",
    ")\n",
    "\n",
    "# Show prediction probabilities for first few samples\n",
    "tabicl_predictions = tabicl_predictor.predict_proba(wine_test_data)\n",
    "print(tabicl_predictions.head())\n",
    "\n",
    "# Show TabICL leaderboard\n",
    "print(\"\\nTabICL Model Details:\")\n",
    "tabicl_predictor.leaderboard(wine_test_data, silent=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. TabPFNv2: Prior-Fitted Networks\n",
    "\n",
    "**TabPFNv2** (\"**Tab**ular **P**rior-**F**itted **N**etworks **v2**\") is designed for accurate predictions on small tabular datasets by using prior-fitted network architectures.\n",
    "\n",
    "**Paper**: [\"Accurate predictions on small data with a tabular foundation model\"](https://www.nature.com/articles/s41586-024-08328-6)  \n",
    "**Authors**: Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister & Frank Hutter  \n",
    "**GitHub**: https://github.com/PriorLabs/TabPFN\n",
    "\n",
    "TabPFNv2 excels on small datasets (< 10,000 samples) by leveraging prior knowledge encoded in the network architecture."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Train TabPFNv2 on Wine dataset (perfect size for TabPFNv2)\n",
    "print(\"Training TabPFNv2 on Wine dataset...\")\n",
    "tabpfnv2_predictor = TabularPredictor(\n",
    "    label='target',\n",
    "    path='./tabpfnv2_model'\n",
    ")\n",
    "tabpfnv2_predictor.fit(\n",
    "    wine_train_data,\n",
    "    hyperparameters={\n",
    "        'TABPFNV2': {\n",
    "            # TabPFNv2 works best with default parameters on small datasets\n",
    "        },\n",
    "    },\n",
    ")\n",
    "\n",
    "# Show prediction probabilities for first few samples\n",
    "tabpfnv2_predictions = tabpfnv2_predictor.predict_proba(wine_test_data)\n",
    "print(tabpfnv2_predictions.head())\n",
    "\n",
    "\n",
    "tabpfnv2_predictor.leaderboard(wine_test_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Advanced Usage: Combining Multiple Foundational Models\n",
    "\n",
    "AutoGluon allows you to combine multiple foundational models in a single predictor for enhanced performance through model stacking and ensembling:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure multiple foundational models together\n",
    "multi_foundation_config = {\n",
    "    'MITRA': {\n",
    "        'fine_tune': True,\n",
    "        'fine_tune_steps': 10\n",
    "    },\n",
    "    'TABPFNV2': {},\n",
    "    'TABICL': {},\n",
    "}\n",
    "\n",
    "print(\"Training ensemble of foundational models...\")\n",
    "ensemble_predictor = TabularPredictor(\n",
    "    label='target',\n",
    "    path='./ensemble_foundation_model'\n",
    ").fit(\n",
    "    wine_train_data,\n",
    "    hyperparameters=multi_foundation_config,\n",
    "    time_limit=300,  # More time for multiple models\n",
    ")\n",
    "\n",
    "# Evaluate ensemble performance\n",
    "ensemble_predictor.leaderboard(wine_test_data)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "tutorial",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}