Components: model#

autogluon.eda.visualization.model#

ConfusionMatrix

Render confusion matrix for binary/multiclass classificator.

FeatureImportance

Render feature importance for the model.

RegressionEvaluation

This plot shows residuals on the vertical axis vs prediction on horizontal axis.

ModelLeaderboard

Render model leaderboard for trained model ensemble.

ConfusionMatrix#

class autogluon.eda.visualization.model.ConfusionMatrix(fig_args: Optional[Dict[str, Any]] = None, headers: bool = False, namespace: Optional[str] = None, **kwargs)[source]#

Render confusion matrix for binary/multiclass classificator.

This visualization depends on AutoGluonModelEvaluator analysis.

Parameters
  • headers (bool, default = False) – if True then render headers

  • namespace (str, default = None) – namespace to use; can be nested like ns_a.ns_b.ns_c

  • fig_args (Optional[Dict[str, Any]] = None,) – kwargs to pass into chart figure

Examples

>>> import autogluon.eda.analysis as eda
>>> import autogluon.eda.visualization as viz
>>> import autogluon.eda.auto as auto
>>>
>>> df_train = ...
>>> df_test = ...
>>> predictor = ...
>>>
>>> auto.analyze(model=predictor, val_data=df_test, anlz_facets=[
>>>     eda.model.AutoGluonModelEvaluator(),
>>> ], viz_facets=[
>>>     viz.model.ConfusionMatrix(fig_args=dict(figsize=(3,3)), annot_kws={"size": 12}),
>>> ])

FeatureImportance#

class autogluon.eda.visualization.model.FeatureImportance(show_barplots: bool = False, fig_args: Optional[Dict[str, Any]] = None, headers: bool = False, namespace: Optional[str] = None, **kwargs)[source]#

Render feature importance for the model.

This visualization depends on AutoGluonModelEvaluator analysis.

Parameters
  • show_barplots (bool, default = False) – render features barplots if True

  • headers (bool, default = False) – if True then render headers

  • namespace (str, default = None) – namespace to use; can be nested like ns_a.ns_b.ns_c

  • fig_args (Optional[Dict[str, Any]] = None,) – kwargs to pass into chart figure

Examples

>>> import autogluon.eda.analysis as eda
>>> import autogluon.eda.visualization as viz
>>> import autogluon.eda.auto as auto
>>>
>>> df_train = ...
>>> df_test = ...
>>> predictor = ...
>>>
>>> auto.analyze(model=predictor, val_data=df_test, anlz_facets=[
>>>     eda.model.AutoGluonModelEvaluator(),
>>> ], viz_facets=[
>>>     viz.model.FeatureImportance(show_barplots=True)
>>> ])

RegressionEvaluation#

class autogluon.eda.visualization.model.RegressionEvaluation(residuals_plot_mode: Optional[str] = 'qoq', fig_args: Optional[Dict[str, Any]] = None, headers: bool = False, namespace: Optional[str] = None, **kwargs)[source]#

This plot shows residuals on the vertical axis vs prediction on horizontal axis.

This visualization depends on AutoGluonModelEvaluator analysis.

Parameters
  • residuals_plot_mode (Optional[str], default = 'qoq') – Additional plot to render to the right of the main plot. The supported values: - qoq (default) - Q-Q plot, which is a common way to check that residuals are normally distributed. If the residuals are normally distributed, then their quantiles when plotted against quantiles of normal distribution should form a straight line. - hist - display histogram that our error is normally distributed around zero, which also generally indicates a well fitted model - any other value - don’t render additional details

  • headers (bool, default = False) – if True then render headers

  • namespace (str, default = None) – namespace to use; can be nested like ns_a.ns_b.ns_c

  • fig_args (Optional[Dict[str, Any]] = None,) – kwargs to pass into chart figure

Examples

>>> import autogluon.eda.analysis as eda
>>> import autogluon.eda.visualization as viz
>>> import autogluon.eda.auto as auto
>>>
>>> df_train = ...
>>> df_test = ...
>>> predictor = ...
>>>
>>> auto.analyze(model=predictor, val_data=df_test, anlz_facets=[
>>>     eda.model.AutoGluonModelEvaluator(),
>>> ], viz_facets=[
>>>     viz.model.RegressionEvaluation(fig_args=dict(figsize=(6,6)), marker='o', scatter_kws={'s':5}),
>>> ])

ModelLeaderboard#

class autogluon.eda.visualization.model.ModelLeaderboard(namespace: Optional[str] = None, headers: bool = False, **kwargs)[source]#

Render model leaderboard for trained model ensemble.

Parameters
  • headers (bool, default = False) – if True then render headers

  • namespace (str, default = None) – namespace to use; can be nested like ns_a.ns_b.ns_c

Examples

>>> import autogluon.eda.analysis as eda
>>> import autogluon.eda.visualization as viz
>>> import autogluon.eda.auto as auto
>>>
>>> df_train = ...
>>> df_test = ...
>>> predictor = ...
>>>
>>> auto.analyze(model=predictor, val_data=df_test, anlz_facets=[
>>>     eda.model.AutoGluonModelEvaluator(),
>>> ], viz_facets=[
>>>     viz.model.ModelLeaderboard(),
>>> ])

autogluon.eda.analysis.model#

AutoGluonModelEvaluator

Evaluate AutoGluon model performance.

AutoGluonModelQuickFit

Fit a quick model using AutoGluon.

AutoGluonModelEvaluator#

class autogluon.eda.analysis.model.AutoGluonModelEvaluator(normalize: Optional[str] = None, parent: Optional[AbstractAnalysis] = None, children: Optional[List[AbstractAnalysis]] = None, **kwargs)[source]#

Evaluate AutoGluon model performance.

This analysis requires a trained classifier passed in model arg and uses ‘val_data’ dataset to assess model performance.

It is assumed that the validation dataset should follow the same column names seen by the model and has not been used during the training process.

Parameters
  • model (TabularPredictor, required) – fitted AutoGluon model to analyze

  • val_data (pd.DataFrame, required) – validation dataset to use. Warning: do not use data used for training as a validation data. Predictions on data used by the model during training tend to be optimistic and might not generalize on unseen data.

  • normalize ({'true', 'pred', 'all'}, default=None) – Normalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If None, confusion matrix will not be normalized. Note: applicable only for binary and multiclass classification; ignored for regression models.

  • parent (Optional[AbstractAnalysis], default = None) – parent Analysis

  • children (Optional[List[AbstractAnalysis]], default None) – wrapped analyses; these will receive sampled args during fit call

Examples

>>> import autogluon.eda.analysis as eda
>>> import autogluon.eda.visualization as viz
>>> import autogluon.eda.auto as auto
>>>
>>> df_train = ...
>>> df_test = ...
>>> predictor = ...
>>>
>>> auto.analyze(model=predictor, val_data=df_test, anlz_facets=[
>>>     eda.model.AutoGluonModelEvaluator(),
>>> ], viz_facets=[
>>>     viz.layouts.MarkdownSectionComponent(markdown=f'### Model Prediction for {predictor.label}'),
>>>     viz.model.ConfusionMatrix(fig_args=dict(figsize=(3,3)), annot_kws={"size": 12}),
>>>     viz.model.RegressionEvaluation(fig_args=dict(figsize=(6,6)), chart_args=dict(marker='o', scatter_kws={'s':5})),
>>>     viz.layouts.MarkdownSectionComponent(markdown=f'### Feature Importance for Trained Model'),
>>>     viz.model.FeatureImportance(show_barplots=True)
>>> ])

AutoGluonModelQuickFit#

class autogluon.eda.analysis.model.AutoGluonModelQuickFit(problem_type: str = 'auto', estimator_args: Optional[Dict[str, Any]] = None, parent: Optional[AbstractAnalysis] = None, children: Optional[List[AbstractAnalysis]] = None, save_model_to_state: bool = True, **kwargs)[source]#

Fit a quick model using AutoGluon.

train_data, val_data and label must be present in args.

Note: this component can be wrapped into TrainValidationSplit and ~autogluon.eda.analysis.dataset.Sampler to perform automated sampling and train-test split. This whole logic is implemented in quick_fit() shortcut.

Examples

>>> import autogluon.eda.analysis as eda
>>>
>>> # Quick fit
>>> state = auto.quick_fit(
>>>     train_data=..., label=...,
>>>     return_state=True,  # return state object from call
>>>     hyperparameters={'GBM': {}}  # train specific model
>>> )
>>>
>>> # Using quick fit model
>>> model = state.model
>>> y_pred = model.predict(test_data)
Parameters
  • problem_type (str, default = 'auto') – problem type to use. Valid problem_type values include [‘auto’, ‘binary’, ‘multiclass’, ‘regression’, ‘quantile’, ‘softclass’] auto means it will be Auto-detected using AutoGluon methods.

  • estimator_args (Optional[Dict[str, Any]], default = None,) – kwargs to pass into estimator constructor (TabularPredictor)

  • save_model_to_state (bool, default = True,) – save fitted model into state under model key. This functionality might be helpful in cases when the fitted model could be usable for other purposes (i.e. imputers)

  • parent (Optional[AbstractAnalysis], default = None) – parent Analysis

  • children (Optional[List[AbstractAnalysis]], default None) – wrapped analyses; these will receive sampled args during fit call

  • kwargs