Reference: Base APIs¶
This section highlights the base APIs used by the EDA framework. The processing consists of the following parts:
Analysis graph construction - in this part a nested graph of analyses is constructed.
analysis = BaseAnalysis(
# State
state=state,
# Arguments
train_data=train_data, test_data=test_data, val_data=val_data, model=model, label=label,
# Nested analyses
children=[
Sampler(sample=sample, children=[
DatasetSummary(),
MissingValuesAnalysis(),
RawTypesAnalysis(),
SpecialTypesAnalysis(),
ApplyFeatureGenerator(category_to_numbers=True, children=[
FeatureDistanceAnalysis()
]),
]),
],
)
2. .fit() call. This call will execute operations in the graph and produce a state. The state is a nested dictionary without any prescribed structure. All components share the same namespace. If multiple components are fitted with different parameters, they can be put into separate sub-spaces via Namespace component that can be passed either for further processing via next analysis or be rendered.
state = analysis.fit()
3. Rendering: in this stage we construct components graph (a combination of layout components and visual components) and then pass State generated previously as an input argument into render() call.
viz = SimpleVerticalLinearLayout(
facets=[
DatasetStatistics(headers=True),
DatasetTypeMismatch(headers=True),
MarkdownSectionComponent("### Feature Distance"),
FeatureDistanceAnalysisVisualization(),
],
)
viz.render(state)
Please note: it is possible that the components may depend on each other’s output; all the pre-requisites to fit() the component must be checked in can_handle(). There are two ways the components can share the information: 1) using state; 2) share values/shadow arguments (i.e., sample component modifies train_data, test_data and val_data arguments in the scope of calling children’s fit().
autogluon.eda.analysis.base¶
Methods |
|
Creates a nested namespace in state. |
AbstractAnalysis¶
-
class
autogluon.eda.analysis.base.
AbstractAnalysis
(parent: Optional[autogluon.eda.analysis.base.AbstractAnalysis] = None, children: Optional[List[autogluon.eda.analysis.base.AbstractAnalysis]] = None, state: Optional[autogluon.eda.state.AnalysisState] = None, **kwargs)[source]¶ Methods
all_keys_must_be_present
(state, *keys)Checks if all the keys are present in the state
at_least_one_key_must_be_present
(state, *keys)Checks if at least one key is present in the state
available_datasets
(args)Generator which iterates only through the datasets provided in arguments
can_handle
(state, args)Checks if state and args has all the required parameters for fitting.
fit
(**kwargs)Fit the analysis tree.
-
all_keys_must_be_present
(state: autogluon.eda.state.AnalysisState, *keys) → bool¶ Checks if all the keys are present in the state
- Parameters
- state: AnalysisState
state object to perform check on
- keys:
list of the keys to check
- Returns
- True if all the key from the keys list are present in the state
-
at_least_one_key_must_be_present
(state: autogluon.eda.state.AnalysisState, *keys) → bool¶ Checks if at least one key is present in the state
- Parameters
- state: AnalysisState
state object to perform check on
- keys:
list of the keys to check
- Returns
- True if at least one key from the keys list is present in the state
-
static
available_datasets
(args: autogluon.eda.state.AnalysisState) → Generator[Tuple[str, pandas.core.frame.DataFrame], None, None][source]¶ Generator which iterates only through the datasets provided in arguments
- Parameters
- args: AnalysisState
arguments passed into the call. These are different from self.args in a way that it’s arguments assembled from the parents and shadowed via children (allows to isolate reused parameters in upper arguments declarations.
- Returns
- tuple of dataset name (train_data, test_data or tuning_data) and dataset itself
-
abstract
can_handle
(state: autogluon.eda.state.AnalysisState, args: autogluon.eda.state.AnalysisState) → bool[source]¶ Checks if state and args has all the required parameters for fitting. See also
at_least_one_key_must_be_present()
andall_keys_must_be_present()
helpers to construct more complex logic.- Parameters
- state: AnalysisState
state to be updated by this fit function
- args: AnalysisState
analysis properties assembled from root of analysis hierarchy to this component (with lower levels shadowing upper level args).
- Returns
- True if all the pre-requisites for fitting are present
-
Namespace¶
-
class
autogluon.eda.analysis.base.
Namespace
(namespace: Optional[str] = None, parent: Optional[autogluon.eda.analysis.base.AbstractAnalysis] = None, children: Optional[List[autogluon.eda.analysis.base.AbstractAnalysis]] = None, **kwargs)[source]¶ Creates a nested namespace in state. All the components within children will have relative root of the state moved into this subspace. To instruct visualization facets to use a specific subspace, please use namespace argument (see the example).
- Parameters
- namespace: Optional[str], default = None
namespace to use; use root if not specified
- parent: Optional[AbstractAnalysis], default = None
parent Analysis
- children: Optional[List[AbstractAnalysis]], default None
wrapped analyses; these will receive sampled args during fit call
- kwargs
Examples
>>> import autogluon.eda.analysis as eda >>> import autogluon.eda.visualization as viz >>> import autogluon.eda.auto as auto >>> >>> auto.analyze( >>> train_data=..., label=..., >>> anlz_facets=[ >>> # Puts output into the root namespace >>> eda.interaction.Correlation(), >>> # Puts output into the focus namespace >>> eda.Namespace(namespace='focus', children=[ >>> eda.interaction.Correlation(focus_field='Fare', focus_field_threshold=0.3), >>> ]) >>> ], >>> viz_facets=[ >>> # Renders correlations from the root namespace >>> viz.interaction.CorrelationVisualization(), >>> # Renders correlations from the focus namespace >>> viz.interaction.CorrelationVisualization(namespace='focus'), >>> ] >>> )
autogluon.eda.visualization.base¶
Methods |
AbstractVisualization¶
-
class
autogluon.eda.visualization.base.
AbstractVisualization
(namespace: Optional[str] = None, **kwargs)[source]¶ Methods
all_keys_must_be_present
(state, *keys)Checks if all the keys are present in the state
at_least_one_key_must_be_present
(state, *keys)Checks if at least one key is present in the state
can_handle
(state)Checks if state has all the required parameters for visualization.
render
(state)Render component.
-
all_keys_must_be_present
(state: autogluon.eda.state.AnalysisState, *keys) → bool¶ Checks if all the keys are present in the state
- Parameters
- state: AnalysisState
state object to perform check on
- keys:
list of the keys to check
- Returns
- True if all the key from the keys list are present in the state
-
at_least_one_key_must_be_present
(state: autogluon.eda.state.AnalysisState, *keys) → bool¶ Checks if at least one key is present in the state
- Parameters
- state: AnalysisState
state object to perform check on
- keys:
list of the keys to check
- Returns
- True if at least one key from the keys list is present in the state
-
abstract
can_handle
(state: autogluon.eda.state.AnalysisState) → bool[source]¶ Checks if state has all the required parameters for visualization. See also
at_least_one_key_must_be_present()
andall_keys_must_be_present()
helpers to construct more complex logic.- Parameters
- state: AnalysisState
fitted state
- Returns
- True if all the pre-requisites for rendering are present
-