AutoGluon-Cloud Predictors¶
Predictors built into AutoGluon-Cloud such that a single call to fit() can produce high-quality trained AutoGluon models for tabular, image, or text data on the cloud.
|
|
TabularCloudPredictor¶
-
class
autogluon.cloud.
TabularCloudPredictor
(cloud_output_path: str, local_output_path: Optional[str] = None, verbosity: int = 2)[source]¶ - Attributes
endpoint_name
Return the CloudPredictor deployed endpoint name
is_fit
Whether this CloudPredictor is fitted already
predictor_type
Type of the underneath AutoGluon Predictor
Methods
attach_endpoint
(endpoint)Attach the current CloudPredictor to an existing SageMaker endpoint.
attach_job
(job_name)Attach to a sagemaker training job.
Delete endpoint, endpoint configuration and deployed model
deploy
([predictor_path, endpoint_name, …])Deploy a predictor as a SageMaker endpoint, which can be used to do real-time inference later.
Detach the current endpoint and return it.
download_predict_results
([job_name, save_path])Download batch transform result
download_trained_predictor
([save_path])Download the trained predictor from the cloud.
fit
(*, predictor_init_args, predictor_fit_args)Fit the predictor with SageMaker.
Generate required trust relationship and IAM policy file in json format for CloudPredictor with SageMaker backend.
get_batch_transform_job_status
([job_name])Get the status of the batch transform job.
Get the status of the training job.
info
()Return general info about CloudPredictor
load
(path[, verbosity])Load the CloudPredictor
predict
(test_data[, test_data_image_column, …])Predict using SageMaker batch transform.
predict_proba
(test_data[, …])Predict using SageMaker batch transform.
predict_proba_real_time
(test_data[, …])Predict with the deployed SageMaker endpoint.
predict_real_time
(test_data[, …])Predict with the deployed SageMaker endpoint.
save
([silent])Save the CloudPredictor so that user can later reload the predictor to gain access to deployed endpoint.
to_local_predictor
([save_path])Convert the SageMaker trained predictor to a local AutoGluon Predictor.
-
attach_endpoint
(endpoint: Union[str, autogluon.cloud.utils.ag_sagemaker.AutoGluonRealtimePredictor]) → None¶ Attach the current CloudPredictor to an existing SageMaker endpoint.
- Parameters
- endpoint: str orclass:AutoGluonRealtimePredictor
If str is passed, it should be the name of the endpoint being attached to.
-
attach_job
(job_name: str) → None¶ Attach to a sagemaker training job. This is useful when the local process crashed and you want to reattach to the previous job
- Parameters
- job_name: str
The name of the job being attached
-
cleanup_deployment
() → None¶ Delete endpoint, endpoint configuration and deployed model
-
deploy
(predictor_path: Optional[str] = None, endpoint_name: Optional[str] = None, framework_version: str = 'latest', instance_type: str = 'ml.m5.2xlarge', initial_instance_count: int = 1, custom_image_uri: Optional[str] = None, wait: bool = True, model_kwargs: Optional[Dict] = None, **kwargs) → None¶ Deploy a predictor as a SageMaker endpoint, which can be used to do real-time inference later. This method would first create a AutoGluonSagemakerInferenceModel with the trained predictor, and then deploy it to the endpoint.
- Parameters
- predictor_path: str
Path to the predictor tarball you want to deploy. Path can be both a local path or a S3 location. If None, will deploy the most recent trained predictor trained with fit().
- endpoint_name: str
The endpoint name to use for the deployment. If None, CloudPredictor will create one with prefix ag-cloudpredictor
- framework_version: str, default = `latest`
Inference container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance to be deployed for the endpoint
- initial_instance_count: int, default = 1,
Initial number of instances to be deployed for the endpoint
- wait: Bool, default = True,
Whether to wait for the endpoint to be deployed. To be noticed, the function won’t return immediately because there are some preparations needed prior deployment.
- model_kwargs: dict, default = dict()
Any extra arguments needed to initialize Sagemaker Model Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#model for all options
- **kwargs:
Any extra arguments needed to pass to deploy. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model.deploy for all options
-
detach_endpoint
() → autogluon.cloud.utils.ag_sagemaker.AutoGluonRealtimePredictor¶ Detach the current endpoint and return it.
- Returns
- AutoGluonRealtimePredictor object.
-
download_predict_results
(job_name: Optional[str] = None, save_path: Optional[str] = None) → str¶ Download batch transform result
- Parameters
- job_name: str
The specific batch transform job results to download. If None, will download the most recent job results.
- save_path: str
Path to save the downloaded results. If None, CloudPredictor will create one.
- Returns
- str,
Path to downloaded results.
-
download_trained_predictor
(save_path: Optional[str] = None) → str¶ Download the trained predictor from the cloud.
- Parameters
- save_path: str
Path to save the model. If None, CloudPredictor will create a folder ‘AutogluonModels’ for the model under local_output_path.
- Returns
- save_path: str
Path to the saved model.
-
property
endpoint_name
¶ Return the CloudPredictor deployed endpoint name
-
fit
(*, predictor_init_args: Dict[str, Any], predictor_fit_args: Dict[str, Any], image_column: Optional[str] = None, leaderboard: bool = True, framework_version: str = 'latest', job_name: Optional[str] = None, instance_type: str = 'ml.m5.2xlarge', instance_count: int = 1, volume_size: int = 100, custom_image_uri: Optional[str] = None, wait: bool = True, autogluon_sagemaker_estimator_kwargs: Optional[Dict] = None, **kwargs) → autogluon.cloud.predictor.cloud_predictor.CloudPredictor¶ Fit the predictor with SageMaker. This function will first upload necessary config and train data to s3 bucket. Then launch a SageMaker training job with the AutoGluon training container.
- Parameters
- predictor_init_args: dict
Init args for the predictor
- predictor_fit_args: dict
Fit args for the predictor
- image_column: str, default = None
The column name in the training/tuning data that contains the image paths. The image paths MUST be absolute paths to you local system.
- leaderboard: bool, default = True
Whether to include the leaderboard in the output artifact
- framework_version: str, default = `latest`
Training container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- job_name: str, default = None
Name of the launched training job. If None, CloudPredictor will create one with prefix ag-cloudpredictor
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance type the predictor will be trained on with SageMaker.
- instance_count: int, default = 1
Number of instance used to fit the predictor.
- volumes_size: int, default = 30
Size in GB of the EBS volume to use for storing input data during training (default: 30). Must be large enough to store training data if File Mode is used (which is the default).
- wait: bool, default = True
Whether the call should wait until the job completes To be noticed, the function won’t return immediately because there are some preparations needed prior fit. Use get_fit_job_status to get job status.
- autogluon_sagemaker_estimator_kwargs: dict, default = dict()
Any extra arguments needed to initialize AutoGluonSagemakerEstimator Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Framework for all options
- **kwargs:
Any extra arguments needed to pass to fit. Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Framework.fit for all options
- Returns
- CloudPredictor object. Returns self.
-
static
generate_trust_relationship_and_iam_policy_file
(account_id: str, cloud_output_bucket: str, output_path: Optional[str] = None) → Dict[str, str]¶ Generate required trust relationship and IAM policy file in json format for CloudPredictor with SageMaker backend. Users can use the generated files to create an IAM role for themselves. IMPORTANT: Make sure you review both files before creating the role!
- Parameters
- account_id: str
The AWS account ID you plan to use for CloudPredictor.
- cloud_output_bucket: str
s3 bucket name where intermediate artifacts will be uploaded and trained models should be saved. You need to create this bucket beforehand and we would put this bucket in the policy being created.
- output_path: str
Where you would like the generated file being written to. If not specified, will write to the current folder.
-
get_batch_transform_job_status
(job_name: Optional[str] = None) → str¶ Get the status of the batch transform job. This is useful when the user made an asynchronous call to the predict() function
- Parameters
- job_name: str
The name of the job being checked. If None, will check the most recent job status.
- Returns
- str,
- Valid Values: InProgress | Completed | Failed | Stopping | Stopped | NotCreated
-
get_fit_job_status
() → str¶ Get the status of the training job. This is useful when the user made an asynchronous call to the fit() function
- Returns
- str,
- Valid Values: InProgress | Completed | Failed | Stopping | Stopped | NotCreated
-
info
() → Dict[str, Any]¶ Return general info about CloudPredictor
-
property
is_fit
¶ Whether this CloudPredictor is fitted already
-
classmethod
load
(path: str, verbosity: Optional[int] = None) → autogluon.cloud.predictor.cloud_predictor.CloudPredictor¶ Load the CloudPredictor
- Parameters
- path: str
The path to directory in which this Predictor was previously saved
- Returns
- CloudPredictor object.
-
predict
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, predictor_path: Optional[str] = None, framework_version: str = 'latest', job_name: Optional[str] = None, instance_type: str = 'ml.m5.2xlarge', instance_count: int = 1, custom_image_uri: Optional[str] = None, wait: bool = True, download: bool = True, persist: bool = True, save_path: Optional[str] = None, model_kwargs: Optional[Dict] = None, transformer_kwargs: Optional[Dict] = None, **kwargs) → Optional[pandas.core.series.Series]¶ Predict using SageMaker batch transform. When minimizing latency isn’t a concern, then the batch transform functionality may be easier, more scalable, and more appropriate. If you want to minimize latency, use predict_real_time() instead. This method would first create a AutoGluonSagemakerInferenceModel with the trained predictor, then create a transformer with it, and call transform in the end.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame, or a local path to a csv.
- test_data_image_column: str, default = None
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- predictor_path: str
Path to the predictor tarball you want to use to predict. Path can be both a local path or a S3 location. If None, will use the most recent trained predictor trained with fit().
- framework_version: str, default = `latest`
Inference container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- job_name: str, default = None
Name of the launched training job. If None, CloudPredictor will create one with prefix ag-cloudpredictor.
- instance_count: int, default = 1,
Number of instances used to do batch transform.
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance to be used for batch transform.
- wait: bool, default = True
Whether to wait for batch transform to complete. To be noticed, the function won’t return immediately because there are some preparations needed prior transform.
- download: bool, default = True
Whether to download the batch transform results to the disk and load it after the batch transform finishes. Will be ignored if wait is False.
- persist: bool, default = True
Whether to persist the downloaded batch transform results on the disk. Will be ignored if download is False
- save_path: str, default = None,
Path to save the downloaded result. Will be ignored if download is False. If None, CloudPredictor will create one. If persist is False, file would first be downloaded to this path and then removed.
- model_kwargs: dict, default = dict()
Any extra arguments needed to initialize Sagemaker Model Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#model for all options
- transformer_kwargs: dict
Any extra arguments needed to pass to transformer. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/transformer.html#sagemaker.transformer.Transformer for all options.
- **kwargs:
Any extra arguments needed to pass to transform. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/transformer.html#sagemaker.transformer.Transformer.transform for all options.
- Returns
- Optional Pandas.Series
Predict results in Series if download is True None if download is False
-
predict_proba
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, include_predict: bool = True, predictor_path: Optional[str] = None, framework_version: str = 'latest', job_name: Optional[str] = None, instance_type: str = 'ml.m5.2xlarge', instance_count: int = 1, custom_image_uri: Optional[str] = None, wait: bool = True, download: bool = True, persist: bool = True, save_path: Optional[str] = None, model_kwargs: Optional[Dict] = None, transformer_kwargs: Optional[Dict] = None, **kwargs) → Optional[Union[Tuple[pandas.core.series.Series, Union[pandas.core.frame.DataFrame, pandas.core.series.Series]], pandas.core.frame.DataFrame, pandas.core.series.Series]]¶ Predict using SageMaker batch transform. When minimizing latency isn’t a concern, then the batch transform functionality may be easier, more scalable, and more appropriate. If you want to minimize latency, use predict_real_time() instead. This method would first create a AutoGluonSagemakerInferenceModel with the trained predictor, then create a transformer with it, and call transform in the end.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame, or a local path to a csv.
- test_data_image_column: str, default = None
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- include_predict: bool, default = True
Whether to include predict result along with predict_proba results. This flag can save you time from making two calls to get both the prediction and the probability as batch inference involves noticeable overhead.
- predictor_path: str
Path to the predictor tarball you want to use to predict. Path can be both a local path or a S3 location. If None, will use the most recent trained predictor trained with fit().
- framework_version: str, default = `latest`
Inference container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- job_name: str, default = None
Name of the launched training job. If None, CloudPredictor will create one with prefix ag-cloudpredictor.
- instance_count: int, default = 1,
Number of instances used to do batch transform.
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance to be used for batch transform.
- wait: bool, default = True
Whether to wait for batch transform to complete. To be noticed, the function won’t return immediately because there are some preparations needed prior transform.
- download: bool, default = True
Whether to download the batch transform results to the disk and load it after the batch transform finishes. Will be ignored if wait is False.
- persist: bool, default = True
Whether to persist the downloaded batch transform results on the disk. Will be ignored if download is False
- save_path: str, default = None,
Path to save the downloaded result. Will be ignored if download is False. If None, CloudPredictor will create one. If persist is False, file would first be downloaded to this path and then removed.
- model_kwargs: dict, default = dict()
Any extra arguments needed to initialize Sagemaker Model Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#model for all options
- transformer_kwargs: dict
Any extra arguments needed to pass to transformer. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/transformer.html#sagemaker.transformer.Transformer for all options.
- **kwargs:
Any extra arguments needed to pass to transform. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/transformer.html#sagemaker.transformer.Transformer.transform for all options.
- Returns
- Optional[Union[Tuple[pd.Series, Union[pd.DataFrame, pd.Series]], Union[pd.DataFrame, pd.Series]]]
If download is False, will return None or (None, None) if include_predict is True If download is True and include_predict is True, will return (prediction, predict_probability), where prediction is a Pandas.Series and predict_probability is a Pandas.DataFrame or a Pandas.Series that’s identical to prediction when it’s a regression problem.
-
predict_proba_real_time
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, accept: str = 'application/x-parquet')[source]¶ Predict with the deployed SageMaker endpoint. A deployed SageMaker endpoint is required. This is intended to provide a low latency inference. If you want to inference on a large dataset, use predict() instead.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- test_data_image_column: default = None
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- accept: str, default = application/x-parquet
Type of accept output content. Valid options are application/x-parquet, text/csv, application/json
- Returns
- Pandas.DataFrame or Pandas.Series
Will return a Pandas.Series when it’s a regression problem. Will return a Pandas.DataFrame otherwise
-
predict_real_time
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, accept: str = 'application/x-parquet')[source]¶ Predict with the deployed SageMaker endpoint. A deployed SageMaker endpoint is required. This is intended to provide a low latency inference. If you want to inference on a large dataset, use predict() instead.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- test_data_image_column: default = None
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- accept: str, default = application/x-parquet
Type of accept output content. Valid options are application/x-parquet, text/csv, application/json
- Returns
- Pandas.Series
- Predict results in Series
-
property
predictor_type
¶ Type of the underneath AutoGluon Predictor
-
save
(silent: bool = False) → None¶ Save the CloudPredictor so that user can later reload the predictor to gain access to deployed endpoint.
-
to_local_predictor
(save_path: Optional[str] = None, **kwargs)¶ Convert the SageMaker trained predictor to a local AutoGluon Predictor.
- Parameters
- save_path: str
Path to save the model. If None, CloudPredictor will create a folder for the model.
- kwargs:
Additional args to be passed to load call of the underneath predictor
- Returns
- AutoGluon Predictor,
TabularPredictor or MultiModalPredictor based on predictor_type
MultiModalCloudPredictor¶
-
class
autogluon.cloud.
MultiModalCloudPredictor
(cloud_output_path: str, local_output_path: Optional[str] = None, verbosity: int = 2)[source]¶ - Attributes
endpoint_name
Return the CloudPredictor deployed endpoint name
is_fit
Whether this CloudPredictor is fitted already
predictor_type
Type of the underneath AutoGluon Predictor
Methods
attach_endpoint
(endpoint)Attach the current CloudPredictor to an existing SageMaker endpoint.
attach_job
(job_name)Attach to a sagemaker training job.
Delete endpoint, endpoint configuration and deployed model
deploy
([predictor_path, endpoint_name, …])Deploy a predictor as a SageMaker endpoint, which can be used to do real-time inference later.
Detach the current endpoint and return it.
download_predict_results
([job_name, save_path])Download batch transform result
download_trained_predictor
([save_path])Download the trained predictor from the cloud.
fit
(*, predictor_init_args, predictor_fit_args)Fit the predictor with SageMaker.
Generate required trust relationship and IAM policy file in json format for CloudPredictor with SageMaker backend.
get_batch_transform_job_status
([job_name])Get the status of the batch transform job.
Get the status of the training job.
info
()Return general info about CloudPredictor
load
(path[, verbosity])Load the CloudPredictor
predict
(test_data[, test_data_image_column])test_data: str
predict_proba
(test_data[, …])test_data: str
predict_proba_real_time
(test_data[, …])Predict with the deployed SageMaker endpoint.
predict_real_time
(test_data[, …])Predict with the deployed SageMaker endpoint.
save
([silent])Save the CloudPredictor so that user can later reload the predictor to gain access to deployed endpoint.
to_local_predictor
([save_path])Convert the SageMaker trained predictor to a local AutoGluon Predictor.
-
attach_endpoint
(endpoint: Union[str, autogluon.cloud.utils.ag_sagemaker.AutoGluonRealtimePredictor]) → None¶ Attach the current CloudPredictor to an existing SageMaker endpoint.
- Parameters
- endpoint: str orclass:AutoGluonRealtimePredictor
If str is passed, it should be the name of the endpoint being attached to.
-
attach_job
(job_name: str) → None¶ Attach to a sagemaker training job. This is useful when the local process crashed and you want to reattach to the previous job
- Parameters
- job_name: str
The name of the job being attached
-
cleanup_deployment
() → None¶ Delete endpoint, endpoint configuration and deployed model
-
deploy
(predictor_path: Optional[str] = None, endpoint_name: Optional[str] = None, framework_version: str = 'latest', instance_type: str = 'ml.m5.2xlarge', initial_instance_count: int = 1, custom_image_uri: Optional[str] = None, wait: bool = True, model_kwargs: Optional[Dict] = None, **kwargs) → None¶ Deploy a predictor as a SageMaker endpoint, which can be used to do real-time inference later. This method would first create a AutoGluonSagemakerInferenceModel with the trained predictor, and then deploy it to the endpoint.
- Parameters
- predictor_path: str
Path to the predictor tarball you want to deploy. Path can be both a local path or a S3 location. If None, will deploy the most recent trained predictor trained with fit().
- endpoint_name: str
The endpoint name to use for the deployment. If None, CloudPredictor will create one with prefix ag-cloudpredictor
- framework_version: str, default = `latest`
Inference container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance to be deployed for the endpoint
- initial_instance_count: int, default = 1,
Initial number of instances to be deployed for the endpoint
- wait: Bool, default = True,
Whether to wait for the endpoint to be deployed. To be noticed, the function won’t return immediately because there are some preparations needed prior deployment.
- model_kwargs: dict, default = dict()
Any extra arguments needed to initialize Sagemaker Model Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#model for all options
- **kwargs:
Any extra arguments needed to pass to deploy. Please refer to https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model.deploy for all options
-
detach_endpoint
() → autogluon.cloud.utils.ag_sagemaker.AutoGluonRealtimePredictor¶ Detach the current endpoint and return it.
- Returns
- AutoGluonRealtimePredictor object.
-
download_predict_results
(job_name: Optional[str] = None, save_path: Optional[str] = None) → str¶ Download batch transform result
- Parameters
- job_name: str
The specific batch transform job results to download. If None, will download the most recent job results.
- save_path: str
Path to save the downloaded results. If None, CloudPredictor will create one.
- Returns
- str,
Path to downloaded results.
-
download_trained_predictor
(save_path: Optional[str] = None) → str¶ Download the trained predictor from the cloud.
- Parameters
- save_path: str
Path to save the model. If None, CloudPredictor will create a folder ‘AutogluonModels’ for the model under local_output_path.
- Returns
- save_path: str
Path to the saved model.
-
property
endpoint_name
¶ Return the CloudPredictor deployed endpoint name
-
fit
(*, predictor_init_args: Dict[str, Any], predictor_fit_args: Dict[str, Any], image_column: Optional[str] = None, leaderboard: bool = True, framework_version: str = 'latest', job_name: Optional[str] = None, instance_type: str = 'ml.m5.2xlarge', instance_count: int = 1, volume_size: int = 100, custom_image_uri: Optional[str] = None, wait: bool = True, autogluon_sagemaker_estimator_kwargs: Optional[Dict] = None, **kwargs) → autogluon.cloud.predictor.cloud_predictor.CloudPredictor¶ Fit the predictor with SageMaker. This function will first upload necessary config and train data to s3 bucket. Then launch a SageMaker training job with the AutoGluon training container.
- Parameters
- predictor_init_args: dict
Init args for the predictor
- predictor_fit_args: dict
Fit args for the predictor
- image_column: str, default = None
The column name in the training/tuning data that contains the image paths. The image paths MUST be absolute paths to you local system.
- leaderboard: bool, default = True
Whether to include the leaderboard in the output artifact
- framework_version: str, default = `latest`
Training container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
- job_name: str, default = None
Name of the launched training job. If None, CloudPredictor will create one with prefix ag-cloudpredictor
- instance_type: str, default = ‘ml.m5.2xlarge’
Instance type the predictor will be trained on with SageMaker.
- instance_count: int, default = 1
Number of instance used to fit the predictor.
- volumes_size: int, default = 30
Size in GB of the EBS volume to use for storing input data during training (default: 30). Must be large enough to store training data if File Mode is used (which is the default).
- wait: bool, default = True
Whether the call should wait until the job completes To be noticed, the function won’t return immediately because there are some preparations needed prior fit. Use get_fit_job_status to get job status.
- autogluon_sagemaker_estimator_kwargs: dict, default = dict()
Any extra arguments needed to initialize AutoGluonSagemakerEstimator Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Framework for all options
- **kwargs:
Any extra arguments needed to pass to fit. Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Framework.fit for all options
- Returns
- CloudPredictor object. Returns self.
-
static
generate_trust_relationship_and_iam_policy_file
(account_id: str, cloud_output_bucket: str, output_path: Optional[str] = None) → Dict[str, str]¶ Generate required trust relationship and IAM policy file in json format for CloudPredictor with SageMaker backend. Users can use the generated files to create an IAM role for themselves. IMPORTANT: Make sure you review both files before creating the role!
- Parameters
- account_id: str
The AWS account ID you plan to use for CloudPredictor.
- cloud_output_bucket: str
s3 bucket name where intermediate artifacts will be uploaded and trained models should be saved. You need to create this bucket beforehand and we would put this bucket in the policy being created.
- output_path: str
Where you would like the generated file being written to. If not specified, will write to the current folder.
-
get_batch_transform_job_status
(job_name: Optional[str] = None) → str¶ Get the status of the batch transform job. This is useful when the user made an asynchronous call to the predict() function
- Parameters
- job_name: str
The name of the job being checked. If None, will check the most recent job status.
- Returns
- str,
- Valid Values: InProgress | Completed | Failed | Stopping | Stopped | NotCreated
-
get_fit_job_status
() → str¶ Get the status of the training job. This is useful when the user made an asynchronous call to the fit() function
- Returns
- str,
- Valid Values: InProgress | Completed | Failed | Stopping | Stopped | NotCreated
-
info
() → Dict[str, Any]¶ Return general info about CloudPredictor
-
property
is_fit
¶ Whether this CloudPredictor is fitted already
-
classmethod
load
(path: str, verbosity: Optional[int] = None) → autogluon.cloud.predictor.cloud_predictor.CloudPredictor¶ Load the CloudPredictor
- Parameters
- path: str
The path to directory in which this Predictor was previously saved
- Returns
- CloudPredictor object.
-
predict
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, **kwargs) → Optional[pandas.core.series.Series][source]¶ - test_data: str
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- When predicting with only images:
Can be a local path to a directory containing the images or a local path to a single image.
- test_data_image_column: Optional(str)
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- kwargs:
Refer to CloudPredictor.predict()
-
predict_proba
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, **kwargs) → Optional[Union[Tuple[pandas.core.series.Series, Union[pandas.core.frame.DataFrame, pandas.core.series.Series]], pandas.core.frame.DataFrame, pandas.core.series.Series]][source]¶ - test_data: str
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- When predicting with only images:
Can be a local path to a directory containing the images or a local path to a single image.
- test_data_image_column: Optional(str)
If test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- kwargs:
Refer to CloudPredictor.predict()
-
predict_proba_real_time
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, accept: str = 'application/x-parquet') → Union[pandas.core.frame.DataFrame, pandas.core.series.Series][source]¶ Predict with the deployed SageMaker endpoint. A deployed SageMaker endpoint is required. This is intended to provide a low latency inference. If you want to inference on a large dataset, use predict() instead.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- When predicting with only images:
- Can be a pandas.DataFrame or a local path to a csv file.
Similarly, you need to specify test_data_image_column, and make sure the image column contains relative path to the image.
Or a local path to a single image file. Or a list of local paths to image files.
- test_data_image_column: default = None
If provided a csv file or pandas.DataFrame as the test_data and test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- accept: str, default = application/x-parquet
Type of accept output content. Valid options are application/x-parquet, text/csv, application/json
- Returns
- Pandas.DataFrame or Pandas.Series
Will return a Pandas.Series when it’s a regression problem. Will return a Pandas.DataFrame otherwise
-
predict_real_time
(test_data: Union[str, pandas.core.frame.DataFrame], test_data_image_column: Optional[str] = None, accept: str = 'application/x-parquet') → pandas.core.series.Series[source]¶ Predict with the deployed SageMaker endpoint. A deployed SageMaker endpoint is required. This is intended to provide a low latency inference. If you want to inference on a large dataset, use predict() instead.
- Parameters
- test_data: Union(str, pandas.DataFrame)
The test data to be inferenced. Can be a pandas.DataFrame or a local path to a csv file. When predicting multimodality with image modality:
You need to specify test_data_image_column, and make sure the image column contains relative path to the image.
- When predicting with only images:
- Can be a pandas.DataFrame or a local path to a csv file.
Similarly, you need to specify test_data_image_column, and make sure the image column contains relative path to the image.
Or a local path to a single image file. Or a list of local paths to image files.
- test_data_image_column: default = None
If provided a csv file or pandas.DataFrame as the test_data and test_data involves image modality, you must specify the column name corresponding to image paths. The path MUST be an abspath
- accept: str, default = application/x-parquet
Type of accept output content. Valid options are application/x-parquet, text/csv, application/json
- Returns
- Pandas.Series
- Predict results in Series
-
property
predictor_type
¶ Type of the underneath AutoGluon Predictor
-
save
(silent: bool = False) → None¶ Save the CloudPredictor so that user can later reload the predictor to gain access to deployed endpoint.
-
to_local_predictor
(save_path: Optional[str] = None, **kwargs)¶ Convert the SageMaker trained predictor to a local AutoGluon Predictor.
- Parameters
- save_path: str
Path to save the model. If None, CloudPredictor will create a folder for the model.
- kwargs:
Additional args to be passed to load call of the underneath predictor
- Returns
- AutoGluon Predictor,
TabularPredictor or MultiModalPredictor based on predictor_type