

# Forecast a deployed Autopilot model
<a name="timeseries-forecasting-deploy-models"></a>

After training your models using the AutoML API, you can deploy them for real-time or batch-based forecasting. 

The AutoML API trains several model candidates for your time-series data and selects an optimal forecasting model based on your target objective metric. Once your model candidates have been trained, you can find the best candidate in the response [DescribeAutoMLJobV2](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJobV2.html) at [BestCandidate](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-CandidateName).

To get predictions using this best performing model, you can either set up an endpoint to obtain forecasts interactively or use batch forecasting to make predictions on a batch of observations.

**Considerations**
+ When providing input data for forecasting, the schema of your data should remain the same as the one used to train your model, including the number of columns, column headers, and data types. You can forecast for existing or new item IDs within the same or different timestamp range to predict for a different time period.
+ Forecasting models predict for the forecast horizon points in the future specified in the input request at training, which is from the *target end date* to the *target end date \$1 forecast horizon*. To use the model for predicting specific dates, you should provide the data in the same format as the original input data, extending up to a specified *target end date*. In this scenario, the model will start predicting from the new target end date.

  For example, if your dataset had monthly data from January to June with a Forecast horizon of 2, then the model would predict the target value for the next 2 months, which would be July and August. If in August, you want to predict for the next 2 months, this time your input data should be from January to August and the model will predict for the next 2 months (September, October).
+ When forecasting future data points, there is no set minimum for the amount of historical data to provide. Include enough data to capture seasonal and recurrent patterns in your time-series.

**Topics**
+ [Real-time forecasting](timeseries-forecasting-realtime.md)
+ [Batch forecasting](timeseries-forecasting-batch.md)

# Real-time forecasting
<a name="timeseries-forecasting-realtime"></a>

Real-time forecasting is useful when you need to generate predictions on-the-fly, such as for applications that require immediate responses or when forecasting for individual data points.

By deploying your AutoML model as a real-time endpoint, you can generate forecasts on-demand and minimize the latency between receiving new data and obtaining predictions. This makes real-time forecasting well-suited for applications that require immediate, personalized, or event-driven forecasting capabilities.

For real time forecasting, the dataset should be a subset of the input dataset. The real time endpoint has an input data size of approximately 6MB and a response timeout limitation of 60 seconds. We recommend bringing in one or few items at a time.

You can use SageMaker APIs to retrieve the best candidate of an AutoML job and then create a SageMaker AI endpoint using that candidate.

Alternatively, you can chose the automatic deployment option when creating your Autopilot experiment. For information on setting up the automatic deployment of models, see [How to enable automatic deployment](autopilot-create-experiment-timeseries-forecasting.md#timeseries-forecasting-auto-model-deployment).

**To create a SageMaker AI endpoint using your best model candidate:**

1. 

**Retrieve the details of the AutoML job.**

   The following AWS CLI command example uses the [DescribeAutoMLJobV2](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJobV2.html) API to obtain details of the AutoML job, including the information about the best model candidate.

   ```
   aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name job-name --region region
   ```

1. 

**Extract the container definition from [InferenceContainers](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-InferenceContainers) for the best model candidate.**

   A container definition is the containerized environment used to host the trained SageMaker AI model for making predictions.

   ```
   BEST_CANDIDATE=$(aws sagemaker describe-auto-ml-job-v2 \
     --auto-ml-job-name job-name 
     --region region \
     --query 'BestCandidate.InferenceContainers[0]' \
     --output json
   ```

   This command extracts the container definition for the best model candidate and stores it in the `BEST_CANDIDATE` variable.

1. 

**Create a SageMaker AI model using the best candidate container definition.**

   Use the container definitions from the previous steps to create a SageMaker AI model by using the [CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API.

   ```
   aws sagemaker create-model \
               --model-name 'your-candidate-name>' \
               --primary-container "$BEST_CANDIDATE"
               --execution-role-arn 'execution-role-arn>' \
               --region 'region>
   ```

   The `--execution-role-arn` parameter specifies the IAM role that SageMaker AI assumes when using the model for inference. For details on the permissions required for this role, see [CreateModel API: Execution Role Permissions](https://docs.aws.amazon.com/).

1. 

**Create a SageMaker AI endpoint configuration using the model.**

   The following AWS CLI command uses the [CreateEndpointConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) API to create an endpoint configuration.

   ```
   aws sagemaker create-endpoint-config \
     --production-variants file://production-variants.json \
     --region 'region'
   ```

   Where the `production-variants.json` file contains the model configuration, including the model name and instance type.
**Note**  
We recommend using [m5.12xlarge](https://aws.amazon.com/ec2/instance-types/m5/) instances for real-time forecasting.

   ```
   [
       {
         "VariantName": "variant-name",
         "ModelName": "model-name",
         "InitialInstanceCount": 1,
         "InstanceType": "m5.12xlarge"
       }
     ]
   }
   ```

1. 

**Create the SageMaker AI endpoint using the endpoint configuration.**

   The following AWS CLI example uses the [CreateEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) API to create the endpoint.

   ```
   aws sagemaker create-endpoint \
               --endpoint-name 'endpoint-name>' \
               --endpoint-config-name 'endpoint-config-name' \
               --region 'region'
   ```

   Check the progress of your real-time inference endpoint deployment by using the [DescribeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeEndpoint.html) API. See the following AWS CLI command as an example.

   ```
   aws sagemaker describe-endpoint \
               --endpoint-name 'endpoint-name' \
               --region 'region'
   ```

   After the `EndpointStatus` changes to `InService`, the endpoint is ready to use for real-time inference.

1. 

**Invoke the SageMaker AI endpoint to make predictions.**

   ```
   aws sagemaker invoke-endpoint \
               --endpoint-name 'endpoint-name' \ 
               --region 'region' \
               --body file://input-data-in-bytes.json \
               --content-type 'application/json' outfile
   ```

   Where the `input-data-in-bytes.json` file contains the input data for the prediction.

# Batch forecasting
<a name="timeseries-forecasting-batch"></a>

Batch forecasting, also known as offline inferencing, generates model predictions on a batch of observations. Batch inference is a good option for large datasets or if you don't need an immediate response to a model prediction request.

By contrast, online inference (real-time inferencing) generates predictions in real time. 

You can use SageMaker APIs to retrieve the best candidate of an AutoML job and then submit a batch of input data for inference using that candidate.

1. 

**Retrieve the details of the AutoML job.**

   The following AWS CLI command example uses the [DescribeAutoMLJobV2](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJobV2.html) API to obtain details of the AutoML job, including the information about the best model candidate.

   ```
   aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name job-name --region region
   ```

1. 

**Extract the container definition from [InferenceContainers](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-InferenceContainers) for the best model candidate.**

   A container definition is the containerized environment used to host the trained SageMaker AI model for making predictions.

   ```
   BEST_CANDIDATE=$(aws sagemaker describe-auto-ml-job-v2 \
         --auto-ml-job-name job-name 
         --region region \
         --query 'BestCandidate.InferenceContainers[0]' \
         --output json
   ```

   This command extracts the container definition for the best model candidate and stores it in the `BEST_CANDIDATE` variable.

1. 

**Create a SageMaker AI model using the best candidate container definition.**

   Use the container definitions from the previous steps to create a SageMaker AI model by using the [CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API.

   ```
   aws sagemaker create-model \
         --model-name 'model-name' \
         --primary-container "$BEST_CANDIDATE"
         --execution-role-arn 'execution-role-arn>' \
         --region 'region>
   ```

   The `--execution-role-arn` parameter specifies the IAM role that SageMaker AI assumes when using the model for inference. For details on the permissions required for this role, see [CreateModel API: Execution Role Permissions](https://docs.aws.amazon.com/).

1. 

**Create a batch transform job.**

   The following example creates a transform job using the [CreateTransformJob](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-transform-job.html) API. 

   ```
   aws sagemaker create-transform-job \ 
          --transform-job-name 'transform-job-name' \
          --model-name 'model-name'\
          --transform-input file://transform-input.json \
          --transform-output file://transform-output.json \
          --transform-resources file://transform-resources.json \
          --region 'region'
   ```

   The input, output, and resource details are defined in separate JSON files:
   + `transform-input.json`:

     ```
     {
       "DataSource": {
         "S3DataSource": {
           "S3DataType": "S3Prefix",
           "S3Uri": "s3://my-input-data-bucket/path/to/input/data"
         }
       },
       "ContentType": "text/csv",
       "SplitType": "None"
     }
     ```
   + `transform-output.json`:

     ```
     {
       "S3OutputPath": "s3://my-output-bucket/path/to/output",
       "AssembleWith": "Line"
     }
     ```
   + `transform-resources.json`:
**Note**  
We recommend using [m5.12xlarge](https://aws.amazon.com/ec2/instance-types/m5/) instances for general-purpose workloads and `m5.24xlarge` instances for big data forecasting tasks.

     ```
     {
       "InstanceType": "instance-type",
       "InstanceCount": 1
     }
     ```

1. 

**Monitor the progress of your transform job using the [DescribeTransformJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTransformJob.html) API.**

   See the following AWS CLI command as an example.

   ```
   aws sagemaker describe-transform-job \
         --transform-job-name 'transform-job-name' \
         --region region
   ```

1. 

**Retrieve the batch transform output.**

   After the job is finished, the predicted result is available in the `S3OutputPath`. 

   The output file name has the following format: `input_data_file_name.out`. As an example, if your input file is `text_x.csv`, the output name will be `text_x.csv.out`.

   ```
   aws s3 ls s3://my-output-bucket/path/to/output/
   ```

The following code examples illustrate the use of the AWS SDK for Python (boto3) and the AWS CLI for batch forecasting.

------
#### [ AWS SDK for Python (boto3) ]

 The following example uses **AWS SDK for Python (boto3)** to make predictions in batches.

```
import sagemaker 
import boto3

session = sagemaker.session.Session()

sm_client = boto3.client('sagemaker', region_name='us-west-2')
role = 'arn:aws:iam::1234567890:role/sagemaker-execution-role'
output_path = 's3://test-auto-ml-job/output'
input_data = 's3://test-auto-ml-job/test_X.csv'

best_candidate = sm_client.describe_auto_ml_job_v2(AutoMLJobName=job_name)['BestCandidate']
best_candidate_containers = best_candidate['InferenceContainers']
best_candidate_name = best_candidate['CandidateName']

# create model
reponse = sm_client.create_model(
    ModelName = best_candidate_name,
    ExecutionRoleArn = role,
    Containers = best_candidate_containers 
)

# Lauch Transform Job
response = sm_client.create_transform_job(
    TransformJobName=f'{best_candidate_name}-transform-job',
    ModelName=model_name,
    TransformInput={
        'DataSource': {
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': input_data
            }
        },
        'ContentType': "text/csv",
        'SplitType': 'None'
    },
    TransformOutput={
        'S3OutputPath': output_path,
        'AssembleWith': 'Line',
    },
    TransformResources={
        'InstanceType': 'ml.m5.2xlarge',
        'InstanceCount': 1,
    },
)
```

The batch inference job returns a response in the following format.

```
{'TransformJobArn': 'arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-transform-job',
 'ResponseMetadata': {'RequestId': '659f97fc-28c4-440b-b957-a49733f7c2f2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '659f97fc-28c4-440b-b957-a49733f7c2f2',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '96',
   'date': 'Thu, 11 Aug 2022 22:23:49 GMT'},
  'RetryAttempts': 0}}
```

------
#### [ AWS Command Line Interface (AWS CLI) ]

1. **Obtain the best candidate container definitions**.

   ```
   aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name 'test-automl-job' --region us-west-2
   ```

1. **Create the model**.

   ```
   aws sagemaker create-model --model-name 'test-sagemaker-model'
   --containers '[{
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz",
       "Environment": {
           "AUTOML_SPARSE_ENCODE_RECORDIO_PROTOBUF": "1",
           "AUTOML_TRANSFORM_MODE": "feature-transform",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "application/x-recordio-protobuf",
           "SAGEMAKER_PROGRAM": "sagemaker_serve",
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:1.3-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/tuning/flicdf10v2-dpp0-xgb/test-job1E9-244-7490a1c0/output/model.tar.gz",
       "Environment": {
           "MAX_CONTENT_LENGTH": "20971520",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv",
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,probabilities" 
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3", 
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz", 
       "Environment": { 
           "AUTOML_TRANSFORM_MODE": "inverse-label-transform", 
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv", 
           "SAGEMAKER_INFERENCE_INPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,labels,probabilities", 
           "SAGEMAKER_PROGRAM": "sagemaker_serve", 
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code" 
       } 
   }]' \
   --execution-role-arn 'arn:aws:iam::1234567890:role/sagemaker-execution-role' \
   --region 'us-west-2'
   ```

1. **Create a transform job**.

   ```
   aws sagemaker create-transform-job --transform-job-name 'test-tranform-job'\
    --model-name 'test-sagemaker-model'\
    --transform-input '{
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "SplitType": "None"
       }'\
   --transform-output '{
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line"
       }'\
   --transform-resources '{
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       }'\
   --region 'us-west-2'
   ```

1. **Check the progress of the transform job**. 

   ```
   aws sagemaker describe-transform-job --transform-job-name  'test-tranform-job' --region us-west-2
   ```

   The following is the response from the transform job.

   ```
   {
       "TransformJobName": "test-tranform-job",
       "TransformJobArn": "arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-tranform-job",
       "TransformJobStatus": "InProgress",
       "ModelName": "test-model",
       "TransformInput": {
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "CompressionType": "None",
           "SplitType": "None"
       },
       "TransformOutput": {
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line",
           "KmsKeyId": ""
       },
       "TransformResources": {
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       },
       "CreationTime": 1662495635.679,
       "TransformStartTime": 1662495847.496,
       "DataProcessing": {
           "InputFilter": "$",
           "OutputFilter": "$",
           "JoinSource": "None"
       }
   }
   ```

   After the `TransformJobStatus` changes to `Completed`, you can check the inference result in the `S3OutputPath`.

------