

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 批次預測
<a name="timeseries-forecasting-batch"></a>

批次預測 (也稱為離線推論) 會根據一批觀察產生模型預測。對於大型資料集來說，或者如果您不需要立即回應模型預測請求，批次推論是不錯的選擇。

相反地，線上推論 (即時推論) 可即時產生預測。

您可以使用 SageMaker APIs 擷取 AutoML 任務的最佳候選項目，然後使用該候選項目提交一批輸入資料以進行推論。

1. 

**擷取 AutoML 任務的詳細資訊。**

   下列 AWS CLI 命令範例使用 [DescribeAutoMLJobV2](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJobV2.html) API 來取得 AutoML 任務的詳細資訊，包括最佳模型候選項目的相關資訊。

   ```
   aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name job-name --region region
   ```

1. 

**從 [InferenceContainers](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-InferenceContainers) 擷取容器定義，以獲得最佳模型候選項目。**

   容器定義是用於託管訓練 SageMaker AI 模型以進行預測的容器化環境。

   ```
   BEST_CANDIDATE=$(aws sagemaker describe-auto-ml-job-v2 \
         --auto-ml-job-name job-name 
         --region region \
         --query 'BestCandidate.InferenceContainers[0]' \
         --output json
   ```

   此命令會擷取最佳模型候選項目的容器定義，並將其存放在 `BEST_CANDIDATE` 變數中。

1. 

**使用最佳候選容器定義建立 SageMaker AI 模型。**

   使用上一個步驟中的容器定義，利用 [CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API 建立一個 SageMaker AI 模型。

   ```
   aws sagemaker create-model \
         --model-name 'model-name' \
         --primary-container "$BEST_CANDIDATE"
         --execution-role-arn 'execution-role-arn>' \
         --region 'region>
   ```

   `--execution-role-arn` 參數指定使用模型進行推論時，SageMaker AI 擔任的 IAM 角色。如需此角色所需許可的詳細資訊，請參閱 [CreateModel API：執行角色許可](https://docs.aws.amazon.com/)。

1. 

**建立批次轉換任務。**

   下列範例使用 [CreateTransformJob](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-transform-job.html) API 建立轉換任務。

   ```
   aws sagemaker create-transform-job \ 
          --transform-job-name 'transform-job-name' \
          --model-name 'model-name'\
          --transform-input file://transform-input.json \
          --transform-output file://transform-output.json \
          --transform-resources file://transform-resources.json \
          --region 'region'
   ```

   輸入、輸出和資源詳細資訊在不同的 JSON 檔案中定義：
   + `transform-input.json`:

     ```
     {
       "DataSource": {
         "S3DataSource": {
           "S3DataType": "S3Prefix",
           "S3Uri": "s3://my-input-data-bucket/path/to/input/data"
         }
       },
       "ContentType": "text/csv",
       "SplitType": "None"
     }
     ```
   + `transform-output.json`:

     ```
     {
       "S3OutputPath": "s3://my-output-bucket/path/to/output",
       "AssembleWith": "Line"
     }
     ```
   + `transform-resources.json`:
**注意**  
我們建議將 [m5.12xlarge](https://aws.amazon.com/ec2/instance-types/m5/) 執行個體用於一般用途工作負載，將 `m5.24xlarge` 執行個體用於巨量資料預測任務。

     ```
     {
       "InstanceType": "instance-type",
       "InstanceCount": 1
     }
     ```

1. 

**使用 [DescribeTransformJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTransformJob.html) API 監控轉換任務的進度。**

   請參閱下列 AWS CLI 命令做為範例。

   ```
   aws sagemaker describe-transform-job \
         --transform-job-name 'transform-job-name' \
         --region region
   ```

1. 

**擷取批次轉換輸出。**

   任務完成後，即可在 `S3OutputPath` 中使用預測結果。

   輸出檔案名稱的格式如下：`input_data_file_name.out`。例如，如果輸入檔案是 `text_x.csv`，則輸出名稱將是 `text_x.csv.out`。

   ```
   aws s3 ls s3://my-output-bucket/path/to/output/
   ```

下列程式碼範例說明使用適用於 Python (boto3) 的 AWS SDK 和 AWS CLI 進行批次預測。

------
#### [ AWS SDK for Python (boto3) ]

 下列範例使用**適用於 Python 的AWS SDK (boto3)** 進行批次預測。

```
import sagemaker 
import boto3

session = sagemaker.session.Session()

sm_client = boto3.client('sagemaker', region_name='us-west-2')
role = 'arn:aws:iam::1234567890:role/sagemaker-execution-role'
output_path = 's3://test-auto-ml-job/output'
input_data = 's3://test-auto-ml-job/test_X.csv'

best_candidate = sm_client.describe_auto_ml_job_v2(AutoMLJobName=job_name)['BestCandidate']
best_candidate_containers = best_candidate['InferenceContainers']
best_candidate_name = best_candidate['CandidateName']

# create model
reponse = sm_client.create_model(
    ModelName = best_candidate_name,
    ExecutionRoleArn = role,
    Containers = best_candidate_containers 
)

# Lauch Transform Job
response = sm_client.create_transform_job(
    TransformJobName=f'{best_candidate_name}-transform-job',
    ModelName=model_name,
    TransformInput={
        'DataSource': {
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': input_data
            }
        },
        'ContentType': "text/csv",
        'SplitType': 'None'
    },
    TransformOutput={
        'S3OutputPath': output_path,
        'AssembleWith': 'Line',
    },
    TransformResources={
        'InstanceType': 'ml.m5.2xlarge',
        'InstanceCount': 1,
    },
)
```

批次推論任務以下列格式傳回回應。

```
{'TransformJobArn': 'arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-transform-job',
 'ResponseMetadata': {'RequestId': '659f97fc-28c4-440b-b957-a49733f7c2f2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '659f97fc-28c4-440b-b957-a49733f7c2f2',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '96',
   'date': 'Thu, 11 Aug 2022 22:23:49 GMT'},
  'RetryAttempts': 0}}
```

------
#### [ AWS Command Line Interface (AWS CLI) ]

1. **取得最佳候選項目容器定義**。

   ```
   aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name 'test-automl-job' --region us-west-2
   ```

1. **建立模型**

   ```
   aws sagemaker create-model --model-name 'test-sagemaker-model'
   --containers '[{
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz",
       "Environment": {
           "AUTOML_SPARSE_ENCODE_RECORDIO_PROTOBUF": "1",
           "AUTOML_TRANSFORM_MODE": "feature-transform",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "application/x-recordio-protobuf",
           "SAGEMAKER_PROGRAM": "sagemaker_serve",
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:1.3-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/tuning/flicdf10v2-dpp0-xgb/test-job1E9-244-7490a1c0/output/model.tar.gz",
       "Environment": {
           "MAX_CONTENT_LENGTH": "20971520",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv",
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,probabilities" 
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3", 
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz", 
       "Environment": { 
           "AUTOML_TRANSFORM_MODE": "inverse-label-transform", 
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv", 
           "SAGEMAKER_INFERENCE_INPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,labels,probabilities", 
           "SAGEMAKER_PROGRAM": "sagemaker_serve", 
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code" 
       } 
   }]' \
   --execution-role-arn 'arn:aws:iam::1234567890:role/sagemaker-execution-role' \
   --region 'us-west-2'
   ```

1. **建立轉換任務**。

   ```
   aws sagemaker create-transform-job --transform-job-name 'test-tranform-job'\
    --model-name 'test-sagemaker-model'\
    --transform-input '{
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "SplitType": "None"
       }'\
   --transform-output '{
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line"
       }'\
   --transform-resources '{
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       }'\
   --region 'us-west-2'
   ```

1. **檢查轉換任務的進度**。

   ```
   aws sagemaker describe-transform-job --transform-job-name  'test-tranform-job' --region us-west-2
   ```

   以下是轉換任務的回應。

   ```
   {
       "TransformJobName": "test-tranform-job",
       "TransformJobArn": "arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-tranform-job",
       "TransformJobStatus": "InProgress",
       "ModelName": "test-model",
       "TransformInput": {
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "CompressionType": "None",
           "SplitType": "None"
       },
       "TransformOutput": {
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line",
           "KmsKeyId": ""
       },
       "TransformResources": {
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       },
       "CreationTime": 1662495635.679,
       "TransformStartTime": 1662495847.496,
       "DataProcessing": {
           "InputFilter": "$",
           "OutputFilter": "$",
           "JoinSource": "None"
       }
   }
   ```

   `TransformJobStatus` 變更為之後 `Completed`，您可以在 `S3OutputPath` 檢查推論結果。

------