

# Deploy the model to Amazon EC2
<a name="ex1-model-deployment"></a>

To get predictions, deploy your model to Amazon EC2 using Amazon SageMaker AI.

**Topics**
+ [Deploy the Model to SageMaker AI Hosting Services](#ex1-deploy-model)
+ [(Optional) Use SageMaker AI Predictor to Reuse the Hosted Endpoint](#ex1-deploy-model-sdk-use-endpoint)
+ [(Optional) Make Prediction with Batch Transform](#ex1-batch-transform)

## Deploy the Model to SageMaker AI Hosting Services
<a name="ex1-deploy-model"></a>

To host a model through Amazon EC2 using Amazon SageMaker AI, deploy the model that you trained in [Create and Run a Training Job](ex1-train-model.md#ex1-train-model-sdk) by calling the `deploy` method of the `xgb_model` estimator. When you call the `deploy` method, you must specify the number and type of EC2 ML instances that you want to use for hosting an endpoint.

```
import sagemaker
from sagemaker.serializers import CSVSerializer
xgb_predictor=xgb_model.deploy(
    initial_instance_count=1,
    instance_type='ml.t2.medium',
    serializer=CSVSerializer()
)
```
+ `initial_instance_count` (int) – The number of instances to deploy the model.
+ `instance_type` (str) – The type of instances that you want to operate your deployed model.
+ `serializer` (int) – Serialize input data of various formats (a NumPy array, list, file, or buffer) to a CSV-formatted string. We use this because the XGBoost algorithm accepts input files in CSV format.

The `deploy` method creates a deployable model, configures the SageMaker AI hosting services endpoint, and launches the endpoint to host the model. For more information, see the [SageMaker AI generic Estimator's deploy class method](https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator.deploy) in the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable). To retrieve the name of endpoint that's generated by the `deploy` method, run the following code:

```
xgb_predictor.endpoint_name
```

This should return the endpoint name of the `xgb_predictor`. The format of the endpoint name is `"sagemaker-xgboost-YYYY-MM-DD-HH-MM-SS-SSS"`. This endpoint stays active in the ML instance, and you can make instantaneous predictions at any time unless you shut it down later. Copy this endpoint name and save it to reuse and make real-time predictions elsewhere in SageMaker Studio or SageMaker AI notebook instances.

**Tip**  
To learn more about compiling and optimizing your model for deployment to Amazon EC2 instances or edge devices, see [Compile and Deploy Models with Neo](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html).

## (Optional) Use SageMaker AI Predictor to Reuse the Hosted Endpoint
<a name="ex1-deploy-model-sdk-use-endpoint"></a>

After you deploy the model to an endpoint, you can set up a new SageMaker AI predictor by pairing the endpoint and continuously make real-time predictions in any other notebooks. The following example code shows how to use the SageMaker AI Predictor class to set up a new predictor object using the same endpoint. Re-use the endpoint name that you used for the `xgb_predictor`.

```
import sagemaker
xgb_predictor_reuse=sagemaker.predictor.Predictor(
    endpoint_name="sagemaker-xgboost-YYYY-MM-DD-HH-MM-SS-SSS",
    sagemaker_session=sagemaker.Session(),
    serializer=sagemaker.serializers.CSVSerializer()
)
```

The `xgb_predictor_reuse` Predictor behaves exactly the same as the original `xgb_predictor`. For more information, see the [SageMaker AI Predictor](https://sagemaker.readthedocs.io/en/stable/predictors.html#sagemaker.predictor.RealTimePredictor) class in the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable).

## (Optional) Make Prediction with Batch Transform
<a name="ex1-batch-transform"></a>

Instead of hosting an endpoint in production, you can run a one-time batch inference job to make predictions on a test dataset using the SageMaker AI batch transform. After your model training has completed, you can extend the estimator to a `transformer` object, which is based on the [SageMaker AI Transformer](https://sagemaker.readthedocs.io/en/stable/api/inference/transformer.html) class. The batch transformer reads in input data from a specified S3 bucket and makes predictions.

**To run a batch transform job**

1. Run the following code to convert the feature columns of the test dataset to a CSV file and uploads to the S3 bucket:

   ```
   X_test.to_csv('test.csv', index=False, header=False)
   
   boto3.Session().resource('s3').Bucket(bucket).Object(
   os.path.join(prefix, 'test/test.csv')).upload_file('test.csv')
   ```

1. Specify S3 bucket URIs of input and output for the batch transform job as shown following:

   ```
   # The location of the test dataset
   batch_input = 's3://{}/{}/test'.format(bucket, prefix)
   
   # The location to store the results of the batch transform job
   batch_output = 's3://{}/{}/batch-prediction'.format(bucket, prefix)
   ```

1. Create a transformer object specifying the minimal number of parameters: the `instance_count` and `instance_type` parameters to run the batch transform job, and the `output_path` to save prediction data as shown following: 

   ```
   transformer = xgb_model.transformer(
       instance_count=1, 
       instance_type='ml.m4.xlarge', 
       output_path=batch_output
   )
   ```

1. Initiate the batch transform job by executing the `transform()` method of the `transformer` object as shown following:

   ```
   transformer.transform(
       data=batch_input, 
       data_type='S3Prefix',
       content_type='text/csv', 
       split_type='Line'
   )
   transformer.wait()
   ```

1. When the batch transform job is complete, SageMaker AI creates the `test.csv.out` prediction data saved in the `batch_output` path, which should be in the following format: `s3://sagemaker-<region>-111122223333/demo-sagemaker-xgboost-adult-income-prediction/batch-prediction`. Run the following AWS CLI to download the output data of the batch transform job:

   ```
   ! aws s3 cp {batch_output} ./ --recursive
   ```

   This should create the `test.csv.out` file under the current working directory. You'll be able to see the float values that are predicted based on the logistic regression of the XGBoost training job.