

# Give SageMaker AI Access to Resources in your Amazon VPC
<a name="infrastructure-give-access"></a>

SageMaker AI runs the following job types in an Amazon Virtual Private Cloud by default. 
+ Processing
+ Training
+ Model hosting
+ Batch transform
+ Amazon SageMaker Clarify
+ SageMaker AI Compilation

However, containers for these jobs access AWS resources—such as the Amazon Simple Storage Service (Amazon S3) buckets where you store training data and model artifacts—over the internet.

To control access to your data and job containers, we recommend that you create a private VPC and configure it so that they aren't accessible over the internet. For information about creating and configuring a VPC, see [Getting Started With Amazon VPC](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/getting-started-ipv4.html) in the *Amazon VPC User Guide*. Using a VPC helps to protect your job containers and data because you can configure your VPC so that it is not connected to the internet. Using a VPC also allows you to monitor all network traffic in and out of your job containers by using VPC flow logs. For more information, see [VPC Flow Logs](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/flow-logs.html) in the *Amazon VPC User Guide*.

You specify your private VPC configuration when you create jobs by specifying subnets and security groups. When you specify the subnets and security groups, SageMaker AI creates *elastic network interfaces* that are associated with your security groups in one of the subnets. Network interfaces allow your job containers to connect to resources in your VPC. For information about network interfaces, see [Elastic Network Interfaces](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_ElasticNetworkInterfaces.html) in the *Amazon VPC User Guide*.

You specify a VPC configuration within the `VpcConfig` object of the [CreateProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html) operation or [CreateTrainingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html) operation. Specifying a VPC configuration when you create a training job gives your model access to resources within your VPC.

Specifying a VPC configuration alone doesn't change the invocation path. To connect to Amazon SageMaker AI within a VPC, create a VPC endpoint and invoke it. For more information, see [Connect to SageMaker AI Within your VPC](interface-vpc-endpoint.md).

**Topics**
+ [Give SageMaker AI Processing Jobs Access to Resources in Your Amazon VPC](process-vpc.md)
+ [Give SageMaker AI Training Jobs Access to Resources in Your Amazon VPC](train-vpc.md)
+ [Give SageMaker AI Hosted Endpoints Access to Resources in Your Amazon VPC](host-vpc.md)
+ [Give Batch Transform Jobs Access to Resources in Your Amazon VPC](batch-vpc.md)
+ [Give Amazon SageMaker Clarify Jobs Access to Resources in Your Amazon VPC](clarify-vpc.md)
+ [Give SageMaker AI Compilation Jobs Access to Resources in Your Amazon VPC](neo-vpc.md)
+ [Give Inference Recommender Jobs Access to Resources in Your Amazon VPC](inference-recommender-vpc-access.md)

# Give SageMaker AI Processing Jobs Access to Resources in Your Amazon VPC
<a name="process-vpc"></a>

To control access to your data and processing jobs, create a Amazon VPC with private subnets. For information about creating and configuring a VPC, see [Get Started With Amazon VPC](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-getting-started.html) in the *Amazon VPC User Guide*.

You can monitor all network traffic in and out of your processing containers by using VPC flow logs. For more information, see [VPC Flow Logs](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/flow-logs.html) in the *Amazon VPC User Guide*.

This document explains how to add Amazon VPC configurations for processing jobs.

## Configure a Processing Job for Amazon VPC Access
<a name="process-vpc-configure"></a>

You configure the processing job by specifying the subnets and security group IDs within the VPC. You don’t need to specify the subnet for the processing container. Amazon SageMaker AI automatically pulls the processing container from Amazon ECR. For more information about processing containers, see [Data transformation workloads with SageMaker Processing](processing-job.md).

When creating a processing job, you can specify subnets and security groups in your VPC using either the SageMaker AI console or the API.

To use the API, you specify the subnets and security group IDs in the `NetworkConfig.VpcConfig` parameter of the [ CreateProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html) operation. SageMaker AI uses the subnet and security group details to create the network interfaces and attaches them to the processing containers. The network interfaces provide the processing containers with a network connection within your VPC. This allows the processing job to connect to resources that exist in your VPC.

The following is an example of the `VpcConfig` parameter that you include in your call to the `CreateProcessingJob` operation:

```
VpcConfig: {
    "Subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ],    
    "SecurityGroupIds": [
        "sg-0123456789abcdef0"
    ]
}
```

## Configure Your Private VPC for SageMaker AI Processing
<a name="process-vpc-vpc"></a>

When configuring the private VPC for your SageMaker AI processing jobs, use the following guidelines. For information about setting up a VPC, see [Working with VPCs and Subnets](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/working-with-vpcs.html) in the *Amazon VPC User Guide*.

**Topics**
+ [Ensure That Subnets Have Enough IP Addresses](#process-vpc-ip)
+ [Create an Amazon S3 VPC Endpoint](#process-vpc-s3)
+ [Use a Custom Endpoint Policy to Restrict Access to S3](#process-vpc-policy)
+ [Configure Route Tables](#process-vpc-route-table)
+ [Configure the VPC Security Group](#process-vpc-groups)
+ [Connect to Resources Outside Your VPC](#process-vpc-nat)
+ [Monitor Amazon SageMaker Processing Jobs with CloudWatch Logs and Metrics](#process-vpc-cloudwatch)

### Ensure That Subnets Have Enough IP Addresses
<a name="process-vpc-ip"></a>

Your VPC subnets should have at least two private IP addresses for each instance in a processing job. For more information, see [VPC and Subnet Sizing for IPv4](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#vpc-sizing-ipv4) in the *Amazon VPC User Guide*.

### Create an Amazon S3 VPC Endpoint
<a name="process-vpc-s3"></a>

If you configure your VPC so that processing containers don't have access to the internet, they can't connect to the Amazon S3 buckets that contain your data unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your processing containers to access the buckets where you store your data. We recommend that you also create a custom policy that allows only requests from your private VPC to access to your S3 buckets. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html).

**To create an S3 VPC endpoint:**

1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, then choose **Create Endpoint**

1. For **Service Name**, choose **com.amazonaws.*region*.s3**, where *region* is the name of the region where your VPC resides.

1. For **VPC**, choose the VPC you want to use for this endpoint.

1. For **Configure route tables**, select the route tables to be used by the endpoint. The VPC service automatically adds a route to each route table you select that points any S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the S3 service by any user or service within the VPC. Choose **Custom** to restrict access further. For information, see [Use a Custom Endpoint Policy to Restrict Access to S3](#process-vpc-policy).

### Use a Custom Endpoint Policy to Restrict Access to S3
<a name="process-vpc-policy"></a>

The default endpoint policy allows full access to S3 for any user or service in your VPC. To further restrict access to S3, create a custom endpoint policy. For more information, see [Using Endpoint Policies for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-policies-s3). You can also use a bucket policy to restrict access to your S3 buckets to only traffic that comes from your Amazon VPC. For information, see [Using Amazon S3 Bucket Policies](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-s3-bucket-policies).

#### Restrict Package Installation on the Processing Container
<a name="process-vpc-policy-repos"></a>

The default endpoint policy allows users to install packages from the Amazon Linux and Amazon Linux 2 repositories on the processing container. If you don't want users to install packages from that repository, create a custom endpoint policy that explicitly denies access to the Amazon Linux and Amazon Linux 2 repositories. The following is an example of a policy that denies access to these repositories:

```
{ 
    "Statement": [ 
      { 
        "Sid": "AmazonLinuxAMIRepositoryAccess",
        "Principal": "*",
        "Action": [ 
            "s3:GetObject" 
        ],
        "Effect": "Deny",
        "Resource": [
            "arn:aws:s3:::packages.*.amazonaws.com/*",
            "arn:aws:s3:::repo.*.amazonaws.com/*"
        ] 
      } 
    ] 
} 

{ 
    "Statement": [ 
        { "Sid": "AmazonLinux2AMIRepositoryAccess",
          "Principal": "*",
          "Action": [ 
              "s3:GetObject" 
              ],
          "Effect": "Deny",
          "Resource": [
              "arn:aws:s3:::amazonlinux.*.amazonaws.com/*" 
              ] 
         } 
    ] 
}
```

### Configure Route Tables
<a name="process-vpc-route-table"></a>

Use default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example, `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use default DNS settings, ensure that the URLs that you use to specify the locations of the data in your processing jobs resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [Routing for Gateway Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

### Configure the VPC Security Group
<a name="process-vpc-groups"></a>

In distributed processing, you must allow communication between the different containers in the same processing job. To do that, configure a rule for your security group that allows inbound connections between members of the same security group. For more information, see [Security Group Rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules).

### Connect to Resources Outside Your VPC
<a name="process-vpc-nat"></a>

If you're connecting your models to resources outside the VPC that they're running in, do one of the following:
+ **Connect to other AWS services** – If your model needs access to an AWS service that supports interface Amazon VPC endpoints, create an endpoint to connect to that service. For a list of services that support interface endpoints, see [AWS services that integrate with AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/aws-services-privatelink-support.html) in the AWS PrivateLink User Guide. For information about creating an interface VPC endpoint, see [Access an AWS service using an interface VPC endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html) in the AWS PrivateLink User Guide.
+ **Connect to resources over the internet** – If your models are running on instances in an Amazon VPC that does not have a subnet with access to the internet, the models won't have access to resources on the internet. If your model needs access to an AWS service that doesn't support interface VPC endpoints, or to a resource outside of AWS, ensure that you are running your models in a private subnet that has access to the internet using a public NAT gateway in a public subnet. After you have your models running in the private subnet, configure your security groups and network access control lists (NACLs) to allow outbound connections from the private subnet to the public NAT gateway in the public subnet. For information, see [NAT gateways](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html ) in the Amazon VPC User Guide.

### Monitor Amazon SageMaker Processing Jobs with CloudWatch Logs and Metrics
<a name="process-vpc-cloudwatch"></a>

Amazon SageMaker AI provides Amazon CloudWatch logs and metrics to monitor training jobs. CloudWatch provides CPU, GPU, memory, GPU memory, and disk metrics, and event logging. For more information about monitoring Amazon SageMaker processing jobs, see [Amazon SageMaker AI metrics in Amazon CloudWatch](monitoring-cloudwatch.md) and [SageMaker AI job metrics](monitoring-cloudwatch.md#cloudwatch-metrics-jobs).

# Give SageMaker AI Training Jobs Access to Resources in Your Amazon VPC
<a name="train-vpc"></a>

**Note**  
For training jobs, you can configure only subnets with a default tenancy VPC in which your instance runs on shared hardware. For more information on the tenancy attribute for VPCs, see [Dedicated Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-instance.html).

## Configure a Training Job for Amazon VPC Access
<a name="train-vpc-configure"></a>

To control access to your training jobs, run them in an Amazon VPC with private subnets that don’t have internet access.

You configure the training job to run in the VPC by specifying its subnets and security group IDs. You don’t need to specify the subnet for the container of the training job. Amazon SageMaker AI automatically pulls the training container image from Amazon ECR.

When you create a training job, you can specify the subnets and security groups in your VPC using the Amazon SageMaker AI console or the API.

To use the API, you specify the subnets and security group IDs in the `VpcConfig` parameter of the [ CreateTrainingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html) operation. SageMaker AI uses the subnet and security group details to create the network interfaces and attaches them to the training containers. The network interfaces provide the training containers with a network connection within your VPC. This allows the training job to connect to resources that exist in your VPC.

The following is an example of the `VpcConfig` parameter that you include in your call to the `CreateTrainingJob` operation:

```
VpcConfig: {
      "Subnets": [
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2"
          ],
      "SecurityGroupIds": [
          "sg-0123456789abcdef0"
          ]
        }
```

## Configure Your Private VPC for SageMaker AI Training
<a name="train-vpc-vpc"></a>

When configuring the private VPC for your SageMaker AI training jobs, use the following guidelines. For information about setting up a VPC, see [Working with VPCs and Subnets](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/working-with-vpcs.html) in the *Amazon VPC User Guide*.

**Topics**
+ [Ensure That Subnets Have Enough IP Addresses](#train-vpc-ip)
+ [Create an Amazon S3 VPC Endpoint](#train-vpc-s3)
+ [Use a Custom Endpoint Policy to Restrict Access to S3](#train-vpc-policy)
+ [Configure Route Tables](#train-vpc-route-table)
+ [Configure the VPC Security Group](#train-vpc-groups)
+ [Connect to Resources Outside Your VPC](#train-vpc-nat)
+ [Monitor Amazon SageMaker Training Jobs with CloudWatch Logs and Metrics](#train-vpc-cloudwatch)

### Ensure That Subnets Have Enough IP Addresses
<a name="train-vpc-ip"></a>

Training instances that *don't use* an Elastic Fabric Adapter (EFA) should have at least 2 private IP addresses. Training instances that use an EFA should have at least 5 private IP addresses. For more information, see [Multiple IP addresses](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/MultipleIP.html) in the Amazon EC2 User Guide.

Your VPC subnets should have at least two private IP addresses for each instance in a training job. For more information, see [VPC and Subnet Sizing for IPv4](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#vpc-sizing-ipv4) in the *Amazon VPC User Guide*.

### Create an Amazon S3 VPC Endpoint
<a name="train-vpc-s3"></a>

If you configure your VPC so that training containers don't have access to the internet, they can't connect to the Amazon S3 buckets that contain your training data unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your training containers to access the buckets where you store your data and model artifacts. We recommend that you also create a custom policy that allows only requests from your private VPC to access to your S3 buckets. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html).

**To create an S3 VPC endpoint:**

1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, then choose **Create Endpoint**

1. For **Service Name**, search for **com.amazonaws.*region*.s3**, where *region* is the name of the region where your VPC resides.

1. Choose the **Gateway** type.

1. For **VPC**, choose the VPC you want to use for this endpoint.

1. For **Configure route tables**, select the route tables to be used by the endpoint. The VPC service automatically adds a route to each route table you select that points any S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the S3 service by any user or service within the VPC. Choose **Custom** to restrict access further. For information, see [Use a Custom Endpoint Policy to Restrict Access to S3](#train-vpc-policy).

### Use a Custom Endpoint Policy to Restrict Access to S3
<a name="train-vpc-policy"></a>

The default endpoint policy allows full access to S3 for any user or service in your VPC. To further restrict access to S3, create a custom endpoint policy. For more information, see [Using Endpoint Policies for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-policies-s3). You can also use a bucket policy to restrict access to your S3 buckets to only traffic that comes from your Amazon VPC. For information, see [Using Amazon S3 Bucket Policies](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-s3-bucket-policies).

#### Restrict Package Installation on the Training Container
<a name="train-vpc-policy-repos"></a>

The default endpoint policy allows users to install packages from the Amazon Linux and Amazon Linux 2 repositories on the training container. If you don't want users to install packages from that repository, create a custom endpoint policy that explicitly denies access to the Amazon Linux and Amazon Linux 2 repositories. The following is an example of a policy that denies access to these repositories:

```
{ 
    "Statement": [ 
      { 
        "Sid": "AmazonLinuxAMIRepositoryAccess",
        "Principal": "*",
        "Action": [ 
            "s3:GetObject" 
        ],
        "Effect": "Deny",
        "Resource": [
            "arn:aws:s3:::packages.*.amazonaws.com/*",
            "arn:aws:s3:::repo.*.amazonaws.com/*"
        ] 
      } 
    ] 
} 

{ 
    "Statement": [ 
        { "Sid": "AmazonLinux2AMIRepositoryAccess",
          "Principal": "*",
          "Action": [ 
              "s3:GetObject" 
              ],
          "Effect": "Deny",
          "Resource": [
              "arn:aws:s3:::amazonlinux.*.amazonaws.com/*" 
              ] 
         } 
    ] 
}
```

### Configure Route Tables
<a name="train-vpc-route-table"></a>

Use default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example, `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use default DNS settings, ensure that the URLs that you use to specify the locations of the data in your training jobs resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [Routing for Gateway Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

### Configure the VPC Security Group
<a name="train-vpc-groups"></a>

In distributed training, you must allow communication between the different containers in the same training job. To do that, configure a rule for your security group that allows inbound connections between members of the same security group. For EFA-enabled instances, ensure that both inbound and outbound connections allow all traffic from the same security group. For information, see [Security Group Rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules) in the *Amazon Virtual Private Cloud User Guide*.

### Connect to Resources Outside Your VPC
<a name="train-vpc-nat"></a>

If you configure your VPC so that it doesn't have internet access, training jobs that use that VPC do not have access to resources outside your VPC. If your training job needs access to resources outside your VPC, provide access with one of the following options:
+ If your training job needs access to an AWS service that supports interface VPC endpoints, create an endpoint to connect to that service. For a list of services that support interface endpoints, see [VPC Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html) in the *Amazon Virtual Private Cloud User Guide*. For information about creating an interface VPC endpoint, see [Interface VPC Endpoints (AWS PrivateLink)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-interface.html) in the *Amazon Virtual Private Cloud User Guide*.
+ If your training job needs access to an AWS service that doesn't support interface VPC endpoints or to a resource outside of AWS, create a NAT gateway and configure your security groups to allow outbound connections. For information about setting up a NAT gateway for your VPC, see [Scenario 2: VPC with Public and Private Subnets (NAT)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html) in the *Amazon Virtual Private Cloud User Guide*.

### Monitor Amazon SageMaker Training Jobs with CloudWatch Logs and Metrics
<a name="train-vpc-cloudwatch"></a>

Amazon SageMaker AI provides Amazon CloudWatch logs and metrics to monitor training jobs. CloudWatch provides CPU, GPU, memory, GPU memory, and disk metrics, and event logging. For more information about monitoring Amazon SageMaker training jobs, see [Amazon SageMaker AI metrics in Amazon CloudWatch](monitoring-cloudwatch.md) and [SageMaker AI job metrics](monitoring-cloudwatch.md#cloudwatch-metrics-jobs).

# Give SageMaker AI Hosted Endpoints Access to Resources in Your Amazon VPC
<a name="host-vpc"></a>

## Configure a Model for Amazon VPC Access
<a name="host-vpc-configure"></a>

To specify subnets and security groups in your private VPC, use the `VpcConfig` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API, or provide this information when you create a model in the SageMaker AI console. SageMaker AI uses this information to create network interfaces and attach them to your model containers. The network interfaces provide your model containers with a network connection within your VPC that is not connected to the internet. They also enable your model to connect to resources in your private VPC.

**Note**  
You must create at least two subnets in different availability zones in your private VPC, even if you have only one hosting instance.

The following is an example of the `VpcConfig` parameter that you include in your call to `CreateModel`:

```
VpcConfig: {
      "Subnets": [
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2"
          ],
      "SecurityGroupIds": [
          "sg-0123456789abcdef0"
          ]
       }
```

## Configure Your Private VPC for SageMaker AI Hosting
<a name="host-vpc-vpc"></a>

When configuring the private VPC for your SageMaker AI models, use the following guidelines. For information about setting up a VPC, see [Working with VPCs and Subnets](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/working-with-vpcs.html) in the *Amazon VPC User Guide*.

**Topics**
+ [Ensure That Subnets Have Enough IP Addresses](#host-vpc-ip)
+ [Create an Amazon S3 VPC Endpoint](#host-vpc-s3)
+ [Use a Custom Endpoint Policy to Restrict Access to Amazon S3](#host-vpc-policy)
+ [Add Permissions for Endpoint Access for Containers Running in a VPC to Custom IAM Policies](#host-vpc-endpoints)
+ [Configure Route Tables](#host-vpc-route-table)
+ [Connect to Resources Outside Your VPC](#model-vpc-nat)

### Ensure That Subnets Have Enough IP Addresses
<a name="host-vpc-ip"></a>

Training instances that don't use an Elastic Fabric Adapter (EFA) should have at least 2 private IP addresses. Training instances that use an EFA should have at least 5 private IP addresses. For more information, see [Multiple IP addresses](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/MultipleIP.html) in the Amazon EC2 User Guide.

### Create an Amazon S3 VPC Endpoint
<a name="host-vpc-s3"></a>

If you configure your VPC so that model containers don't have access to the internet, they can't connect to the Amazon S3 buckets that contain your data unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your model containers to access the buckets where you store your data and model artifacts . We recommend that you also create a custom policy that allows only requests from your private VPC to access to your S3 buckets. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html).

**To create an Amazon S3 VPC endpoint:**

1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, then choose **Create Endpoint**

1. For **Service Name**, choose **com.amazonaws.*region*.s3**, where *region* is the name of the AWS Region where your VPC resides.

1. For **VPC**, choose the VPC that you want to use for this endpoint.

1. For **Configure route tables**, choose the route tables for the endpoint to use. The VPC service automatically adds a route to each route table that you choose that points Amazon S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the Amazon S3 service by any user or service within the VPC. To restrict access further, choose **Custom**. For more information, see [Use a Custom Endpoint Policy to Restrict Access to Amazon S3](#host-vpc-policy).

### Use a Custom Endpoint Policy to Restrict Access to Amazon S3
<a name="host-vpc-policy"></a>

The default endpoint policy allows full access to Amazon Simple Storage Service (Amazon S3) for any user or service in your VPC. To further restrict access to Amazon S3, create a custom endpoint policy. For more information, see [Using Endpoint Policies for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-policies-s3). 

You can also use a bucket policy to restrict access to your S3 buckets to only traffic that comes from your Amazon VPC. For information, see [Using Amazon S3 Bucket Policies](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-s3-bucket-policies).

#### Restrict Package Installation on the Model Container with a Custom Endpoint Policy
<a name="host-vpc-policy-repos"></a>

The default endpoint policy allows users to install packages from the Amazon Linux and Amazon Linux 2 repositories on the model container. If you don't want users to install packages from those repositories, create a custom endpoint policy that explicitly denies access to the Amazon Linux and Amazon Linux 2 repositories. The following is an example of a policy that denies access to these repositories:

```
{ 
    "Statement": [ 
      { 
        "Sid": "AmazonLinuxAMIRepositoryAccess",
        "Principal": "*",
        "Action": [ 
            "s3:GetObject" 
        ],
        "Effect": "Deny",
        "Resource": [
            "arn:aws:s3:::packages.*.amazonaws.com/*",
            "arn:aws:s3:::repo.*.amazonaws.com/*"
        ] 
      } 
    ] 
} 

{ 
    "Statement": [ 
        { "Sid": "AmazonLinux2AMIRepositoryAccess",
          "Principal": "*",
          "Action": [ 
              "s3:GetObject" 
              ],
          "Effect": "Deny",
          "Resource": [
              "arn:aws:s3:::amazonlinux.*.amazonaws.com/*" 
              ] 
         } 
    ] 
}
```

### Add Permissions for Endpoint Access for Containers Running in a VPC to Custom IAM Policies
<a name="host-vpc-endpoints"></a>

The `SageMakerFullAccess` managed policy includes the permissions that you need to use models configured for Amazon VPC access with an endpoint. These permissions allow SageMaker AI to create an elastic network interface and attach it to model containers running in a VPC. If you use your own IAM policy, you must add the following permissions to that policy to use models configured for VPC access. 

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeVpcs",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:CreateNetworkInterfacePermission",
                "ec2:CreateNetworkInterface"
            ],
            "Resource": "*"
        }
    ]
}
```

------

For more information about the `SageMakerFullAccess` managed policy, see [AWS managed policy: AmazonSageMakerFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonSageMakerFullAccess). 

### Configure Route Tables
<a name="host-vpc-route-table"></a>

Use default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example, `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use default DNS settings, ensure that the URLs that you use to specify the locations of the data in your models resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [Routing for Gateway Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

### Connect to Resources Outside Your VPC
<a name="model-vpc-nat"></a>

If you configure your VPC so that it doesn't have internet access, models that use that VPC do not have access to resources outside your VPC. If your model needs access to resources outside your VPC, provide access with one of the following options:
+ If your model needs access to an AWS service that supports interface VPC endpoints, create an endpoint to connect to that service. For a list of services that support interface endpoints, see [VPC Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html) in the *Amazon VPC User Guide*. For information about creating an interface VPC endpoint, see [Interface VPC Endpoints (AWS PrivateLink)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-interface.html) in the *Amazon VPC User Guide*.
+ If your model needs access to an AWS service that doesn't support interface VPC endpoints or to a resource outside of AWS, create a NAT gateway and configure your security groups to allow outbound connections. For information about setting up a NAT gateway for your VPC, see [Scenario 2: VPC with Public and Private Subnets (NAT)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html) in the *Amazon Virtual Private Cloud User Guide*.

# Give Batch Transform Jobs Access to Resources in Your Amazon VPC
<a name="batch-vpc"></a>

To control access to your data and batch transform jobs, we recommend that you create a private Amazon VPC and configure it so that your jobs aren't accessible over the public internet. You specify your private VPC configuration when you create a model by specifying subnets and security groups. You then specify the same model when you create a batch transform job. When you specify the subnets and security groups, SageMaker AI creates *elastic network interfaces* that are associated with your security groups in one of the subnets. Network interfaces allow your model containers to connect to resources in your VPC. For information about network interfaces, see [Elastic Network Interfaces](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_ElasticNetworkInterfaces.html) in the *Amazon VPC User Guide*.

This document explains how to add Amazon VPC configurations for batch transform jobs.

## Configure a Batch Transform Job for Amazon VPC Access
<a name="batch-vpc-configure"></a>

To specify subnets and security groups in your private VPC, use the `VpcConfig` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API, or provide this information when you create a model in the SageMaker AI console. Then specify the same model in the `ModelName` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html) API, or in the **Model name** field when you create a transform job in the SageMaker AI console. SageMaker AI uses this information to create network interfaces and attach them to your model containers. The network interfaces provide your model containers with a network connection within your VPC that is not connected to the internet. They also enable your transform job to connect to resources in your private VPC.

The following is an example of the `VpcConfig` parameter that you include in your call to `CreateModel`:

```
VpcConfig: {
      "Subnets": [
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2"
          ],
      "SecurityGroupIds": [
          "sg-0123456789abcdef0"
          ]
        }
```

If you are creating a model using the `CreateModel` API operation, the IAM execution role that you use to create your model must include the permissions described in [CreateModel API: Execution Role Permissions](sagemaker-roles.md#sagemaker-roles-createmodel-perms), including the following permissions required for a private VPC. 

When creating a model in the console, if you select **Create a new role** in the **Model Settings** section, the [AmazonSageMakerFullAccess ](https://console.aws.amazon.com/iam/home#/policies/arn:aws:iam::aws:policy/AmazonSageMakerFullAccess$jsonEditor) policy used to create the role already contains these permissions. If you select **Enter a custom IAM role ARN** or **Use existing role**, the role ARN that you specify must have an execution policy attached with the following permissions. 

```
{
            "Effect": "Allow",
            "Action": [
            "ec2:CreateNetworkInterface",
            "ec2:CreateNetworkInterfacePermission",
            "ec2:DeleteNetworkInterface",
            "ec2:DeleteNetworkInterfacePermission",
            "ec2:DescribeNetworkInterfaces",
            "ec2:DescribeVpcs",
            "ec2:DescribeDhcpOptions",
            "ec2:DescribeSubnets",
            "ec2:DescribeSecurityGroups"
```

## Configure Your Private VPC for SageMaker AI Batch Transform
<a name="batch-vpc-vpc"></a>

When configuring the private VPC for your SageMaker AI batch transform jobs, use the following guidelines. For information about setting up a VPC, see [Working with VPCs and Subnets](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/working-with-vpcs.html) in the *Amazon VPC User Guide*.

**Topics**
+ [Ensure That Subnets Have Enough IP Addresses](#batch-vpc-ip)
+ [Create an Amazon S3 VPC Endpoint](#batch-vpc-s3)
+ [Use a Custom Endpoint Policy to Restrict Access to S3](#batch-vpc-policy)
+ [Configure Route Tables](#batch-vpc-route-table)
+ [Configure the VPC Security Group](#batch-vpc-groups)
+ [Connect to Resources Outside Your VPC](#batch-vpc-nat)

### Ensure That Subnets Have Enough IP Addresses
<a name="batch-vpc-ip"></a>

Your VPC subnets should have at least two private IP addresses for each instance in a transform job. For more information, see [VPC and Subnet Sizing for IPv4](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#vpc-sizing-ipv4) in the *Amazon VPC User Guide*.

### Create an Amazon S3 VPC Endpoint
<a name="batch-vpc-s3"></a>

If you configure your VPC so that model containers don't have access to the internet, they can't connect to the Amazon S3 buckets that contain your data unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your model containers to access the buckets where you store your data and model artifacts . We recommend that you also create a custom policy that allows only requests from your private VPC to access to your S3 buckets. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html).

**To create an S3 VPC endpoint:**

1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, then choose **Create Endpoint**

1. For **Service Name**, choose **com.amazonaws.*region*.s3**, where *region* is the name of the region where your VPC resides.

1. For **VPC**, choose the VPC you want to use for this endpoint.

1. For **Configure route tables**, select the route tables to be used by the endpoint. The VPC service automatically adds a route to each route table you select that points any S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the S3 service by any user or service within the VPC. Choose **Custom** to restrict access further. For information, see [Use a Custom Endpoint Policy to Restrict Access to S3](#batch-vpc-policy).

### Use a Custom Endpoint Policy to Restrict Access to S3
<a name="batch-vpc-policy"></a>

The default endpoint policy allows full access to S3 for any user or service in your VPC. To further restrict access to S3, create a custom endpoint policy. For more information, see [Using Endpoint Policies for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-policies-s3). You can also use a bucket policy to restrict access to your S3 buckets to only traffic that comes from your Amazon VPC. For information, see [Using Amazon S3 Bucket Policies](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-s3-bucket-policies).

#### Restrict Package Installation on the Model Container
<a name="batch-vpc-policy-repos"></a>

The default endpoint policy allows users to install packages from the Amazon Linux and Amazon Linux 2 repositories on the training container. If you don't want users to install packages from that repository, create a custom endpoint policy that explicitly denies access to the Amazon Linux and Amazon Linux 2 repositories. The following is an example of a policy that denies access to these repositories:

```
{ 
    "Statement": [ 
      { 
        "Sid": "AmazonLinuxAMIRepositoryAccess",
        "Principal": "*",
        "Action": [ 
            "s3:GetObject" 
        ],
        "Effect": "Deny",
        "Resource": [
            "arn:aws:s3:::packages.*.amazonaws.com/*",
            "arn:aws:s3:::repo.*.amazonaws.com/*"
        ] 
      } 
    ] 
} 

{ 
    "Statement": [ 
        { "Sid": "AmazonLinux2AMIRepositoryAccess",
          "Principal": "*",
          "Action": [ 
              "s3:GetObject" 
              ],
          "Effect": "Deny",
          "Resource": [
              "arn:aws:s3:::amazonlinux.*.amazonaws.com/*" 
              ] 
         } 
    ] 
}
```

### Configure Route Tables
<a name="batch-vpc-route-table"></a>

Use default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example, `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use default DNS settings, ensure that the URLs that you use to specify the locations of the data in your batch transform jobs resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [Routing for Gateway Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

### Configure the VPC Security Group
<a name="batch-vpc-groups"></a>

In distributed batch transform, you must allow communication between the different containers in the same batch transform job. To do that, configure a rule for your security group that allows inbound and outbound connections between members of the same security group. Members of the same security group should be able to communicate with each other across all ports. For more information, see [Security Group Rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules).

### Connect to Resources Outside Your VPC
<a name="batch-vpc-nat"></a>

If you configure your VPC so that it doesn't have internet access, batch transform jobs that use that VPC do not have access to resources outside your VPC. If your batch transform job needs access to resources outside your VPC, provide access with one of the following options:
+ If your batch transform job needs access to an AWS service that supports interface VPC endpoints, create an endpoint to connect to that service. For a list of services that support interface endpoints, see [VPC Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html) in the *Amazon VPC User Guide*. For information about creating an interface VPC endpoint, see [Interface VPC Endpoints (AWS PrivateLink)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-interface.html) in the *Amazon VPC User Guide*.
+ If your batch transform job needs access to an AWS service that doesn't support interface VPC endpoints or to a resource outside of AWS, create a NAT gateway and configure your security groups to allow outbound connections. For information about setting up a NAT gateway for your VPC, see [Scenario 2: VPC with Public and Private Subnets (NAT)](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html) in the *Amazon Virtual Private Cloud User Guide*.

# Give Amazon SageMaker Clarify Jobs Access to Resources in Your Amazon VPC
<a name="clarify-vpc"></a>

To control access to your data and SageMaker Clarify jobs, we recommend that you create a private Amazon VPC and configure it so that your jobs aren't accessible over the public internet. For information about creating and configuring an Amazon VPC for processing jobs, see [Give SageMaker Processing Jobs Access to Resources in Your Amazon VPC](https://docs.aws.amazon.com/sagemaker/latest/dg/process-vpc). 

This document explains how to add additional Amazon VPC configurations that meet the requirements for SageMaker Clarify jobs.

**Topics**
+ [Configure a SageMaker Clarify Job for Amazon VPC Access](#clarify-vpc-config)
+ [Configure Your Private Amazon VPC for SageMaker Clarify jobs](#clarify-vpc-vpc)

## Configure a SageMaker Clarify Job for Amazon VPC Access
<a name="clarify-vpc-config"></a>

You need to specify subnets and security groups when configuring your private Amazon VPC for SageMaker Clarify jobs and to enable the job to get inferences from the SageMaker AI model when computing post-training bias metrics and feature contributions that help explain model predictions.

**Topics**
+ [SageMaker Clarify Job Amazon VPC Subnets and Security Groups](#clarify-vpc-job)
+ [Configure a Model Amazon VPC for Inference](#clarify-vpc-model)

### SageMaker Clarify Job Amazon VPC Subnets and Security Groups
<a name="clarify-vpc-job"></a>

Subnets and security groups in your private Amazon VPC can be assigned to a SageMaker Clarify job in various ways, depending on how you create the job.
+ **SageMaker AI console**: Provide this information when you create the job in the **SageMaker AI Dashboard**. From the **Processing** menu, choose **Processing jobs**, then choose **Create processing job**. Select the **VPC** option in the **Network** panel and provide the subnets and security groups using the drop-down lists. Make sure network isolation option provided in this panel is turned off.
+ **SageMaker API**: Use the `NetworkConfig.VpcConfig` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob) API, as shown in the following example:

  ```
  "NetworkConfig": {
      "VpcConfig": {
          "Subnets": [
              "subnet-0123456789abcdef0",
              "subnet-0123456789abcdef1",
              "subnet-0123456789abcdef2"
          ],
          "SecurityGroupIds": [
              "sg-0123456789abcdef0"
          ]
      }
  }
  ```
+ **SageMaker Python SDK**: Use the `NetworkConfig` parameter of the [https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.SageMakerClarifyProcessor](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.SageMakerClarifyProcessor) API or [https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.processing.Processor](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.processing.Processor) API, as shown in the following example:

  ```
  from sagemaker.network import NetworkConfig
  network_config = NetworkConfig(
      subnets=[
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2",
      ],
      security_group_ids=[
          "sg-0123456789abcdef0",
      ],
  )
  ```

SageMaker AI uses the information to create network interfaces and attach them to the SageMaker Clarify job. The network interfaces provide a SageMaker Clarify job with a network connection within your Amazon VPC that is not connected to the public internet. They also enable the SageMaker Clarify job to connect to resources in your private Amazon VPC.

**Note**  
The network isolation option of the SageMaker Clarify job must be turned off (by default the option is turned off) so that the SageMaker Clarify job can communicate with the shadow endpoint.

### Configure a Model Amazon VPC for Inference
<a name="clarify-vpc-model"></a>

In order to compute post-training bias metrics and explainability, the SageMaker Clarify job needs to get inferences from the SageMaker AI model that is specified by the `model_name` parameter of the [analysis configuration](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-configure-processing-jobs.html#clarify-processing-job-configure-analysis) for the SageMaker Clarify processing job. Alternatively, if you use the `SageMakerClarifyProcessor` API in the SageMaker AI Python SDK, the job needs to get the `model_name` specified by the [ModelConfig](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.ModelConfig) class. To accomplish this, the SageMaker Clarify job creates an ephemeral endpoint with the model, known as a *shadow endpoint*, and then applies the Amazon VPC configuration of the model to the shadow endpoint.

To specify subnets and security groups in your private Amazon VPC to the SageMaker AI model, use the `VpcConfig` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel) API or provide this information when you create the model using the SageMaker AI dashboard in the console. The following is an example of the `VpcConfig` parameter that you include in your call to `CreateModel`: 

```
"VpcConfig": {
    "Subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ],
    "SecurityGroupIds": [
        "sg-0123456789abcdef0"
    ]
}
```

You can specify the number of instances of the shadow endpoint to launch with the `initial_instance_count` parameter of the [analysis configuration](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-configure-processing-jobs.html#clarify-processing-job-configure-analysis) for the SageMaker Clarify processing job. Alternatively, if you use the `SageMakerClarifyProcessor` API in the SageMaker AI Python SDK, the job needs to get the `instance_count` specified by the [ModelConfig](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.ModelConfig) class.

**Note**  
Even if you only request one instance when creating the shadow endpoint, you need at least two subnets in the model's [ModelConfig](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.ModelConfig) in distinct availability zones. Otherwise the shadow endpoint creation fails with the following error:  
ClientError: Error hosting endpoint sagemaker-clarify-endpoint-XXX: Failed. Reason: Unable to locate at least 2 availability zone(s) with the requested instance type YYY that overlap with SageMaker AI subnets.

If your model requires model files in Amazon S3, then the model Amazon VPC needs to have an Amazon S3 VPC endpoint. For more information about creating and configuring an Amazon VPC for SageMaker AI models, see [Give SageMaker AI Hosted Endpoints Access to Resources in Your Amazon VPC](host-vpc.md). 

## Configure Your Private Amazon VPC for SageMaker Clarify jobs
<a name="clarify-vpc-vpc"></a>

In general, you can follow the steps in [Configure Your Private VPC for SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/process-vpc.html#process-vpc-vpc) to configure your private Amazon VPC for SageMaker Clarify jobs. Here are some highlights and special requirements for SageMaker Clarify jobs.

**Topics**
+ [Connect to Resources Outside Your Amazon VPC](#clarify-vpc-nat)
+ [Configure the Amazon VPC Security Group](#clarify-vpc-security-group)

### Connect to Resources Outside Your Amazon VPC
<a name="clarify-vpc-nat"></a>

If you configure your Amazon VPC so that it does not have public internet access, then some additional setup is required to grant SageMaker Clarify jobs access to resources and services outside of your Amazon VPC. For example, an Amazon S3 VPC endpoint is required because a SageMaker Clarify job needs to load a dataset from an S3 bucket as well as save the analysis results to an S3 bucket. For more information, see [Create an Amazon S3 VPC Endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/process-vpc.html#process-vpc-s3) for the creation guide. In addition, if a SageMaker Clarify job needs to get inferences from the shadow endpoint, then it needs to call several more AWS services. 
+ **Create an Amazon SageMaker API service VPC endpoint**: The SageMaker Clarify job needs to call the Amazon SageMaker API service to manipulate the shadow endpoint, or to describe a SageMaker AI model for Amazon VPC validation. You can follow the guidance provided in the [Securing all Amazon SageMaker API calls with AWS PrivateLink](https://aws.amazon.com/blogs/machine-learning/securing-all-amazon-sagemaker-api-calls-with-aws-privatelink/) blog to create an Amazon SageMaker API VPC endpoint that allows the SageMaker Clarify job to make the service calls. Note that the service name of Amazon SageMaker API service is `com.amazonaws.region.sagemaker.api`, where *region* is the name of the Region where your Amazon VPC resides.
+ **Create an Amazon SageMaker AI Runtime VPC Endpoint**: The SageMaker Clarify job needs to call the Amazon SageMaker AI runtime service, which routes the invocations to the shadow endpoint. The setup steps are similar to those for the Amazon SageMaker API service. Note that the service name of Amazon SageMaker AI Runtime service is `com.amazonaws.region.sagemaker.runtime`, where *region* is the name of the Region where your Amazon VPC resides.

### Configure the Amazon VPC Security Group
<a name="clarify-vpc-security-group"></a>

SageMaker Clarify jobs support distributed processing when two or more processing instances are specified in one of the following ways:
+ **SageMaker AI console**: The **Instance count** is specified in the **Resource configuration** part of the **Job settings** panel on the **Create processing job** page.
+ **SageMaker API**: The `InstanceCount` is specified when you create the job with the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob) API.
+ **SageMaker Python SDK**: The `instance_count` is specified when using the [SageMakerClarifyProcessor](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.clarify.SageMakerClarifyProcessor) API or the [Processor](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html?highlight=Processor#sagemaker.processing.Processor) API.

In distributed processing, you must allow communication between the different instances in the same processing job. To do that, configure a rule for your security group that allows inbound connections between members of the same security group. For information, see [Security group rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules).

# Give SageMaker AI Compilation Jobs Access to Resources in Your Amazon VPC
<a name="neo-vpc"></a>

**Note**  
For compilation jobs, you can configure only subnets with a default tenancy VPC in which your job runs on shared hardware. For more information on the tenancy attribute for VPCs, see [Dedicated Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-instance.html).

## Configure a Compilation Job for Amazon VPC Access
<a name="neo-vpc-configure"></a>

To specify subnets and security groups in your private VPC, use the `VpcConfig` request parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html) API, or provide this information when you create a compilation job in the SageMaker AI console. SageMaker AI Neo uses this information to create network interfaces and attach them to your compilation jobs. The network interfaces provide compilation jobs with a network connection within your VPC that is not connected to the internet. They also enable your compilation job to connect to resources in your private VPC. The following is an example of the `VpcConfig` parameter that you include in your call to `CreateCompilationJob`:

```
VpcConfig: {"Subnets": [
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2"
          ],
      "SecurityGroupIds": [
          "sg-0123456789abcdef0"
          ]
        }
```

## Configure Your Private VPC for SageMaker AI Compilation
<a name="neo-vpc-vpc"></a>

When configuring the private VPC for your SageMaker AI compilation jobs, use the following guidelines. For information about setting up a VPC, see [Working with VPCs and Subnets](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/working-with-vpcs.html) in the *Amazon VPC User Guide*.

**Topics**
+ [Ensure That Subnets Have Enough IP Addresses](#neo-vpc-ip)
+ [Create an Amazon S3 VPC Endpoint](#neo-vpc-s3)
+ [Use a Custom Endpoint Policy to Restrict Access to S3](#neo-vpc-policy)
+ [Configure Route Tables](#neo-vpc-route-table)
+ [Configure the VPC Security Group](#neo-vpc-groups)

### Ensure That Subnets Have Enough IP Addresses
<a name="neo-vpc-ip"></a>

Your VPC subnets should have at least two private IP addresses for each instance in a compilation job. For more information, see [VPC and Subnet Sizing for IPv4](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#vpc-sizing-ipv4) in the *Amazon VPC User Guide*.

### Create an Amazon S3 VPC Endpoint
<a name="neo-vpc-s3"></a>

If you configure your VPC to block access to the internet, SageMaker Neo can't connect to the Amazon S3 buckets that contain your models unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your SageMaker Neo compilation jobs to access the buckets where you store your data and model artifacts . We recommend that you also create a custom policy that allows only requests from your private VPC to access to your S3 buckets. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html).

**To create an S3 VPC endpoint:**

1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, then choose **Create Endpoint**

1. For **Service Name**, search for **com.amazonaws.*region*.s3**, where *region* is the name of the region where your VPC resides.

1. Choose the **Gateway** type.

1. For **VPC**, choose the VPC you want to use for this endpoint.

1. For **Configure route tables**, select the route tables to be used by the endpoint. The VPC service automatically adds a route to each route table you select that points any S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the S3 service by any user or service within the VPC. Choose **Custom** to restrict access further. For information, see [Use a Custom Endpoint Policy to Restrict Access to S3](train-vpc.md#train-vpc-policy).

### Use a Custom Endpoint Policy to Restrict Access to S3
<a name="neo-vpc-policy"></a>

The default endpoint policy allows full access to S3 for any user or service in your VPC. To further restrict access to S3, create a custom endpoint policy. For more information, see [Using Endpoint Policies for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-policies-s3). You can also use a bucket policy to restrict access to your S3 buckets to only traffic that comes from your Amazon VPC. For information, see [Using Amazon S3 Bucket Policies](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html#vpc-endpoints-s3-bucket-policies). The following is a sample customized policy:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::your-sample-bucket",
                "arn:aws:s3:::your-sample-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:SourceVpce": [
                        "vpce-1a2b3c4d"
                    ]
                }
            }
        }
    ]
}
```

------

#### Add Permissions for Compilation Job Running in a Amazon VPC to Custom IAM Policies
<a name="neo-vpc-custom-iam"></a>

The `SageMakerFullAccess` managed policy includes the permissions that you need to use models configured for Amazon VPC access with an endpoint. These permissions allow SageMaker Neo to create an elastic network interface and attach it to compilation job running in a Amazon VPC. If you use your own IAM policy, you must add the following permissions to that policy to use models configured for Amazon VPC access.

------
#### [ JSON ]

****  

```
{"Version":"2012-10-17",		 	 	 
    "Statement": [
        {"Effect": "Allow",
            "Action": [
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeVpcs",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:CreateNetworkInterfacePermission",
                "ec2:CreateNetworkInterface",
                "ec2:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": "*"
        }
    ]
}
```

------

For more information about the `SageMakerFullAccess` managed policy, see [AWS managed policy: AmazonSageMakerFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonSageMakerFullAccess).

### Configure Route Tables
<a name="neo-vpc-route-table"></a>

Use default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example, `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use default DNS settings, ensure that the URLs that you use to specify the locations of the data in your compilation jobs resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [Routing for Gateway Endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

### Configure the VPC Security Group
<a name="neo-vpc-groups"></a>

In your security group for the compilation job, you must allow outbound communication to your Amazon S3 Amazon VPC endpoints and the subnet CIDR ranges used for the compilation job. For information, see [Security Group Rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules) and [Control access to services with Amazon VPC endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-access.html).

# Give Inference Recommender Jobs Access to Resources in Your Amazon VPC
<a name="inference-recommender-vpc-access"></a>

**Note**  
Inference Recommender requires you to register your model with Model Registry. Note that Model Registry doesn't allow your model artifacts or Amazon ECR image to be VPC restricted.  
Inference Recommender also has a requirement that your sample payload Amazon S3 object is not VPC restricted. For inference recommendation jobs, you can't create a custom policy that allows only requests from your private VPC to access to your Amazon S3 buckets.

To specify subnets and security groups in your private VPC, use the `RecommendationJobVpcConfig` request parameter of the [CreateInferenceRecommendationsJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateInferenceRecommendationsJob.html) API, or specify your subnets and security groups when you create a recommendation job in the SageMaker AI console.

Inference Recommender uses this information to create endpoints. When provisioning endpoints, SageMaker AI creates network interfaces and attaches them to your endpoints. The network interfaces provide your endpoints with a network connection to your VPC. The following is an example of the `VpcConfig` parameter that you include in a call to `CreateInferenceRecommendationsJob`:

```
VpcConfig: {
      "Subnets": [
          "subnet-0123456789abcdef0",
          "subnet-0123456789abcdef1",
          "subnet-0123456789abcdef2"
          ],
      "SecurityGroupIds": [
          "sg-0123456789abcdef0"
          ]
       }
```

Refer to the following topics for more information on configuring your Amazon VPC for use with Inference Recommender jobs.

**Topics**
+ [Ensure that subnets have enough IP addresses](#inference-recommender-vpc-access-subnets)
+ [Create an Amazon S3 VPC endpoint](#inference-recommender-vpc-access-endpoint)
+ [Add permissions for Inference Recommender jobs running in an Amazon VPC to custom IAM policies](#inference-recommender-vpc-access-permissions)
+ [Configure route tables](#inference-recommender-vpc-access-route-tables)
+ [Configure the VPC security group](#inference-recommender-vpc-access-security-group)

## Ensure that subnets have enough IP addresses
<a name="inference-recommender-vpc-access-subnets"></a>

Your VPC subnets should have at least two private IP addresses for each instance in an inference recommendation job. For more information about subnets and private IP addresses, see [How Amazon VPC works](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#vpc-sizing-ipv4) in the *Amazon VPC User Guide*.

## Create an Amazon S3 VPC endpoint
<a name="inference-recommender-vpc-access-endpoint"></a>

If you configure your VPC to block access to the internet, Inference Recommender can't connect to the Amazon S3 buckets that contain your models unless you create a VPC endpoint that allows access. By creating a VPC endpoint, you allow your SageMaker AI inference recommendation jobs to access the buckets where you store your data and model artifacts.

To create an Amazon S3 VPC endpoint, use the following procedure:

1. Open the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, choose **Endpoints**, and then choose **Create Endpoint**.

1. For **Service Name**, search for `com.amazonaws.region.s3`, where `region` is the name of the Region where your VPC resides.

1. Choose the **Gateway type**.

1. For **VPC**, choose the VPC you want to use for this endpoint.

1. For **Configure route tables**, select the route tables to be used by the endpoint. The VPC service automatically adds a route to each route table you select that points any Amazon S3 traffic to the new endpoint.

1. For **Policy**, choose **Full Access** to allow full access to the Amazon S3 service by any user or service within the VPC.

## Add permissions for Inference Recommender jobs running in an Amazon VPC to custom IAM policies
<a name="inference-recommender-vpc-access-permissions"></a>

The `[ AmazonSageMakerFullAccess](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess)` managed policy includes the permissions that you need to use models configured for Amazon VPC access with an endpoint. These permissions allow Inference Recommender to create an elastic network interface and attach it to the inference recommendation job running in an Amazon VPC. If you use your own IAM policy, you must add the following permissions to that policy to use models configured for Amazon VPC access.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {"Effect": "Allow",
            "Action": [
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeVpcs",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:CreateNetworkInterfacePermission",
                "ec2:CreateNetworkInterface",
                "ec2:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": "*"
        }
    ]
}
```

------



## Configure route tables
<a name="inference-recommender-vpc-access-route-tables"></a>

Use the default DNS settings for your endpoint route table, so that standard Amazon S3 URLs (for example: `http://s3-aws-region.amazonaws.com/amzn-s3-demo-bucket`) resolve. If you don't use the default DNS settings, ensure that the URLs that you use to specify the locations of the data in your inference recommendation jobs resolve by configuring the endpoint route tables. For information about VPC endpoint route tables, see [ Routing gateway endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpce-gateway.html#vpc-endpoints-routing) in the *Amazon VPC User Guide*.

## Configure the VPC security group
<a name="inference-recommender-vpc-access-security-group"></a>

In your security group for the inference recommendation job, you must allow outbound communication to your Amazon S3 VPC endpoints and the subnet CIDR ranges used for the inference recommendation job. For information, see [Security Group Rules](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#SecurityGroupRules) and [Control access to services with Amazon VPC endpoints](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-access.html) in the *Amazon VPC User Guide*.