

# Connect to Amazon SageMaker AI resources from within a VPC
<a name="infrastructure-connect-to-resources"></a>

**Important**  
The following information applies to both Amazon SageMaker Studio and Amazon SageMaker Studio Classic. The same concepts of connecting to resources within a VPC apply to both Studio and Studio Classic.

Amazon SageMaker Studio and SageMaker AI notebook instances allow direct internet access by default. SageMaker AI allows you to download popular packages and notebooks, customize your development environment, and work efficiently. However, this could provide an opening for unauthorized access to your data. For example, if you install malicious code on your computer as a publicly available notebook or source code library, it could access your data. You can restrict which traffic can access the internet by launching your Studio and SageMaker AI notebook instances in a [Amazon Virtual Private Cloud (Amazon VPC)](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html). 

An Amazon Virtual Private Cloud is a virtual network dedicated to your AWS account. With an Amazon VPC, you can control the network access and internet connectivity of your Studio and notebook instances. You can remove direct internet access to add another layer of security.

The following topics describe how to connect your Studio instances and notebook instances to resources in a VPC.

**Topics**
+ [

# Connect Amazon SageMaker Studio in a VPC to External Resources
](studio-updated-and-internet-access.md)
+ [

# Connect Studio notebooks in a VPC to external resources
](studio-notebooks-and-internet-access.md)
+ [

# Connect a Notebook Instance in a VPC to External Resources
](appendix-notebook-and-internet-access.md)

# Connect Amazon SageMaker Studio in a VPC to External Resources
<a name="studio-updated-and-internet-access"></a>

**Important**  
As of November 30, 2023, the previous Amazon SageMaker Studio experience is now named Amazon SageMaker Studio Classic. The following section is specific to using the updated Studio experience. For information about using the Studio Classic application, see [Amazon SageMaker Studio Classic](studio.md).

The following topic gives information on how to connect Amazon SageMaker Studio in a VPC to external resources.

**Topics**
+ [

## Default communication with the internet
](#studio-notebooks-and-internet-access-default-setting)
+ [

## `VPC only` communication with the internet
](#studio-notebooks-and-internet-access-vpc-only)

## Default communication with the internet
<a name="studio-notebooks-and-internet-access-default-setting"></a>

By default, Amazon SageMaker Studio provides a network interface that allows communication with the internet through a VPC managed by SageMaker AI. Traffic to AWS services like Amazon S3 and CloudWatch goes through an internet gateway, as does traffic that accesses the SageMaker AI API and SageMaker AI runtime. Traffic between the domain and your Amazon EFS volume goes through the VPC that you specified when you onboarded to the domain or called the [CreateDomain](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDomain.html) API.

## `VPC only` communication with the internet
<a name="studio-notebooks-and-internet-access-vpc-only"></a>

To prevent SageMaker AI from providing internet access to Studio, you can disable internet access by specifying the `VPC only` network access type when you [onboard to Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-vpc.html) or call the [CreateDomain](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDomain.html) API. As a result, you won't be able to run Studio unless your VPC has an interface endpoint to the SageMaker API and runtime, or a NAT gateway with internet access, and your security groups allow outbound connections.

**Note**  
The network access type can be changed after domain creation using the `--app-network-access-type` parameter of the [update-domain](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-domain.html) command.

### Requirements to use `VPC only` mode
<a name="studio-notebooks-and-internet-access-vpc-only-requirements"></a>

When you choose `VpcOnly`, follow these steps:

1. You must use private subnets only. You cannot use public subnets in `VpcOnly` mode.

1. Ensure your subnets have the required number of IP addresses needed. The expected number of IP addresses needed per user can vary based on use case. We recommend between 2 and 4 IP addresses per user. The total IP address capacity for a domain is the sum of available IP addresses for each subnet provided when the domain is created. Ensure that your estimated IP address usage does not exceed the capacity supported by the number of subnets you provide. Additionally, using subnets distributed across many availability zones can aid in IP address availability. For more information, see [VPC and subnet sizing for IPv4](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Subnets.html#vpc-sizing-ipv4).
**Note**  
You can configure only subnets with a default tenancy VPC in which your instance runs on shared hardware. For more information on the tenancy attribute for VPCs, see [Dedicated Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-instance.html).

1. 
**Warning**  
When using `VpcOnly` mode, you partly own the networking configuration for the domain. We recommend the security best practice of applying least-privilege permissions to the inbound and outbound access that security group rules provide. Overly permissive inbound rule configurations could allow users with access to the VPC to interact with the applications of other user profiles without authentication.

   Set up one or more security groups with inbound and outbound rules that allow the following traffic:
   + [NFS traffic over TCP on port 2049](https://docs.aws.amazon.com/efs/latest/ug/network-access.html) between the domain and the Amazon EFS volume.
   + [TCP traffic within the security group](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html#sg-rules-other-instances). This is required for connectivity between the Jupyter Server application and the Kernel Gateway applications. You must allow access to at least ports in the range `8192-65535`. 

   Create a distinct security group for each user profile and add inbound access from that same security group. We do not recommend reusing a domain-level security group for user profiles. If the domain-level security group allows inbound access to itself, then all applications in the domain would have access to all other applications in the domain.

1. If you want to allow internet access, you must use a [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html#nat-gateway-working-with) with access to the internet, for example through an [internet gateway](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html).

1. If you don't want to allow internet access, [create interface VPC endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-interface.html) (AWS PrivateLink) to allow Studio to access the following services with the corresponding service names. You must also associate the security groups for your VPC with these endpoints.
   + SageMaker API : `com.amazonaws.region.sagemaker.api`. 
   + SageMaker AI runtime: `com.amazonaws.region.sagemaker.runtime`. This is required to run endpoint invocations.
   + Amazon S3: `com.amazonaws.region.s3`.
   + SageMaker Projects: `com.amazonaws.region.servicecatalog`.
   + SageMaker Studio: `aws.sagemaker.region.studio`.
   + Any other AWS services you require.

    If you use the [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/) to run remote training jobs, you must also create the following Amazon VPC endpoints.
   + AWS Security Token Service: `com.amazonaws.region.sts`
   + Amazon CloudWatch: `com.amazonaws.region.logs`. This is required to allow SageMaker Python SDK to get the remote training job status from Amazon CloudWatch.

1. If using the domain in `VpcOnly` mode from an on-premises network, establish private connectivity from the network of the host running Studio in the browser and the target Amazon VPC. This is required because the Studio UI invokes AWS endpoints using API calls with temporary AWS credentials. These temporary credentials are associated with the execution role of the logged user profile. If the domain is configured in `VpcOnly` mode in an on-premises network, the execution role might define IAM policy conditions that enforce the execution of AWS service API calls only through the configured Amazon VPC endpoints.This causes API calls executed from the Studio UI to fail. We recommend resolving this using an [AWS Site-to-Site VPN](https://docs.aws.amazon.com/vpn/latest/s2svpn/VPC_VPN.html) or [AWS Direct Connect](https://docs.aws.amazon.com/directconnect/latest/UserGuide/Welcome.html)connection.

**Note**  
For a customer working within VPC mode, company firewalls can cause connection issues with Studio or applications. Make the following checks if you encounter one of these issues when using Studio from behind a firewall.  
Verify that the Studio URL and URLs for all of your applications are in your network's allowlist. For example:  

  ```
  *.studio.region.sagemaker.aws
  *.console.aws.a2z.com
  ```
Verify that the websocket connections are not blocked. Jupyter uses websockets.

**For more information**
+ [Security groups for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html)
+ [Connect to SageMaker AI Within your VPC](interface-vpc-endpoint.md)
+ [VPC with public and private subnets (NAT)](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Scenario2.html)

# Connect Studio notebooks in a VPC to external resources
<a name="studio-notebooks-and-internet-access"></a>

The following topic gives information about how to connect Studio Notebooks in a VPC to external resources.

## Default communication with the internet
<a name="studio-notebooks-and-internet-access-default"></a>

By default, SageMaker Studio provides a network interface that allows communication with the internet through a VPC managed by SageMaker AI. Traffic to AWS services, like Amazon S3 and CloudWatch, goes through an internet gateway. Traffic that accesses the SageMaker API and SageMaker AI runtime also goes through an internet gateway. Traffic between the domain and Amazon EFS volume goes through the VPC that you identified when you onboarded to Studio or called the [CreateDomain](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDomain.html) API. The following diagram shows the default configuration.

![\[SageMaker Studio VPC diagram depicting direct internet access usage.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/studio/studio-vpc-internet.png)


## `VPC only` communication with the internet
<a name="studio-notebooks-and-internet-access-vpc"></a>

To stop SageMaker AI from providing internet access to your Studio notebooks, disable internet access by specifying the `VPC only` network access type. Specify this network access type when you [onboard to Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-vpc.html) or call the [CreateDomain](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDomain.html) API. As a result, you won't be able to run a Studio notebook unless:
+ your VPC has an interface endpoint to the SageMaker API and runtime, or a NAT gateway with internet access
+ your security groups allow outbound connections

The following diagram shows a configuration for using VPC-only mode.

![\[SageMaker Studio VPC diagram depicting usage of VPC-only mode.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/studio/studio-vpc-private.png)


### Requirements to use `VPC only` mode
<a name="studio-notebooks-and-internet-access-vpc-requirements"></a>

When you choose `VpcOnly`, follow these steps:

1. You must use private subnets only. You cannot use public subnets in `VpcOnly` mode.

1. Ensure your subnets have the required number of IP addresses needed. The expected number of IP addresses needed per user can vary based on use case. We recommend between 2 and 4 IP addresses per user. The total IP address capacity for a Studio domain is the sum of available IP addresses for each subnet provided when the domain is created. Make sure that your IP address usage isn't more than the capacity supported by the number of subnets you provide. Additionally, using subnets distributed across many availability zones can help with IP address availability. For more information, see [VPC and subnet sizing for IPv4](https://docs.aws.amazon.com/vpc/latest/userguide/how-it-works.html#vpc-sizing-ipv4).
**Note**  
You can configure only subnets with a default tenancy VPC in which your instance runs on shared hardware. For more information on the tenancy attribute for VPCs, see [Dedicated Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-instance.html).

1. 
**Warning**  
When using `VpcOnly` mode, you partly own the networking configuration for the domain. We recommend the security best practice of applying least-privilege permissions to the inbound and outbound access that security group rules provide. Overly permissive inbound rule configurations could allow users with access to the VPC to interact with the applications of other user profiles without authentication.

   Set up one or more security groups with inbound and outbound rules that allow the following traffic:
   + [NFS traffic over TCP on port 2049](https://docs.aws.amazon.com/efs/latest/ug/network-access.html) between the domain and the Amazon EFS volume.
   + [TCP traffic within the security group](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html#sg-rules-other-instances). This is required for connectivity between the Jupyter Server application and the Kernel Gateway applications. You must allow access to at least ports in the range `8192-65535`. 

   Create a distinct security group for each user profile and add inbound access from that same security group. We do not recommend reusing a domain-level security group for user profiles. If the domain-level security group allows inbound access to itself, all applications in the domain have access to all other applications in the domain.

1. If you want to allow internet access, you must use a [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html#nat-gateway-working-with) with access to the internet, for example through an [internet gateway](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html).

1. To remove internet access, [create interface VPC endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html) (AWS PrivateLink) to allow Studio to access the following services with the corresponding service names. You must also associate the security groups for your VPC with these endpoints.
   + SageMaker API : `com.amazonaws.region.sagemaker.api` 
   + SageMaker AI runtime: `com.amazonaws.region.sagemaker.runtime`. This is required to run Studio notebooks and to train and host models. 
   + Amazon S3: `com.amazonaws.region.s3`.
   + To use SageMaker Projects: `com.amazonaws.region.servicecatalog`.
   + Any other AWS services you require.

    If you use the [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/) to run remote training jobs, you must also create the following Amazon VPC endpoints.
   + AWS Security Token Service: `com.amazonaws.region.sts`
   + Amazon CloudWatch: `com.amazonaws.region.logs`. This is required to allow SageMaker Python SDK to get the remote training job status from Amazon CloudWatch.

**Note**  
For a customer working within VPC mode, company firewalls can cause connection issues with SageMaker Studio or between JupyterServer and the KernelGateway. Make the following checks if you run into one of these issues when using SageMaker Studio from behind a firewall.  
Check that the Studio URL is in your networks allowlist.
Check that the websocket connections are not blocked. Jupyter uses websocket under the hood. If the KernelGateway application is InService, JupyterServer may not be able to connect to the KernelGateway. You should see this problem when opening System Terminal as well.

**For more information**
+ [Securing Amazon SageMaker Studio connectivity using a private VPC](https://aws.amazon.com/blogs/machine-learning/securing-amazon-sagemaker-studio-connectivity-using-a-private-vpc).
+ [Security groups for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html)
+ [Connect to SageMaker AI Within your VPC](interface-vpc-endpoint.md)
+ [VPC with public and private subnets (NAT)](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Scenario2.html)

# Connect a Notebook Instance in a VPC to External Resources
<a name="appendix-notebook-and-internet-access"></a>

The following topic gives information on how to connect your notebook instance in a VPC to external resources.

## Default communication with the internet
<a name="appendix-notebook-and-internet-access-default"></a>

When your notebook allows *direct internet access*, SageMaker AI provides a network interface that allows the notebook to communicate with the internet through a VPC managed by SageMaker AI. Traffic within your VPC's CIDR goes through elastic network interface created in your VPC. All the other traffic goes through the network interface created by SageMaker AI, which is essentially through the public internet. Traffic to gateway VPC endpoints like Amazon S3 and DynamoDB goes through the public internet, while traffic to interface VPC interface endpoints still goes through your VPC. If you want to use gateway VPC endpoints, you might want to disable direct internet access. 

## VPC-only communication with the internet
<a name="appendix-notebook-and-internet-access-default-vpc"></a>

To disable direct internet access, you can specify a VPC for your notebook instance. By doing so, you prevent SageMaker AI from providing internet access to your notebook instance. As a result, the notebook instance can't train or host models unless your VPC has an interface endpoint (AWS PrivateLink) or a NAT gateway and your security groups allow outbound connections. 

For information about creating a VPC interface endpoint to use AWS PrivateLink for your notebook instance, see [Connect to a Notebook Instance Through a VPC Interface Endpoint](notebook-interface-endpoint.md). For information about setting up a NAT gateway for your VPC, see [VPC with Public and Private Subnets (NAT)](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-example-private-subnets-nat.html) in the *Amazon Virtual Private Cloud User Guide*. For information about security groups, see [Security Groups for Your VPC](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html). For more information about networking configurations in each networking mode and configuring network on premise, see [Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options](https://aws.amazon.com/blogs/machine-learning/understanding-amazon-sagemaker-notebook-instance-networking-configurations-and-advanced-routing-options/). 

**Warning**  
When you use a VPC for your notebook instance, you partly own the networking configuration for the instance. As a security best practice, we recommend that you apply least-privilege permissions to the inbound and outbound access that you permit with your security group rules. If you apply overly permissive inbound rule configurations, then users who have access to your VPC could access your Jupyter Notebooks without authenticating.

## Security and Shared Notebook Instances
<a name="appendix-notebook-and-single-user"></a>

A SageMaker notebook instance is designed to work best for an individual user. It is designed to give data scientists and other users the most power for managing their development environment.

A notebook instance user has root access for installing packages and other pertinent software. We recommend that you exercise judgement when granting individuals access to notebook instances that are attached to a VPC that contains sensitive information. For example, you might grant a user access to a notebook instance with an IAM policy by giving them the ability to create a presigned notebook URL, as shown in the following example:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sagemaker:CreatePresignedNotebookInstanceUrl",
            "Resource": "arn:aws:sagemaker:us-east-1:111122223333:notebook-instance/myNotebookInstance"
        }
    ]
}
```

------

 