# Migration from Amazon SageMaker Studio Classic **Important** Custom IAM policies that allow Amazon SageMaker Studio or Amazon SageMaker Studio Classic to create Amazon SageMaker resources must also grant permissions to add tags to those resources. The permission to add tags to resources is required because Studio and Studio Classic automatically tag any resources they create. If an IAM policy allows Studio and Studio Classic to create resources but does not allow tagging, "AccessDenied" errors can occur when trying to create resources. For more information, see [Provide permissions for tagging SageMaker AI resources](security_iam_id-based-policy-examples.md#grant-tagging-permissions). [AWS managed policies for Amazon SageMaker AI](security-iam-awsmanpol.md) that give permissions to create SageMaker resources already include permissions to add tags while creating those resources. When you open Amazon SageMaker Studio, the web-based UI is based on the chosen default experience. Amazon SageMaker AI currently supports two different default experiences: the Amazon SageMaker Studio experience and the Amazon SageMaker Studio Classic experience. To access the latest Amazon SageMaker Studio features, you must migrate existing domains from the Amazon SageMaker Studio Classic experience. When you migrate your default experience from Studio Classic to Studio, you don't lose any features, and can still access the Studio Classic IDE within Studio. For information about the added benefits of the Studio experience, see [Amazon SageMaker Studio](studio-updated.md). **Note** For existing customers that created their accounts before November 30, 2023, Studio Classic may be the default experience. You can enable Studio as your default experience using the AWS Command Line Interface (AWS CLI) or the Amazon SageMaker AI console. For more information about Studio Classic, see [Amazon SageMaker Studio Classic](studio.md). For customers that created their accounts after November 30, 2023, we recommend using Studio as the default experience because it contains various integrated development environments (IDEs), including the Studio Classic IDE, and other new features. JupyterLab 3 reached its end of maintenance date on May 15, 2024. After December 31, 2024, you can only create new Studio Classic notebooks on JupyterLab 3 for a limited period. However after December 31, 2024, SageMaker AI will no longer provide fixes for critical issues on Studio Classic notebooks on JupyterLab 3. We recommend that you migrate your workloads to the new Studio experience, which supports JupyterLab 4. + If Studio is your default experience, the UI is similar to the images found in [Amazon SageMaker Studio UI overview](studio-updated-ui.md). + If Studio Classic is your default experience, the UI is similar to the images found in [Amazon SageMaker Studio Classic UI Overview](studio-ui.md). To migrate, you must update an existing domain. Migrating an existing domain from Studio Classic to Studio requires three distinct phases: 1. ** Migrate the UI from Studio Classic to Studio**: One time, low lift task that requires creating a test domain to ensure Studio is compliant with your organization's network configurations before migrating the existing domain's UI from Studio Classic to Studio. 1. **(Optional) Migrate custom images and lifecycle configuration scripts**: Medium lift task for migrating your custom images and LCC scripts from Studio Classic to Studio. 1. **(Optional) Migrate data from Studio Classic to Studio**: Heavy lift task that requires using AWS DataSync to migrate data from the Studio Classic Amazon Elastic File System volume to either a target Amazon EFS or Amazon Elastic Block Store volume. 1. **(Optional) Migrate data flows from Data Wrangler in Studio Classic**: One time, low lift task for migrating your data flows from Data Wrangler in Studio Classic to Studio, which you can then access in the latest version of Studio through SageMaker Canvas. For more information, see [Migrate data flows from Data Wrangler](studio-updated-migrate-data.md#studio-updated-migrate-flows). The following topics show how to complete these phases to migrate an existing domain from Studio Classic to Studio. ## Automatic migration Between July 2024 and August 2024, we are automatically upgrading the default landing experience for users to the new Studio experience. This only changes the default landing UI to the updated Studio UI. The Studio Classic application is still accessible from the new Studio UI. To ensure that migration works successfully for your users, see [Migrate the UI from Studio Classic to Studio](studio-updated-migrate-ui.md). In particular, ensure the following: + the domain's execution role has the right permissions + the default landing experience is set to Studio + the domain's Amazon VPC, if applicable, is configured to Studio using the Studio VPC endpoint However, if you need to continue having Studio Classic as your default UI for a limited time, set the landing experience to Studio Classic explicitly. For more information, see [Set Studio Classic as the default experience](studio-updated-migrate-ui.md#studio-updated-migrate-revert). **Topics** + [Automatic migration](#studio-updated-migrate-auto) + [Complete prerequisites to migrate the Studio experience](studio-updated-migrate-prereq.md) + [Migrate the UI from Studio Classic to Studio](studio-updated-migrate-ui.md) + [(Optional) Migrate custom images and lifecycle configurations](studio-updated-migrate-lcc.md) + [(Optional) Migrate data from Studio Classic to Studio](studio-updated-migrate-data.md) # Complete prerequisites to migrate the Studio experience Migration of the default experience from Studio Classic to Studio is managed by the administrator of the existing domain. If you do not have permissions to set Studio as the default experience for the existing domain, contact your administrator. To migrate your default experience, you must have administrator permissions or at least have permissions to update the existing domain, AWS Identity and Access Management (IAM), and Amazon Simple Storage Service (Amazon S3). Complete the following prerequisites before migrating an existing domain from Studio Classic to Studio. + The AWS Identity and Access Management role used to complete migration must have a policy attached with at least the following permissions. For information about creating an IAM policy, see [Creating IAM policies](https://docs.aws.amazon.com//IAM/latest/UserGuide/access_policies_create.html). **Note** The release of Studio includes updates to the AWS managed policies. For more information, see [SageMaker AI Updates to AWS Managed Policies](security-iam-awsmanpol.md#security-iam-awsmanpol-updates). + Phase 1 required permissions: + `iam:CreateServiceLinkedRole` + `iam:PassRole` + `sagemaker:DescribeDomain` + `sagemaker:UpdateDomain` + `sagemaker:CreateDomain` + `sagemaker:CreateUserProfile` + `sagemaker:ListApps` + `sagemaker:AddTags` + `sagemaker:DeleteApp` + `sagemaker:DeleteSpace` + `sagemaker:UpdateSpace` + `sagemaker:DeleteUserProfile` + `sagemaker:DeleteDomain` + `s3:PutBucketCORS` + Phase 2 required permissions (Optional, only if using lifecycle configuration scripts): No additional permissions needed. If the existing domain has lifecycle configurations and custom images, the admin will already have the required permissions. + Phase 3 using custom Amazon Elastic File System required permissions (Optional, only if transfering data): + `efs:CreateFileSystem` + `efs:CreateMountTarget` + `efs:DescribeFileSystems` + `efs:DescribeMountTargets` + `efs:DescribeMountTargetSecurityGroups` + `efs:ModifyMountTargetSecurityGroups` + `ec2:DescribeSubnets` + `ec2:DescribeSecurityGroups` + `ec2:DescribeNetworkInterfaceAttribute` + `ec2:DescribeNetworkInterfaces` + `ec2:AuthorizeSecurityGroupEgress` + `ec2:AuthorizeSecurityGroupIngress` + `ec2:CreateNetworkInterface` + `ec2:CreateNetworkInterfacePermission` + `ec2:RevokeSecurityGroupIngress` + `ec2:RevokeSecurityGroupEgress` + `ec2:DeleteSecurityGroup` + `datasync:CreateLocationEfs` + `datasync:CreateTask` + `datasync:StartTaskExecution` + `datasync:DeleteTask` + `datasync:DeleteLocation` + `sagemaker:ListUserProfiles` + `sagemaker:DescribeUserProfile` + `sagemaker:UpdateDomain` + `sagemaker:UpdateUserProfile` + Phase 3 using Amazon Simple Storage Service required permissions (Optional, only if transfering data): + `iam:CreateRole` + `iam:GetRole` + `iam:AttachRolePolicy` + `iam:DetachRolePolicy` + `iam:DeleteRole` + `efs:DescribeFileSystems` + `efs:DescribeMountTargets` + `efs:DescribeMountTargetSecurityGroups` + `ec2:DescribeSubnets` + `ec2:CreateSecurityGroup` + `ec2:DescribeSecurityGroups` + `ec2:DescribeNetworkInterfaces` + `ec2:CreateNetworkInterface` + `ec2:CreateNetworkInterfacePermission` + `ec2:DetachNetworkInterfaces` + `ec2:DeleteNetworkInterface` + `ec2:DeleteNetworkInterfacePermission` + `ec2:CreateTags` + `ec2:AuthorizeSecurityGroupEgress` + `ec2:AuthorizeSecurityGroupIngress` + `ec2:RevokeSecurityGroupIngress` + `ec2:RevokeSecurityGroupEgress` + `ec2:DeleteSecurityGroup` + `datasync:CreateLocationEfs` + `datasync:CreateLocationS3` + `datasync:CreateTask` + `datasync:StartTaskExecution` + `datasync:DescribeTaskExecution` + `datasync:DeleteTask` + `datasync:DeleteLocation` + `sagemaker:CreateStudioLifecycleConfig` + `sagemaker:UpdateDomain` + `s3:ListBucket` + `s3:GetObject` + Access to AWS services from a terminal environment on either: + Your local machine using the AWS CLI version `2.13+`. Use the following command to verify the AWS CLI version. ``` aws --version ``` + AWS CloudShell. For more information, see [What is AWS CloudShell?](https://docs.aws.amazon.com/cloudshell/latest/userguide/welcome.html) + From your local machine or AWS CloudShell, run the following command and provide your AWS credentials. For information about AWS credentials, see [Understanding and getting your AWS credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html). ``` aws configure ``` + Verify that the lightweight JSON processor, jq, is installed in the terminal environment. jq is required to parse AWS CLI responses. ``` jq --version ``` If jq is not installed, install it using one of the following commands: + ``` sudo apt-get install -y jq ``` + ``` sudo yum install -y jq ``` # Migrate the UI from Studio Classic to Studio The first phase for migrating an existing domain involves migrating the UI from Amazon SageMaker Studio Classic to Amazon SageMaker Studio. This phase does not include the migration of data. Users can continue working with their data the same way as they were before migration. For information about migrating data, see [(Optional) Migrate data from Studio Classic to Studio](studio-updated-migrate-data.md). Phase 1 consists of the following steps: 1. Update application creation permissions for new applications available in Studio. 1. Update the VPC configuration for the domain. 1. Upgrade the domain to use the Studio UI. ## Prerequisites Before running these steps, complete the prerequisites in [Complete prerequisites to migrate the Studio experience](studio-updated-migrate-prereq.md). ## Step 1: Update application creation permissions Before migrating the domain, update the domain's execution role to grant users permissions to create applications. 1. Create an AWS Identity and Access Management policy with one of the following contents by following the steps in [Creating IAM policies](https://docs.aws.amazon.com//IAM/latest/UserGuide/access_policies_create.html): + Use the following policy to grant permissions for all application types and spaces. **Note** If the domain uses the `SageMakerFullAccess` policy, you do not need to perform this action. `SageMakerFullAccess` grants permissions to create all applications. ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Sid": "SMStudioUserProfileAppPermissionsCreateAndDelete", "Effect": "Allow", "Action": [ "sagemaker:CreateApp", "sagemaker:DeleteApp" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:app/*", "Condition": { "Null": { "sagemaker:OwnerUserProfileArn": "true" } } }, { "Sid": "SMStudioCreatePresignedDomainUrlForUserProfile", "Effect": "Allow", "Action": [ "sagemaker:CreatePresignedDomainUrl" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}" }, { "Sid": "SMStudioAppPermissionsListAndDescribe", "Effect": "Allow", "Action": [ "sagemaker:ListApps", "sagemaker:ListDomains", "sagemaker:ListUserProfiles", "sagemaker:ListSpaces", "sagemaker:DescribeApp", "sagemaker:DescribeDomain", "sagemaker:DescribeUserProfile", "sagemaker:DescribeSpace" ], "Resource": "*" }, { "Sid": "SMStudioAppPermissionsTagOnCreate", "Effect": "Allow", "Action": [ "sagemaker:AddTags" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:*/*", "Condition": { "Null": { "sagemaker:TaggingAction": "false" } } }, { "Sid": "SMStudioRestrictSharedSpacesWithoutOwners", "Effect": "Allow", "Action": [ "sagemaker:CreateSpace", "sagemaker:UpdateSpace", "sagemaker:DeleteSpace" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:space/${sagemaker:DomainId}/*", "Condition": { "Null": { "sagemaker:OwnerUserProfileArn": "true" } } }, { "Sid": "SMStudioRestrictSpacesToOwnerUserProfile", "Effect": "Allow", "Action": [ "sagemaker:CreateSpace", "sagemaker:UpdateSpace", "sagemaker:DeleteSpace" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:space/${sagemaker:DomainId}/*", "Condition": { "ArnLike": { "sagemaker:OwnerUserProfileArn": "arn:aws:sagemaker:us-east-1:111122223333:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}" }, "StringEquals": { "sagemaker:SpaceSharingType": [ "Private", "Shared" ] } } }, { "Sid": "SMStudioRestrictCreatePrivateSpaceAppsToOwnerUserProfile", "Effect": "Allow", "Action": [ "sagemaker:CreateApp", "sagemaker:DeleteApp" ], "Resource": "arn:aws:sagemaker:us-east-1:111122223333:app/${sagemaker:DomainId}/*", "Condition": { "ArnLike": { "sagemaker:OwnerUserProfileArn": "arn:aws:sagemaker:us-east-1:111122223333:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}" }, "StringEquals": { "sagemaker:SpaceSharingType": [ "Private" ] } } }, { "Sid": "AllowAppActionsForSharedSpaces", "Effect": "Allow", "Action": [ "sagemaker:CreateApp", "sagemaker:DeleteApp" ], "Resource": "arn:aws:sagemaker:*:*:app/${sagemaker:DomainId}/*/*/*", "Condition": { "StringEquals": { "sagemaker:SpaceSharingType": [ "Shared" ] } } } ] } ``` ------ + Because Studio shows an expanded set of applications, users may have access to applications that weren't displayed before. Administrators can limit access to these default applications by creating an AWS Identity and Access Management (IAM) policy that grants denies permissions for some applications to specific users. **Note** Application type can be either `jupyterlab` or `codeeditor`. ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Sid": "DenySageMakerCreateAppForSpecificAppTypes", "Effect": "Deny", "Action": "sagemaker:CreateApp", "Resource": "arn:aws:sagemaker:us-east-1:111122223333:app/domain-id/*/app-type/*" } ] } ``` ------ 1. Attach the policy to the execution role of the domain. For instructions, follow the steps in [Adding IAM identity permissions (console)](https://docs.aws.amazon.com//IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console). ## Step 2: Update VPC configuration If you use your domain in `VPC-Only` mode, ensure your VPC configuration meets the requirements for using Studio in `VPC-Only` mode. For more information, see [Connect Amazon SageMaker Studio in a VPC to External Resources](studio-updated-and-internet-access.md). ## Step 3: Upgrade to the Studio UI Before you migrate your existing domain from Studio Classic to Studio, we recommend creating a test domain using Studio with the same configurations as your existing domain. ### (Optional) Create a test domain Use this test domain to interact with Studio, test out networking configurations, and launch applications, before migrating the existing domain. 1. Get the domain ID of your existing domain. 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane, expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain. 1. On the **Domain details** page, choose the **Domain settings** tab. 1. Copy the **Domain ID**. 1. Add the domain ID of your existing domain. ``` export REF_DOMAIN_ID="domain-id" export SM_REGION="region" ``` 1. Use `describe-domain` to get important information about the existing domain. ``` export REF_EXECROLE=$(aws sagemaker describe-domain --region=$SM_REGION --domain-id=$REF_DOMAIN_ID | jq -r '.DefaultUserSettings.ExecutionRole') export REF_VPC=$(aws sagemaker describe-domain --region=$SM_REGION --domain-id=$REF_DOMAIN_ID | jq -r '.VpcId') export REF_SIDS=$(aws sagemaker describe-domain --region=$SM_REGION --domain-id=$REF_DOMAIN_ID | jq -r '.SubnetIds | join(",")') export REF_SGS=$(aws sagemaker describe-domain --region=$SM_REGION --domain-id=$REF_DOMAIN_ID | jq -r '.DefaultUserSettings.SecurityGroups | join(",")') export AUTHMODE=$(aws sagemaker describe-domain --region=$SM_REGION --domain-id=$REF_DOMAIN_ID | jq -r '.AuthMode') ``` 1. Validate the parameters. ``` echo "Execution Role: $REF_EXECROLE || VPCID: $REF_VPC || SubnetIDs: $REF_SIDS || Security GroupIDs: $REF_SGS || AuthMode: $AUTHMODE" ``` 1. Create a test domain using the configurations from the existing domain. ``` IFS=',' read -r -a subnet_ids <<< "$REF_SIDS" IFS=',' read -r -a security_groups <<< "$REF_SGS" security_groups_json=$(printf '%s\n' "${security_groups[@]}" | jq -R . | jq -s .) aws sagemaker create-domain \ --domain-name "TestV2Config" \ --vpc-id $REF_VPC \ --auth-mode $AUTHMODE \ --subnet-ids "${subnet_ids[@]}" \ --app-network-access-type VpcOnly \ --default-user-settings " { \"ExecutionRole\": \"$REF_EXECROLE\", \"StudioWebPortal\": \"ENABLED\", \"DefaultLandingUri\": \"studio::\", \"SecurityGroups\": $security_groups_json } " ``` 1. After the test domain is `In Service`, use the test domain's ID to create a user profile. This user profile is used to launch and test applications. ``` aws sagemaker create-user-profile \ --region="$SM_REGION" --domain-id=test-domain-id \ --user-profile-name test-network-user ``` #### Test Studio functionality Launch the test domain using the `test-network-user` user profile. We suggest that you thoroughly test the Studio UI and create applications to test Studio functionality in `VPCOnly` mode. Test the following workflows: + Create a new JupyterLab Space, test environment and connectivity. + Create a new Code Editor, based on Code-OSS, Visual Studio Code - Open Source Space, test environment and connectivity. + Launch a new Studio Classic App, test environment and connectivity. + Test Amazon Simple Storage Service connectivity with test read and write actions. If these tests are successful, then upgrade the existing domain. If you encounter any failures, we recommended fixing your environment and connectivity issues before updating the existing domain. #### Clean up test domain resources After you have migrated the existing domain, clean up test domain resources. 1. Add the test domain's ID. ``` export TEST_DOMAIN="test-domain-id" export SM_REGION="region" ``` 1. List all applications in the domain that are in a running state. ``` active_apps_json=$(aws sagemaker list-apps --region=$SM_REGION --domain-id=$TEST_DOMAIN) echo $active_apps_json ``` 1. Parse the JSON list of running applications and delete them. If users attempted to create an application that they do not have permissions for, there may be spaces that are not captured in the following script. You must manually delete these spaces. ``` echo "$active_apps_json" | jq -c '.Apps[]' | while read -r app; do if echo "$app" | jq -e '. | has("SpaceName")' > /dev/null; then app_type=$(echo "$app" | jq -r '.AppType') app_name=$(echo "$app" | jq -r '.AppName') domain_id=$(echo "$app" | jq -r '.DomainId') space_name=$(echo "$app" | jq -r '.SpaceName') echo "Deleting App - AppType: $app_type || AppName: $app_name || DomainId: $domain_id || SpaceName: $space_name" aws sagemaker delete-app --region=$SM_REGION --domain-id=$domain_id \ --app-type $app_type --app-name $app_name --space-name $space_name echo "Deleting Space - AppType: $app_type || AppName: $app_name || DomainId: $domain_id || SpaceName: $space_name" aws sagemaker delete-space --region=$SM_REGION --domain-id=$domain_id \ --space-name $space_name else app_type=$(echo "$app" | jq -r '.AppType') app_name=$(echo "$app" | jq -r '.AppName') domain_id=$(echo "$app" | jq -r '.DomainId') user_profile_name=$(echo "$app" | jq -r '.UserProfileName') echo "Deleting Studio Classic - AppType: $app_type || AppName: $app_name || DomainId: $domain_id || UserProfileName: $user_profile_name" aws sagemaker delete-app --region=$SM_REGION --domain-id=$domain_id \ --app-type $app_type --app-name $app_name --user-profile-name $user_profile_name fi done ``` 1. Delete the test user profile. ``` aws sagemaker delete-user-profile \ --region=$SM_REGION --domain-id=$TEST_DOMAIN \ --user-profile-name "test-network-user" ``` 1. Delete the test domain. ``` aws sagemaker delete-domain \ --region=$SM_REGION --domain-id=$TEST_DOMAIN ``` After you have tested Studio functionality with the configurations in your test domain, migrate the existing domain. When Studio is the default experience for a domain, Studio is the default experience for all users in the domain. However, the user settings takes precedence over the domain settings. Therefore, if a user has their default experience set to Studio Classic in their user settings, then that user will have Studio Classic as their default experience. You can migrate the existing domain by updating it from the SageMaker AI console, the AWS CLI, or AWS CloudFormation. Choose one of the following tabs to view the relevant instructions. ### Set Studio as the default experience for the existing domain using the SageMaker AI console You can set Studio as the default experience for the existing domain by using the SageMaker AI console. 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain that you want to enable Studio as the default experience for. 1. On the **Domain details** page expand **Enable the new Studio**. 1. (Optional) To view the details about the steps involved in enabling Studio as your default experience, choose **View details**. The page shows the following. + In the **SageMaker Studio Overview** section you can view the applications that are included or available in the Studio web-based interface. + In the **Enablement process** section you can view descriptions of the workflow tasks to enable Studio. **Note** You will need to migrate your data manually. For instructions about migrating your data, see [(Optional) Migrate data from Studio Classic to Studio](studio-updated-migrate-data.md). + In the **Revert to Studio Classic experience** section you can view how to revert back to Studio Classic after enabling Studio as your default experience. 1. To begin the process to enable Studio as your default experience, choose **Enable the new Studio**. 1. In the **Specify and configure role** section, you can view the default applications that are automatically included in Studio. To prevent users from running these applications, choose the AWS Identity and Access Management (IAM) Role that has an IAM policy that denies access. For information about how to create a policy to limit access, see [Step 1: Update application creation permissions](#studio-updated-migrate-limit-apps). 1. In the **Choose default S3 bucket to attach CORS policy** section, you can give Studio access to Amazon S3 buckets. The default Amazon S3 bucket, in this case, is the default Amazon S3 bucket for your Studio Classic. In this step you can do the following: + Verify the domain’s default Amazon S3 bucket to attach the CORS policy to. If your domain does not have a default Amazon S3 bucket, SageMaker AI creates an Amazon S3 bucket with the correct CORS policy attached. + You can include 10 additional Amazon S3 buckets to attach the CORS policy to. If you wish to include more than 10 buckets, you can add them manually. For more information about manually attaching the CORS policy to your Amazon S3 buckets, see [(Optional) Update your CORS policy to access Amazon S3 buckets](#studio-updated-migrate-cors). To proceed, select the check box next to **Do you agree to overriding any existing CORS policy on the chosen Amazon S3 buckets?**. 1. The **Migrate data** section contains information about the different data storage volumes for Studio Classic and Studio. Your data will not be migrated automatically through this process. For instructions about migrating your data, lifecycle configurations, and JupyterLab extensions, see [(Optional) Migrate data from Studio Classic to Studio](studio-updated-migrate-data.md). 1. Once you have completed the tasks on the page and verified your configuration, choose **Enable the new Studio**. ### Set Studio as the default experience for the existing domain using the AWS CLI To set Studio as the default experience for the existing domain using the AWS CLI, use the [update-domain](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-domain.html) call. You must set `ENABLED` as the value for `StudioWebPortal`, and set `studio::` as the value for `DefaultLandingUri` as part of the `default-user-settings` parameter. `StudioWebPortal` indicates if the Studio experience is the default experience and `DefaultLandingUri` indicates the default experience that the user is directed to when accessing the domain. In this example, setting these values on a domain level (in `default-user-settings`) makes Studio the default experience for users within the domain. If a user within the domain has their `StudioWebPortal` set to `DISABLED` and `DefaultLandingUri` set to `app:JupyterServer:` on a user level (in `UserSettings`), this takes precedence over the domain settings. In other words, that user will have Studio Classic as their default experience, regardless of the domain settings. The following code example shows how to set Studio as the default experience for users within the domain: ``` aws sagemaker update-domain \ --domain-id existing-domain-id \ --region AWS Region \ --default-user-settings ' { "StudioWebPortal": "ENABLED", "DefaultLandingUri": "studio::" } ' ``` + To obtain your `existing-domain-id`, use the following instructions: **To get `existing-domain-id`** 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane, expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain. 1. On the **Domain details** page, choose the **Domain settings** tab. 1. Copy the **Domain ID**. + To ensure you are using the correct AWS Region for your domain, use the following instructions: **To get `AWS Region`** 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane, expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain. 1. On the **Domain details** page, verify that this is the existing domain. 1. Expand the AWS Region dropdown list from the top right of the SageMaker AI console, and use the corresponding AWS Region ID to the right of your AWS Region name. For example, `us-west-1`. After you migrate your default experience to Studio, you can give Studio access to Amazon S3 buckets. For example, you can include access to your Studio Classic default Amazon S3 bucket and additional Amazon S3 buckets. To do so, you must manually attach a [Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) (CORS) configuration to the Amazon S3 buckets. For more information about how to manually attach the CORS policy to your Amazon S3 buckets, see [(Optional) Update your CORS policy to access Amazon S3 buckets](#studio-updated-migrate-cors). Similarly, you can set Studio as the default experience when you create a domain from the AWS CLI using the [create-domain](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/create-domain.html) call. ### Set Studio as the default experience for the existing domain using the AWS CloudFormation You can set the default experience when creating a domain using the AWS CloudFormation. For an CloudFormation migration template, see [SageMaker Studio Administrator IaC Templates](https://github.com/aws-samples/sagemaker-studio-admin-iac-templates/tree/main?tab=readme-ov-file#phase-1-migration). For more information about creating a domain using CloudFormation, see [Creating Amazon SageMaker AI domain using CloudFormation](https://github.com/aws-samples/cloudformation-studio-domain?tab=readme-ov-file#creating-sagemaker-studio-domains-using-cloudformation). For information about the domain resource supported by AWS CloudFormation, see [AWS::SageMaker AI::Domain](https://docs.aws.amazon.com//AWSCloudFormation/latest/UserGuide/aws-resource-sagemaker-domain.html#cfn-sagemaker-domain-defaultusersettings). After you migrate your default experience to Studio, you can give Studio access to Amazon S3 buckets. For example, you can include access to your Studio Classic default Amazon S3 bucket and additional Amazon S3 buckets. To do so, you must manually attach a [Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) (CORS) configuration to the Amazon S3 buckets. For information about how to manually attach the CORS policy to your Amazon S3 buckets, see [(Optional) Update your CORS policy to access Amazon S3 buckets](#studio-updated-migrate-cors). ### (Optional) Update your CORS policy to access Amazon S3 buckets In Studio Classic, users can create, list, and upload files to Amazon Simple Storage Service (Amazon S3) buckets. To support the same experience in Studio, administrators must attach a [Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) (CORS) configuration to the Amazon S3 buckets. This is required because Studio makes Amazon S3 calls from the internet browser. The browser invokes CORS on behalf of users. As a result, all of the requests to Amazon S3 buckets fail unless the CORS policy is attached to the Amazon S3 buckets. You may need to manually attach the CORS policy to Amazon S3 buckets for the following reasons. + If there is already an existing Amazon S3 default bucket that doesn’t have the correct CORS policy attached when you migrate the existing domain's default experience to Studio. + If you are using the AWS CLI to migrate the existing domain's default experience to Studio. For information about using the AWS CLI to migrate, see [Set Studio as the default experience for the existing domain using the AWS CLI](#studio-updated-migrate-set-studio-updated-cli). + If you want to attach the CORS policy to additional Amazon S3 buckets. **Note** If you plan to use the SageMaker AI console to enable Studio as your default experience, the Amazon S3 buckets that you attach the CORS policy to will have their existing CORS policies overridden during the migration. For this reason, you can ignore the following manual instructions. However, if you have already used the SageMaker AI console to migrate and want to include more Amazon S3 buckets to attach the CORS policy to, then continue with the following manual instructions. The following procedure shows how to manually add a CORS configuration to an Amazon S3 bucket. **To add a CORS configuration to an Amazon S3 bucket** 1. Verify that there is an Amazon S3 bucket in the same AWS Region as the existing domain with the following name. For instructions, see [Viewing the properties for an Amazon S3 bucket](https://docs.aws.amazon.com//AmazonS3/latest/userguide/view-bucket-properties.html). ``` sagemaker-region-account-id ``` 1. Add a CORS configuration with the following content to the default Amazon S3 bucket. For instructions, see [Configuring cross-origin resource sharing (CORS)](https://docs.aws.amazon.com//AmazonS3/latest/userguide/enabling-cors-examples.html). ``` [ { "AllowedHeaders": [ "*" ], "AllowedMethods": [ "POST", "PUT", "GET", "HEAD", "DELETE" ], "AllowedOrigins": [ "https://*.sagemaker.aws" ], "ExposeHeaders": [ "ETag", "x-amz-delete-marker", "x-amz-id-2", "x-amz-request-id", "x-amz-server-side-encryption", "x-amz-version-id" ] } ] ``` ### (Optional) Migrate from Data Wrangler in Studio Classic to SageMaker Canvas Amazon SageMaker Data Wrangler exists as its own feature in the Studio Classic experience. When you enable Studio as your default experience, use the [Amazon SageMaker Canvas](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas.html) application to access Data Wrangler functionality. SageMaker Canvas is an application in which you can train and deploy machine learning models without writing any code, and Canvas provides data preparation features powered by Data Wrangler. The new Studio experience doesn’t support the classic Data Wrangler UI, and you must create a Canvas application if you want to continue using Data Wrangler. However, you must have the necessary permissions to create and use Canvas applications. Complete the following steps to attach the necessary permissions policies to your SageMaker AI domain's or user’s AWS IAM role. **To grant permissions for Data Wrangler functionality inside Canvas** 1. Attach the AWS managed policy [AmazonSageMakerFullAccess](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess) to your user’s IAM role. For a procedure that shows you how to attach IAM policies to a role, see [Adding IAM identity permissions (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console) in the *AWS IAM User Guide.* If this permissions policy is too permissive for your use case, you can create scoped-down policies that include at least the following permissions: ``` { "Sid": "AllowStudioActions", "Effect": "Allow", "Action": [ "sagemaker:CreatePresignedDomainUrl", "sagemaker:DescribeDomain", "sagemaker:ListDomains", "sagemaker:DescribeUserProfile", "sagemaker:ListUserProfiles", "sagemaker:DescribeSpace", "sagemaker:ListSpaces", "sagemaker:DescribeApp", "sagemaker:ListApps" ], "Resource": "*" }, { "Sid": "AllowAppActionsForUserProfile", "Effect": "Allow", "Action": [ "sagemaker:CreateApp", "sagemaker:DeleteApp" ], "Resource": "arn:aws:sagemaker:region:account-id:app/domain-id/user-profile-name/canvas/*", "Condition": { "Null": { "sagemaker:OwnerUserProfileArn": "true" } } } ``` 1. Attach the AWS managed policy [AmazonSageMakerCanvasDataPrepFullAccess](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerCanvasDataPrepFullAccess.html) to your user’s IAM role. After attaching the necessary permissions, you can create a Canvas application and log in. For more information, see [Getting started with using Amazon SageMaker Canvas](canvas-getting-started.md). When you’ve logged into Canvas, you can directly access Data Wrangler and begin creating data flows. For more information, see [Data preparation](canvas-data-prep.md) in the Canvas documentation. ### (Optional) Migrate from Autopilot in Studio Classic to SageMaker Canvas [Amazon SageMaker Autopilot](https://docs.aws.amazon.com/sagemaker/AWSIronmanApiDoc/integ/npepin-studio-migration-autopilot-to-canvas/latest/dg/autopilot-automate-model-development.html) exists as its own feature in the Studio Classic experience. When you migrate to using the updated Studio experience, use the [Amazon SageMaker Canvas](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas.html) application to continue using the same automated machine learning (AutoML) capabilities via a user interface (UI). SageMaker Canvas is an application in which you can train and deploy machine learning models without writing any code, and Canvas provides a UI to run your AutoML tasks. The new Studio experience doesn’t support the classic Autopilot UI. You must create a Canvas application if you want to continue using Autopilot's AutoML features via a UI. However, you must have the necessary permissions to create and use Canvas applications. + If you are accessing SageMaker Canvas from Studio, add those permissions to the execution role of your SageMaker AI domain or user profile. + If you are accessing SageMaker Canvas from the Console, add those permissions to your user’s AWS IAM role. + If you are accessing SageMaker Canvas via a [presigned URL](https://docs.aws.amazon.com/sagemaker/latest/dg/setting-up-canvas-sso.html#canvas-optional-access), add those permissions to the IAM role that you're using for Okta SSO access. To enable AutoML capabilities in Canvas, add the following policies to your execution role or IAM user role. + AWS managed policy: [`CanvasFullAccess`.](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol-canvas.html#security-iam-awsmanpol-AmazonSageMakerCanvasFullAccess) + Inline policy: ``` { "Sid": "AllowAppActionsForUserProfile", "Effect": "Allow", "Action": [ "sagemaker:CreateApp", "sagemaker:DeleteApp" ], "Resource": "arn:aws:sagemaker:region:account-id:app/domain-id/user-profile-name/canvas/*", "Condition": { "Null": { "sagemaker:OwnerUserProfileArn": "true" } } } ``` **To attach IAM policies to an execution role** 1. **Find the execution role attached to your SageMaker AI user profile** 1. In the SageMaker AI console [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/), navigate to **Domains**, then choose your SageMaker AI domain. 1. The execution role ARN is listed under *Execution role* on the **User Details** page of your user profile. Make note of the execution role name in the ARN. 1. In the IAM console [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/), choose **Roles**. 1. Search for your role by name in the search field. 1. Select the role. 1. Add policies to the role 1. In the IAM console [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/), choose **Roles**. 1. Search for your role by name in the search field. 1. Select the role. 1. In the **Permissions** tab, navigate to the dropdown menu **Add permissions**. 1. + For managed policies: Select **Attach policies**, search for the name of the manage policy you want to attach. Select the policy then choose **Add permissions**. + For inline policies: Select **Create inline policy**, paste your policy in the JSON tab, choose next, name your policy, and choose **Create**. For a procedure that shows you how to attach IAM policies to a role, see [Adding IAM identity permissions (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console) in the *AWS IAM User Guide.* After attaching the necessary permissions, you can create a Canvas application and log in. For more information, see [Getting started with using Amazon SageMaker Canvas](canvas-getting-started.md). ## Set Studio Classic as the default experience Administrators can revert to Studio Classic as the default experience for an existing domain. This can be done through the AWS CLI. **Note** When Studio Classic is set as the default experience on a domain level, Studio Classic is the default experience for all users in the domain. However, settings on a user level takes precedence over the domain level settings. So if a user has their default experience set to Studio, then that user will have Studio as their default experience. To revert to Studio Classic as the default experience for the existing domain using the AWS CLI, use the [update-domain](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-domain.html) call. As part of the `default-user-settings` field, you must set: + `StudioWebPortal` value to `DISABLED`. + `DefaultLandingUri` value to `app:JupyterServer:` `StudioWebPortal` indicates if the Studio experience is the default experience and `DefaultLandingUri` indicates the default experience that the user is directed to when accessing the domain. In this example, setting these values on a domain level (in `default-user-settings`) makes Studio Classic the default experience for users within the domain. If a user within the domain has their `StudioWebPortal` set to `ENABLED` and `DefaultLandingUri` set to `studio::` on a user level (in `UserSettings`), this takes precedence over the domain level settings. In other words, that user will have Studio as their default experience, regardless of the domain level settings. The following code example shows how to set Studio Classic as the default experience for users within the domain: ``` aws sagemaker update-domain \ --domain-id existing-domain-id \ --region AWS Region \ --default-user-settings ' { "StudioWebPortal": "DISABLED", "DefaultLandingUri": "app:JupyterServer:" } ' ``` Use the following instructions to obtain your `existing-domain-id`. 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane, expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain. 1. On the **Domain details** page, choose the **Domain settings** tab. 1. Copy the **Domain ID**. To obtain your `AWS Region`, use the following instructions to ensure you are using the correct AWS Region for your domain. 1. Open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 1. From the left navigation pane, expand **Admin configurations** and choose **Domains**. 1. Choose the existing domain. 1. On the **Domain details** page, verify that this is the existing domain. 1. Expand the AWS Region dropdown list from the top right of the SageMaker AI console, and use the corresponding AWS Region ID to the right of your AWS Region name. For example, `us-west-1`. # (Optional) Migrate custom images and lifecycle configurations You must update your custom images and lifecycle configuration (LCC) scripts to work with the simplified local run model in Amazon SageMaker Studio. If you have not created custom images or lifecycle configurations in your domain, skip this phase. Amazon SageMaker Studio Classic operates in a split environment with: + A `JupyterServer` application running the Jupyter Server. + Studio Classic notebooks running on one or more `KernelGateway` applications. Studio has shifted away from a split environment. Studio runs the JupyterLab and Code Editor, based on Code-OSS, Visual Studio Code - Open Source applications in a local runtime model. For more information about the change in architecture, see [Boost productivity on Amazon SageMaker Studio](https://aws.amazon.com/blogs//machine-learning/boost-productivity-on-amazon-sagemaker-studio-introducing-jupyterlab-spaces-and-generative-ai-tools/). ## Migrate custom images Your existing Studio Classic custom images may not work in Studio. We recommend creating a new custom image that satisfies the requirements for use in Studio. The release of Studio simplifies the process to build custom images by providing [SageMaker Studio image support policy](sagemaker-distribution.md). SageMaker AI Distribution images include popular libraries and packages for machine learning, data science, and data analytics visualization. For a list of base SageMaker Distribution images and Amazon Elastic Container Registry account information, see [Amazon SageMaker Images Available for Use With Studio Classic Notebooks](notebooks-available-images.md). To build a custom image, complete one of the following. + Extend a SageMaker Distribution image with custom packages and modules. These images are pre-configured with JupyterLab and Code Editor, based on Code-OSS, Visual Studio Code - Open Source. + Build a custom Dockerfile file by following the instructions in [Bring your own image (BYOI)](studio-updated-byoi.md). You must install JupyterLab and the open source CodeServer on the image to make it compatible with Studio. ## Migrate lifecycle configurations Because of the simplified local runtime model in Studio, we recommend migrating the structure of your existing Studio Classic LCCs. In Studio Classic, you often have to create separate lifecycle configurations for both KernelGateway and JupyterServer applications. Because the JupyterServer and KernelGateway applications run on separate compute resources within Studio Classic, Studio Classic LCCs can be one of either type: + JupyterServer LCC: These LCCs mostly govern a user’s home actions, including setting proxy, creating environment variables, and auto-shutdown of resources. + KernelGateway LCC: These LCCs govern Studio Classic notebook environment optimizations. This includes updating numpy package versions in the `Data Science 3.0` kernel and installing the snowflake package in `Pytorch 2.0 GPU` kernel. In the simplified Studio architecture, you only need one LCC script that runs at application start up. While migration of your LCC scripts varies based on development environment, we recommend combining JupyterServer and KernelGateway LCCs to build a combined LCC. LCCs in Studio can be associated with one of the following applications: + JupyterLab + Code Editor Users can select the LCC for the respective application type when creating a space or use the default LCC set by the admin. **Note** Existing Studio Classic auto-shutdown scripts do not work with Studio. For an example Studio auto-shutdown script, see [SageMaker Studio Lifecycle Configuration examples](https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples). ### Considerations when refactoring LCCs Consider the following differences between Studio Classic and Studio when refactoring your LCCs. + JupyterLab and Code Editor applications, when created, are run as `sagemaker-user` with `UID:1001` and `GID:101`. By default, `sagemaker-user` has permissions to assume sudo/root permissions. KernelGateway applications are run as `root` by default. + SageMaker Distribution images that run inside JupyterLab and Code Editor apps use the Debian-based package manager, `apt-get`. + Studio JupyterLab and Code Editor applications use the Conda package manager. SageMaker AI creates a single base Python3 Conda environment when a Studio application is launched. For information about updating packages in the base Conda environment and creating new Conda environments, see [JupyterLab user guide](studio-updated-jl-user-guide.md). In contrast, not all KernelGateway applications use Conda as a package manager. + The Studio JupyterLab application uses `JupyterLab 4.0`, while Studio Classic uses `JupyterLab 3.0`. Validate that all JupyterLab extensions you use are compatible with `JupyterLab 4.0`. For more information about extensions, see [Extension Compatibility with JupyterLab 4.0](https://github.com/jupyterlab/jupyterlab/issues/14590). # (Optional) Migrate data from Studio Classic to Studio Studio Classic and Studio use two different types of storage volumes. Studio Classic uses a single Amazon Elastic File System (Amazon EFS) volume to store data across all users and shared spaces in the domain. In Studio, each space gets its own Amazon Elastic Block Store (Amazon EBS) volume. When you update the default experience of an existing domain, SageMaker AI automatically mounts a folder in an Amazon EFS volume for each user in a domain. As a result, users are able to access files from Studio Classic in their Studio applications. For more information, see [Amazon EFS auto-mounting in Studio](studio-updated-automount.md). You can also opt out of Amazon EFS auto-mounting and manually migrate the data to give users access to files from Studio Classic in Studio applications. To accomplish this, you must transfer the files from the user home directories to the Amazon EBS volumes associated with those spaces. The following section gives information about this workflow. For more information about opting out of Amazon EFS auto-mounting, see [Opt out of Amazon EFS auto-mounting](studio-updated-automount-optout.md). ## Manually migrate all of your data from Studio Classic The following section describes how to migrate all of the data from your Studio Classic storage volume to the new Studio experience. When manually migrating a user's data, code, and artifacts from Studio Classic to Studio, we recommend one of the following approaches: 1. Using a custom Amazon EFS volume 1. Using Amazon Simple Storage Service (Amazon S3) If you used Amazon SageMaker Data Wrangler in Studio Classic and want to migrate your data flow files, then choose one of the following options for migration: + If you want to migrate all of the data from your Studio Classic storage volume, including your data flow files, go to [Manually migrate all of your data from Studio Classic](#studio-updated-migrate-data-all) and complete the section **Use Amazon S3 to migrate data**. Then, skip to the [Import the flow files into Canvas](#studio-updated-migrate-flows-import) section. + If you only want to migrate your data flow files and no other data from your Studio Classic storage volume, skip to the [Migrate data flows from Data Wrangler](#studio-updated-migrate-flows) section. ### Prerequisites Before running these steps, complete the prerequisites in [Complete prerequisites to migrate the Studio experience](studio-updated-migrate-prereq.md). You must also complete the steps in [Migrate the UI from Studio Classic to Studio](studio-updated-migrate-ui.md). ### Choosing an approach Consider the following when choosing an approach to migrate your Studio Classic data. ** Pros and cons of using a custom Amazon EFS volume** In this approach, you use an Amazon EFS-to-Amazon EFS AWS DataSync task (one time or cadence) to copy data, then mount the target Amazon EFS volume to a user’s spaces. This gives users access to data from Studio Classic in their Studio compute environments. Pros: + Only the user’s home directory data is visible in the user's spaces. There is no data cross-pollination. + Syncing from the source Amazon EFS volume to a target Amazon EFS volume is safer than directly mounting the source Amazon EFS volume managed by SageMaker AI into spaces. This avoids the potential to impact home directory user files. + Users have the flexibility to continue working in Studio Classic and Studio applications, while having their data available in both applications if AWS DataSync is set up on a regular cadence. + No need for repeated push and pull with Amazon S3. Cons: + No write access to the target Amazon EFS volume mounted to user's spaces. To get write access to the target Amazon EFS volume, customers would need to mount the target Amazon EFS volume to an Amazon Elastic Compute Cloud instance and provide appropriate permissions for users to write to the Amazon EFS prefix. + Requires modification to the security groups managed by SageMaker AI to allow network file system (NFS) inbound and outbound flow. + Costs more than using Amazon S3. + If [migrating data flows from Data Wrangler in Studio Classic](#studio-updated-migrate-flows), you must follow the steps for manually exporting flow files. **Pros and cons of using Amazon S3** In this approach, you use an Amazon EFS-to-Amazon S3 AWS DataSync task (one time or cadence) to copy data, then create a lifecycle configuration to copy the user’s data from Amazon S3 to their private space’s Amazon EBS volume. Pros: + If the LCC is attached to the domain, users can choose to use the LCC to copy data to their space or to run the space with no LCC script. This gives users the choice to copy their files only to the spaces they need. + If an AWS DataSync task is set up on a cadence, users can restart their Studio application to get the latest files. + Because the data is copied over to Amazon EBS, users have write permissions on the files. + Amazon S3 storage is cheaper than Amazon EFS. + If [migrating data flows from Data Wrangler in Studio Classic](#studio-updated-migrate-flows), you can skip the manual export steps and directly import the data flows into SageMaker Canvas from Amazon S3. Cons: + If administrators need to prevent cross-pollination, they must create AWS Identity and Access Management policies at the user level to ensure users can only access the Amazon S3 prefix that contains their files. ### Use a custom Amazon EFS volume to migrate data In this approach, you use an Amazon EFS-to-Amazon EFS AWS DataSync to copy the contents of a Studio Classic Amazon EFS volume to a target Amazon EFS volume once or in a regular cadence, then mount the target Amazon EFS volume to a user’s spaces. This gives users access to data from Studio Classic in their Studio compute environments. 1. Create a target Amazon EFS volume. You will transfer data into this Amazon EFS volume and mount it to a corresponding user's space using prefix-level mounting. ``` export SOURCE_DOMAIN_ID="domain-id" export AWS_REGION="region" export TARGET_EFS=$(aws efs create-file-system --performance-mode generalPurpose --throughput-mode bursting --encrypted --region $REGION | jq -r '.FileSystemId') echo "Target EFS volume Created: $TARGET_EFS" ``` 1. Add variables for the source Amazon EFS volume currently attached to the domain and used by all users. The domain's Amazon Virtual Private Cloud information is required to ensure the target Amazon EFS is created in the same Amazon VPC and subnet, with the same security group configuration. ``` export SOURCE_EFS=$(aws sagemaker describe-domain --domain-id $SOURCE_DOMAIN_ID | jq -r '.HomeEfsFileSystemId') export VPC_ID=$(aws sagemaker describe-domain --domain-id $SOURCE_DOMAIN_ID | jq -r '.VpcId') echo "EFS managed by SageMaker: $SOURCE_EFS | VPC: $VPC_ID" ``` 1. Create an Amazon EFS mount target in the same Amazon VPC and subnet as the source Amazon EFS volume, with the same security group configuration. The mount target takes a few minutes to be available. ``` export EFS_VPC_ID=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].VpcId") export EFS_AZ_NAME=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].AvailabilityZoneName") export EFS_AZ_ID=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].AvailabilityZoneId") export EFS_SUBNET_ID=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].SubnetId") export EFS_MOUNT_TARG_ID=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].MountTargetId") export EFS_SG_IDS=$(aws efs describe-mount-target-security-groups --mount-target-id $EFS_MOUNT_TARG_ID | jq -r '.SecurityGroups[]') aws efs create-mount-target \ --file-system-id $TARGET_EFS \ --subnet-id $EFS_SUBNET_ID \ --security-groups $EFS_SG_IDS ``` 1. Create Amazon EFS source and destination locations for the AWS DataSync task. ``` export SOURCE_EFS_ARN=$(aws efs describe-file-systems --file-system-id $SOURCE_EFS | jq -r ".FileSystems[0].FileSystemArn") export TARGET_EFS_ARN=$(aws efs describe-file-systems --file-system-id $TARGET_EFS | jq -r ".FileSystems[0].FileSystemArn") export EFS_SUBNET_ID_ARN=$(aws ec2 describe-subnets --subnet-ids $EFS_SUBNET_ID | jq -r ".Subnets[0].SubnetArn") export ACCOUNT_ID=$(aws ec2 describe-security-groups --group-id $EFS_SG_IDS | jq -r ".SecurityGroups[0].OwnerId") export EFS_SG_ID_ARN=arn:aws:ec2:$REGION:$ACCOUNT_ID:security-group/$EFS_SG_IDS export SOURCE_LOCATION_ARN=$(aws datasync create-location-efs --subdirectory "/" --efs-filesystem-arn $SOURCE_EFS_ARN --ec2-config SubnetArn=$EFS_SUBNET_ID_ARN,SecurityGroupArns=$EFS_SG_ID_ARN --region $REGION | jq -r ".LocationArn") export DESTINATION_LOCATION_ARN=$(aws datasync create-location-efs --subdirectory "/" --efs-filesystem-arn $TARGET_EFS_ARN --ec2-config SubnetArn=$EFS_SUBNET_ID_ARN,SecurityGroupArns=$EFS_SG_ID_ARN --region $REGION | jq -r ".LocationArn") ``` 1. Allow traffic between the source and target network file system (NFS) mounts. When a new domain is created, SageMaker AI creates 2 security groups. + NFS inbound security group with only inbound traffic. + NFS outbound security group with only outbound traffic. The source and target NFS are placed inside the same security groups. You can allow traffic between these mounts from the AWS Management Console or AWS CLI. + Allow traffic from the AWS Management Console 1. Sign in to the AWS Management Console and open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/). 1. Choose **Security Groups**. 1. Search for the existing domain's ID on the **Security Groups** page. ``` d-xxxxxxx ``` The results should return two security groups that include the domain ID in the name. + `security-group-for-inbound-nfs-domain-id` + `security-group-for-outbound-nfs-domain-id` 1. Select the inbound security group ID. This opens a new page with details about the security group. 1. Select the **Outbound Rules** tab. 1. Select **Edit outbound rules**. 1. Update the existing outbound rules or add a new outbound rule with the following values: + **Type**: NFS + **Protocol**: TCP + **Port range**: 2049 + **Destination**: security-group-for-outbound-nfs-*domain-id* \$1 *security-group-id* 1. Choose **Save rules**. 1. Select the **Inbound Rules** tab. 1. Select **Edit inbound rules**. 1. Update the existing inbound rules or add a new outbound rule with the following values: + **Type**: NFS + **Protocol**: TCP + **Port range**: 2049 + **Destination**: security-group-for-outbound-nfs-*domain-id* \$1 *security-group-id* 1. Choose **Save rules**. + Allow traffic from the AWS CLI 1. Update the security group inbound and outbound rules with the following values: + **Protocol**: TCP + **Port range**: 2049 + **Group ID**: Inbound security group ID or outbound security group ID ``` export INBOUND_SG_ID=$(aws ec2 describe-security-groups --filters "Name=group-name,Values=security-group-for-inbound-nfs-$SOURCE_DOMAIN_ID" | jq -r ".SecurityGroups[0].GroupId") export OUTBOUND_SG_ID=$(aws ec2 describe-security-groups --filters "Name=group-name,Values=security-group-for-outbound-nfs-$SOURCE_DOMAIN_ID" | jq -r ".SecurityGroups[0].GroupId") echo "Outbound SG ID: $OUTBOUND_SG_ID | Inbound SG ID: $INBOUND_SG_ID" aws ec2 authorize-security-group-egress \ --group-id $INBOUND_SG_ID \ --protocol tcp --port 2049 \ --source-group $OUTBOUND_SG_ID aws ec2 authorize-security-group-ingress \ --group-id $OUTBOUND_SG_ID \ --protocol tcp --port 2049 \ --source-group $INBOUND_SG_ID ``` 1. Add both the inbound and outbound security groups to the source and target Amazon EFS mount targets. This allows traffic between the 2 Amazon EFS mounts. ``` export SOURCE_EFS_MOUNT_TARGET=$(aws efs describe-mount-targets --file-system-id $SOURCE_EFS | jq -r ".MountTargets[0].MountTargetId") export TARGET_EFS_MOUNT_TARGET=$(aws efs describe-mount-targets --file-system-id $TARGET_EFS | jq -r ".MountTargets[0].MountTargetId") aws efs modify-mount-target-security-groups \ --mount-target-id $SOURCE_EFS_MOUNT_TARGET \ --security-groups $INBOUND_SG_ID $OUTBOUND_SG_ID aws efs modify-mount-target-security-groups \ --mount-target-id $TARGET_EFS_MOUNT_TARGET \ --security-groups $INBOUND_SG_ID $OUTBOUND_SG_ID ``` 1. Create a AWS DataSync task. This returns a task ARN that can be used to run the task on-demand or as part of a regular cadence. ``` export EXTRA_XFER_OPTIONS='VerifyMode=ONLY_FILES_TRANSFERRED,OverwriteMode=ALWAYS,Atime=NONE,Mtime=NONE,Uid=NONE,Gid=NONE,PreserveDeletedFiles=REMOVE,PreserveDevices=NONE,PosixPermissions=NONE,TaskQueueing=ENABLED,TransferMode=CHANGED,SecurityDescriptorCopyFlags=NONE,ObjectTags=NONE' export DATASYNC_TASK_ARN=$(aws datasync create-task --source-location-arn $SOURCE_LOCATION_ARN --destination-location-arn $DESTINATION_LOCATION_ARN --name "SMEFS_to_CustomEFS_Sync" --region $REGION --options $EXTRA_XFER_OPTIONS | jq -r ".TaskArn") ``` 1. Start a AWS DataSync task to automatically copy data from the source Amazon EFS to the target Amazon EFS mount. This does not retain the file's POSIX permissions, which allows users to read from the target Amazon EFS mount, but not write to it. ``` aws datasync start-task-execution --task-arn $DATASYNC_TASK_ARN ``` 1. Mount the target Amazon EFS volume on the domain at the root level. ``` aws sagemaker update-domain --domain-id $SOURCE_DOMAIN_ID \ --default-user-settings '{"CustomFileSystemConfigs": [{"EFSFileSystemConfig": {"FileSystemId": "'"$TARGET_EFS"'", "FileSystemPath": "/"}}]}' ``` 1. Overwrite every user profile with a `FileSystemPath` prefix. The prefix includes the user’s UID, which is created by SageMaker AI. This ensure user’s only have access to their data and prevents cross-pollination. When a space is created in the domain and the target Amazon EFS volume is mounted to the application, the user’s prefix overwrites the domain prefix. As a result, SageMaker AI only mounts the `/user-id` directory on the user's application. ``` aws sagemaker list-user-profiles --domain-id $SOURCE_DOMAIN_ID | jq -r '.UserProfiles[] | "\(.UserProfileName)"' | while read user; do export uid=$(aws sagemaker describe-user-profile --domain-id $SOURCE_DOMAIN_ID --user-profile-name $user | jq -r ".HomeEfsFileSystemUid") echo "$user $uid" aws sagemaker update-user-profile --domain-id $SOURCE_DOMAIN_ID --user-profile-name $user --user-settings '{"CustomFileSystemConfigs": [{"EFSFileSystemConfig":{"FileSystemId": "'"$TARGET_EFS"'", "FileSystemPath": "'"/$uid/"'"}}]}' done ``` 1. Users can then select the custom Amazon EFS filesystem when launching an application. For more information, see [JupyterLab user guide](studio-updated-jl-user-guide.md) or [Launch a Code Editor application in Studio](code-editor-use-studio.md). ### Use Amazon S3 to migrate data In this approach, you use an Amazon EFS-to-Amazon S3 AWS DataSync task to copy the contents of a Studio Classic Amazon EFS volume to an Amazon S3 bucket once or in a regular cadence, then create a lifecycle configuration to copy the user’s data from Amazon S3 to their private space’s Amazon EBS volume. **Note** This approach only works for domains that have internet access. 1. Set the source Amazon EFS volume ID from the domain containing the data that you are migrating. ``` timestamp=$(date +%Y%m%d%H%M%S) export SOURCE_DOMAIN_ID="domain-id" export AWS_REGION="region" export ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) export EFS_ID=$(aws sagemaker describe-domain --domain-id $SOURCE_DOMAIN_ID | jq -r '.HomeEfsFileSystemId') ``` 1. Set the target Amazon S3 bucket name. For information about creating an Amazon S3 bucket, see [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html). The bucket used must have a CORS policy as described in [(Optional) Update your CORS policy to access Amazon S3 buckets](studio-updated-migrate-ui.md#studio-updated-migrate-cors). Users in the domain must also have permissions to access the Amazon S3 bucket. In this example, we are copying files to a prefix named `studio-new`. If you are using a single Amazon S3 bucket to migrate multiple domains, use the `studio-new/` prefix to restrict permissions to the files using IAM. ``` export BUCKET_NAME=s3-bucket-name export S3_DESTINATION_PATH=studio-new ``` 1. Create a trust policy that gives AWS DataSync permissions to assume the execution role of your account. ``` export TRUST_POLICY=$(cat < If you have previously used Amazon SageMaker Data Wrangler in Amazon SageMaker Studio Classic for data preparation tasks, you can migrate to the new Amazon SageMaker Studio and access the latest version of Data Wrangler in Amazon SageMaker Canvas. Data Wrangler in SageMaker Canvas provides you with an enhanced user experience and access to the latest features, such as a natural language interface and faster performance. You can onboard to SageMaker Canvas at any time to begin using the new Data Wrangler experience. For more information, see [Getting started with using Amazon SageMaker Canvas](canvas-getting-started.md). If you have data flow files saved in Studio Classic that you were previously working on, you can onboard to Studio and then import the flow files into Canvas. You have the following options for migration: + One-click migration: When you sign in to Canvas, you can use a one-time import option that migrates all of your flow files on your behalf. + Manual migration: You can manually import your flow files into Canvas. From Studio Classic, either export the files to Amazon S3 or download them to your local machine. Then, you sign in to the SageMaker Canvas application, import the flow files, and continue your data preparation tasks. The following guide describes the prerequisites to migration and how to migrate your data flow files using either the one-click or manual option. ### Prerequisites Review the following prerequisites before you begin migrating your flow files. **Step 1. Migrate the domain and grant permissions** Before migrating data flow files, you need to follow specific steps of the [Migration from Amazon SageMaker Studio Classic](studio-updated-migrate.md) guide to ensure that your user profile's AWS IAM execution role has the required permissions. Follow the [Prerequisites](studio-updated-migrate-prereq.md) and [Migrate the UI from Studio Classic to Studio](studio-updated-migrate-ui.md) before proceeding, which describe how to grant the required permissions, configure Studio as the new experience, and migrate your existing domain. Specifically, you must have permissions to create a SageMaker Canvas application and use the SageMaker Canvas data preparation features. To obtain these permissions, you can either: + Add the [ AmazonSageMakerCanvasDataPrepFullAccess](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerCanvasDataPrepFullAccess.html) policy to your IAM role, or + Attach a least-permissions policy, as shown in the **(Optional) Migrate from Data Wrangler in Studio Classic to SageMaker Canvas** section of the page [Migrate the UI from Studio Classic to Studio](studio-updated-migrate-ui.md). Make sure to use the same user profile for both Studio and SageMaker Canvas. After completing the prerequisites outlined in the migration guide, you should have a new domain with the required permissions to access SageMaker Canvas through Studio. **Step 2. (Optional) Prepare an Amazon S3 location** If you are doing a manual migration and plan to use Amazon S3 to transfer your flow files instead of using the local download option, you should have an Amazon S3 bucket in your account that you'd like to use for storing the flow files. ### One-click migration method SageMaker Canvas offers a one-time import option for migrating your data flows from Data Wrangler in Studio Classic to Data Wrangler in SageMaker Canvas. As long as your Studio Classic and Canvas applications share the same Amazon EFS storage volume, you can migrate in one click from Canvas. This streamlined process eliminates the need for manual export and import steps, and you can import all of your flows at once. Use the following procedure to migrate all of your flow files: 1. Open your latest version of Studio. 1. In Studio, in the left navigation pane, choose the **Data** dropdown menu. 1. From the navigation options, choose **Data Wrangler**. 1. On the **Data Wrangler** page, choose **Run in Canvas**. If you have successfully set up the permissions, this creates a Canvas application for you. The Canvas application may take a few minutes before it's ready. 1. When Canvas is ready, choose **Open in Canvas.** 1. Canvas opens to the **Data Wrangler** page, and a banner at the top of the page appears that says Import your data flows from Data Wrangler in Studio Classic to Canvas. This is a one time import. Learn more. In the banner, choose **Import All**. **Warning** If you close the banner notification, you won't be able to re-open it or use the one-click migration method anymore. A pop-up notification appears, indicating that Canvas is importing your flow files from Studio Classic. If the import is fully successful, you receive another notification that `X` number of flow files were imported, and you can see your flow files on the **Data Wrangler** page of the Canvas application. Any imported flow files that have the same name as existing data flows in your Canvas application are renamed with a postfix. You can open a data flow to verify that it looks as expected. In case any of your flow files don't import successfully, you receive a notification that the import was either partially successful or failed. Choose **View errors** on the notification message to check the individual error messages for guidance on how to reformat any incorrectly formatted flow files. After importing your flow files, you should now be able to continue using Data Wrangler to prepare data in SageMaker Canvas. ### Manual migration method The following sections describe how to manually import your flow files into Canvas in case the one-click migration method didn't work. #### Export the flow files from Studio Classic **Note** If you've already migrated your Studio Classic data to Amazon S3 by following the instructions in [(Optional) Migrate data from Studio Classic to Studio](#studio-updated-migrate-data), you can skip this step and go straight to the [Import the flow files into Canvas](#studio-updated-migrate-flows-import) section in which you import your flow files from the Amazon S3 location where your Studio Classic data is stored. You can export your flow files by either saving them to Amazon S3 or downloading them to your local machine. When you import your flow files into SageMaker Canvas in the next step, if you choose the local upload option, then you can only upload 20 flow files at a time. If you have a large number of flow files to import, we recommend that you use Amazon S3 instead. Follow the instructions in either [Method 1: Use Amazon S3 to transfer flow files](#studio-updated-migrate-flows-export-s3) or [Method 2: Use your local machine to transfer flow files](#studio-updated-migrate-flows-export-local) to proceed. ##### Method 1: Use Amazon S3 to transfer flow files With this method, you use Amazon S3 as the intermediary between Data Wrangler in Studio Classic and Data Wrangler in SageMaker Canvas (accessed through the latest version of Studio). You export the flow files from Studio Classic to Amazon S3, and then in the next step, you access Canvas through Studio and import the flow files from Amazon S3. Make sure that you have an Amazon S3 bucket prepared as the storage location for the flow files. Use the following procedure to export your flow files from Studio Classic to Amazon S3: 1. Open Studio Classic. 1. Open a new terminal by doing the following: 1. On the top navigation bar, choose **File**. 1. In the context menu, hover over **New**, and then select **Terminal**. 1. By default, the terminal should open in your home directory. Navigate to the folder that contains all of the flow files that you want to migrate. 1. Use the following command to synchronize all of the flow files to the specified Amazon S3 location. Replace `{bucket-name}` and `{folder}` with the path to your desired Amazon S3 location. For more information about the command and parameters, see the [sync](https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html) command in the AWS AWS CLI Command Reference. ``` aws s3 sync . s3://{bucket-name}/{folder}/ --exclude "*.*" --include "*.flow" ``` If you are using your own AWS KMS key, then use the following command instead to synchronize the files, and specify your KMS key ID. Make sure that the user's IAM execution role (which should be the same role used in **Step 1. Migrate the domain and grant permissions** of the preceding [Prerequisites](#studio-updated-migrate-flows-prereqs)) has been granted access to use the KMS key. ``` aws s3 sync . s3://{bucket-name}/{folder}/ --exclude "*.*" --include "*.flow" --sse-kms-key-id {your-key-id} ``` Your flow files should now be exported. You can check your Amazon S3 bucket to make sure that the flow files synchronized successfully. To import these files in the latest version of Data Wrangler, follow the steps in [Import the flow files into Canvas](#studio-updated-migrate-flows-import). ##### Method 2: Use your local machine to transfer flow files With this method, you download the flow files from Studio Classic to your local machine. You can download the files directly, or you can compress them as a zip archive. Then, you unpack the zip file locally (if applicable), sign in to Canvas, and import the flow files by uploading them from your local machine. Use the following procedure to download your flow files from Studio Classic: 1. Open Studio Classic. 1. (Optional) If you want to compress multiple flow files into a zip archive and download them all at once, then do the following: 1. On the top navigation bar of Studio Classic, choose **File**. 1. In the context menu, hover over **New**, and then select **Terminal**. 1. By default, the terminal opens in your home directory. Navigate to the folder that contains all of the flow files that you want to migrate. 1. Use the following command to pack the flow files in the current directory as a zip. The command excludes any hidden files: ``` find . -not -path "*/.*" -name "*.flow" -print0 | xargs -0 zip my_archive.zip ``` 1. Download the zip archive or individual flow files to your local machine by doing the following: 1. In the left navigation pane of Studio Classic, choose **File Browser**. 1. Find the file you want to download in the file browser. 1. Right click the file, and in the context menu, select **Download**. The file should download to your local machine. If you packed them as a zip archive, extract the files locally. After the files are extracted, to import these files in the latest version of Data Wrangler, follow the steps in [Import the flow files into Canvas](#studio-updated-migrate-flows-import). #### Import the flow files into Canvas After exporting your flow files, access Canvas through Studio and import the files. Use the following procedure to import flow files into Canvas: 1. Open your latest version of Studio. 1. In Studio, in the left navigation pane, choose the **Data** dropdown menu. 1. From the navigation options, choose **Data Wrangler**. 1. On the **Data Wrangler** page, choose **Run in Canvas**. If you have successfully set up the permissions, this creates a Canvas application for you. The Canvas application may take a few minutes before it's ready. 1. When Canvas is ready, choose **Open in Canvas.** 1. Canvas opens to the **Data Wrangler** page. In the top pane, choose **Import data flows**. 1. For **Data source**, choose either **Amazon S3** or **Local upload**. 1. Select your flow files from your Amazon S3 bucket, or upload the files from your local machine. **Note** For local upload, you can upload a maximum of 20 flow files at a time. For larger imports, use Amazon S3. If you select a folder to import, any flow files in sub-folders are also imported. 1. Choose **Import data**. If the import was successful, you receive a notification that `X` number of flow files were successfully imported. In case your flow files don't import successfully, you receive a notification in the SageMaker Canvas application. Choose **View errors** on the notification message to check the individual error messages for guidance on how to reformat any incorrectly formatted flow files. After your flow files are done importing, go to the **Data Wrangler** page of the SageMaker Canvas application to view your data flows. You can try opening a data flow to verify that it looks as expected.