

# Setting up HyperPod in Studio


You need to set up the clusters depending on your choice of the cluster orchestrator to access your clusters through Amazon SageMaker Studio. In the following sections, choose the setup that matches with your orchestrator.

The instructions assume that you already have your cluster set up. For information on the cluster orchestrators and how to set up, start with the HyperPod orchestrator pages:
+  [Orchestrating SageMaker HyperPod clusters with SlurmSlurm orchestration](sagemaker-hyperpod-slurm.md) 
+  [Orchestrating SageMaker HyperPod clusters with Amazon EKS](sagemaker-hyperpod-eks.md) 

**Topics**
+ [

# Setting up a Slurm cluster in Studio
](sagemaker-hyperpod-studio-setup-slurm.md)
+ [

# Setting up an Amazon EKS cluster in Studio
](sagemaker-hyperpod-studio-setup-eks.md)

# Setting up a Slurm cluster in Studio
Setting up a Slurm cluster

The following instructions describe how to set up a HyperPod Slurm cluster in Studio.

1. Create a domain or have one ready. For information on creating a domain, see [Guide to getting set up with Amazon SageMaker AI](gs.md).

1. (Optional) Create and attach a custom FSx for Lustre volume to your domain. 

   1. Ensure that your FSx Lustre file system exists in the same VPC as your intended domain, and is in one of the subnets present in the domain.

   1. You can follow the instructions in [Adding a custom file system to a domain](domain-custom-file-system.md). 

1. (Optional) We recommend that you add tags to your clusters to ensure a more smooth workflow. For information on how to add tags, see [Edit a SageMaker HyperPod cluster](sagemaker-hyperpod-operate-slurm-console-ui.md#sagemaker-hyperpod-operate-slurm-console-ui-edit-clusters) to update your cluster using the SageMaker AI console.

   1. Tag your FSx for Lustre file system to your Studio domain. This will help you identify the file system while launching your Studio spaces. To do so, add the following tag to your cluster to identify it with the FSx filesystem ID, `fs-id`. 

      Tag Key = “`hyperpod-cluster-filesystem`”, Tag Value = “`fs-id`”.

   1. Tag your [Amazon Managed Grafana](https://docs.aws.amazon.com/grafana/latest/userguide/what-is-Amazon-Managed-Service-Grafana.html) workspace to your Studio domain. This will be used to quickly link to your Grafana workspace directly from your cluster in Studio. To do so, add the following tag to your cluster to identify it with your Grafana workspace ID, `ws-id`.

      Tag Key = “`grafana-workspace`”, Tag Value = “`ws-id`”.

1. Add the following permission to your execution role. 

   For information on SageMaker AI execution roles and how to edit them, see [Understanding domain space permissions and execution roles](execution-roles-and-spaces.md). 

   To learn how to attach policies to an IAM user or group, see [Adding and removing IAM identity permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html).

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Action": [
                   "ssm:StartSession",
                   "ssm:TerminateSession"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "sagemaker:CreateCluster",
                   "sagemaker:ListClusters"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "cloudwatch:PutMetricData",
                   "cloudwatch:GetMetricData"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "sagemaker:DescribeCluster",
                   "sagemaker:DescribeClusterNode",
                   "sagemaker:ListClusterNodes",
                   "sagemaker:UpdateCluster",
                   "sagemaker:UpdateClusterSoftware"
               ],
               "Resource": "arn:aws:sagemaker:us-east-1:111122223333:cluster/*"
           }
       ]
   }
   ```

------

1. Add a tag to this IAM role, with Tag Key = “`SSMSessionRunAs`” and Tag Value = “`os user`”. The `os user` here is the same user that you setup for the Slurm cluster. Manage access to SageMaker HyperPod clusters at an IAM role or user level by using the Run As feature in [AWS Systems Manager Agent (SSM Agent)](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html). With this feature, you can start each SSM session using the operating system (OS) user associated to the IAM role or user. 

   For information on how to add tags to your execution role, see [Tag IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_tags_roles.html).

1. [Turn on Run As support for Linux and macOS managed nodes](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-preferences-run-as.html). The Run As settings are account wide and is required for all SSM sessions to start successfully.

1. (Optional) [Restrict task view in Studio for Slurm clusters](#sagemaker-hyperpod-studio-setup-slurm-restrict-tasks-view). For information on viewable tasks in Studio, see [Tasks](sagemaker-hyperpod-studio-tabs.md#sagemaker-hyperpod-studio-tabs-tasks).

In Amazon SageMaker Studio you can navigate to view your clusters in HyperPod clusters (under Compute).

## Restrict task view in Studio for Slurm clusters


You can restrict users to view Slurm tasks that are authorized to view, without requiring manual input of namespaces or additional permissions checks. The restriction is applied based on the users’ IAM role, providing a streamlined and secure user experience. The following section provides information on how to restrict task view in Studio for Slurm clusters. For information on viewable tasks in Studio, see [Tasks](sagemaker-hyperpod-studio-tabs.md#sagemaker-hyperpod-studio-tabs-tasks). 

All Studio users can view, manage, and interact with all Slurm cluster tasks by default. To restrict this, you can manage access to SageMaker HyperPod clusters at an IAM role or user level by using the **Run As** feature in [AWS Systems Manager Agent (SSM Agent)](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html).

You can do this by tagging IAM roles with specific identifiers, such as their username or group. When a user accesses Studio, the Session Manager uses the Run As feature to execute commands as a specific Slurm user account that matches their IAM role tags. The Slurm configuration can be set up to limit task visibility based on the user account. The Studio UI will automatically filter tasks visible to that specific user account when commands are executed through the Run As feature. Once set up, each user assuming the role with the specified identifiers will have those Slurm tasks filtered based on the Slurm configuration. For information on how to add tags to your execution role, see [Tag IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_tags_roles.html).

# Setting up an Amazon EKS cluster in Studio
Setting up an Amazon EKS cluster

The following instructions describe how to set up an Amazon EKS cluster in Studio.

1. Create a domain or have one ready. For information on creating a domain, see [Guide to getting set up with Amazon SageMaker AI](gs.md).

1. Add the following permission to your execution role. 

   For information on SageMaker AI execution roles and how to edit them, see [Understanding domain space permissions and execution roles](execution-roles-and-spaces.md). 

   To learn how to attach policies to an IAM user or group, see [Adding and removing IAM identity permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html).

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "DescribeHyerpodClusterPermissions",
               "Effect": "Allow",
               "Action": [
                   "sagemaker:DescribeCluster"
               ],
               "Resource": "arn:aws:sagemaker:us-east-1:111122223333:cluster/cluster-name"
           },
           {
               "Effect": "Allow",
               "Action": "ec2:Describe*",
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "ecr:CompleteLayerUpload",
                   "ecr:GetAuthorizationToken",
                   "ecr:UploadLayerPart",
                   "ecr:InitiateLayerUpload",
                   "ecr:BatchCheckLayerAvailability",
                   "ecr:PutImage"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
                   "Action": [
                       "cloudwatch:PutMetricData",
                       "cloudwatch:GetMetricData"
                       ],
               "Resource": "*"
           },
           {
               "Sid": "UseEksClusterPermissions",
               "Effect": "Allow",
               "Action": [
                   "eks:DescribeCluster",
                   "eks:AccessKubernetesApi",
                   "eks:DescribeAddon"
               ],
               "Resource": "arn:aws:eks:us-east-1:111122223333:cluster/cluster-name"
           },
           {
               "Sid": "ListClustersPermission",
               "Effect": "Allow",
               "Action": [
                   "sagemaker:ListClusters"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "ssm:StartSession",
                   "ssm:TerminateSession"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

------

1. [Grant IAM users access to Kubernetes with EKS access entries](https://docs.aws.amazon.com/eks/latest/userguide/access-entries.html).

   1. Navigate to the Amazon EKS cluster associated with your HyperPod cluster.

   1. Choose the **Access** tab and [create an access entry](https://docs.aws.amazon.com/eks/latest/userguide/creating-access-entries.html) for the execution role you created. 

      1. In step 1, Select the execution role you created above in the **IAM** principal dropdown.

      1. In step 2, select a policy name and select an access scope that you want the users to have access to. 

1. (Optional) To ensure a more smooth experience, we recommend that you add tags to your clusters. For information on how to add tags, see [Edit a SageMaker HyperPod cluster](sagemaker-hyperpod-operate-slurm-console-ui.md#sagemaker-hyperpod-operate-slurm-console-ui-edit-clusters) to update your cluster using the SageMaker AI console.

   1. Tag your [Amazon Managed Grafana](https://docs.aws.amazon.com/grafana/latest/userguide/what-is-Amazon-Managed-Service-Grafana.html) workspace to your Studio domain. This will be used to quickly link to your Grafana workspace directly from your cluster in Studio. To do so, add the following tag to your cluster to identify it with your Grafana workspace ID, `ws-id`.

     Tag Key = “`grafana-workspace`”, Tag Value = “`ws-id`”.

1. (Optional) [Restrict task view in Studio for EKS clusters](#sagemaker-hyperpod-studio-setup-eks-restrict-tasks-view). For information on viewable tasks in Studio, see [Tasks](sagemaker-hyperpod-studio-tabs.md#sagemaker-hyperpod-studio-tabs-tasks).

## Restrict task view in Studio for EKS clusters


You can restrict Kubernetes namespace permissions for users, so that they will only have access to view tasks belonging to a specified namespace. The following provides information on how to restrict the task view in Studio for EKS clusters. For information on viewable tasks in Studio, see [Tasks](sagemaker-hyperpod-studio-tabs.md#sagemaker-hyperpod-studio-tabs-tasks). 

Users will have visibility to all EKS cluster tasks by default. You can restrict users’ visibility for EKS cluster tasks to specified namespaces, ensuring that users can access the resources they need while maintaining strict access controls. You will need to provide the namespace for the user to display jobs of that namespace once the following is set up.

Once the restriction is applied, you will need to provide the namespace to the users assuming the role. Studio will only display the jobs of the namespace once the user provides inputs namespace they have permissions to view in the **Tasks** tab. 

The following configuration allows administrators to grant specific, limited access to data scientists for viewing tasks within the cluster. This configuration grants the following permissions:
+ List and get pods
+ List and get events
+ Get Custom Resource Definitions (CRDs)

YAML Configuration

```
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pods-events-crd-cluster-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["events"]
  verbs: ["get", "list"]
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: pods-events-crd-cluster-role-binding
subjects:
- kind: Group
  name: pods-events-crd-cluster-level
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: pods-events-crd-cluster-role
  apiGroup: rbac.authorization.k8s.io
```

1. Save the YAML configuration to a file named `cluster-role.yaml`.

1. Apply the configuration using [https://kubernetes.io/docs/reference/kubectl/](https://kubernetes.io/docs/reference/kubectl/):

   ```
   kubectl apply -f cluster-role.yaml
   ```

1. Verify the configuration:

   ```
   kubectl get clusterrole pods-events-crd-cluster-role
   kubectl get clusterrolebinding pods-events-crd-cluster-role-binding
   ```

1. Assign users to the `pods-events-crd-cluster-level` group through your identity provider or IAM.