

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# AWS Identity and Access Management for SageMaker HyperPod
<a name="sagemaker-hyperpod-prerequisites-iam"></a>

AWS Identity and Access Management (IAM) 是一種 AWS 服務，可協助管理員安全地控制對 AWS 資源的存取。IAM 管理員可以控制*驗證* (已登入) 和*授權* (具有許可) 來使用 Amazon EKS 資源。IAM 是一項服務 AWS ，您可以免費使用。

**重要**  
允許 Amazon SageMaker Studio 或 Amazon SageMaker Studio Classic 建立 Amazon SageMaker 資源的自訂 IAM 政策也必須授予許可，才能將標籤新增至這些資源。需要將標籤新增至資源的許可，因為 Studio 和 Studio Classic 會自動標記它們建立的任何資源。如果 IAM 政策允許 Studio 和 Studio Classic 建立資源，但不允許標記，則在嘗試建立資源時可能會發生 "AccessDenied" 錯誤。如需詳細資訊，請參閱[提供標記 SageMaker AI 資源的許可](security_iam_id-based-policy-examples.md#grant-tagging-permissions)。  
提供許可來建立 SageMaker 資源的 [AWS Amazon SageMaker AI 的 受管政策](security-iam-awsmanpol.md) 已包含建立這些資源時新增標籤的許可。

假設 SageMaker HyperPod 使用者有兩個主要層級：*叢集管理員使用者*和*資料科學家使用者*。
+ **叢集管理員使用者** - 負責建立和管理 SageMaker HyperPod 叢集。這包括設定 HyperPod 叢集和管理使用者對它們的存取。
  + 使用 Slurm 或 Amazon EKS 建立和設定 SageMaker HyperPod 叢集。
  + 為資料科學家使用者和 HyperPod 叢集資源建立和設定 IAM 角色。
  + 對於 SageMaker HyperPod 與 Amazon EKS 的協同運作，建立和設定 [EKS 存取項目](https://docs.aws.amazon.com/eks/latest/userguide/access-entries.html)、[角色型存取控制 (RBAC)](sagemaker-hyperpod-eks-setup-rbac.md) 和 Pod 身分識別，以滿足資料科學使用案例。
+ **資料科學家使用者** - 專注於 ML 模型訓練。他們使用開放原始碼協調器或 SageMaker HyperPod CLI 來提交和管理訓練任務。
  + 擔任並使用叢集管理員使用者提供的 IAM 角色。
  + 與 SageMaker HyperPod (Slurm 或 Kubernetes) 支援的開放原始碼協調器 CLI 或 SageMaker HyperPod CLI 互動，以檢查叢集容量、連線至叢集，以及提交工作負載。

透過連接正確的許可或政策來操作 SageMaker HyperPod 叢集，為叢集管理員設定 IAM 角色。叢集管理員也應建立 IAM 角色，以提供給 SageMaker HyperPod 資源，以擔任 執行並與必要 AWS 資源通訊的角色，例如 Amazon S3、Amazon CloudWatch 和 AWS Systems Manager (SSM)。最後， AWS 帳戶管理員或叢集管理員應授予科學家存取 SageMaker HyperPod 叢集和執行 ML 工作負載的許可。

根據您選擇的協調器，叢集管理員和科學家所需的許可可能會有所不同。您也可以使用每個服務的條件金鑰，控制角色中各種動作的許可範圍。使用下列服務授權參考，為與 SageMaker HyperPod 相關的服務新增詳細範圍。
+ [Amazon Elastic Compute Cloud](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonec2.html)
+ [Amazon Elastic Container Registry](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonelasticcontainerregistry.html) (適用於 SageMaker HyperPod 與 Amazon EKS 的叢集協同運作)
+ [Amazon Elastic Kubernetes Service](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonelastickubernetesservice.html) (適用於 SageMaker HyperPod 與 Amazon EKS 的叢集協同運作)
+ [Amazon FSx](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonfsx.html)
+ [AWS IAM Identity Center ( AWS 單一登入的後續版本）](https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsiamidentitycentersuccessortoawssinglesign-on.html)
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsidentityandaccessmanagementiam.html)
+ [Amazon Simple Storage Service](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazons3.html)
+ [Amazon SageMaker AI](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonsagemaker.html)
+ [AWS Systems Manager](https://docs.aws.amazon.com/service-authorization/latest/reference/list_awssystemsmanager.html)

**Topics**
+ [

## 用於建立叢集的 IAM 許可
](#sagemaker-hyperpod-prerequisites-iam-cluster-creation)
+ [

## 叢集管理員的 IAM 使用者
](#sagemaker-hyperpod-prerequisites-iam-cluster-admin)
+ [

## 科學家的 IAM 使用者
](#sagemaker-hyperpod-prerequisites-iam-cluster-user)
+ [

## SageMaker HyperPod 的 IAM 角色
](#sagemaker-hyperpod-prerequisites-iam-role-for-hyperpod)

## 用於建立叢集的 IAM 許可
<a name="sagemaker-hyperpod-prerequisites-iam-cluster-creation"></a>

建立 HyperPod 叢集需要下列政策範例中概述的 IAM 許可。如果您的 AWS 帳戶 具有[https://docs.aws.amazon.com//aws-managed-policy/latest/reference/AdministratorAccess.html](https://docs.aws.amazon.com//aws-managed-policy/latest/reference/AdministratorAccess.html)許可，則預設會授予這些許可。

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:CreateCluster",
                "sagemaker:DeleteCluster",
                "sagemaker:UpdateCluster"
            ],
            "Resource": "arn:aws:sagemaker:*:*:cluster/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:AddTags"
            ],
            "Resource": "arn:aws:sagemaker:*:*:cluster/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:ListTags",
                "sagemaker:ListClusters",
                "sagemaker:ListClusterNodes",
                "sagemaker:ListComputeQuotas",
                "sagemaker:ListTrainingPlans",
                "sagemaker:DescribeCluster",
                "sagemaker:DescribeClusterNode"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "cloudformation:CreateStack",
                "cloudformation:UpdateStack",
                "cloudformation:DeleteStack",
                "cloudformation:ContinueUpdateRollback",
                "cloudformation:SetStackPolicy",
                "cloudformation:ValidateTemplate",
                "cloudformation:DescribeStacks",
                "cloudformation:DescribeStackEvents",
                "cloudformation:Get*",
                "cloudformation:List*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/sagemaker-*",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": [
                        "sagemaker.amazonaws.com",
                        "eks.amazonaws.com",
                        "lambda.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:PassRole",
                "iam:GetRole"
            ],
            "Resource": "arn:aws:iam::*:role/*",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": [
                        "sagemaker.amazonaws.com",
                        "eks.amazonaws.com",
                        "lambda.amazonaws.com",
                        "cloudformation.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Sid": "AmazonVPCFullAccess",
            "Effect": "Allow",
            "Action": [
                "ec2:AcceptVpcPeeringConnection",
                "ec2:AcceptVpcEndpointConnections",
                "ec2:AllocateAddress",
                "ec2:AssignIpv6Addresses",
                "ec2:AssignPrivateIpAddresses",
                "ec2:AssociateAddress",
                "ec2:AssociateDhcpOptions",
                "ec2:AssociateRouteTable",
                "ec2:AssociateSecurityGroupVpc",
                "ec2:AssociateSubnetCidrBlock",
                "ec2:AssociateVpcCidrBlock",
                "ec2:AttachClassicLinkVpc",
                "ec2:AttachInternetGateway",
                "ec2:AttachNetworkInterface",
                "ec2:AttachVpnGateway",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CreateCarrierGateway",
                "ec2:CreateCustomerGateway",
                "ec2:CreateDefaultSubnet",
                "ec2:CreateDefaultVpc",
                "ec2:CreateDhcpOptions",
                "ec2:CreateEgressOnlyInternetGateway",
                "ec2:CreateFlowLogs",
                "ec2:CreateInternetGateway",
                "ec2:CreateLocalGatewayRouteTableVpcAssociation",
                "ec2:CreateNatGateway",
                "ec2:CreateNetworkAcl",
                "ec2:CreateNetworkAclEntry",
                "ec2:CreateNetworkInterface",
                "ec2:CreateNetworkInterfacePermission",
                "ec2:CreateRoute",
                "ec2:CreateRouteTable",
                "ec2:CreateSecurityGroup",
                "ec2:CreateSubnet",
                "ec2:CreateTags",
                "ec2:CreateVpc",
                "ec2:CreateVpcEndpoint",
                "ec2:CreateVpcEndpointConnectionNotification",
                "ec2:CreateVpcEndpointServiceConfiguration",
                "ec2:CreateVpcPeeringConnection",
                "ec2:CreateVpnConnection",
                "ec2:CreateVpnConnectionRoute",
                "ec2:CreateVpnGateway",
                "ec2:DeleteCarrierGateway",
                "ec2:DeleteCustomerGateway",
                "ec2:DeleteDhcpOptions",
                "ec2:DeleteEgressOnlyInternetGateway",
                "ec2:DeleteFlowLogs",
                "ec2:DeleteInternetGateway",
                "ec2:DeleteLocalGatewayRouteTableVpcAssociation",
                "ec2:DeleteNatGateway",
                "ec2:DeleteNetworkAcl",
                "ec2:DeleteNetworkAclEntry",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteNetworkInterfacePermission",
                "ec2:DeleteRoute",
                "ec2:DeleteRouteTable",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteSubnet",
                "ec2:DeleteTags",
                "ec2:DeleteVpc",
                "ec2:DeleteVpcEndpoints",
                "ec2:DeleteVpcEndpointConnectionNotifications",
                "ec2:DeleteVpcEndpointServiceConfigurations",
                "ec2:DeleteVpcPeeringConnection",
                "ec2:DeleteVpnConnection",
                "ec2:DeleteVpnConnectionRoute",
                "ec2:DeleteVpnGateway",
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeAddresses",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeCarrierGateways",
                "ec2:DescribeClassicLinkInstances",
                "ec2:DescribeCustomerGateways",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeEgressOnlyInternetGateways",
                "ec2:DescribeFlowLogs",
                "ec2:DescribeInstances",
                "ec2:DescribeInternetGateways",
                "ec2:DescribeIpv6Pools",
                "ec2:DescribeLocalGatewayRouteTables",
                "ec2:DescribeLocalGatewayRouteTableVpcAssociations",
                "ec2:DescribeKeyPairs",
                "ec2:DescribeMovingAddresses",
                "ec2:DescribeNatGateways",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeNetworkInterfaceAttribute",
                "ec2:DescribeNetworkInterfacePermissions",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribePrefixLists",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroupReferences",
                "ec2:DescribeSecurityGroupRules",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSecurityGroupVpcAssociations",
                "ec2:DescribeStaleSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeTags",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcClassicLink",
                "ec2:DescribeVpcClassicLinkDnsSupport",
                "ec2:DescribeVpcEndpointConnectionNotifications",
                "ec2:DescribeVpcEndpointConnections",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeVpcEndpointServiceConfigurations",
                "ec2:DescribeVpcEndpointServicePermissions",
                "ec2:DescribeVpcEndpointServices",
                "ec2:DescribeVpcPeeringConnections",
                "ec2:DescribeVpcs",
                "ec2:DescribeVpnConnections",
                "ec2:DescribeVpnGateways",
                "ec2:DetachClassicLinkVpc",
                "ec2:DetachInternetGateway",
                "ec2:DetachNetworkInterface",
                "ec2:DetachVpnGateway",
                "ec2:DisableVgwRoutePropagation",
                "ec2:DisableVpcClassicLink",
                "ec2:DisableVpcClassicLinkDnsSupport",
                "ec2:DisassociateAddress",
                "ec2:DisassociateRouteTable",
                "ec2:DisassociateSecurityGroupVpc",
                "ec2:DisassociateSubnetCidrBlock",
                "ec2:DisassociateVpcCidrBlock",
                "ec2:EnableVgwRoutePropagation",
                "ec2:EnableVpcClassicLink",
                "ec2:EnableVpcClassicLinkDnsSupport",
                "ec2:GetSecurityGroupsForVpc",
                "ec2:ModifyNetworkInterfaceAttribute",
                "ec2:ModifySecurityGroupRules",
                "ec2:ModifySubnetAttribute",
                "ec2:ModifyVpcAttribute",
                "ec2:ModifyVpcEndpoint",
                "ec2:ModifyVpcEndpointConnectionNotification",
                "ec2:ModifyVpcEndpointServiceConfiguration",
                "ec2:ModifyVpcEndpointServicePermissions",
                "ec2:ModifyVpcPeeringConnectionOptions",
                "ec2:ModifyVpcTenancy",
                "ec2:MoveAddressToVpc",
                "ec2:RejectVpcEndpointConnections",
                "ec2:RejectVpcPeeringConnection",
                "ec2:ReleaseAddress",
                "ec2:ReplaceNetworkAclAssociation",
                "ec2:ReplaceNetworkAclEntry",
                "ec2:ReplaceRoute",
                "ec2:ReplaceRouteTableAssociation",
                "ec2:ResetNetworkInterfaceAttribute",
                "ec2:RestoreAddressToClassic",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:UnassignIpv6Addresses",
                "ec2:UnassignPrivateIpAddresses",
                "ec2:UpdateSecurityGroupRuleDescriptionsEgress",
                "ec2:UpdateSecurityGroupRuleDescriptionsIngress"
            ],
            "Resource": "*"
        },
        {
            "Sid": "CloudWatchPermissions",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:*",
                "logs:*",
                "sns:CreateTopic",
                "sns:ListSubscriptions",
                "sns:ListSubscriptionsByTopic",
                "sns:ListTopics",
                "sns:Subscribe",
                "iam:GetPolicy",
                "iam:GetPolicyVersion",
                "iam:GetRole",
                "oam:ListSinks",
                "rum:*",
                "synthetics:*",
                "xray:*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket",
                "s3:DeleteBucket",
                "s3:PutBucketPolicy",
                "s3:PutBucketTagging",
                "s3:PutBucketPublicAccessBlock",
                "s3:PutBucketLogging",
                "s3:DeleteBucketPolicy",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:PutEncryptionConfiguration",
                "s3:AbortMultipartUpload",
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws:s3:::*",
                "arn:aws:s3:::*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "eks:CreateCluster",
                "eks:DeleteCluster",
                "eks:CreateNodegroup",
                "eks:DeleteNodegroup",
                "eks:UpdateNodegroupConfig",
                "eks:UpdateNodegroupVersion",
                "eks:UpdateClusterConfig",
                "eks:UpdateClusterVersion",
                "eks:CreateFargateProfile",
                "eks:DeleteFargateProfile",
                "eks:CreateAddon",
                "eks:DeleteAddon",
                "eks:UpdateAddon",
                "eks:CreateAccessEntry",
                "eks:DeleteAccessEntry",
                "eks:UpdateAccessEntry",
                "eks:AssociateAccessPolicy",
                "eks:AssociateIdentityProviderConfig",
                "eks:DisassociateIdentityProviderConfig",
                "eks:TagResource",
                "eks:UntagResource",
                "eks:AccessKubernetesApi",
                "eks:Describe*",
                "eks:List*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ssm:GetParameter",
                "ssm:PutParameter",
                "ssm:DeleteParameter",
                "ssm:DescribeParameters"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey"
            ],
            "Resource": "*",
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "sagemaker.*.amazonaws.com",
                        "ec2.*.amazonaws.com",
                        "s3.*.amazonaws.com",
                        "eks.*.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "lambda:CreateFunction",
                "lambda:DeleteFunction",
                "lambda:GetFunction",
                "lambda:UpdateFunctionCode",
                "lambda:UpdateFunctionConfiguration",
                "lambda:AddPermission",
                "lambda:RemovePermission",
                "lambda:PublishLayerVersion",
                "lambda:DeleteLayerVersion",
                "lambda:InvokeFunction",
                "lambda:Get*",
                "lambda:List*",
                "lambda:TagResource"
            ],
            "Resource": [
                "arn:aws:lambda:*:*:function:*",
                "arn:aws:lambda:*:*:layer:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:DeleteRole",
                "iam:DeleteRolePolicy"
            ],
            "Resource": [
                "arn:aws:iam::*:role/*sagemaker*",
                "arn:aws:iam::*:role/*eks*",
                "arn:aws:iam::*:role/*hyperpod*",
                "arn:aws:iam::*:policy/*sagemaker*",
                "arn:aws:iam::*:policy/*hyperpod*",
                "arn:aws:iam::*:role/*LifeCycleScriptStack*",
                "arn:aws:iam::*:role/*LifeCycleScript*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:CreateRole",
                "iam:TagRole",
                "iam:PutRolePolicy",
                "iam:Get*",
                "iam:List*",
                "iam:AttachRolePolicy",
                "iam:DetachRolePolicy"
            ],
            "Resource": [
                "arn:aws:iam::*:role/*",
                "arn:aws:iam::*:policy/*"
            ]
        },
        {
            "Sid": "FullAccessToFSx",
            "Effect": "Allow",
            "Action": [
                "fsx:AssociateFileGateway",
                "fsx:AssociateFileSystemAliases",
                "fsx:CancelDataRepositoryTask",
                "fsx:CopyBackup",
                "fsx:CopySnapshotAndUpdateVolume",
                "fsx:CreateAndAttachS3AccessPoint",
                "fsx:CreateBackup",
                "fsx:CreateDataRepositoryAssociation",
                "fsx:CreateDataRepositoryTask",
                "fsx:CreateFileCache",
                "fsx:CreateFileSystem",
                "fsx:CreateFileSystemFromBackup",
                "fsx:CreateSnapshot",
                "fsx:CreateStorageVirtualMachine",
                "fsx:CreateVolume",
                "fsx:CreateVolumeFromBackup",
                "fsx:DetachAndDeleteS3AccessPoint",
                "fsx:DeleteBackup",
                "fsx:DeleteDataRepositoryAssociation",
                "fsx:DeleteFileCache",
                "fsx:DeleteFileSystem",
                "fsx:DeleteSnapshot",
                "fsx:DeleteStorageVirtualMachine",
                "fsx:DeleteVolume",
                "fsx:DescribeAssociatedFileGateways",
                "fsx:DescribeBackups",
                "fsx:DescribeDataRepositoryAssociations",
                "fsx:DescribeDataRepositoryTasks",
                "fsx:DescribeFileCaches",
                "fsx:DescribeFileSystemAliases",
                "fsx:DescribeFileSystems",
                "fsx:DescribeS3AccessPointAttachments",
                "fsx:DescribeSharedVpcConfiguration",
                "fsx:DescribeSnapshots",
                "fsx:DescribeStorageVirtualMachines",
                "fsx:DescribeVolumes",
                "fsx:DisassociateFileGateway",
                "fsx:DisassociateFileSystemAliases",
                "fsx:ListTagsForResource",
                "fsx:ManageBackupPrincipalAssociations",
                "fsx:ReleaseFileSystemNfsV3Locks",
                "fsx:RestoreVolumeFromSnapshot",
                "fsx:TagResource",
                "fsx:UntagResource",
                "fsx:UpdateDataRepositoryAssociation",
                "fsx:UpdateFileCache",
                "fsx:UpdateFileSystem",
                "fsx:UpdateSharedVpcConfiguration",
                "fsx:UpdateSnapshot",
                "fsx:UpdateStorageVirtualMachine",
                "fsx:UpdateVolume"
            ],
            "Resource": "*"
        }
    ]
}
```

------

## 叢集管理員的 IAM 使用者
<a name="sagemaker-hyperpod-prerequisites-iam-cluster-admin"></a>

叢集管理員會操作和設定 SageMaker HyperPod 叢集，執行 [SageMaker HyperPod Slurm 叢集操作](sagemaker-hyperpod-operate-slurm.md)中的任務。下列政策範例包含叢集管理員執行 SageMaker HyperPod 核心 API 和管理您 AWS 帳戶內 SageMaker HyperPod 叢集的最低許可集。

**注意**  
具有叢集管理員角色的 IAM 使用者可以使用條件金鑰，在專門針對 `CreateCluster` 和 `UpdateCluster` 動作管理 SageMaker HyperPod 叢集資源時提供精細的存取控制。若要尋找這些動作支援的條件金鑰，請在 [SageMaker AI 定義的動作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonsagemaker.html#amazonsagemaker-actions-as-permissions)中搜尋 `CreateCluster` 或 `UpdateCluster`。

------
#### [ Slurm ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:CreateCluster",
                "sagemaker:ListClusters"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:DeleteCluster",
                "sagemaker:DescribeCluster",
                "sagemaker:DescribeClusterNode",
                "sagemaker:ListClusterNodes",
                "sagemaker:UpdateCluster",
                "sagemaker:UpdateClusterSoftware",
                "sagemaker:BatchDeleteClusterNodes"
            ],
            "Resource": "arn:aws:sagemaker:us-east-1:111122223333:cluster/*"
        }
    ]
}
```

------
#### [ Amazon EKS ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::111122223333:role/execution-role-name"
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:CreateCluster",
                "sagemaker:DeleteCluster",
                "sagemaker:DescribeCluster",
                "sagemaker:DescribeClusterNode",
                "sagemaker:ListClusterNodes",
                "sagemaker:ListClusters",
                "sagemaker:UpdateCluster",
                "sagemaker:UpdateClusterSoftware",
                "sagemaker:BatchAddClusterNodes",
                "sagemaker:BatchDeleteClusterNodes",
                "sagemaker:ListComputeQuotas",
                "sagemaker:ListClusterSchedulerConfigs",
                "sagemaker:DeleteClusterSchedulerConfig",
                "sagemaker:DeleteComputeQuota",
                "eks:DescribeCluster",
                "eks:CreateAccessEntry",
                "eks:DescribeAccessEntry",
                "eks:DeleteAccessEntry",
                "eks:AssociateAccessPolicy",
                "iam:CreateServiceLinkedRole"
            ],
            "Resource": "*"
        }
    ]
}
```

------

若要授予存取 SageMaker AI 主控台的許可，請使用[使用 Amazon SageMaker AI 主控台所需的許可](https://docs.aws.amazon.com/sagemaker/latest/dg/security_iam_id-based-policy-examples.html#console-permissions)中提供的範例政策。

若要授予存取 Amazon EC2 Systems Manager 主控台的許可，請使用* AWS Systems Manager 《 使用者指南*》中的[使用 AWS Systems Manager 主控台](https://docs.aws.amazon.com/systems-manager/latest/userguide/security_iam_id-based-policy-examples.html#security_iam_id-based-policy-examples-console)所提供的範例政策。

您也可以考慮將 [`AmazonSageMakerFullAccess`](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonSageMakerFullAccess) 政策連接至角色；不過，請注意，`AmazonSageMakerFullAccess` 政策會將許可授予整個 SageMaker API 呼叫、功能和資源。

如需 IAM 使用者的一般指引，請參閱《AWS Identity and Access Management 使用者指南》**中的 [IAM 使用者](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users.html)。

## 科學家的 IAM 使用者
<a name="sagemaker-hyperpod-prerequisites-iam-cluster-user"></a>

科學家會在叢集管理員佈建的 SageMaker HyperPod 叢集節點上登入並執行 ML 工作負載。對於您 AWS 帳戶中的科學家，您應該授予`"ssm:StartSession"`執行 SSM `start-session`命令的許可。以下是 IAM 使用者的政策範例。

------
#### [ Slurm ]

新增下列政策以授予 SSM 工作階段許可，來連線至所有資源的 SSM 目標。這可讓您存取 HyperPod 叢集。

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	             
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssm:StartSession",
                "ssm:TerminateSession"
            ],
            "Resource": "*"    
        }
    ]
}
```

------

------
#### [ Amazon EKS ]

授予下列 IAM 角色許可，讓資料科學家在 HyperPod CLI 命令之間執行 `hyperpod list-clusters` 和 `hyperpod connect-cluster` 命令。若要進一步了解 HyperPod CLI，請參閱 [在 Amazon EKS 協作的 SageMaker HyperPod 叢集上執行任務](sagemaker-hyperpod-eks-run-jobs.md)。它還包含 SSM 工作階段許可，以連線到所有資源的 SSM 目標。這可讓您存取 HyperPod 叢集。

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "DescribeHyerpodClusterPermissions",
            "Effect": "Allow",
            "Action": [
                "sagemaker:DescribeCluster"
            ],
            "Resource": "arn:aws:sagemaker:us-east-2:111122223333:cluster/hyperpod-cluster-name"
        },
        {
            "Sid": "UseEksClusterPermissions",
            "Effect": "Allow",
            "Action": [
                "eks:DescribeCluster"
            ],
            "Resource": "arn:aws:sagemaker:us-east-2:111122223333:cluster/eks-cluster-name"
        },
        {
            "Sid": "ListClustersPermission",
            "Effect": "Allow",
            "Action": [
                "sagemaker:ListClusters"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ssm:StartSession",
                "ssm:TerminateSession"
            ],
            "Resource": "*"    
        }
    ]
}
```

------

若要授予資料科學家 IAM 使用者或角色存取叢集中的 Kubernetes API，亦請參閱《Amazon EKS 使用者指南》**中的[授予 IAM 使用者和角色存取 Kubernetes API](https://docs.aws.amazon.com/eks/latest/userguide/grant-k8s-access.html)。

------

## SageMaker HyperPod 的 IAM 角色
<a name="sagemaker-hyperpod-prerequisites-iam-role-for-hyperpod"></a>

若要讓 SageMaker HyperPod 叢集執行並與必要的 AWS 資源通訊，您需要建立 HyperPod 叢集要擔任的 IAM 角色。

從連接受管角色 [AWS 受管政策：AmazonSageMakerHyperPodServiceRolePolicy](security-iam-awsmanpol-AmazonSageMakerHyperPodServiceRolePolicy.md) 開始。鑑於此 AWS 受管政策，SageMaker HyperPod 叢集執行個體群組會擔任與 Amazon CloudWatch、Amazon S3 和 AWS Systems Manager Agent (SSM Agent) 通訊的角色。此受管政策是 SageMaker HyperPod 資源正確執行的最低要求，因此您必須將 IAM 角色與此政策提供給所有執行個體群組。

**提示**  
根據您為多個執行個體群組設計許可層級時的喜好設定，您也可以設定多個 IAM 角色，並將其連接至不同的執行個體群組。當您設定叢集使用者存取特定的 SageMaker HyperPod 叢集節點時，節點擔任的角色具有您手動連接的選擇性許可。  
當您透過 [AWS Systems Manager](https://aws.amazon.com/systems-manager/) 設定科學家對特定叢集節點的存取 (另請參閱 [設定 AWS Systems Manager 和執行為叢集使用者存取控制](sagemaker-hyperpod-prerequisites.md#sagemaker-hyperpod-prerequisites-ssm)) 時，叢集節點擔任的角色具有您手動連接的選擇性許可。

完成建立 IAM 角色後，請記下其名稱和 ARN。您在建立 SageMaker HyperPod 叢集時使用 角色，授予每個執行個體群組與必要 AWS 資源通訊所需的正確許可。

------
#### [ Slurm ]

對於與 Slurm 協作的 HyperPod，您必須將下列受管政策連接至 SageMaker HyperPod IAM 角色。
+ [AmazonSageMakerClusterInstanceRolePolicy](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerClusterInstanceRolePolicy.html)

**(選用) 搭配 Amazon Virtual Private Cloud 使用 SageMaker HyperPod 的其他許可**

如果您想要使用自己的 Amazon Virtual Private Cloud (VPC) 而非預設的 SageMaker AI VPC，則應將下列額外許可新增至 SageMaker HyperPod 的 IAM 角色。

```
{
    "Effect": "Allow",
    "Action": [
        "ec2:CreateNetworkInterface",
        "ec2:CreateNetworkInterfacePermission",
        "ec2:DeleteNetworkInterface",
        "ec2:DeleteNetworkInterfacePermission",
        "ec2:DescribeNetworkInterfaces",
        "ec2:DescribeVpcs",
        "ec2:DescribeDhcpOptions",
        "ec2:DescribeSubnets",
        "ec2:DescribeSecurityGroups",
        "ec2:DetachNetworkInterface"
    ],
    "Resource": "*"
}
{
    "Effect": "Allow",
    "Action": "ec2:CreateTags",
    "Resource": [
        "arn:aws:ec2:*:*:network-interface/*"
    ]
}
```

以下清單詳細列出當您使用自己的 Amazon VPC 設定叢集時啟用 SageMaker HyperPod 叢集功能所需的許可。
+ 需要下列 `ec2` 許可，才能使用 VPC 設定 SageMaker HyperPod 叢集。

  ```
  {
      "Effect": "Allow",
      "Action": [
          "ec2:CreateNetworkInterface",
          "ec2:CreateNetworkInterfacePermission",
          "ec2:DeleteNetworkInterface",
          "ec2:DeleteNetworkInterfacePermission",
          "ec2:DescribeNetworkInterfaces",
          "ec2:DescribeVpcs",
          "ec2:DescribeDhcpOptions",
          "ec2:DescribeSubnets",
          "ec2:DescribeSecurityGroups"
      ],
      "Resource": "*"
  }
  ```
+ 需要下列 `ec2` 許可，才能啟用 [SageMaker HyperPod 自動繼續功能](sagemaker-hyperpod-resiliency-slurm-auto-resume.md)。

  ```
  {
      "Effect": "Allow",
      "Action": [
          "ec2:DetachNetworkInterface"
      ],
      "Resource": "*"
  }
  ```
+ 下列 `ec2` 許可允許 SageMaker HyperPod 在帳戶內的網路介面上建立標籤。

  ```
  {
      "Effect": "Allow",
      "Action": "ec2:CreateTags",
      "Resource": [
          "arn:aws:ec2:*:*:network-interface/*"
      ]
  }
  ```

------
#### [ Amazon EKS ]

對於與 Amazon EKS 協作的 HyperPod，您必須將下列受管政策連接至 SageMaker HyperPod IAM 角色。
+ [AmazonSageMakerClusterInstanceRolePolicy](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerClusterInstanceRolePolicy.html)

除了受管政策之外，請將下列許可政策連接至角色。

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:AssignPrivateIpAddresses",
        "ec2:AttachNetworkInterface",
        "ec2:CreateNetworkInterface",
        "ec2:CreateNetworkInterfacePermission",
        "ec2:DeleteNetworkInterface",
        "ec2:DeleteNetworkInterfacePermission",
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeNetworkInterfaces",
        "ec2:DescribeTags",
        "ec2:DescribeVpcs",
        "ec2:DescribeDhcpOptions",
        "ec2:DescribeSubnets",
        "ec2:DescribeSecurityGroups",
        "ec2:DetachNetworkInterface",
        "ec2:ModifyNetworkInterfaceAttribute",
        "ec2:UnassignPrivateIpAddresses",
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetAuthorizationToken",
        "ecr:GetDownloadUrlForLayer",
        "eks-auth:AssumeRoleForPodIdentity"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateTags"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:network-interface/*"
      ]
    }
  ]
}
```

------

**注意**  
`"eks-auth:AssumeRoleForPodIdentity"` 許可是選用的。如果您打算使用 EKS Pod 身分識別，其是必要的。

**SageMaker HyperPod 服務連結角色**

對於 SageMaker HyperPod 中的 Amazon EKS 支援，HyperPod 會使用 [AWS 受管政策：AmazonSageMakerHyperPodServiceRolePolicy](security-iam-awsmanpol-AmazonSageMakerHyperPodServiceRolePolicy.md) 建立服務連結角色，以監控和支援 EKS 叢集上的彈性，例如取代節點和重新啟動任務。

**具有受限執行個體群組 (RIG) 的 Amazon EKS 叢集的其他 IAM 政策**

在受限執行個體群組中執行的工作負載依賴要從 Amazon S3 載入資料的執行角色。您必須將額外的 Amazon S3 許可新增至執行角色，以便在受限執行個體群組中執行的自訂任務可以正確擷取輸入資料。

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket"      
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ]
    }
  ]
}
```

------

------