本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
设置 HyperPod 集群以进行模型部署
本指南为您提供了在 Amazon SageMaker HyperPod 集群上启用推理功能的全面设置指南。以下步骤可帮助您设置支持机器学习工程师部署和管理推理端点所需的基础架构、权限和操作员。
先决条件
在继续操作之前,请验证您的 AWS 凭证配置是否正确并具有必要的权限。确认您已使用创建 HyperPod集群创建集 SageMaker HyperPod 群。
-
将 kubectl 配置为连接到由 Amazon EKS HyperPod 集群编排的新创建的集群。指定区域和 HyperPod 集群名称。
export HYPERPOD_CLUSTER_NAME=<hyperpod-cluster-name> export REGION=<region> # S3 bucket where tls certificates will be uploaded BUCKET_NAME="<Enter name of your s3 bucket>" # This should be bucket name, not URI export EKS_CLUSTER_NAME=$(aws --region $REGION sagemaker describe-cluster --cluster-name $HYPERPOD_CLUSTER_NAME \ --query 'Orchestrator.Eks.ClusterArn' --output text | \ cut -d'/' -f2) aws eks update-kubeconfig --name $EKS_CLUSTER_NAME --region $REGION
-
设置默认环境变量。
LB_CONTROLLER_POLICY_NAME="AWSLoadBalancerControllerIAMPolicy-$HYPERPOD_CLUSTER_NAME" LB_CONTROLLER_ROLE_NAME="aws-load-balancer-controller-$HYPERPOD_CLUSTER_NAME" S3_MOUNT_ACCESS_POLICY_NAME="S3MountpointAccessPolicy-$HYPERPOD_CLUSTER_NAME" S3_CSI_ROLE_NAME="SM_HP_S3_CSI_ROLE-$HYPERPOD_CLUSTER_NAME" KEDA_OPERATOR_POLICY_NAME="KedaOperatorPolicy-$HYPERPOD_CLUSTER_NAME" KEDA_OPERATOR_ROLE_NAME="keda-operator-role-$HYPERPOD_CLUSTER_NAME" PRESIGNED_URL_ACCESS_POLICY_NAME="PresignedUrlAccessPolicy-$HYPERPOD_CLUSTER_NAME" HYPERPOD_INFERENCE_ACCESS_POLICY_NAME="HyperpodInferenceAccessPolicy-$HYPERPOD_CLUSTER_NAME" HYPERPOD_INFERENCE_ROLE_NAME="HyperpodInferenceRole-$HYPERPOD_CLUSTER_NAME" HYPERPOD_INFERENCE_SA_NAME="hyperpod-inference-service-account" HYPERPOD_INFERENCE_SA_NAMESPACE="kube-system" JUMPSTART_GATED_ROLE_NAME="JumpstartGatedRole-$HYPERPOD_CLUSTER_NAME" FSX_CSI_ROLE_NAME="AmazonEKSFSxLustreCSIDriverFullAccess-$HYPERPOD_CLUSTER_NAME"
-
从集群 ARN 中提取 Amazon EKS 集群名称,更新本地 kubeconfig,并通过列出命名空间中的所有 pod 来验证连接。
kubectl get pods --all-namespaces
-
(可选)安装 NVIDIA 设备插件以在集群上启用 GPU 支持。
#Install nvidia device plugin kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.5/nvidia-device-plugin.yml # Verify that GPUs are visible to k8s kubectl get nodes -o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia.com/gpu
为推理操作员安装做好环境准备
现在,您的 HyperPod 集群已配置完毕,下一步是安装推理运算符。推理运算符是一个 Kubernetes 运算符,支持在 Amazon EKS 集群上部署和管理机器学习推理终端节点。
完成接下来的关键准备步骤,确保您的 Amazon EKS 集群具有适当的安全配置和支持基础设施组件。这包括为跨服务身份验证配置 IAM 角色和安全策略、安装用于入口管理的 Loa AWS d Balancer Controller、设置 Amazon S3 和 Amazon FSx CSI 驱动程序以实现永久存储访问以及部署 KEDA 和证书管理器以实现自动扩展和证书管理功能。
-
收集在 EKS 和 IAM 组件之间配置服务集成 ARNs 所需的基本 AWS 资源标识符。 SageMaker
%%bash -x export ACCOUNT_ID=$(aws --region $REGION sts get-caller-identity --query 'Account' --output text) export OIDC_ID=$(aws --region $REGION eks describe-cluster --name $EKS_CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5) export EKS_CLUSTER_ROLE=$(aws eks --region $REGION describe-cluster --name $EKS_CLUSTER_NAME --query 'cluster.roleArn' --output text)
-
将 IAM OIDCidentity 提供商与您的 EKS 集群关联。
eksctl utils associate-iam-oidc-provider --region=$REGION --cluster=$EKS_CLUSTER_NAME --approve
-
创建 HyperPod 推理运算符 IAM 角色所需的信任策略和权限策略 JSON 文档。这些策略可在 EKS 和其他服务之间实现安全的跨 AWS 服务通信。 SageMaker
bash # Create trust policy JSON cat << EOF > trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "sagemaker.amazonaws.com" ] }, "Action": "sts:AssumeRole" }, { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/oidc.eks.${REGION}.amazonaws.com/id/${OIDC_ID}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringLike": { "oidc.eks.${REGION}.amazonaws.com/id/${OIDC_ID}:aud": "sts.amazonaws.com", "oidc.eks.${REGION}.amazonaws.com/id/${OIDC_ID}:sub": "system:serviceaccount:*:*" } } } ] } EOF # Create permission policy JSON cat << EOF > permission-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:PutObject", "s3:GetObject", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::$BUCKET_NAME" "arn:aws:s3:::$BUCKET_NAME/*" ] }, { "Effect": "Allow", "Action": [ "ecr:GetAuthorizationToken" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "ecr:BatchCheckLayerAvailability", "ecr:GetDownloadUrlForLayer", "ecr:GetRepositoryPolicy", "ecr:DescribeRepositories", "ecr:ListImages", "ecr:DescribeImages", "ecr:BatchGetImage", "ecr:GetLifecyclePolicy", "ecr:GetLifecyclePolicyPreview", "ecr:ListTagsForResource", "ecr:DescribeImageScanFindings" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "ec2:AssignPrivateIpAddresses", "ec2:AttachNetworkInterface", "ec2:CreateNetworkInterface", "ec2:DeleteNetworkInterface", "ec2:DescribeInstances", "ec2:DescribeTags", "ec2:DescribeNetworkInterfaces", "ec2:DescribeInstanceTypes", "ec2:DescribeSubnets", "ec2:DetachNetworkInterface", "ec2:DescribeDhcpOptions", "ec2:ModifyNetworkInterfaceAttribute", "ec2:UnassignPrivateIpAddresses", "ec2:CreateTags", "ec2:DescribeRouteTables", "ec2:DescribeSecurityGroups", "ec2:DescribeVolumes", "ec2:DescribeVolumesModifications", "ec2:DescribeVpcs" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "eks:Describe*", "eks:List*", "eks:AssociateAccessPolicy", "eks:AccessKubernetesApi", "eks-auth:AssumeRoleForPodIdentity" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "elasticloadbalancing:Create*", "elasticloadbalancing:Describe*" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "sagemaker:CreateModel", "sagemaker:DescribeModel", "sagemaker:DeleteModel", "sagemaker:ListModels", "sagemaker:CreateEndpointConfig", "sagemaker:DescribeEndpointConfig", "sagemaker:DeleteEndpointConfig", "sagemaker:CreateEndpoint", "sagemaker:DeleteEndpoint", "sagemaker:DescribeEndpoint", "sagemaker:UpdateEndpoint", "sagemaker:ListTags", "sagemaker:EnableClusterInference", "sagemaker:DescribeClusterInference", "sagemaker:DescribeHubContent" ], "Resource": "arn:aws:sagemaker:$REGION:*:*" }, { "Effect": "Allow", "Action": [ "fsx:DescribeFileSystems" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "acm:ImportCertificate", "acm:DeleteCertificate" ], "Resource": "arn:aws:acm:$REGION:$ACCOUNT_ID:certificate/*" }, { "Sid": "AllowPassRoleToSageMaker", "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": "arn:aws:iam::$ACCOUNT_ID:role/$HYPERPOD_INFERENCE_ROLE_NAME", "Condition": { "StringEquals": { "iam:PassedToService": "sagemaker.amazonaws.com" } } }, { "Sid": "CloudWatchEMFPermissions", "Effect": "Allow", "Action": [ "cloudwatch:PutMetricData", "logs:PutLogEvents", "logs:DescribeLogStreams", "logs:DescribeLogGroups", "logs:CreateLogStream", "logs:CreateLogGroup" ], "Resource": "*" } ] } EOF
-
为推理运算符创建执行角色。
aws iam create-policy --policy-name $HYPERPOD_INFERENCE_ACCESS_POLICY_NAME --policy-document file://permission-policy.json export policy_arn="arn:aws:iam::${ACCOUNT_ID}:policy/$HYPERPOD_INFERENCE_ACCESS_POLICY_NAME" # Create the IAM role eksctl create iamserviceaccount --approve --role-only --name=$HYPERPOD_INFERENCE_SA_NAME --namespace=$HYPERPOD_INFERENCE_SA_NAMESPACE --cluster=$EKS_CLUSTER_NAME --attach-policy-arn=$policy_arn --role-name=$HYPERPOD_INFERENCE_ROLE_NAME --region=$REGION
aws iam create-role --role-name $HYPERPOD_INFERENCE_ROLE_NAME --assume-role-policy-document file://trust-policy.json aws iam put-role-policy --role-name $HYPERPOD_INFERENCE_ROLE_NAME --policy-name InferenceOperatorInlinePolicy --policy-document file://permission-policy.json
-
下载并创建 Load Balancer Controller 管理您的 EKS 集群中的应用程序负 AWS 载均衡器和网络负载均衡器所需的 IAM 策略。
%%bash -x export ALBController_IAM_POLICY_NAME=HyperPodInferenceALBControllerIAMPolicy curl -o AWSLoadBalancerControllerIAMPolicy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.13.0/docs/install/iam_policy.json aws iam create-policy --policy-name $ALBController_IAM_POLICY_NAME --policy-document file://AWSLoadBalancerControllerIAMPolicy.json
-
创建一个 IAM 服务账户,将 Kubernetes 服务账户与 IAM 策略关联起来,使 AWS 负载均衡器控制器能够通过 IRSA(服务账户的 IAM 角色)获得必要的 AWS 权限。
%%bash -x export ALB_POLICY_ARN="arn:aws:iam::$ACCOUNT_ID:policy/$ALBController_IAM_POLICY_NAME" # Create IAM service account with gathered values eksctl create iamserviceaccount \ --approve \ --override-existing-serviceaccounts \ --name=aws-load-balancer-controller \ --namespace=kube-system \ --cluster=$EKS_CLUSTER_NAME \ --attach-policy-arn=$ALB_POLICY_ARN \ --region=$REGION # Print the values for verification echo "Cluster Name: $EKS_CLUSTER_NAME" echo "Region: $REGION" echo "Policy ARN: $ALB_POLICY_ARN"
-
将标签 (
kubernetes.io.role/elb
) 应用于 EKS 集群中的所有子网(包括公有子网和私有子网)。export VPC_ID=$(aws --region $REGION eks describe-cluster --name $EKS_CLUSTER_NAME --query 'cluster.resourcesVpcConfig.vpcId' --output text) # Add Tags aws ec2 describe-subnets \ --filters "Name=vpc-id,Values=${VPC_ID}" "Name=map-public-ip-on-launch,Values=true" \ --query 'Subnets[*].SubnetId' --output text | \ tr '\t' '\n' | \ xargs -I{} aws ec2 create-tags --resources {} --tags Key=kubernetes.io/role/elb,Value=1 # Verify Tags are added aws ec2 describe-subnets \ --filters "Name=vpc-id,Values=${VPC_ID}" "Name=map-public-ip-on-launch,Values=true" \ --query 'Subnets[*].SubnetId' --output text | \ tr '\t' '\n' | xargs -n1 -I{} aws ec2 describe-tags --filters "Name=resource-id,Values={}" "Name=key,Values=kubernetes.io/role/elb" --query "Tags[0].Value" --output text
-
为 KEDA 和证书管理器创建命名空间。
kubectl create namespace keda kubectl create namespace cert-manager
-
创建 Amazon S3 VPC 端点。
aws ec2 create-vpc-endpoint \ --vpc-id ${VPC_ID} \ --vpc-endpoint-type Gateway \ --service-name "com.amazonaws.${REGION}.s3" \ --route-table-ids $(aws ec2 describe-route-tables --filters "Name=vpc-id,Values=${VPC_ID}" --query 'RouteTables[].Associations[].RouteTableId' --output text | tr ' ' '\n' | sort -u | tr '\n' ' ')
-
配置 S3 存储访问权限:
-
创建一个 IAM 策略,授予使用适用于 Amazon S3 的 Mountpoint 所必需的 S3 权限,从而允许文件系统从集群内访问 S3 存储桶。
%%bash -x cat <<EOF> s3accesspolicy.json { "Version": "2012-10-17", "Statement": [ { "Sid": "MountpointFullBucketAccess", "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::*", "arn:aws:s3:::*/*" ] }, { "Sid": "MountpointFullObjectAccess", "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:AbortMultipartUpload", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::*", "arn:aws:s3:::*/*" ] } ] } EOF aws iam create-policy \ --policy-name S3MountpointAccessPolicy \ --policy-document file://s3accesspolicy.json cat <<EOF>> s3accesstrustpolicy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::$ACCOUNT_ID:oidc-provider/oidc.eks.$REGION.amazonaws.com/id/${OIDC_ID}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "oidc.eks.$REGION.amazonaws.com/id/${OIDC_ID}:aud": "sts.amazonaws.com", "oidc.eks.$REGION.amazonaws.com/id/${OIDC_ID}:sub": "system:serviceaccount:kube-system:${s3-csi-driver-sa}" } } } ] } EOF aws iam create-role --role-name $S3_CSI_ROLE_NAME --assume-role-policy-document file://s3accesstrustpolicy.json aws iam attach-role-policy --role-name $S3_CSI_ROLE_NAME --policy-arn "arn:aws:iam::$ACCOUNT_ID:policy/S3MountpointAccessPolicy"
-
(可选)为 Amazon S3 CSI 驱动程序创建 IAM 服务账户。Amazon S3 CSI 驱动程序需要一个具有适当权限的 IAM 服务账户,才能将 S3 存储桶作为永久卷挂载到您的 EKS 集群中。此步骤使用所需的 S3 访问策略创建必要的 IAM 角色和 Kubernetes 服务账户。
%%bash -x export S3_CSI_ROLE_NAME="SM_HP_S3_CSI_ROLE-$REGION" export S3_CSI_POLICY_ARN=$(aws iam list-policies --query 'Policies[?PolicyName==`S3MountpointAccessPolicy`]' | jq '.[0].Arn' | tr -d '"') eksctl create iamserviceaccount \ --name s3-csi-driver-sa \ --namespace kube-system \ --cluster $EKS_CLUSTER_NAME \ --attach-policy-arn $S3_CSI_POLICY_ARN \ --approve \ --role-name $S3_CSI_ROLE_NAME \ --region $REGION kubectl label serviceaccount s3-csi-driver-sa app.kubernetes.io/component=csi-driver app.kubernetes.io/instance=aws-mountpoint-s3-csi-driver app.kubernetes.io/managed-by=EKS app.kubernetes.io/name=aws-mountpoint-s3-csi-driver -n kube-system --overwrite
-
(可选)安装 Amazon S3 CSI 驱动程序附加组件。此驱动程序使您的 pod 能够将 S3 存储桶作为永久卷挂载,从而可以从 Kubernetes 工作负载中直接访问 S3 存储。
%%bash -x export S3_CSI_ROLE_ARN=$(aws iam get-role --role-name $S3_CSI_ROLE_NAME --query 'Role.Arn' --output text) eksctl create addon --name aws-mountpoint-s3-csi-driver --cluster $EKS_CLUSTER_NAME --service-account-role-arn $S3_CSI_ROLE_ARN --force
-
(可选)为 S3 存储创建永久卷声明 (PVC)。此 PVC 使您的 Pod 能够像传统文件系统一样请求和使用 S3 存储。
%%bash -x cat <<EOF> pvc_s3.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: s3-claim spec: accessModes: - ReadWriteMany # supported options: ReadWriteMany / ReadOnlyMany storageClassName: "" # required for static provisioning resources: requests: storage: 1200Gi # ignored, required volumeName: s3-pv EOF kubectl apply -f pvc_s3.yaml
-
-
(可选)配置 FSx 存储访问权限。为 Amazon FSx CSI 驱动程序创建一个 IAM 服务账户。 FSx CSI 驱动程序将使用此服务账户代表您的集群与 Amazon FSx 服务进行交互。
%%bash -x eksctl create iamserviceaccount \ --name fsx-csi-controller-sa \ --namespace kube-system \ --cluster $EKS_CLUSTER_NAME \ --attach-policy-arn arn:aws:iam::aws:policy/AmazonFSxFullAccess \ --approve \ --role-name FSXLCSI-${EKS_CLUSTER_NAME}-${REGION} \ --region $REGION
创建 KEDA 操作员角色
-
创建信任策略和权限策略。
# Create trust policy cat <<EOF > /tmp/keda-trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::$ACCOUNT_ID:oidc-provider/oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringLike": { "oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID:sub": "system:serviceaccount:kube-system:keda-operator", "oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID:aud": "sts.amazonaws.com" } } } ] } EOF # Create permissions policy cat <<EOF > /tmp/keda-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "aps:QueryMetrics", "aps:GetLabels", "aps:GetSeries", "aps:GetMetricMetadata" ], "Resource": "*" } ] } EOF # Create the role aws iam create-role \ --role-name keda-operator-role \ --assume-role-policy-document file:///tmp/keda-trust-policy.json # Create the policy KEDA_POLICY_ARN=$(aws iam create-policy \ --policy-name KedaOperatorPolicy \ --policy-document file:///tmp/keda-policy.json \ --query 'Policy.Arn' \ --output text) # Attach the policy to the role aws iam attach-role-policy \ --role-name keda-operator-role \ --policy-arn $KEDA_POLICY_ARN
-
如果您使用的是门控模型,请创建一个 IAM 角色来访问门控模型。
-
创建一个 IAM 策略。
%%bash -s $REGION cat <<EOF> /tmp/presignedurl-policy.json { "Version": "2012-10-17", "Statement": [ { "Sid": "CreatePresignedUrlAccess", "Effect": "Allow", "Action": [ "sagemaker:CreateHubContentPresignedUrls" ], "Resource": [ "arn:aws:sagemaker:$1:aws:hub/SageMakerPublicHub", "arn:aws:sagemaker:$1:aws:hub-content/SageMakerPublicHub/*/*" ] } ] } EOF aws iam create-policy --policy-name PresignedUrlAccessPolicy --policy-document file:///tmp/presignedurl-policy.json JUMPSTART_GATED_ROLE_NAME="JumpstartGatedRole-${REGION}-${HYPERPOD_CLUSTER_NAME}" cat <<EOF > /tmp/trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::$ACCOUNT_ID:oidc-provider/oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringLike": { "oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID:sub": "system:serviceaccount:*:$HYPERPOD_INFERENCE_SA_NAME", "oidc.eks.$REGION.amazonaws.com/id/$OIDC_ID:aud": "sts.amazonaws.com" } } }, { "Effect": "Allow", "Principal": { "Service": "sagemaker.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF
-
创建一个 IAM 角色。
# Create the role using existing trust policy aws iam create-role \ --role-name $JUMPSTART_GATED_ROLE_NAME \ --assume-role-policy-document file:///tmp/trust-policy.json # Attach the existing PresignedUrlAccessPolicy to the role aws iam attach-role-policy \ --role-name $JUMPSTART_GATED_ROLE_NAME \ --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/PresignedUrlAccessPolicy
JUMPSTART_GATED_ROLE_ARN_LIST= !aws iam get-role --role-name=$JUMPSTART_GATED_ROLE_NAME --query "Role.Arn" --output text JUMPSTART_GATED_ROLE_ARN = JUMPSTART_GATED_ROLE_ARN_LIST[0] !echo $JUMPSTART_GATED_ROLE_ARN
-
向执行角色添加
SageMakerFullAccess
策略。aws iam attach-role-policy --role-name=$HYPERPOD_INFERENCE_ROLE_NAME --policy-arn=arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
-
安装推理运算符
-
安装 HyperPod 推理运算符。此步骤收集所需的 AWS 资源标识符,并生成带有相应配置参数的 Helm 安装命令。
从 https://github.com/aws/sagemaker-hyperpod-cli/tree/main/helm_chart 访问掌舵图
。 git clone https://github.com/aws/sagemaker-hyperpod-cli cd sagemaker-hyperpod-cli cd helm_chart/HyperPodHelmChart
%%bash -x HYPERPOD_INFERENCE_ROLE_ARN=$(aws iam get-role --role-name=$HYPERPOD_INFERENCE_ROLE_NAME --query "Role.Arn" --output text) echo $HYPERPOD_INFERENCE_ROLE_ARN S3_CSI_ROLE_ARN=$(aws iam get-role --role-name=$S3_CSI_ROLE_NAME --query "Role.Arn" --output text) echo $S3_CSI_ROLE_ARN HYPERPOD_CLUSTER_ARN=$(aws sagemaker describe-cluster --cluster-name $HYPERPOD_CLUSTER_NAME --query "ClusterArn") # Verify values echo "Cluster Name: $EKS_CLUSTER_NAME" echo "Execution Role: $HYPERPOD_INFERENCE_ROLE_ARN" echo "Hyperpod ARN: $HYPERPOD_CLUSTER_ARN" # Run the the HyperPod inference operator installation. helm install hyperpod-inference-operator charts/inference-operator -n kube-system \ --set region=$REGION \ --set eksClusterName=$EKS_CLUSTER_NAME \ --set hyperpodClusterArn=$HYPERPOD_CLUSTER_ARN \ --set executionRoleArn=$HYPERPOD_INFERENCE_ROLE_ARN \ --set s3.serviceAccountRoleArn=$S3_CSI_ROLE_ARN \ --set s3.node.serviceAccount.create=false \ --set keda.podIdentity.aws.irsa.roleArn="arn:aws:iam::$ACCOUNT_ID:role/keda-operator-role" \ --set tlsCertificateS3Bucket="s3://$BUCKET_NAME" \ --set alb.region=$REGION \ --set alb.clusterName=$EKS_CLUSTER_NAME \ --set alb.vpcId=$VPC_ID # For JumpStart Gated Model usage, Add # --set jumpstartGatedModelDownloadRoleArn=$UMPSTART_GATED_ROLE_ARN
-
为 IAM 集成配置服务账户注释。此注解使操作员的服务账户能够获得管理推理端点和与 AWS 服务交互所必需的 IAM 权限。
%%bash -x EKS_CLUSTER_ROLE_NAME=$(echo $EKS_CLUSTER_ROLE | sed 's/.*\///') # Annotate service account kubectl annotate serviceaccount hyperpod-inference-operator-controller-manager \ -n hyperpod-inference-system \ eks.amazonaws.com/role-arn=arn:aws:iam::${ACCOUNT_ID}:role/${EKS_CLUSTER_ROLE_NAME} \ --overwrite
验证推理运算符是否正常工作
-
创建模型部署配置文件。这将创建一个 Kubernetes 清单文件,用于为推理运算符定义 JumpStart 模型部署。 HyperPod该配置指定了如何将来自亚马逊的预训练模型部署 SageMaker JumpStart 为 Amazon EKS 集群上的推理终端节点。
cat <<EOF>> simple_model_install.yaml --- apiVersion: inference.sagemaker.aws.amazon.com/v1alpha1 kind: JumpStartModel metadata: name: testing-deployment-bert namespace: default spec: model: modelId: "huggingface-eqa-bert-base-cased" sageMakerEndpoint: name: "hp-inf-ep-for-testing" server: instanceType: "ml.c5.2xlarge" environmentVariables: - name: SAMPLE_ENV_VAR value: "sample_value" maxDeployTimeInSeconds: 1800 EOF
-
部署模型并清理配置文件。此步骤创建 JumpStart 模型资源并删除临时配置文件以保持工作空间整洁。
%%bash -x kubectl create -f simple_model_install.yaml rm -rfv simple_model_install.yaml
-
检查模型是否已安装并正在运行。此验证可确保操作员能够成功获得管理推理端点的 AWS 权限。
%%bash # Get the service account details kubectl get serviceaccount -n hyperpod-inference-system # Check if the service account has the AWS annotations kubectl describe serviceaccount hyperpod-inference-operator-controller-manager -n hyperpod-inference-system
-
写入模型输入文件。这将创建一个包含示例数据的 JSON 输入文件,用于测试已部署模型的问答能力。
%%writefile demo-input.json {"question" :"what is the name of the planet?","context" : "earth"}
-
调用 SageMaker 端点执行负载测试,以验证推理端点的性能和可靠性。
%%bash #!/bin/bash for i in {1..1000} do echo "Invocation #$i" aws sagemaker-runtime invoke-endpoint \ --endpoint-name testing-deployment-jumpstart-9 \ --region {REGION} \ --body fileb://demo-input.json \ --content-type application/list-text \ --accept application/json \ "demoout_${i}.json" # Add a small delay to prevent throttling (optional) #sleep 0.5 rm -f "demoout_${i}.json" done