本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
使用 SageMaker API 更新 SageMaker HyperPod 集群的训练计划,或者 AWS CLI
您可以使用update-cluster AWS CLI 命令更新现有集群的实例组,从而添加、更新或删除训练计划。以下示例说明如何更新集 SageMaker HyperPod 群并为实例组提供新的训练计划。
# Update a cluster aws sagemaker update-cluster \ --cluster-namecluster-name\ --instance-groups '[ \ { \ "InstanceCount":1,\ "InstanceGroupName": "controller-nodes",\ "InstanceType": "ml.t3.xlarge",\ "LifeCycleConfig": {"SourceS3Uri":source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id:role/execution_role",\ "ThreadsPerCore":1,\ },\ { \ "InstanceCount":2, \ "InstanceGroupName": "worker-nodes",\ "InstanceType": "p4d.24xlarge",\ "LifeCycleConfig": {"SourceS3Uri":source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id}:role/execution_role}",\ "ThreadsPerCore":1,\ "TrainingPlanArn":training_plan_arn,\ },\ {\ "InstanceCount":1,\ "InstanceGroupName": "worker-nodes-2",\ "InstanceType": "p4d.24xlarge",\ "LifeCycleConfig": {"SourceS3Uri":source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id:role/execution_role",\ "ThreadsPerCore":1,\ "TrainingPlanArn":training_plan_arn,\ }\ ]'