AWS::SageMaker::Cluster ClusterInstanceGroup - AWS CloudFormation

This is the new CloudFormation Template Reference Guide. Please update your bookmarks and links. For help getting started with CloudFormation, see the AWS CloudFormation User Guide.

AWS::SageMaker::Cluster ClusterInstanceGroup

The configuration information of the instance group within the HyperPod cluster.

Syntax

To declare this entity in your CloudFormation template, use the following syntax:

Properties

CapacityRequirements

The capacity requirements for the instance group, specifying on-demand and spot instance configurations.

Required: No

Type: ClusterCapacityRequirements

Update requires: No interruption

CurrentCount

The number of instances that are currently in the instance group of a SageMaker HyperPod cluster.

Required: No

Type: Integer

Minimum: 0

Update requires: No interruption

ExecutionRole

The execution role for the instance group to assume.

Required: Yes

Type: String

Pattern: ^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$

Minimum: 20

Maximum: 2048

Update requires: Replacement

ImageId

The ID of the Amazon Machine Image (AMI) to use for the instances in the group.

Required: No

Type: String

Pattern: ^ami-[0-9a-fA-F]{8,17}|default$

Minimum: 7

Maximum: 21

Update requires: No interruption

InstanceCount

The number of instances in an instance group of the SageMaker HyperPod cluster.

Required: Yes

Type: Integer

Minimum: 0

Update requires: No interruption

InstanceGroupName

The name of the instance group of a SageMaker HyperPod cluster.

Required: Yes

Type: String

Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9])*$

Minimum: 1

Maximum: 63

Update requires: Replacement

InstanceStorageConfigs

The configurations of additional storage specified to the instance group where the instance (node) is launched.

Required: No

Type: Array of ClusterInstanceStorageConfig

Maximum: 4

Update requires: No interruption

InstanceType

The instance type of the instance group of a SageMaker HyperPod cluster.

Required: Yes

Type: String

Allowed values: ml.p4d.24xlarge | ml.p4de.24xlarge | ml.p5.48xlarge | ml.p5.4xlarge | ml.p6e-gb200.36xlarge | ml.trn1.32xlarge | ml.trn1n.32xlarge | ml.g5.xlarge | ml.g5.2xlarge | ml.g5.4xlarge | ml.g5.8xlarge | ml.g5.12xlarge | ml.g5.16xlarge | ml.g5.24xlarge | ml.g5.48xlarge | ml.c5.large | ml.c5.xlarge | ml.c5.2xlarge | ml.c5.4xlarge | ml.c5.9xlarge | ml.c5.12xlarge | ml.c5.18xlarge | ml.c5.24xlarge | ml.c5n.large | ml.c5n.2xlarge | ml.c5n.4xlarge | ml.c5n.9xlarge | ml.c5n.18xlarge | ml.m5.large | ml.m5.xlarge | ml.m5.2xlarge | ml.m5.4xlarge | ml.m5.8xlarge | ml.m5.12xlarge | ml.m5.16xlarge | ml.m5.24xlarge | ml.t3.medium | ml.t3.large | ml.t3.xlarge | ml.t3.2xlarge | ml.g6.xlarge | ml.g6.2xlarge | ml.g6.4xlarge | ml.g6.8xlarge | ml.g6.16xlarge | ml.g6.12xlarge | ml.g6.24xlarge | ml.g6.48xlarge | ml.gr6.4xlarge | ml.gr6.8xlarge | ml.g6e.xlarge | ml.g6e.2xlarge | ml.g6e.4xlarge | ml.g6e.8xlarge | ml.g6e.16xlarge | ml.g6e.12xlarge | ml.g6e.24xlarge | ml.g6e.48xlarge | ml.p5e.48xlarge | ml.p5en.48xlarge | ml.p6-b200.48xlarge | ml.trn2.3xlarge | ml.trn2.48xlarge | ml.c6i.large | ml.c6i.xlarge | ml.c6i.2xlarge | ml.c6i.4xlarge | ml.c6i.8xlarge | ml.c6i.12xlarge | ml.c6i.16xlarge | ml.c6i.24xlarge | ml.c6i.32xlarge | ml.m6i.large | ml.m6i.xlarge | ml.m6i.2xlarge | ml.m6i.4xlarge | ml.m6i.8xlarge | ml.m6i.12xlarge | ml.m6i.16xlarge | ml.m6i.24xlarge | ml.m6i.32xlarge | ml.r6i.large | ml.r6i.xlarge | ml.r6i.2xlarge | ml.r6i.4xlarge | ml.r6i.8xlarge | ml.r6i.12xlarge | ml.r6i.16xlarge | ml.r6i.24xlarge | ml.r6i.32xlarge | ml.i3en.large | ml.i3en.xlarge | ml.i3en.2xlarge | ml.i3en.3xlarge | ml.i3en.6xlarge | ml.i3en.12xlarge | ml.i3en.24xlarge | ml.m7i.large | ml.m7i.xlarge | ml.m7i.2xlarge | ml.m7i.4xlarge | ml.m7i.8xlarge | ml.m7i.12xlarge | ml.m7i.16xlarge | ml.m7i.24xlarge | ml.m7i.48xlarge | ml.r7i.large | ml.r7i.xlarge | ml.r7i.2xlarge | ml.r7i.4xlarge | ml.r7i.8xlarge | ml.r7i.12xlarge | ml.r7i.16xlarge | ml.r7i.24xlarge | ml.r7i.48xlarge | ml.p6-b300.48xlarge

Update requires: Replacement

KubernetesConfig

The Kubernetes configuration for the instance group, including labels and taints.

Required: No

Type: ClusterKubernetesConfig

Update requires: No interruption

LifeCycleConfig

The lifecycle configuration for a SageMaker HyperPod cluster.

Required: Yes

Type: ClusterLifeCycleConfig

Update requires: No interruption

MinInstanceCount

The minimum number of instances to maintain in the instance group.

Required: No

Type: Integer

Minimum: 0

Update requires: No interruption

OnStartDeepHealthChecks

A flag indicating whether deep health checks should be performed when the HyperPod cluster instance group is created or updated. Deep health checks are comprehensive, invasive tests that validate the health of the underlying hardware and infrastructure components.

Required: No

Type: Array of String

Update requires: No interruption

OverrideVpcConfig

The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.

Required: No

Type: VpcConfig

Update requires: Replacement

ScheduledUpdateConfig

Configuration for scheduled updates to the instance group.

Required: No

Type: ScheduledUpdateConfig

Update requires: No interruption

SlurmConfig

The Slurm workload manager configuration for the instance group.

Required: No

Type: ClusterSlurmConfig

Update requires: No interruption

ThreadsPerCore

The number of threads per CPU core you specified under CreateCluster.

Required: No

Type: Integer

Minimum: 1

Maximum: 2

Update requires: Replacement

TrainingPlanArn

The Amazon Resource Name (ARN) of the training plan associated with the instance group.

Required: No

Type: String

Pattern: ^arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:training-plan/.*$

Minimum: 50

Maximum: 2048

Update requires: No interruption