Creating SageMaker HyperPod clusters using AWS CloudFormation templates
You can create SageMaker HyperPod clusters using the CloudFormation templates for HyperPod. You must install AWS CLI to proceed.
In this topic:
Configure resources in the console and deploy using CloudFormation
You can configure resources using the AWS Management Console and deploy using the CloudFormation templates.
Follow these steps.
-
Follow instructions in Creating a SageMaker HyperPod cluster with Amazon EKS orchestration to configure your AWS resources that you will need to create your cluster.
-
At the end of the Create cluster page, choose Download CloudFormation template parameters. This will open the Using the configuration file to create the cluster using the AWS CLI window on the right of the page.
-
On the Using the configuration file to create the cluster using the AWS CLI window, choose Download configuration parameters file. The file will be downloaded to your machine. You can edit the configuration JSON file based on your needs or leave it as is if no change is required.
-
Run the create-stack AWS CLI command to deploy the CloudFormation stack that will provision the configured resources and create the HyperPod cluster.
aws cloudformation create-stack --stack-name
my-stack
--template-urlhttps://aws-sagemaker-hyperpod-cluster-setup.amazonaws.com/templates-slurm/main-stack-slurm-based-template.yaml
--parameters file://params.json --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM -
To view the status of the resources provisioning, navigate to the CloudFormation console.
After the cluster creation completes, view the new cluster under Clusters in the main pane of the SageMaker HyperPod console. You can check the status of it displayed under the Status column.
-
After the status of the cluster turns to
InService
, you can start logging into the cluster nodes. To access the cluster nodes and start running ML workloads, see Jobs on SageMaker HyperPod clusters.
Configure and deploy resources using CloudFormation
You can configure and deploy resources using the CloudFormation templates for SageMaker HyperPod.
Follow these steps.
-
Download a CloudFormation template for SageMaker HyperPod from the sagemaker-hyperpod-cluster-setup
GitHub repository. -
Run the create-stack AWS CLI command to deploy the CloudFormation stack that will provision the configured resources and create the HyperPod cluster.
aws cloudformation create-stack --stack-name
my-stack
--template-urlURL_of_the_file_that_contains_the_template_body
--parameters file://params.json --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM -
To view the status of the resources provisioning, navigate to the CloudFormation console.
After the cluster creation completes, view the new cluster under Clusters in the main pane of the SageMaker HyperPod console. You can check the status of it displayed under the Status column.
-
After the status of the cluster turns to
InService
, you can start logging into the cluster nodes.