Setting up multiple controller nodes for a SageMaker HyperPod Slurm cluster - Amazon SageMaker AI

Setting up multiple controller nodes for a SageMaker HyperPod Slurm cluster

This topic explains how to configure multiple controller (head) nodes in a SageMaker HyperPod Slurm cluster using lifecycle scripts. Before you start, review the prerequisites listed in Prerequisites for using SageMaker HyperPod and familiarize yourself with the lifecycle scripts in Customizing SageMaker HyperPod clusters using lifecycle scripts. The instructions in this topic use AWS CLI commands in Amazon Linux environment. Note that the environment variables used in these commands are available in the current session unless explicitly preserved.