本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
使用启动模板覆盖自定义计算节点网络接口
从 AWS ParallelCluster 3.15.0 开始,该LaunchTemplateOverrides参数允许您使用引用的启动模板中的配置覆盖默认网络接口配置,从而自定义计算节点的网络接口。计算节点的整个网络接口部分被用于覆盖的启动模板的网络接口部分覆盖。
本教程演示了一个覆盖p6-b300.48xlarge计算节点默认网络配置的示例。当您需要与默认配置不同的特定网络接口配置时,此自定义非常有用。 AWS ParallelCluster 在本示例中,我们按照 A mazon EC2 EFA 支持的实例类型文档中概述的 P6-B300 实例配置用例 2。
注意
为了最大限度地提高灵活性,建议使用而不是控制台 AWS CLI 来创建启动模板。
注意
启动模板应仅包含网络接口替代项。 AWS ParallelCluster 具有防止覆盖其他参数的验证。
警告
如果您使用替换功能以所使用的实例类型不支持的方式配置网络接口,则实例将无法启动。
步骤 1:创建安全组
在创建用于替代的启动模板时,必须引用安全组。在创建集群之前,计算资源的默认 AWS ParallelCluster 安全组不存在,因此您必须创建自定义安全组。然后,头节点安全组必须引用此安全组,以允许头节点和计算节点之间的流量。
如果您要更新现有集群以自定义新容量,则可以在启动模板中使用默认 AWS ParallelCluster 计算节点安全组,而不是创建自定义计算节点安全组。
创建以下两个安全组:
-
Head node 附加安全组 (
sg-1234abcd):-
入口:来自计算安全组的所有流量
-
-
计算安全组 (
sg-abcd1234):-
入口:来自头节点安全组的所有流量
-
入口:所有来自自我的流量 () compute-to-compute
-
出口:默认全部允许
-
步骤 2:创建启动模板
创建用于定义p6-b300.48xlarge计算节点网络接口配置的启动模板。对于主网络接口(网卡索引 0,设备索引 0),请使用 ENA(默认)网络接口。对于其余的网卡,创建一个仅限 EFA 的接口(网卡索引 1-16,设备索引 0)和 ENA(默认)接口(网卡索引 1-16,设备索引 1)。
运行以下 AWS CLI 命令来创建启动模板 (lt-123456789):
aws ec2 create-launch-template \ --region us-east-1 \ --launch-template-name override-lt \ --launch-template-data '{ "NetworkInterfaces": [ {"NetworkCardIndex":0, "DeviceIndex":0, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":1, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":1, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":2, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":2, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":3, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":3, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":4, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":4, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":5, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":5, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":6, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":6, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":7, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":7, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":8, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":8, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":9, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":9, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":10, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":10, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":11, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":11, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":12, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":12, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":13, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":13, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":14, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":14, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":15, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":15, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":16, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":16, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"} ] }'
步骤 3:使用启动模板覆盖创建集群
创建使用LaunchTemplateOverrides参数引用您创建的启动模板的集群配置。
Region: us-east-1 HeadNode: InstanceType: c5.xlarge Networking: SubnetId: subnet-abcdefghi AdditionalSecurityGroups: # Add the head node SG that allows traffic from the compute node SG - sg-1234abcd ... Scheduling: Scheduler: slurm SlurmQueues: - Name: queue0 Networking: SubnetIds: - subnet-123456789 ComputeResources: - Name: compute-resource1 InstanceType: p6-b300.48xlarge Efa: Enabled: false # The override replaces all network interface configuration, so this setting is ignored LaunchTemplateOverrides: LaunchTemplateId: lt-123456789 Version: 1 # If the launch template is updated, then the new version should be specified here.