使用啟動範本覆寫自訂運算節點網路介面 - AWS ParallelCluster

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

使用啟動範本覆寫自訂運算節點網路介面

從 AWS ParallelCluster 3.15.0 開始, LaunchTemplateOverrides 參數可讓您使用參考啟動範本中的組態覆寫預設網路介面組態,以自訂運算節點的網路介面。用於覆寫的啟動範本的網路介面區段會覆寫運算節點的整個網路介面區段。

本教學課程會逐步解說覆寫p6-b300.48xlarge運算節點預設網路組態的範例。當您需要與預設 AWS ParallelCluster 設定不同的特定網路介面組態時,此自訂很有用。在此範例中,我們為 P6-B300 執行個體設定使用案例 2,如 Amazon EC2 EFA 支援的執行個體類型文件所述。

注意

建議您使用 AWS CLI 來建立啟動範本,而非主控台,以獲得最大的彈性。

注意

啟動範本應該只包含網路介面覆寫。 AWS ParallelCluster 具有防止覆寫其他參數的驗證。

警告

如果您使用覆寫以使用執行個體類型不支援的方式設定網路介面,則執行個體將無法啟動。

先決條件

步驟 1:建立安全群組

建立要在覆寫中使用的啟動範本時,您必須參考安全群組。在叢集建立之前,運算資源的預設 AWS ParallelCluster 安全群組不存在,因此您必須建立自訂安全群組。然後,前端節點安全群組必須參考此安全群組,以允許前端節點和運算節點之間的流量。

如果您要更新現有叢集以自訂新容量,則可以在啟動範本中使用預設 AWS ParallelCluster 運算節點安全群組,而不是建立自訂叢集。

建立下列兩個安全群組:

  • 標頭節點額外安全群組 (sg-1234abcd):

    • 傳入:來自運算安全群組的所有流量

  • 運算安全群組 (sg-abcd1234):

    • 輸入:來自前端節點安全群組的所有流量

    • 輸入:來自自我的所有流量 compute-to-compute)

    • 輸出:預設全部允許

步驟 2:建立啟動範本

建立啟動範本,定義p6-b300.48xlarge運算節點的網路介面組態。對於主要網路界面 (網路卡索引 0、裝置索引 0),請使用 ENA (預設) 網路界面。對於剩餘的網路卡,請建立僅限 EFA 的介面 (網路卡索引 1-16,裝置索引 0) 和 ENA (預設) 介面 (網路卡索引 1-16,裝置索引 1)。

執行下列 AWS CLI 命令來建立啟動範本 (lt-123456789):

aws ec2 create-launch-template \ --region us-east-1 \ --launch-template-name override-lt \ --launch-template-data '{ "NetworkInterfaces": [ {"NetworkCardIndex":0, "DeviceIndex":0, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":1, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":1, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":2, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":2, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":3, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":3, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":4, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":4, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":5, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":5, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":6, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":6, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":7, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":7, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":8, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":8, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":9, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":9, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":10, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":10, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":11, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":11, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":12, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":12, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":13, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":13, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":14, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":14, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":15, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":15, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":16, "DeviceIndex":0, "InterfaceType":"efa-only", "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"}, {"NetworkCardIndex":16, "DeviceIndex":1, "Groups":["sg-abcd1234"], "SubnetId":"subnet-123456789"} ] }'

步驟 3:建立具有啟動範本覆寫的叢集

建立使用 LaunchTemplateOverrides 參數來參考您建立的啟動範本的叢集組態。

Region: us-east-1 HeadNode: InstanceType: c5.xlarge Networking: SubnetId: subnet-abcdefghi AdditionalSecurityGroups: # Add the head node SG that allows traffic from the compute node SG - sg-1234abcd ... Scheduling: Scheduler: slurm SlurmQueues: - Name: queue0 Networking: SubnetIds: - subnet-123456789 ComputeResources: - Name: compute-resource1 InstanceType: p6-b300.48xlarge Efa: Enabled: false # The override replaces all network interface configuration, so this setting is ignored LaunchTemplateOverrides: LaunchTemplateId: lt-123456789 Version: 1 # If the launch template is updated, then the new version should be specified here.