

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 在 AWS PCS 中使用 Slurm 執行多節點 MPI 任務
<a name="getting-started_run-mpi-job"></a>

 這些指示示範使用 Slurm 在 AWS PCS 中執行訊息傳遞界面 (MPI) 任務。

在登入節點的 shell 提示下執行下列命令。
+  成為預設使用者。變更為其主目錄。

  ```
  sudo su - ec2-user
  cd ~/
  ```
+  使用 C 程式設計語言建立原始程式碼。

  ```
  cat > hello.c << EOF
  // * mpi-hello-world - https://www.mpitutorial.com
  // Released under MIT License
  // 
  // Copyright (c) 2014 MPI Tutorial.
  //
  // Permission is hereby granted, free of charge, to any person obtaining a copy
  // of this software and associated documentation files (the "Software"), to 
  // deal in the Software without restriction, including without limitation the 
  // rights to use, copy, modify, merge, publish, distribute, sublicense, and/or 
  // sell copies of the Software, and to permit persons to whom the Software is 
  // furnished to do so, subject to the following conditions:
  // The above copyright notice and this permission notice shall be included in 
  // all copies or substantial portions of the Software.
  //
  // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
  // IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
  // FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  // AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
  // LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
  // FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
  // DEALINGS IN THE SOFTWARE.
  
  #include <mpi.h>
  #include <stdio.h>
  #include <stddef.h>
  
  int main(int argc, char** argv) {
    // Initialize the MPI environment. The two arguments to MPI Init are not
    // currently used by MPI implementations, but are there in case future
    // implementations might need the arguments.
    MPI_Init(NULL, NULL);
  
    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
  
    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
  
    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);
  
    // Print off a hello world message
    printf("Hello world from processor %s, rank %d out of %d processors\n",
           processor_name, world_rank, world_size);
  
    // Finalize the MPI environment. No more MPI calls can be made after this
    MPI_Finalize();
  }
  EOF
  ```
+ 載入 OpenMPI 模組。

  ```
  module load openmpi
  ```
+ 編譯 C 程式。

  ```
  mpicc -o hello hello.c
  ```
+ 撰寫 Slurm 任務提交指令碼。

  ```
  cat > hello.sh << EOF
  #!/bin/bash
  #SBATCH -J multi
  #SBATCH -o multi.out
  #SBATCH -e multi.err
  #SBATCH --exclusive
  #SBATCH --nodes=4
  #SBATCH --ntasks-per-node=1
  
  srun $HOME/hello
  EOF
  ```
+ 變更為共用目錄。

  ```
  cd /shared
  ```
+ 提交任務指令碼。

  ```
  sbatch -p demo ~/hello.sh
  ```
+ 使用 `squeue`監控任務，直到任務完成為止。
+ 檢查 的內容`multi.out`：

  ```
  cat multi.out
  ```

  輸出類似如下。請注意，每個排名都有自己的 IP 地址，因為它在不同的節點上執行。

  ```
  Hello world from processor ip-10-3-133-204, rank 0 out of 4 processors
  Hello world from processor ip-10-3-128-219, rank 2 out of 4 processors
  Hello world from processor ip-10-3-141-26, rank 3 out of 4 processors
  Hello world from processor ip-10-3-143-52, rank 1 out of 4 processor
  ```