本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
AWS PCS 中的任務完成日誌
任務完成日誌提供您平行 AWS 運算服務 (AWS PCS) 任務完成時的金鑰詳細資訊,無需額外費用。您可以使用其他 AWS 服務來存取和處理您的日誌資料,例如 Amazon CloudWatch Logs、Amazon Simple Storage Service (Amazon S3) 和 Amazon Data Firehose; AWS PCS 會記錄任務的中繼資料,例如下列項目。
-
任務 ID 和名稱
-
使用者和群組資訊
-
任務狀態 (例如
COMPLETED
、FAILED
、CANCELLED
) -
使用的分割區
-
時間限制
-
開始、結束、提交和合格時間
-
節點清單和計數
-
處理器計數
-
工作目錄
-
資源用量 (CPU、記憶體)
-
結束代碼
-
節點詳細資訊 (名稱、執行個體 IDs、執行個體類型)
先決條件
管理 AWS PCS 叢集的 IAM 主體必須允許 pcs:AllowVendedLogDeliveryForResource
動作。
下列範例 IAM 政策會授予所需的許可。
{ "Version": "2012-10-17", "Statement": [ { "Sid": "PcsAllowVendedLogsDelivery", "Effect": "Allow", "Action": ["pcs:AllowVendedLogDeliveryForResource"], "Resource": [ "arn:aws:pcs:::cluster/*" ] } ] }
設定任務完成日誌
您可以使用 AWS Management Console 或 為您的 AWS PCS 叢集設定任務完成日誌 AWS CLI。
如何尋找任務完成日誌
您可以在 CloudWatch Logs 和 Amazon S3. AWS PCS 中設定日誌目的地,並使用下列結構化路徑名稱和檔案名稱。
CloudWatch Logs
AWS PCS 使用 CloudWatch Logs 串流的下列名稱格式:
AWSLogs/PCS/
cluster-id
/jobcomp.log
例如:AWSLogs/PCS/pcs_abc123de45/jobcomp.log
Amazon S3
AWS PCS 會針對 S3 路徑使用下列名稱格式:
AWSLogs/
account-id
/PCS/region
/cluster-id
/jobcomp/year
/month
/day
/hour
/
例如:AWSLogs/111122223333/PCS/us-east-1/pcs_abc123de45/jobcomp/2025/06/19/11/
AWS PCS 對日誌檔案使用以下名稱格式:
PCS_jobcomp_
year
-month
-day
-hour
_cluster-id
_random-id
.log.gz
例如:PCS_jobcomp_2025-06-19-11_pcs_abc123de45_04be080b.log.gz
任務完成日誌欄位
AWS PCS 會將任務完成日誌資料寫入為 JSON 物件。JSON 容器jobcomp
會保留任務詳細資訊。下表說明jobcomp
容器內的欄位。某些欄位僅在特定情況下存在,例如陣列任務或異質任務。
名稱 | 範例值 | 必要 | 備註 |
---|---|---|---|
job_id |
11 |
是 | 永遠具有 值 |
user |
"root" |
是 | 永遠具有 值 |
user_id |
0 |
是 | 永遠具有 值 |
group |
"root" |
是 | 永遠具有 值 |
group_id |
0 |
是 | 永遠具有 值 |
name |
"wrap" |
是 | 永遠具有 值 |
job_state |
"COMPLETED" |
是 | 永遠具有 值 |
partition |
"Hydra-MpiQueue-abcdef01-7" |
是 | 永遠具有 值 |
time_limit |
"UNLIMITED" |
是 | 永遠存在,但可能是 "UNLIMITED" |
start_time |
"2025-06-19T10:58:57" |
是 | 永遠存在,但可能是 "Unknown" |
end_time |
"2025-06-19T10:58:57" |
是 | 永遠存在,但可能是 "Unknown" |
node_list |
"Hydra-MpiNG-abcdef01-2345-1" |
是 | 永遠具有 值 |
node_cnt |
1 |
是 | 永遠具有 值 |
proc_cnt |
1 |
是 | 永遠具有 值 |
work_dir |
"/root" |
是 | 永遠存在,但可能是 "Unknown" |
reservation_name |
"weekly_maintenance" |
是 | 永遠存在,但可能是空字串 "" |
tres.cpu |
1 |
是 | 永遠具有 值 |
tres.mem.val |
600 |
是 | 永遠具有 值 |
tres.mem.unit |
"M" |
是 | 可以是 "M" 或 "bb" |
tres.node |
1 |
是 | 永遠具有 值 |
tres.billing |
1 |
是 | 永遠具有 值 |
account |
"finance" |
是 | 永遠存在,但可能是空字串 "" |
qos |
"normal" |
是 | 永遠存在,但可能是空字串 "" |
wc_key |
"project_1" |
是 | 永遠存在,但可能是空字串 "" |
cluster |
"unknown" |
是 | 永遠存在,但可能是 "unknown" |
submit_time |
"2025-06-19T10:55:46" |
是 | 永遠存在,但可能是 "Unknown" |
eligible_time |
"2025-06-19T10:55:46" |
是 | 永遠存在,但可能是 "Unknown" |
array_job_id |
12 |
編號 | 只有在任務是陣列任務時才會出現 |
array_task_id |
1 |
編號 | 只有在任務是陣列任務時才會出現 |
het_job_id |
10 |
編號 | 只有在任務是異質任務時才會出現 |
het_job_offset |
0 |
編號 | 只有在任務是異質任務時才會出現 |
derived_exit_code_status |
0 |
是 | 永遠具有 值 |
derived_exit_code_signal |
0 |
是 | 永遠具有 值 |
exit_code_status |
0 |
是 | 永遠具有 值 |
exit_code_signal |
0 |
是 | 永遠具有 值 |
node_details[0].name |
"Hydra-MpiNG-abcdef01-2345-1" |
編號 | 永遠存在,但node_details 可能是 "[]" |
node_details[0].instance_id |
"i-0abcdef01234567a" |
編號 | 永遠存在,但node_details 可能是 "[]" |
node_details[0].instance_type |
"t4g.micro" |
編號 | 永遠存在,但node_details 可能是 "[]" |
任務完成日誌範例
下列範例顯示各種任務類型和狀態的任務完成日誌:
{ "jobcomp": { "job_id": 1, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T16:32:57", "end_time": "2025-06-19T16:33:03", "node_list": "Hydra-MpiNG-abcdef01-2345-[1-2]", "node_cnt": 2, "proc_cnt": 2, "work_dir": "/usr/bin", "reservation_name": "", "tres": { "cpu": 2, "mem": { "val": 1944, "unit": "M" }, "node": 2, "billing": 2 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T16:29:40", "eligible_time": "2025-06-19T16:29:41", "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc123def45678", "instance_type": "t4g.micro" }, { "name": "Hydra-MpiNG-abcdef01-2345-2", "instance_id": "i-0def456abc78901", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 2, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T16:33:13", "end_time": "2025-06-19T16:33:14", "node_list": "Hydra-MpiNG-abcdef01-2345-[1-2]", "node_cnt": 2, "proc_cnt": 2, "work_dir": "/usr/bin", "reservation_name": "", "tres": { "cpu": 2, "mem": { "val": 1944, "unit": "M" }, "node": 2, "billing": 2 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T16:33:13", "eligible_time": "2025-06-19T16:33:13", "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc123def45678", "instance_type": "t4g.micro" }, { "name": "Hydra-MpiNG-abcdef01-2345-2", "instance_id": "i-0def456abc78901", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 3, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T22:58:57", "end_time": "2025-06-19T22:58:57", "node_list": "Hydra-MpiNG-abcdef01-2345-1", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 972, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T22:55:46", "eligible_time": "2025-06-19T22:55:46", "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc234def56789", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 4, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "525600", "start_time": "2025-06-19T23:04:27", "end_time": "2025-06-19T23:04:27", "node_list": "Hydra-MpiNG-abcdef01-2345-[1-2]", "node_cnt": 2, "proc_cnt": 2, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 2, "mem": { "val": 1944, "unit": "M" }, "node": 2, "billing": 2 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:01:38", "eligible_time": "2025-06-19T23:01:38", "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc234def56789", "instance_type": "t4g.micro" }, { "name": "Hydra-MpiNG-abcdef01-2345-2", "instance_id": "i-0def345abc67890", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 5, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "FAILED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:09:00", "end_time": "2025-06-19T23:09:00", "node_list": "(null)", "node_cnt": 0, "proc_cnt": 0, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 1, "unit": "G" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:09:00", "eligible_time": "2025-06-19T23:09:00", "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 1, "node_details": [] } } { "jobcomp": { "job_id": 6, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "CANCELLED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:09:36", "end_time": "2025-06-19T23:09:36", "node_list": "(null)", "node_cnt": 0, "proc_cnt": 0, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 400, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:09:35", "eligible_time": "2025-06-19T23:09:36", "het_job_id": 6, "het_job_offset": 0, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 1, "node_details": [] } } { "jobcomp": { "job_id": 7, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "CANCELLED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:10:03", "end_time": "2025-06-19T23:10:03", "node_list": "(null)", "node_cnt": 0, "proc_cnt": 0, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 400, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:10:03", "eligible_time": "2025-06-19T23:10:03", "het_job_id": 7, "het_job_offset": 0, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 1, "node_details": [] } } { "jobcomp": { "job_id": 8, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:11:24", "end_time": "2025-06-19T23:11:24", "node_list": "Hydra-MpiNG-abcdef01-2345-1", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 400, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:11:23", "eligible_time": "2025-06-19T23:11:23", "het_job_id": 8, "het_job_offset": 0, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc234def56789", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 9, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:11:24", "end_time": "2025-06-19T23:11:24", "node_list": "Hydra-MpiNG-abcdef01-2345-2", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 400, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:11:23", "eligible_time": "2025-06-19T23:11:23", "het_job_id": 8, "het_job_offset": 1, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-2", "instance_id": "i-0def345abc67890", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 10, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:12:24", "end_time": "2025-06-19T23:12:24", "node_list":"Hydra-MpiNG-abcdef01-2345-1", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 400, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:12:14", "eligible_time": "2025-06-19T23:12:14", "het_job_id": 10, "het_job_offset": 0, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc234def56789", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 11, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:12:24", "end_time": "2025-06-19T23:12:24", "node_list":"Hydra-MpiNG-abcdef01-2345-2", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 600, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:12:14", "eligible_time": "2025-06-19T23:12:14", "het_job_id": 10, "het_job_offset": 1, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-2", "instance_id": "i-0def345abc67890", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 13, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:47:57", "end_time": "2025-06-19T23:47:58", "node_list":"Hydra-MpiNG-abcdef01-2345-1", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 972, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:43:56", "eligible_time": "2025-06-19T23:43:56" , "array_job_id": 12, "array_task_id": 1, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc345def67890", "instance_type": "t4g.micro" } ] } } { "jobcomp": { "job_id": 12, "user": "root", "user_id": 0, "group": "root", "group_id": 0, "name": "wrap", "job_state": "COMPLETED", "partition": "Hydra-MpiQueue-abcdef01-7", "time_limit": "UNLIMITED", "start_time": "2025-06-19T23:47:58", "end_time": "2025-06-19T23:47:58", "node_list":"Hydra-MpiNG-abcdef01-2345-1", "node_cnt": 1, "proc_cnt": 1, "work_dir": "/root", "reservation_name": "", "tres": { "cpu": 1, "mem": { "val": 972, "unit": "M" }, "node": 1, "billing": 1 }, "account": "", "qos": "", "wc_key": "", "cluster": "unknown", "submit_time": "2025-06-19T23:43:56", "eligible_time": "2025-06-19T23:43:56" , "array_job_id": 12, "array_task_id": 2, "derived_exit_code_status": 0, "derived_exit_code_signal": 0, "exit_code_status": 0, "exit_code_signal": 0, "node_details": [ { "name": "Hydra-MpiNG-abcdef01-2345-1", "instance_id": "i-0abc345def67890", "instance_type": "t4g.micro" } ] } }