-

Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.

catatan

apiVersion: batch/v1 kind: Job metadata: name: test-tas-job namespace: hyperpod-ns-team-name labels: kueue.x-k8s.io/queue-name: hyperpod-ns-team-name-localqueue kueue.x-k8s.io/priority-class: PRIORITY_CLASS-priority spec: parallelism: 10 completions: 10 suspend: true template: metadata: labels: kueue.x-k8s.io/queue-name: hyperpod-ns-team-name-localqueue annotations: kueue.x-k8s.io/podset-required-topology: "topology.k8s.aws/network-node-layer-3" or kueue.x-k8s.io/podset-preferred-topology: "topology.k8s.aws/network-node-layer-3" spec: nodeSelector: topology.k8s.aws/network-node-layer-3: TOPOLOGY_LABEL_VALUE containers: - name: dummy-job image: gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 args: ["3600s"] resources: requests: cpu: "100" restartPolicy: Never

Parameter Deskripsi

hyp create hyp-pytorch-job \ --version 1.1 \ --job-name sample-pytorch-job \ --image 123456789012.dkr.ecr.us-west-2.amazonaws.com/ptjob:latest \ --pull-policy "Always" \ --tasks-per-node 1 \ --max-retry 1 \ --priority high-priority \ --namespace hyperpod-ns-team-name \ --queue-name hyperpod-ns-team-name-localqueue \ --preferred-topology-label topology.k8s.aws/network-node-layer-1

apiVersion: kubeflow.org/v1 kind: PyTorchJob metadata: annotations: {} labels: kueue.x-k8s.io/queue-name: hyperpod-ns-team-name-localqueue name: tas-test-pytorch-job namespace: hyperpod-ns-team-name spec: pytorchReplicaSpecs: Master: replicas: 1 restartPolicy: OnFailure template: metadata: labels: kueue.x-k8s.io/queue-name: hyperpod-ns-team-name-localqueue spec: containers: - command: - python3 - /opt/pytorch-mnist/mnist.py - --epochs=1 image: docker.io/kubeflowkatib/pytorch-mnist:v1beta1-45c5727 imagePullPolicy: Always name: pytorch Worker: replicas: 10 restartPolicy: OnFailure template: metadata: # annotations: # kueue.x-k8s.io/podset-required-topology: "topology.k8s.aws/network-node-layer-3" labels: kueue.x-k8s.io/queue-name: hyperpod-ns-team-name-localqueue spec: containers: - command: - python3 - /opt/pytorch-mnist/mnist.py - --epochs=1 image: docker.io/kubeflowkatib/pytorch-mnist:v1beta1-45c5727 imagePullPolicy: Always name: pytorch resources: limits: cpu: 1 requests: memory: 200Mi cpu: 1 #nodeSelector: # topology.k8s.aws/network-node-layer-3: xxxxxxxxxxx

Konsol Manajemen AWS