翻訳は機械翻訳により提供されています。提供された翻訳内容と英語版の間で齟齬、不一致または矛盾がある場合、英語版が優先します。 # kubectl を使用して JumpStart からモデルをデプロイする次のステップでは、kubectl を使用して JumpStart モデルを HyperPod クラスターにデプロイする方法を説明します。次の手順には、ターミナルで実行するように設計されたコードセルとコマンドが含まれています。これらのコマンドを実行する前に、 AWS 認証情報を使用して環境が設定されていることを確認してください。 ## 前提条件開始する前に、以下が整っていることを検証します。 + Amazon SageMaker HyperPod クラスターでの推論機能の設定。詳細については、「[モデルデプロイ用の HyperPod クラスターの設定](sagemaker-hyperpod-model-deployment-setup.md)」を参照してください。 + [kubectl](https://kubernetes.io/docs/reference/kubectl/) ユーティリティのインストールとターミナルでの [jq](https://jqlang.org/) の設定 ## セットアップと設定 1. リージョンを選択します。 ``` export REGION= ``` 1. すべての SageMaker パブリックハブモデルと HyperPod クラスターが表示されます。 1. JumpstartPublic Hub から `JumpstartModel` を選択します。JumpstartPublic ハブには多数のモデルが用意されており、`NextToken` を使用してパブリックハブで使用可能なすべてのモデルを繰り返し一覧表示できます。 ``` aws sagemaker list-hub-contents --hub-name SageMakerPublicHub --hub-content-type Model --query '{Models: HubContentSummaries[].{ModelId:HubContentName,Version:HubContentVersion}, NextToken: NextToken}' --output json ``` ``` export MODEL_ID="deepseek-llm-r1-distill-qwen-1-5b" export MODEL_VERSION="2.0.4" ``` 1. 選択したモデル ID とクラスター名を以下の変数に設定します。 **注記** クラスター管理者に確認して、このロールまたはユーザーにアクセス許可が付与されていることを確認します。`!aws sts get-caller-identity --query "Arn"` を実行すると、ターミナルで使用しているロールまたはユーザーを確認できます。 ``` aws sagemaker list-clusters --output table # Select the cluster name where you want to deploy the model. export HYPERPOD_CLUSTER_NAME="" # Select the instance that is relevant for your model deployment and exists within the selected cluster. # List availble instances in your HyperPod cluster aws sagemaker describe-cluster --cluster-name=$HYPERPOD_CLUSTER_NAME --query "InstanceGroups[].{InstanceType:InstanceType,Count:CurrentCount}" --output table # List supported instance types for the selected model aws sagemaker describe-hub-content --hub-name SageMakerPublicHub --hub-content-type Model --hub-content-name "$MODEL_ID" --output json | jq -r '.HubContentDocument | fromjson | {Default: .DefaultInferenceInstanceType, Supported: .SupportedInferenceInstanceTypes}' # Select and instance type from the cluster that is compatible with the model. # Make sure that the selected instance is either default or supported instance type for the jumpstart model export INSTANCE_TYPE=" jumpstart_model.yaml --- apiVersion: inference.sagemaker.aws.amazon.com/v1 kind: JumpStartModel metadata: name: $SAGEMAKER_ENDPOINT_NAME namespace: $CLUSTER_NAMESPACE spec: sageMakerEndpoint: name: $SAGEMAKER_ENDPOINT_NAME model: modelHubName: SageMakerPublicHub modelId: $MODEL_ID modelVersion: $MODEL_VERSION server: instanceType: $INSTANCE_TYPE # Optional: Specify GPU partition profile for MIG-enabled instances # acceleratorPartitionType: "1g.10gb" metrics: enabled: true environmentVariables: - name: SAMPLE_ENV_VAR value: "sample_value" maxDeployTimeInSeconds: 1800 autoScalingSpec: cloudWatchTrigger: name: "SageMaker-Invocations" namespace: "AWS/SageMaker" useCachedMetrics: false metricName: "Invocations" targetValue: 10 minValue: 0.0 metricCollectionPeriod: 30 metricStat: "Sum" metricType: "Average" dimensions: - name: "EndpointName" value: "$SAGEMAKER_ENDPOINT_NAME" - name: "VariantName" value: "AllTraffic" EOF ``` ## モデルをデプロイする **kubernetes 設定を更新してモデルをデプロイする** 1. Amazon EKS がオーケストレーションした HyperPod クラスターに接続するように kubectl を設定します。 ``` export EKS_CLUSTER_NAME=$(aws --region $REGION sagemaker describe-cluster --cluster-name $HYPERPOD_CLUSTER_NAME \ --query 'Orchestrator.Eks.ClusterArn' --output text | \ cut -d'/' -f2) aws eks update-kubeconfig --name $EKS_CLUSTER_NAME --region $REGION ``` 1. JumpStart モデルをデプロイします。 ``` kubectl apply -f jumpstart_model.yaml ``` **モデルデプロイのステータスをモニタリングする** 1. 推論モデルが正常にデプロイされていることを検証します。 ``` kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE ``` 1. エンドポイントが正常に作成されたことを検証します。 ``` aws sagemaker describe-endpoint --endpoint-name=$SAGEMAKER_ENDPOINT_NAME --output table ``` 1. モデルのエンドポイントを呼び出します。`JumpStartModel` オブジェクトからサンプルペイロードをプログラムで取得できます。 ``` aws sagemaker-runtime invoke-endpoint \ --endpoint-name $SAGEMAKER_ENDPOINT_NAME \ --content-type "application/json" \ --body '{"inputs": "What is AWS SageMaker?"}' \ --region $REGION \ --cli-binary-format raw-in-base64-out \ /dev/stdout ``` ## デプロイを管理する不要になった JumpStart モデルデプロイを削除します。 ``` kubectl delete JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE ``` **トラブルシューティング** デプロイが想定どおりに機能しない場合は、これらのデバッグコマンドを使用します。 1. Kubernetes のデプロイのステータスを確認します。このコマンドは、モデルを実行しているポッドを管理する基盤となる Kubernetes デプロイオブジェクトを検査します。これを使用して、ポッドスケジューリング、リソース割り当て、コンテナの起動に関する問題のトラブルシューティングを行います。 ``` kubectl describe deployment $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE ``` 1. JumpStart モデルリソースのステータスを確認します。このコマンドは、高レベルのモデル設定とデプロイライフサイクルを管理するカスタム `JumpStartModel` リソースを検査します。これを使用して、設定エラーや SageMaker AI エンドポイント作成の問題などのモデル固有の問題をトラブルシューティングします。 ``` kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE ``` 1. すべての Kubernetes オブジェクトのステータスを確認します。このコマンドは、名前空間内のすべての関連 Kubernetes リソースの包括的な概要を提供します。これを使用して、モデルのデプロイに関連付けられたポッド、サービス、デプロイ、カスタムリソースの全体的な状態のクイックヘルスチェックを行います。 ``` kubectl get pods,svc,deployment,JumpStartModel,sagemakerendpointregistration -n $CLUSTER_NAMESPACE ```