使用 kubectl 從 JumpStart 部署模型

下列步驟說明如何使用 kubectl 將 JumpStart 模型部署至 HyperPod 叢集。

下列指示包含在 Jupyter 筆記本環境中執行的程式碼儲存格和命令，例如 Amazon SageMaker Studio 或 SageMaker 筆記本執行個體。每個程式碼區塊代表應循序執行的筆記本儲存格。互動式元素，包括模型探索資料表和狀態監控命令，已針對筆記本介面進行最佳化，在其他環境中可能無法正常運作。在繼續之前，請確定您能夠存取具有必要 AWS 許可的筆記本環境。

先決條件

確定您已在 Amazon SageMaker HyperPod 叢集上設定推論功能。如需詳細資訊，請參閱設定 HyperPod 叢集以進行模型部署。

設定和組態

選取您的區域

選擇要部署 HyperPod 叢集的區域，以及您要執行推論工作負載的位置。您也可以將其他自訂新增至 sagemaker_client。


region_name = <REGION>

import boto3
from botocore.config import Config
# Configure retry options
boto3_config = Config(
    retries={
        'max_attempts': 10,  # Maximum number of retry attempts
        'mode': 'adaptive'   # Use adaptive mode for exponential backoff
    }
)

sagemaker_client=boto3.client("sagemaker", region_name=region_name, config=boto3_config)

選擇您的模型和叢集

檢視所有 SageMaker 公有中樞模型和 HyperPod 叢集。


interactive_view(get_all_public_hub_model_data(sagemaker_client))

interactive_view(get_all_cluster_data(sagemaker_client))

設定您在下列變數中選取的模型 ID 和叢集名稱。

注意

請洽詢您的叢集管理員，以確保授予此筆記本執行角色的許可。您可以執行 !aws sts get-caller-identity --query "Arn"來檢查您正在使用的執行角色


# Change the model_id based on your requirement. A list of model IDs is available in step 1 of this notebook.
# Proprietary models are not supported
model_id = "<insert model id here>"

from sagemaker.hyperpod.inference.notebook_utils import validate_public_hub_model_is_not_proprietary

validate_public_hub_model_is_not_proprietary(sagemaker_client, model_id)


# Select the cluster name where you want to deploy the model. List of clusters is available in step 1 of this notebook.
cluster_name = "<insert cluster name here>"

from sagemaker.hyperpod.inference.notebook_utils import validate_cluster_can_support_public_hub_modelfrom sagemaker.hyperpod.inference.notebook_utils import get_public_hub_model_compatible_instances
validate_cluster_can_support_public_hub_model(sagemaker_client, model_id, cluster_name)
interactive_view(get_public_hub_model_compatible_instances(sagemaker_client, model_id))


# Select the instance that is relevant for your model deployment and exists within the selected cluster.
instance_type = "ml.g5.8xlarge"

與叢集管理員確認您可以使用哪個命名空間。管理員應該已在命名空間中建立 Hyperpod-inference 服務帳戶。
```
cluster_namespace = "default"             
```

設定 S3 儲存貯體名稱

設定憑證的 S3 儲存貯體名稱。此儲存貯體需要一個名為「憑證」的資料夾，其中將上傳憑證。儲存貯體也必須位於上述定義的相同區域。


# Set the S3 bucket name where TLS certificates will be stored for secure model communication
certificate_bucket = "<insert bucket name here>"


import yaml
from datetime import datetime

# Get current time in format suitable for endpoint name
current_time = datetime.now().strftime("%Y%m%d-%H%M%S")
sagemaker_endpoint_name=f"{model_id}-{current_time}"


def generate_jumpstart_model_yaml(model_id, model_version, namespace, instance_type, output_file_path, certificate_bucket):
    """
    Generate a JumpStartModel YAML file with the provided parameters.

    Args:
        model_id (str): The model ID
        model_version (str): The model version
        namespace (str): The namespace
        instance_type (str): The instance type
        output_file_path (str): Path where the YAML file will be saved
    """

    # Create the YAML structure
    tlsCertificateOutputS3Uri = "s3://" + certificate_bucket + "/certificates/"
    model_config = {
        "apiVersion": "inference.sagemaker.aws.amazon.com/v1alpha1",
        "kind": "JumpStartModel",
        "metadata": {
            "name": model_id,
            "namespace": namespace
        },
        "spec": {
            "sageMakerEndpoint": {
                "name": sagemaker_endpoint_name
            },
            "model": {
                "modelHubName": "SageMakerPublicHub",
                "modelId": model_id,
                # modelVersion is optional
                "modelVersion": model_version
                # acceptEula is optional, set value to True when using a gated model
            },
            "server": {
                "instanceType": instance_type
            },
            "tlsConfig": {
                "tlsCertificateOutputS3Uri": tlsCertificateOutputS3Uri
            }
        }
    }

    # Write to YAML file
    with open(output_file_path, 'w') as file:
        yaml.dump(model_config, file, default_flow_style=False)

    print(f"YAML file created successfully at: {output_file_path}")


# Import JumpStart utilities to retrieve model specifications and version information
from sagemaker.jumpstart import utilsfrom sagemaker.jumpstart.enums import JumpStartScriptScope

model_specs = utils.verify_model_region_and_return_specs(
        model_id, "*", JumpStartScriptScope.INFERENCE, region=region_name)     
model_version = model_specs.version


# Generate the output filename for the Kubernetes YAML configuration
output_file_path=f"jumpstart-model-{model_id}.yaml"
generate_jumpstart_model_yaml(
    model_id=model_id,
    model_version=model_version,
    namespace=cluster_namespace,
    instance_type=instance_type,
    output_file_path=output_file_path,
    certificate_bucket=certificate_bucket
)

import os
os.environ["JUMPSTART_YAML_FILE_PATH"]=output_file_path
os.environ["MODEL_ID"]=model_id

部署模型

更新您的 kubernetes 組態並部署模型

從 HyperPod 擷取 EKS 叢集名稱。


!aws sagemaker describe-cluster --cluster-name $cluster_name --query "Orchestrator.Eks.ClusterArn"

設定 kubectl 以連線至 EKS 叢集。


!aws eks update-kubeconfig --name "<insert name of eks cluster from above>" --region $region_name

部署您的 JumpStart 模型。


!kubectl apply -f $JUMPSTART_YAML_FILE_PATH

監控模型部署的狀態

確定模型已成功部署。


!kubectl describe JumpStartModel $model_id -n $cluster_namespace

確定已成功建立端點。


!kubectl describe SageMakerEndPointRegistration sagemaker_endpoint_name -n $cluster_namespace

叫用您的模型端點

您可以透過程式設計方式從 JumpStartModel 物件擷取範例承載。


import boto3

prompt = "{\"inputs\": \"What is AWS SageMaker?\"}}"

runtime_client = boto3.client('sagemaker-runtime', region_name=region_name)
response = runtime_client.invoke_endpoint(
    EndpointName=sagemaker_endpoint_name,
    ContentType="application/json",
    Body=prompt
)
print(response["Body"].read().decode())

管理您的部署

清除資源

當您不再需要 JumpStart 模型部署時，請將其刪除。


!kubectl delete JumpStartModel $model_id -n $cluster_namespace

故障診斷

如果您的部署未如預期般運作，請使用這些除錯命令。

檢查 Kubernetes 部署的狀態。此命令會檢查基礎 Kubernetes 部署物件，以管理執行模型的 Pod。使用此項目來疑難排解 Pod 排程、資源配置和容器啟動問題。
```
!kubectl describe deployment $model_id -n $cluster_namespace               
```
檢查 JumpStart 模型資源的狀態。此命令會檢查管理高階模型組態和部署生命週期的自訂 JumpStartModel 資源。使用此項目來疑難排解模型特定的問題，例如組態錯誤或 SageMaker 端點建立問題。
```
!kubectl describe JumpStartModel $model_id -n $cluster_namespace               
```
檢查所有 Kubernetes 物件的狀態。此命令提供命名空間中所有相關 Kubernetes 資源的完整概觀。使用此項目進行快速運作狀態檢查，以查看與您的模型部署相關聯的 Pod、服務、部署和自訂資源的整體狀態。
```
!kubectl get pods,svc,deployment,JumpStartModel,sagemakerendpointregistration -n $cluster_namespace              
```

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

使用 Studio 從 JumpStart 部署模型

使用 kubectl 從 Amazon S3 和 Amazon FSx 部署自訂微調模型