Deploy a custom model (console)Deploy a custom model (AWS Command Line Interface)Deploy a custom model (AWS SDKs)

Deploy a custom model

You can deploy a custom model with the Amazon Bedrock console, AWS Command Line Interface, or AWS SDKs. For information about using the deployment for inference, see Use a deployment for on-demand inference.

Topics

Deploy a custom model (console)
Deploy a custom model (AWS Command Line Interface)
Deploy a custom model (AWS SDKs)

Deploy a custom model (console)

You deploy a custom model from the Custom models page as follows. You can also deploy a model from the Custom model on-demand page with the same fields. To find this page, under Infer in the navigation pane, choose Custom model on-demand.

To deploy a custom model

Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock.
From the left navigation pane, choose Custom models under Tune.
In the Models tab, choose the radio button for the model you want to deploy.
Choose Set up inference and choose Deploy for on-demand.
In Deployment details, provide the following information:
- Deployment Name (required) – Enter a unique name for your deployment.
- Description (optional) – Enter a description for your deployment.
- Tags (optional) – Add tags for cost allocation and resource management.
Choose Create. When the deployment's status is Active, your custom model is ready for on-demand inference. For more information about using the custom model, see Use a deployment for on-demand inference.

Deploy a custom model (AWS Command Line Interface)

To deploy a custom model for on-demand inference using the AWS Command Line Interface, use the create-custom-model-deployment command with your custom model's Amazon Resource Name (ARN). This command uses the CreateCustomModelDeployment API operation. The response includes the deployment's ARN. When the deployment is active, you use this ARN as the modelId when making inference requests. For information about using the deployment for inference, see Use a deployment for on-demand inference.


aws bedrock create-custom-model-deployment \
--model-deployment-name "Unique name" \
--model-arn "Custom Model ARN" \
--description "Deployment description" \
--tags '[
    {
        "key": "Environment",
        "value": "Production"
    },
    {
        "key": "Team",
        "value": "ML-Engineering"
    },
    {
        "key": "Project",
        "value": "CustomerSupport"
    }
]' \
--client-request-token "unique-deployment-token" \
--region region

Deploy a custom model (AWS SDKs)

To deploy a custom model for on-demand inference, use the CreateCustomModelDeployment API operation with your custom model's Amazon Resource Name (ARN). The response includes the deployment's ARN. When the deployment is active, you use this ARN as the modelId when making inference requests. For information about using the deployment for inference, see Use a deployment for on-demand inference.

The following code shows how to use the SDK for Python (Boto3) to deploy a custom model.


def create_custom_model_deployment(bedrock_client):
    """Create a custom model deployment
    Args:
        bedrock_client: A boto3 Amazon Bedrock client for making API calls

    Returns:
        str: The ARN of the new custom model deployment

    Raises:
        Exception: If there is an error creating the deployment
    """

    try:
        response = bedrock_client.create_custom_model_deployment(
            modelDeploymentName="Unique deployment name",
            modelArn="Custom Model ARN",
            description="Deployment description",
            tags=[
                {'key': 'Environment', 'value': 'Production'},
                {'key': 'Team', 'value': 'ML-Engineering'},
                {'key': 'Project', 'value': 'CustomerSupport'}
            ],
            clientRequestToken=f"deployment-{uuid.uuid4()}"
        )

        deployment_arn = response['customModelDeploymentArn']
        print(f"Deployment created: {deployment_arn}")
        return deployment_arn

    except Exception as e:
        print(f"Error creating deployment: {str(e)}")
        raise

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Prerequisites

Use a deployment for on-demand inference