Deploy a custom model (console)Deploy a custom model (AWS Command Line Interface)Deploy a custom model (AWS SDKs)

Deploy a custom model

You can deploy a custom model with the Amazon Bedrock console, AWS Command Line Interface, or AWS SDKs. For information about using the deployment for inference, see Use a deployment for on-demand inference.

Topics

Deploy a custom model (console)
Deploy a custom model (AWS Command Line Interface)
Deploy a custom model (AWS SDKs)

Deploy a custom model (console)

You deploy a custom model from the Custom models page as follows. You can also deploy a model from the Custom model on-demand page with the same fields. To find this page, in Inference and Assessment in the navigation pane, choose Custom model on-demand.

To deploy a custom model

Sign in to the AWS Management Console using an IAM role with Amazon Bedrock permissions, and open the Amazon Bedrock console at https://console.aws.amazon.com/nova/.
From the left navigation pane, choose Custom models under Foundation models.
In the Models tab, choose the radio button for the model you want to deploy.
Choose Set up inference and choose Deploy for on-demand.
In Deployment details, provide the following information:
- Deployment Name (required) – Enter a unique name for your deployment.
- Description (optional) – Enter a description for your deployment.
- Tags (optional) – Add tags for cost allocation and resource management.
Choose Create. When the status shows Completed, your custom model is ready for on-demand inference. For more information about using the custom model, see Use a deployment for on-demand inference.

Deploy a custom model (AWS Command Line Interface)

To deploy a custom model for on-demand inference using the AWS Command Line Interface, use the create-custom-model-deployment command with your custom model's Amazon Resource Name (ARN). This command uses the CreateCustomModelDeployment API operation. It returns the deployment's ARN that you can use as the modelId when making inference requests. For information about using the deployment for inference, see Use a deployment for on-demand inference.


aws bedrock create-custom-model-deployment \
--model-deployment-name "Unique name" \
--model-arn "Custom Model ARN" \
--description "Deployment description" \
--tags '[
    {
        "key": "Environment",
        "value": "Production"
    },
    {
        "key": "Team",
        "value": "ML-Engineering"
    },
    {
        "key": "Project",
        "value": "CustomerSupport"
    }
]' \
--client-request-token "unique-deployment-token" \
--region region

Deploy a custom model (AWS SDKs)

To deploy a custom model for on-demand inference, use the CreateCustomModelDeployment API operation with your custom model's Amazon Resource Name (ARN). The response returns the deployment's ARN that you can use as the modelId when making inference requests. For information about using the deployment for inference, see Use a deployment for on-demand inference.

The following code shows how to use the SDK for Python (Boto3) to deploy a custom model.


def create_custom_model_deployment(bedrock_client):
    """Create a custom model deployment
    Args:
        bedrock_client: A boto3 Bedrock client for making API calls
 
    Returns:
        str: The ARN of the created custom model deployment
 
    Raises:
        Exception: If there is an error creating the deployment
    """
 
    try:
        response = bedrock_client.create_custom_model_deployment(
            modelDeploymentName="Unique deployment name",
            modelArn="Custom Model ARN",
            description="Deployment description",
            tags=[
                {'key': 'Environment', 'value': 'Production'},
                {'key': 'Team', 'value': 'ML-Engineering'},
                {'key': 'Project', 'value': 'CustomerSupport'}
            ],
            clientRequestToken=f"deployment-{uuid.uuid4()}"
        )
 
        deployment_arn = response['customModelDeploymentArn']
        print(f"Deployment created: {deployment_arn}")
        return deployment_arn
 
    except Exception as e:
        print(f"Error creating deployment: {str(e)}")
        raise

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Deploy a custom model for on-demand inference

Use a deployment for on-demand inference