

# Deploy a custom model for on-demand inference


After you successfully create a custom model with a model customization job (fine-tuning, distillation, or continued pre-training), you can set up on-demand inference for the model.

To set up on-demand inference for a custom model, you deploy the model with a custom model deployment. After you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the `modelId` parameter in your `InvokeModel` or `Converse` API operations. You can use the deployed model for on-demand inference with Amazon Bedrock features such as playgrounds, Agents, and Knowledge Bases. 

**Topics**
+ [

## Supported models
](#custom-model-inference-supported-models)
+ [

# Deploy a custom model
](deploying-custom-model.md)
+ [

# Use a deployment for on-demand inference
](use-custom-model-on-demand.md)
+ [

# Delete a custom model deployment
](delete-custom-model-deployment.md)

## Supported models


You can set up on-demand inference for the following models:
+ Amazon Nova Canvas
+ Amazon Nova Lite
+ Amazon Nova Micro
+ Amazon Nova Pro

# Deploy a custom model


You can deploy a custom model with the Amazon Bedrock console, AWS Command Line Interface, or AWS SDKs. For information about using the deployment for inference, see [Use a deployment for on-demand inference](https://docs.aws.amazon.com/bedrock/latest/userguide/use-custom-model-on-demand.html). 

**Topics**
+ [

## Deploy a custom model (console)
](#deploy-custom-model-console)
+ [

## Deploy a custom model (AWS Command Line Interface)
](#deploy-custom-model-cli)
+ [

## Deploy a custom model (AWS SDKs)
](#deploy-custom-model-sdk)

## Deploy a custom model (console)


You deploy a custom model from the **Custom models** page as follows. You can also deploy a model from the **Custom model on-demand** page with the same fields. To find this page, in **Inference and Assessment** in the navigation pane, choose **Custom model on-demand**.

**To deploy a custom model**

1. Sign in to the AWS Management Console using an [IAM role with Amazon Bedrock permissions](https://docs.aws.amazon.com//bedrock/latest/userguide/getting-started.html), and open the Amazon Bedrock console at [https://console.aws.amazon.com/nova/](https://console.aws.amazon.com/nova/).

1. From the left navigation pane, choose **Custom models** under **Foundation models**.

1. In the **Models** tab, choose the radio button for the model you want to deploy.

1. Choose **Set up inference** and choose **Deploy for on-demand**.

1. In **Deployment details**, provide the following information:
   + **Deployment Name** (required) – Enter a unique name for your deployment.
   + **Description** (optional) – Enter a description for your deployment.
   + **Tags** (optional) – Add tags for cost allocation and resource management.

1. Choose **Create**. When the status shows `Completed`, your custom model is ready for on-demand inference. For more information about using the custom model, see [Use a deployment for on-demand inference](https://docs.aws.amazon.com/bedrock/latest/userguide/use-custom-model-on-demand.html).

## Deploy a custom model (AWS Command Line Interface)


To deploy a custom model for on-demand inference using the AWS Command Line Interface, use the `create-custom-model-deployment` command with your custom model's Amazon Resource Name (ARN). This command uses the [CreateCustomModelDeployment](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateCustomModelDeployment.html) API operation. It returns the deployment's ARN that you can use as the `modelId` when making inference requests. For information about using the deployment for inference, see [Use a deployment for on-demand inference](https://docs.aws.amazon.com/bedrock/latest/userguide/use-custom-model-on-demand.html).

```
aws bedrock create-custom-model-deployment \
--model-deployment-name "Unique name" \
--model-arn "Custom Model ARN" \
--description "Deployment description" \
--tags '[
    {
        "key": "Environment",
        "value": "Production"
    },
    {
        "key": "Team",
        "value": "ML-Engineering"
    },
    {
        "key": "Project",
        "value": "CustomerSupport"
    }
]' \
--client-request-token "unique-deployment-token" \
--region region
```

## Deploy a custom model (AWS SDKs)


To deploy a custom model for on-demand inference, use the [CreateCustomModelDeployment](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateCustomModelDeployment.html) API operation with your custom model's Amazon Resource Name (ARN). The response returns the deployment's ARN that you can use as the `modelId` when making inference requests. For information about using the deployment for inference, see [Use a deployment for on-demand inference](https://docs.aws.amazon.com/bedrock/latest/userguide/use-custom-model-on-demand.html).

The following code shows how to use the SDK for Python (Boto3) to deploy a custom model. 

```
def create_custom_model_deployment(bedrock_client):
    """Create a custom model deployment
    Args:
        bedrock_client: A boto3 Bedrock client for making API calls
 
    Returns:
        str: The ARN of the created custom model deployment
 
    Raises:
        Exception: If there is an error creating the deployment
    """
 
    try:
        response = bedrock_client.create_custom_model_deployment(
            modelDeploymentName="Unique deployment name",
            modelArn="Custom Model ARN",
            description="Deployment description",
            tags=[
                {'key': 'Environment', 'value': 'Production'},
                {'key': 'Team', 'value': 'ML-Engineering'},
                {'key': 'Project', 'value': 'CustomerSupport'}
            ],
            clientRequestToken=f"deployment-{uuid.uuid4()}"
        )
 
        deployment_arn = response['customModelDeploymentArn']
        print(f"Deployment created: {deployment_arn}")
        return deployment_arn
 
    except Exception as e:
        print(f"Error creating deployment: {str(e)}")
        raise
```

# Use a deployment for on-demand inference


After you deploy your custom model for on-demand inference, you can use it to generate responses by making inference requests. For `InvokeModel` or `Converse` operations, you use the deployment Amazon Resource Name (ARN) as the `modelId`.

For information about making inference requests, see the following topics:
+ [Submit prompts and generate responses with model inference](https://docs.aws.amazon.com/bedrock/latest/userguide/inference.html)
+ [Prerequisites for running model inference](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-prereq.html)
+ [Submit prompts and generate responses using the API](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-api.html)

# Delete a custom model deployment


After you are finished using your model for on-demand inference, you can delete the deployment. After you delete the deployment, you can't use it for on-demand inference but it doesn't delete the underlying custom model.

You can delete a custom model deployment with the Amazon Bedrock console, AWS Command Line Interface, or AWS SDKs.

**Important**  
Deleting a custom model deployment is irreversible. Make sure you no longer need the deployment before proceeding with the deletion. If you need to use the custom model for on-demand inference again, you must create a new deployment.

**Topics**
+ [

## Delete a custom model deployment (console)
](#delete-deployment-console)
+ [

## Delete a custom model deployment (AWS Command Line Interface)
](#delete-deployment-cli)
+ [

## Delete a custom model deployment (AWS SDKs)
](#delete-deployment-sdk)

## Delete a custom model deployment (console)


**To delete a custom model deployment**

1. In the navigation pane, under **Inference and Assessment**, choose **Custom model on-demand**.

1. Choose the custom model deployment you want to delete.

1. Choose **Delete**.

1. In the confirmation dialog, enter the deployment name to confirm the deletion.

1. Choose **Delete** to confirm.

The deployment status will change to `Deleting` while the deletion is in progress. Once completed, the deployment will be removed from the list.

## Delete a custom model deployment (AWS Command Line Interface)


To delete a custom model deployment using the AWS Command Line Interface, use the `delete-custom-model-deployment` command with your deployment identifier.

```
aws bedrock delete-custom-model-deployment \
--custom-model-deployment-identifier "deployment-arn-or-name" \
--region region
```

## Delete a custom model deployment (AWS SDKs)


To delete a custom model deployment programmatically, use the [DeleteCustomModelDeployment](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteCustomModelDeployment.html) API operation with the deployment's Amazon Resource Name (ARN) or name. The following code shows how to use the SDK for Python (Boto3) to delete a custom model deployment. 

```
def delete_custom_model_deployment(bedrock_client):
    """Delete a custom model deployment
 
    Args:
        bedrock_client: A boto3 Bedrock client for making API calls
 
    Returns:
        dict: The response from the delete operation
 
    Raises:
        Exception: If there is an error deleting the deployment
    """
 
    try:
        response = bedrock_client.delete_custom_model_deployment(
            customModelDeploymentIdentifier="Deployment identifier"
        )
 
        print(f"Deployment deletion initiated")
        return response
 
    except Exception as e:
        print(f"Error deleting deployment: {str(e)}")
        raise
```