Request Syntax URI Request Parameters Request Body Response Syntax Response Elements Errors Examples See Also

CreateCustomModelDeployment

Deploys a custom model for on-demand inference in Amazon Bedrock. After you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the modelId parameter when you submit prompts and generate responses with model inference.

For more information about setting up on-demand inference for custom models, see Set up inference for a custom model.

The following actions are related to the CreateCustomModelDeployment operation:

Request Syntax


POST /model-customization/custom-model-deployments HTTP/1.1
Content-type: application/json

{
   "clientRequestToken": "string",
   "description": "string",
   "modelArn": "string",
   "modelDeploymentName": "string",
   "tags": [ 
      { 
         "key": "string",
         "value": "string"
      }
   ]
}

URI Request Parameters

The request does not use any URI parameters.

Request Body

The request accepts the following data in JSON format.

clientRequestToken

A unique, case-sensitive identifier to ensure that the operation completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 256.

Pattern: [a-zA-Z0-9]([-a-zA-Z0-9]{0,254}[a-zA-Z0-9])?

Required: No

description

A description for the custom model deployment to help you identify its purpose.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 2048.

Pattern: .*

Required: No

modelArn

The Amazon Resource Name (ARN) of the custom model to deploy for on-demand inference. The custom model must be in the Active state.

Type: String

Length Constraints: Minimum length of 20. Maximum length of 1011.

Pattern: arn:aws(|-us-gov|-cn|-iso|-iso-b):bedrock:[a-z0-9-]{1,20}:[0-9]{12}:custom-model/(imported|[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([a-z0-9-]{1,63}[.]){0,2}[a-z0-9-]{1,63}([:][a-z0-9-]{1,63}){0,2})/[a-z0-9]{12}

Required: Yes

modelDeploymentName

The name for the custom model deployment. The name must be unique within your AWS account and Region.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 63.

Pattern: ([0-9a-zA-Z][_-]?){1,63}

Required: Yes

tags

Tags to assign to the custom model deployment. You can use tags to organize and track your AWS resources for cost allocation and management purposes.

Type: Array of Tag objects

Array Members: Minimum number of 0 items. Maximum number of 200 items.

Required: No

Response Syntax


HTTP/1.1 202
Content-type: application/json

{
   "customModelDeploymentArn": "string"
}

Response Elements

If the action is successful, the service sends back an HTTP 202 response.

The following data is returned in JSON format by the service.

customModelDeploymentArn

The Amazon Resource Name (ARN) of the custom model deployment. Use this ARN as the modelId parameter when invoking the model with the InvokeModel or Converse operations.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 1011.

Pattern: arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:[0-9]{12}:custom-model-deployment/[a-z0-9]{12}

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

The request is denied because of missing access permissions.

HTTP Status Code: 403

InternalServerException

An internal server error occurred. Retry your request.

HTTP Status Code: 500

ResourceNotFoundException

The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.

HTTP Status Code: 404

ServiceQuotaExceededException

The number of requests exceeds the service quota. Resubmit your request later.

HTTP Status Code: 400

ThrottlingException

The number of requests exceeds the limit. Resubmit your request later.

HTTP Status Code: 429

TooManyTagsException

The request contains more tags than can be associated with a resource (50 tags per resource). The maximum number of tags includes both existing tags and those included in your current request.

resourceName: The name of the resource with too many tags.

HTTP Status Code: 400

ValidationException

Input validation failed. Check your request parameters and retry the request.

HTTP Status Code: 400

Examples

Example request

This example illustrates one usage of CreateCustomModelDeployment.


POST /model-customization/custom-model-deployments HTTP/1.1
Content-type: application/json

{
    "clientRequestToken": "unique-deployment-token-456",
    "description": "Production deployment of my custom model for customer support chatbot",
    "modelArn": "arn:aws:bedrock:us-west-2:123456789012:custom-model-deployment/abc123def456",
    "modelDeploymentName": "customer-support-model-deployment",
    "tags": [
        {
            "key": "Environment",
            "value": "Production"
        },
        {
            "key": "Application",
            "value": "CustomerSupport"
        },
        {
            "key": "CostCenter",
            "value": "Engineering"
        }
    ]
}