Create a multi-container endpoint (Boto 3)
Create a Multi-container endpoint by calling CreateModel,
CreateEndpointConfig, and
CreateEndpoint
APIs as you would to create any other endpoints. You can
run these containers sequentially as an inference pipeline, or run each individual
container by using direct invocation. Multi-container endpoints have the following
requirements when you call create_model:
-
Use the
Containersparameter instead ofPrimaryContainer, and include more than one container in theContainersparameter. -
The
ContainerHostnameparameter is required for each container in a multi-container endpoint with direct invocation. -
Set the
Modeparameter of theInferenceExecutionConfigfield toDirectfor direct invocation of each container, orSerialto use containers as an inference pipeline. The default mode isSerial.
Note
Currently there is a limit of up to 15 containers supported on a multi-container endpoint.
The following example creates a multi-container model for direct invocation.
-
Create container elements and
InferenceExecutionConfigwith direct invocation.container1 = { 'Image': '123456789012.dkr.ecr.us-east-1.amazonaws.com/myimage1:mytag', 'ContainerHostname': 'firstContainer' } container2 = { 'Image': '123456789012.dkr.ecr.us-east-1.amazonaws.com/myimage2:mytag', 'ContainerHostname': 'secondContainer' } inferenceExecutionConfig = {'Mode': 'Direct'} -
Create the model with the container elements and set the
InferenceExecutionConfigfield.import boto3 sm_client = boto3.Session().client('sagemaker') response = sm_client.create_model( ModelName = 'my-direct-mode-model-name', InferenceExecutionConfig = inferenceExecutionConfig, ExecutionRoleArn = role, Containers = [container1, container2] )
To create an endoint, you would then call create_endpoint_config