Multi-Model Endpoint Security

Models and data in a multi-model endpoint are co-located on instance storage volume and in container memory. All instances for Amazon SageMaker AI endpoints run on a single tenant container that you own. Only your models can run on your multi-model endpoint. It's your responsibility to manage the mapping of requests to models and to provide access for users to the correct target models. SageMaker AI uses IAM roles to provide IAM identity-based policies that you use to specify allowed or denied actions and resources and the conditions under which actions are allowed or denied.

By default, an IAM principal with InvokeEndpoint permissions on a multi-model endpoint can invoke any model at the address of the S3 prefix defined in the CreateModel operation, provided that the IAM Execution Role defined in operation has permissions to download the model. If you need to restrict InvokeEndpoint access to a limited set of models in S3, you can do one of the following:

Restrict InvokeEndpont calls to specific models hosted at the endpoint by using the sagemaker:TargetModel IAM condition key. For example, the following policy allows InvokeEndpont requests only when the value of the TargetModel field matches one of the specified regular expressions:

For information about SageMaker AI condition keys, see Condition Keys for Amazon SageMaker AI in the AWS Identity and Access Management User Guide.

Create multi-model endpoints with more restrictive S3 prefixes.

For more information about how SageMaker AI uses roles to manage access to endpoints and perform operations on your behalf, see How to use SageMaker AI execution roles. Your customers might also have certain data isolation requirements dictated by their own compliance requirements that can be satisfied using IAM identities.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

API Container Contract

CloudWatch Metrics for Multi-Model Endpoint Deployments