ServerlessProductionVariantProps
- class aws_cdk.aws_sagemaker_alpha.ServerlessProductionVariantProps(*, max_concurrency, memory_size_in_mb, model, variant_name, initial_variant_weight=None, provisioned_concurrency=None)
Bases:
object(experimental) Construction properties for a serverless production variant.
- Parameters:
max_concurrency (
Union[int,float]) – (experimental) The maximum number of concurrent invocations your serverless endpoint can process. Valid range: 1-200memory_size_in_mb (
Union[int,float]) – (experimental) The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.model (
IModel) – (experimental) The model to host.variant_name (
str) – (experimental) Name of the production variant.initial_variant_weight (
Union[int,float,None]) – (experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants. Default: 1.0provisioned_concurrency (
Union[int,float,None]) – (experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint. Valid range: 1-200, must be less than or equal to maxConcurrency. Default: - none
- Stability:
experimental
- ExampleMetadata:
infused
Example:
import aws_cdk.aws_sagemaker_alpha as sagemaker # model: sagemaker.Model endpoint_config = sagemaker.EndpointConfig(self, "ServerlessEndpointConfig", serverless_production_variant=sagemaker.ServerlessProductionVariantProps( model=model, variant_name="serverlessVariant", max_concurrency=10, memory_size_in_mB=2048, provisioned_concurrency=5 ) )
Attributes
- initial_variant_weight
(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.
The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.
- Default:
1.0
- Stability:
experimental
- max_concurrency
(experimental) The maximum number of concurrent invocations your serverless endpoint can process.
Valid range: 1-200
- Stability:
experimental
- memory_size_in_mb
(experimental) The memory size of your serverless endpoint.
Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
- Stability:
experimental
- model
(experimental) The model to host.
- Stability:
experimental
- provisioned_concurrency
(experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.
Valid range: 1-200, must be less than or equal to maxConcurrency.
- Default:
none
- Stability:
experimental
- variant_name
(experimental) Name of the production variant.
- Stability:
experimental