Interface ServerlessProductionVariantProps
- All Superinterfaces:
software.amazon.jsii.JsiiSerializable
- All Known Implementing Classes:
ServerlessProductionVariantProps.Jsii$Proxy
@Generated(value="jsii-pacmak/1.119.0 (build 1634eac)",
date="2025-11-17T14:41:05.250Z")
@Stability(Experimental)
public interface ServerlessProductionVariantProps
extends software.amazon.jsii.JsiiSerializable
(experimental) Construction properties for a serverless production variant.
Example:
import software.amazon.awscdk.services.sagemaker.alpha.*;
Model model;
EndpointConfig endpointConfig = EndpointConfig.Builder.create(this, "ServerlessEndpointConfig")
.serverlessProductionVariant(ServerlessProductionVariantProps.builder()
.model(model)
.variantName("serverlessVariant")
.maxConcurrency(10)
.memorySizeInMB(2048)
.provisionedConcurrency(5)
.build())
.build();
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic final classA builder forServerlessProductionVariantPropsstatic final classAn implementation forServerlessProductionVariantProps -
Method Summary
Modifier and TypeMethodDescriptionbuilder()default Number(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.(experimental) The maximum number of concurrent invocations your serverless endpoint can process.(experimental) The memory size of your serverless endpoint.getModel()(experimental) The model to host.default Number(experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.(experimental) Name of the production variant.Methods inherited from interface software.amazon.jsii.JsiiSerializable
$jsii$toJson
-
Method Details
-
getMaxConcurrency
(experimental) The maximum number of concurrent invocations your serverless endpoint can process.Valid range: 1-200
-
getMemorySizeInMB
(experimental) The memory size of your serverless endpoint.Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
-
getModel
(experimental) The model to host. -
getVariantName
(experimental) Name of the production variant. -
getInitialVariantWeight
(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.
Default: 1.0
-
getProvisionedConcurrency
(experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.Valid range: 1-200, must be less than or equal to maxConcurrency.
Default: - none
-
builder
-