All Superinterfaces:: software.amazon.jsii.JsiiSerializable

All Known Implementing Classes:: ServerlessProductionVariantProps.Jsii$Proxy

@Generated(value="jsii-pacmak/1.126.0 (build 206d44b)", date="2026-02-09T14:39:23.206Z") @Stability(Experimental) public interface ServerlessProductionVariantProps extends software.amazon.jsii.JsiiSerializable

(experimental) Construction properties for a serverless production variant.

Example:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 Model model;
 EndpointConfig endpointConfig = EndpointConfig.Builder.create(this, "ServerlessEndpointConfig")
         .serverlessProductionVariant(ServerlessProductionVariantProps.builder()
                 .model(model)
                 .variantName("serverlessVariant")
                 .maxConcurrency(10)
                 .memorySizeInMB(2048)
                 .provisionedConcurrency(5)
                 .build())
         .build();

Nested Class Summary

Nested Classes

Modifier and Type

Interface

Description

static final class

ServerlessProductionVariantProps.Builder

A builder for ServerlessProductionVariantProps

static final class

ServerlessProductionVariantProps.Jsii$Proxy

An implementation for ServerlessProductionVariantProps
Method Summary

Modifier and Type

Method

Description

static ServerlessProductionVariantProps.Builder

builder()

default Number

getInitialVariantWeight()

(experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

Number

getMaxConcurrency()

(experimental) The maximum number of concurrent invocations your serverless endpoint can process.

Number

getMemorySizeInMB()

(experimental) The memory size of your serverless endpoint.

IModel

getModel()

(experimental) The model to host.

default Number

getProvisionedConcurrency()

(experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.

String

getVariantName()

(experimental) Name of the production variant.

Methods inherited from interface software.amazon.jsii.JsiiSerializable
$jsii$toJson

Method Details
- getMaxConcurrency
  
  @Stability(Experimental) @NotNull Number getMaxConcurrency()
  
  (experimental) The maximum number of concurrent invocations your serverless endpoint can process.
  Valid range: 1-200
- getMemorySizeInMB
  
  @Stability(Experimental) @NotNull Number getMemorySizeInMB()
  
  (experimental) The memory size of your serverless endpoint.
  Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
- getModel
  
  @Stability(Experimental) @NotNull IModel getModel()
  
  (experimental) The model to host.
- getVariantName
  
  @Stability(Experimental) @NotNull String getVariantName()
  
  (experimental) Name of the production variant.
- getInitialVariantWeight
  
  @Stability(Experimental) @Nullable default Number getInitialVariantWeight()
  
  (experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.
  The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.
  Default: 1.0
- getProvisionedConcurrency
  
  @Stability(Experimental) @Nullable default Number getProvisionedConcurrency()
  
  (experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.
  Valid range: 1-200, must be less than or equal to maxConcurrency.
  Default: - none
- builder
  
  @Stability(Experimental) static ServerlessProductionVariantProps.Builder builder()
  
  Returns:
  
  a ServerlessProductionVariantProps.Builder of ServerlessProductionVariantProps

Interface ServerlessProductionVariantProps

Nested Class Summary

Method Summary

Methods inherited from interface software.amazon.jsii.JsiiSerializable

Method Details

getMaxConcurrency

getMemorySizeInMB

getModel

getVariantName

getInitialVariantWeight

getProvisionedConcurrency

builder