Interface ServerlessProductionVariantProps

All Superinterfaces:
software.amazon.jsii.JsiiSerializable
All Known Implementing Classes:
ServerlessProductionVariantProps.Jsii$Proxy

@Generated(value="jsii-pacmak/1.119.0 (build 1634eac)", date="2025-11-17T14:41:05.250Z") @Stability(Experimental) public interface ServerlessProductionVariantProps extends software.amazon.jsii.JsiiSerializable
(experimental) Construction properties for a serverless production variant.

Example:

 import software.amazon.awscdk.services.sagemaker.alpha.*;
 Model model;
 EndpointConfig endpointConfig = EndpointConfig.Builder.create(this, "ServerlessEndpointConfig")
         .serverlessProductionVariant(ServerlessProductionVariantProps.builder()
                 .model(model)
                 .variantName("serverlessVariant")
                 .maxConcurrency(10)
                 .memorySizeInMB(2048)
                 .provisionedConcurrency(5)
                 .build())
         .build();
 
  • Method Details

    • getMaxConcurrency

      @Stability(Experimental) @NotNull Number getMaxConcurrency()
      (experimental) The maximum number of concurrent invocations your serverless endpoint can process.

      Valid range: 1-200

    • getMemorySizeInMB

      @Stability(Experimental) @NotNull Number getMemorySizeInMB()
      (experimental) The memory size of your serverless endpoint.

      Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.

    • getModel

      @Stability(Experimental) @NotNull IModel getModel()
      (experimental) The model to host.
    • getVariantName

      @Stability(Experimental) @NotNull String getVariantName()
      (experimental) Name of the production variant.
    • getInitialVariantWeight

      @Stability(Experimental) @Nullable default Number getInitialVariantWeight()
      (experimental) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

      The traffic to a production variant is determined by the ratio of the variant weight to the sum of all variant weight values across all production variants.

      Default: 1.0

    • getProvisionedConcurrency

      @Stability(Experimental) @Nullable default Number getProvisionedConcurrency()
      (experimental) The number of concurrent invocations that are provisioned and ready to respond to your endpoint.

      Valid range: 1-200, must be less than or equal to maxConcurrency.

      Default: - none

    • builder

      @Stability(Experimental) static ServerlessProductionVariantProps.Builder builder()
      Returns:
      a ServerlessProductionVariantProps.Builder of ServerlessProductionVariantProps