InferenceComponentAvailabilityZoneBalance - Amazon SageMaker

InferenceComponentAvailabilityZoneBalance

Configuration for balancing inference component copies across Availability Zones.

Contents

EnforcementMode

Determines how strictly the Availability Zone balance constraint is enforced.

PERMISSIVE

The endpoint attempts to balance copies across Availability Zones but proceeds with scheduling even if balance can't be achieved due to available capacity or instance distribution across Availability Zones.

Type: String

Valid Values: PERMISSIVE

Required: Yes

MaxImbalance

The maximum allowed difference in the number of inference component copies between any two Availability Zones. This parameter applies only when the endpoint has instances across two or more Availability Zones. A copy placement is allowed if it reduces imbalance or the resulting imbalance is within this value.

Default value: 0.

Type: Integer

Valid Range: Minimum value of 0. Maximum value of 100.

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: