InferenceComponentAvailabilityZoneBalance
Configuration for balancing inference component copies across Availability Zones.
Contents
- EnforcementMode
-
Determines how strictly the Availability Zone balance constraint is enforced.
- PERMISSIVE
-
The endpoint attempts to balance copies across Availability Zones but proceeds with scheduling even if balance can't be achieved due to available capacity or instance distribution across Availability Zones.
Type: String
Valid Values:
PERMISSIVERequired: Yes
- MaxImbalance
-
The maximum allowed difference in the number of inference component copies between any two Availability Zones. This parameter applies only when the endpoint has instances across two or more Availability Zones. A copy placement is allowed if it reduces imbalance or the resulting imbalance is within this value.
Default value:
0.Type: Integer
Valid Range: Minimum value of 0. Maximum value of 100.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: