Class: Aws::SageMaker::Types::ClusterTieredStorageConfig
- Inherits:
-
Struct
- Object
- Struct
- Aws::SageMaker::Types::ClusterTieredStorageConfig
- Defined in:
- gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb
Overview
Defines the configuration for managed tier checkpointing in a HyperPod cluster. Managed tier checkpointing uses multiple storage tiers, including cluster CPU memory, to provide faster checkpoint operations and improved fault tolerance for large-scale model training. The system automatically saves checkpoints at high frequency to memory and periodically persists them to durable storage, like Amazon S3.
Constant Summary collapse
- SENSITIVE =
[]
Instance Attribute Summary collapse
-
#instance_memory_allocation_percentage ⇒ Integer
The percentage (int) of cluster memory to allocate for checkpointing.
-
#mode ⇒ String
Specifies whether managed tier checkpointing is enabled or disabled for the HyperPod cluster.
Instance Attribute Details
#instance_memory_allocation_percentage ⇒ Integer
The percentage (int) of cluster memory to allocate for checkpointing.
5874 5875 5876 5877 5878 5879 |
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 5874 class ClusterTieredStorageConfig < Struct.new( :mode, :instance_memory_allocation_percentage) SENSITIVE = [] include Aws::Structure end |
#mode ⇒ String
Specifies whether managed tier checkpointing is enabled or disabled
for the HyperPod cluster. When set to Enable
, the system installs
a memory management daemon that provides disaggregated memory as a
service for checkpoint storage. When set to Disable
, the feature
is turned off and the memory management daemon is removed from the
cluster.
5874 5875 5876 5877 5878 5879 |
# File 'gems/aws-sdk-sagemaker/lib/aws-sdk-sagemaker/types.rb', line 5874 class ClusterTieredStorageConfig < Struct.new( :mode, :instance_memory_allocation_percentage) SENSITIVE = [] include Aws::Structure end |