Custom model hyperparameters - Amazon Bedrock

Custom model hyperparameters

The following reference content covers the hyperparameters that are available for training each Amazon Bedrock custom model.

A hyperparameter is a parameter that controls the training process, such as the learning rate or epoch count. You set hyperparameters for custom model training when you submit the fine tuning job with the Amazon Bedrock console or by calling the CreateModelCustomizationJob API operation. For guidelines on hyperparameter settings, see Guidelines for model customization.

The Amazon Nova Lite, Amazon Nova Micro, and Amazon Nova Pro models support the following three hyperparameters for model customization. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Amazon Nova models, see Fine-tuning Amazon Nova models.

The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console) Hyperparameter (API) Definition Type Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset integer 1 5 2
Learning rate learningRate The rate at which model parameters are updated after each batch float 1.00E-6 1.00E-4 1.00E-5
Learning rate warmup steps learningRateWarmupSteps The number of iterations over which the learning rate is gradually increased to the specified rate integer 0 100 10

The default epoch number is 2, which works for most cases. In general, larger data sets require fewer epochs to converge, while smaller data sets require more epochs to converge. A faster convergence might also be achieved by increasing the learning rate, but this is less desirable because it might lead to training instability at convergence. We recommend starting with the default hyperparameters, which are based on our assessment across tasks of different complexity and data sizes.

The learning rate will gradually increase to the set value during warm up. Therefore, we recommend that you avoid a large warm up value when the training sample is small because the learning rate might never reach the set value during the training process. We recommend setting the warmup steps by dividing the dataset size by 640 for Amazon Nova Micro, 160 for Amazon Nova Lite, and 320 for Amazon Nova Pro.

The Amazon Nova Canvas model supports the following hyperparameters for model customization.

Hyperparameter (console) Hyperparameter (API) Definition Minimum Maximum Default
Batch size batchSize Number of samples processed before updating model parameters 8 192 8
Steps stepCount Number of times the model is exposed to each batch 10 20,000 500
Learning rate learningRate Rate at which model parameters are updated after each batch 1.00E-7 1.00E-4 1.00E-5

Amazon Titan Text Premier model supports the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console) Hyperparameter (API) Definition Type Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset integer 1 5 2
Batch size (micro) batchSize The number of samples processed before updating model parameters integer 1 1 1
Learning rate learningRate The rate at which model parameters are updated after each batch float 1.00E-07 1.00E-05 1.00E-06
Learning rate warmup steps learningRateWarmupSteps The number of iterations over which the learning rate is gradually increased to the specified rate integer 0 20 5

Amazon Titan Text models, such as Lite and Express, support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console) Hyperparameter (API) Definition Type Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset integer 1 10 5
Batch size (micro) batchSize The number of samples processed before updating model parameters integer 1 64 1
Learning rate learningRate The rate at which model parameters are updated after each batch float 0.0 1 1.00E-5
Learning rate warmup steps learningRateWarmupSteps The number of iterations over which the learning rate is gradually increased to the specified rate integer 0 250 5

The Amazon Titan Image Generator G1 model supports the following hyperparameters for model customization.

Note

stepCount has no default value and must be specified. stepCount supports the value auto. auto prioritizes model performance over training cost by automatically determining a number based on the size of your dataset. Training job costs depend on the number that auto determines. To understand how job cost is calculated and to see examples, see Amazon Bedrock Pricing.

Hyperparameter (console) Hyperparameter (API) Definition Minimum Maximum Default
Batch size batchSize Number of samples processed before updating model parameters 8 192 8
Steps stepCount Number of times the model is exposed to each batch 10 40,000 N/A
Learning rate learningRate Rate at which model parameters are updated after each batch 1.00E-7 1 1.00E-5

The Amazon Titan Multimodal Embeddings G1 model supports the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Note

epochCount has no default value and must be specified. epochCount supports the value Auto. Auto prioritizes model performance over training cost by automatically determining a number based on the size of your dataset. Training job costs depend on the number that Auto determines. To understand how job cost is calculated and to see examples, see Amazon Bedrock Pricing.

Hyperparameter (console) Hyperparameter (API) Definition Type Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset integer 1 100 N/A
Batch size batchSize The number of samples processed before updating model parameters integer 256 9,216 576
Learning rate learningRate The rate at which model parameters are updated after each batch float 5.00E-8 1 5.00E-5

Anthropic Claude 3 models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Console Name API Name Definition Default Minimum Maximum
Epoch count epochCount The maximum number of iterations through the entire training dataset 2 1 10
Batch size batchSize Number of samples processed before updating model parameters 32 4 256
Learning rate multiplier learningRateMultiplier Multiplier that influences the learning rate at which model parameters are updated after each batch 1 0.1 2
Early stopping threshold earlyStoppingThreshold Minimum improvement in validation loss required to prevent premature termination of the training process 0.001 0 0.1
Early stopping patience earlyStoppingPatience Tolerance for stagnation in the validation loss metric before stopping the training process 2 1 10

The Cohere Command and Cohere Command Light models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Cohere models, see the Cohere documentation at https://docs.cohere.com/docs/fine-tuning.

Note

The epochCount quota is adjustable.

Hyperparameter (console) Hyperparameter (API) Definition Type Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset integer 1 100 1
Batch size batchSize The number of samples processed before updating model parameters integer 8 8 (Command)

32 (Light)

8
Learning rate learningRate The rate at which model parameters are updated after each batch. If you use a validation dataset, we recommend that you don't provide a value for learningRate. float 5.00E-6 0.1 1.00E-5
Early stopping threshold earlyStoppingThreshold The minimum improvement in loss required to prevent premature termination of the training process float 0 0.1 0.01
Early stopping patience earlyStoppingPatience The tolerance for stagnation in the loss metric before stopping the training process integer 1 10 6
Evaluation percentage evalPercentage

The percentage of the dataset allocated for model evaluation, if you don't provide a separate validation dataset

float 5 50 20

The Meta Llama 3.1 8B and 70B models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Meta Llama models, see the Meta documentation at https://ai.meta.com/llama/get-started/#fine-tuning.

Note

The epochCount quota is adjustable.

Hyperparameter (console) Hyperparameter (API) Definition Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset 1 10 5
Batch size batchSize The number of samples processed before updating model parameters 1 1 1
Learning rate learningRate The rate at which model parameters are updated after each batch 5.00E-6 0.1 1.00E-4

The Meta Llama 3.2 1B, 3B, 11B, and 90B models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Meta Llama models, see the Meta documentation at https://ai.meta.com/llama/get-started/#fine-tuning.

Hyperparameter (console) Hyperparameter (API) Definition Minimum Maximum Default
Epochs epochCount The number of iterations through the entire training dataset 1 10 5
Batch size batchSize The number of samples processed before updating model parameters 1 1 1
Learning rate learningRate The rate at which model parameters are updated after each batch 5.00E-6 0.1 1.00E-4