Custom model hyperparameters

The following reference content covers the hyperparameters that are available for training each Amazon Bedrock custom model.

A hyperparameter is a parameter that controls the training process, such as the learning rate or epoch count. You set hyperparameters for custom model training when you submit the fine tuning job with the Amazon Bedrock console or by calling the CreateModelCustomizationJob API operation.

The Amazon Nova Lite, Amazon Nova Micro, and Amazon Nova Pro models support the following three hyperparameters for model customization. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Amazon Nova models, see Fine-tuning Amazon Nova models.

The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console)	Hyperparameter (API)	Definition	Type	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	integer	1	5	2
Learning rate	learningRate	The rate at which model parameters are updated after each batch	float	1.00E-6	1.00E-4	1.00E-5
Learning rate warmup steps	learningRateWarmupSteps	The number of iterations over which the learning rate is gradually increased to the specified rate	integer	0	100	10

The default epoch number is 2, which works for most cases. In general, larger data sets require fewer epochs to converge, while smaller data sets require more epochs to converge. A faster convergence might also be achieved by increasing the learning rate, but this is less desirable because it might lead to training instability at convergence. We recommend starting with the default hyperparameters, which are based on our assessment across tasks of different complexity and data sizes.

The learning rate will gradually increase to the set value during warm up. Therefore, we recommend that you avoid a large warm up value when the training sample is small because the learning rate might never reach the set value during the training process. We recommend setting the warmup steps by dividing the dataset size by 640 for Amazon Nova Micro, 160 for Amazon Nova Lite, and 320 for Amazon Nova Pro.

The Amazon Nova Canvas model supports the following hyperparameters for model customization.

Hyperparameter (console)	Hyperparameter (API)	Definition	Minimum	Maximum	Default
Batch size	batchSize	Number of samples processed before updating model parameters	8	192	8
Steps	stepCount	Number of times the model is exposed to each batch	10	20,000	500
Learning rate	learningRate	Rate at which model parameters are updated after each batch	1.00E-7	1.00E-4	1.00E-5

Amazon Titan Text Premier model supports the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console)	Hyperparameter (API)	Definition	Type	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	integer	1	5	2
Batch size (micro)	batchSize	The number of samples processed before updating model parameters	integer	1	1	1
Learning rate	learningRate	The rate at which model parameters are updated after each batch	float	1.00E-07	1.00E-05	1.00E-06
Learning rate warmup steps	learningRateWarmupSteps	The number of iterations over which the learning rate is gradually increased to the specified rate	integer	0	20	5

Amazon Titan Text models, such as Lite and Express, support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Hyperparameter (console)	Hyperparameter (API)	Definition	Type	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	integer	1	10	5
Batch size (micro)	batchSize	The number of samples processed before updating model parameters	integer	1	64	1
Learning rate	learningRate	The rate at which model parameters are updated after each batch	float	0.0	1	1.00E-5
Learning rate warmup steps	learningRateWarmupSteps	The number of iterations over which the learning rate is gradually increased to the specified rate	integer	0	250	5

The Amazon Titan Image Generator G1 model supports the following hyperparameters for model customization.

Note

stepCount has no default value and must be specified. stepCount supports the value auto. auto prioritizes model performance over training cost by automatically determining a number based on the size of your dataset. Training job costs depend on the number that auto determines. To understand how job cost is calculated and to see examples, see Amazon Bedrock Pricing.

Hyperparameter (console)	Hyperparameter (API)	Definition	Minimum	Maximum	Default
Batch size	batchSize	Number of samples processed before updating model parameters	8	192	8
Steps	stepCount	Number of times the model is exposed to each batch	10	40,000	N/A
Learning rate	learningRate	Rate at which model parameters are updated after each batch	1.00E-7	1	1.00E-5

The Amazon Titan Multimodal Embeddings G1 model supports the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Note

epochCount has no default value and must be specified. epochCount supports the value Auto. Auto prioritizes model performance over training cost by automatically determining a number based on the size of your dataset. Training job costs depend on the number that Auto determines. To understand how job cost is calculated and to see examples, see Amazon Bedrock Pricing.

Hyperparameter (console)	Hyperparameter (API)	Definition	Type	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	integer	1	100	N/A
Batch size	batchSize	The number of samples processed before updating model parameters	integer	256	9,216	576
Learning rate	learningRate	The rate at which model parameters are updated after each batch	float	5.00E-8	1	5.00E-5

Anthropic Claude 3 models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing.

Console Name	API Name	Definition	Default	Minimum	Maximum
Epoch count	epochCount	The maximum number of iterations through the entire training dataset	2	1	10
Batch size	batchSize	Number of samples processed before updating model parameters	32	4	256
Learning rate multiplier	learningRateMultiplier	Multiplier that influences the learning rate at which model parameters are updated after each batch	1	0.1	2
Early stopping threshold	earlyStoppingThreshold	Minimum improvement in validation loss required to prevent premature termination of the training process	0.001	0	0.1
Early stopping patience	earlyStoppingPatience	Tolerance for stagnation in the validation loss metric before stopping the training process	2	1	10

The Cohere Command and Cohere Command Light models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Cohere models, see the Cohere documentation at https://docs.cohere.com/docs/fine-tuning.

Note

The epochCount quota is adjustable.

Hyperparameter (console)	Hyperparameter (API)	Definition	Type	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	integer	1	100	1
Batch size	batchSize	The number of samples processed before updating model parameters	integer	8	8 (Command) 32 (Light)	8
Learning rate	learningRate	The rate at which model parameters are updated after each batch. If you use a validation dataset, we recommend that you don't provide a value for `learningRate`.	float	5.00E-6	0.1	1.00E-5
Early stopping threshold	earlyStoppingThreshold	The minimum improvement in loss required to prevent premature termination of the training process	float	0	0.1	0.01
Early stopping patience	earlyStoppingPatience	The tolerance for stagnation in the loss metric before stopping the training process	integer	1	10	6
Evaluation percentage	evalPercentage	The percentage of the dataset allocated for model evaluation, if you don't provide a separate validation dataset	float	5	50	20

The Meta Llama 3.1 8B and 70B models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Meta Llama models, see the Meta documentation at https://ai.meta.com/llama/get-started/#fine-tuning.

Note

The epochCount quota is adjustable.

Hyperparameter (console)	Hyperparameter (API)	Definition	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	1	10	5
Batch size	batchSize	The number of samples processed before updating model parameters	1	1	1
Learning rate	learningRate	The rate at which model parameters are updated after each batch	5.00E-6	0.1	1.00E-4

The Meta Llama 3.2 1B, 3B, 11B, and 90B models support the following hyperparameters for model customization. The number of epochs you specify increases your model customization cost by processing more tokens. Each epoch processes the entire training dataset once. For information about pricing, see Amazon Bedrock pricing. For more information, see Customize your model to improve its performance for your use case.

For information about fine tuning Meta Llama models, see the Meta documentation at https://ai.meta.com/llama/get-started/#fine-tuning.

Hyperparameter (console)	Hyperparameter (API)	Definition	Minimum	Maximum	Default
Epochs	epochCount	The number of iterations through the entire training dataset	1	10	5
Batch size	batchSize	The number of samples processed before updating model parameters	1	1	1
Learning rate	learningRate	The rate at which model parameters are updated after each batch	5.00E-6	0.1	1.00E-4

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Fine-tune Amazon Nova models

Submit a model fine-tuning job