Model requirements for training and validation datasets

The following sections list the requirements for training and validation datasets for a model. For information about dataset constraints for Amazon Nova models, see Fine-tuning Amazon Nova models.

Description	Maximum (Fine-tuning)
Sum of input and output tokens when batch size is 1	4,096
Sum of input and output tokens when batch size is 2, 3, or 4	N/A
Character quota per sample in dataset	Token quota x 6 (estimated)
Training dataset file size	1 GB
Validation dataset file size	100 MB

Description	Maximum (Continued Pre-training)	Maximum (Fine-tuning)
Sum of input and output tokens when batch size is 1	4,096	4,096
Sum of input and output tokens when batch size is 2, 3, or 4	2,048	2,048
Character quota per sample in dataset	Token quota x 6 (estimated)	Token quota x 6 (estimated)
Training dataset file size	10 GB	1 GB
Validation dataset file size	100 MB	100 MB

Description	Maximum (Continued Pre-training)	Maximum (Fine-tuning)
Sum of input and output tokens when batch size is 1 or 2	4,096	4,096
Sum of input and output tokens when batch size is 3, 4, 5, or 6	2,048	2,048
Character quota per sample in dataset	Token quota x 6 (estimated)	Token quota x 6 (estimated)
Training dataset file size	10 GB	1 GB
Validation dataset file size	100 MB	100 MB

Description	Minimum (Fine-tuning)	Maximum (Fine-tuning)
Text prompt length in training sample, in characters	3	1,024
Records in a training dataset	5	10,000
Input image size	0	50 MB
Input image height in pixels	512	4,096
Input image width in pixels	512	4,096
Input image total pixels	0	12,582,912
Input image aspect ratio	1:4	4:1

Description	Minimum (Fine-tuning)	Maximum (Fine-tuning)
Text prompt length in training sample, in characters	0	2,560
Records in a training dataset	1,000	500,000
Input image size	0	5 MB
Input image height in pixels	128	4096
Input image width in pixels	128	4096
Input image total pixels	0	12,528,912
Input image aspect ratio	1:4	4:1

Description	Minimum (Fine-tuning)	Maximum (Fine-tuning)
Input tokens	0	16,000
Output tokens	0	16,000
Character quota per sample in dataset	0	Token quota x 6 (estimated)
Sum of Input and Output tokens	0	16,000
Sum of training and validation records	100	10,000 (adjustable using service quotas)

Supported image formats for Meta Llama-3.2 11B Vision Instruct and Meta Llama-3.2 90B Vision Instruct include: gif, jpeg, png, and webp. For estimating the image-to-token conversion during fine-tuning of these models, you can use this formula as an approximation: Tokens = min(2, max(Height // 560, 1)) * min(2, max(Width // 560, 1)) * 1601. Images are converted into approximately 1,601 to 6,404 tokens based on their size.

Description	Minimum (Fine-tuning)	Maximum (Fine-tuning)
Sum of Input and Output tokens	0	16,000 (10000 for Meta Llama 3.2 90B)
Sum of training and validation records	100	10,000 (adjustable using service quotas)
Input image size for Meta Llama 11B and 90B instruct models)	0	10 MB
Input image height in pixels for Meta Llama 11B and 90B instruct models	10	8192
Input image width in pixels for Meta Llama 11B and 90B90B instruct models	10	8192

Description	Minimum (Fine-tuning)	Maximum (Fine-tuning)
Sum of Input and output tokens	0	16000
Sum of training and validation records	100	10,000 (adjustable using Service Quotas)

Description	Maximum (Fine-tuning)
Input tokens	4,096
Output tokens	2,048
Character quota per sample in dataset	Token quota x 6 (estimated)
Records in a training dataset	10,000
Records in a validation dataset	1,000

Description	Maximum (Fine-tuning)
Minimum number of records	32
Maximum training records	10,000
Maximum validation records	1,000
Maximum total records	10,000 (adjustable using service quotas)
Maximum tokens	32,000
Maximum training dataset size	10 GB
Maximum validation dataset size	1 GB

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Prepare your training datasets for fine-tuning and continued pre-training

Prepare data for fine-tuning text-to-text models