Nova Customization SDK - Amazon Nova

Nova Customization SDK

The Nova Customization SDK is a comprehensive Python SDK for customizing Amazon Nova models. The SDK provides a unified interface for training, evaluation, monitoring, deployment, and inference of Amazon Nova models across different platforms including SageMaker AI and Amazon Bedrock. Whether you're adapting models to domain-specific tasks or optimizing performance for your use case, this SDK provides everything you need in one unified interface.

Benefits

  • One SDK for the entire model customization lifecycle—from data preparation to deployment and monitoring.

  • Support for multiple training methods including continued pre-training (CPT), supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement fine-tuning (RFT), both single-turn and multi-turn, with both LoRA and full-rank approaches.

  • Built-in support for SageMaker Training Jobs and , with automatic resource management.

  • No more finding the right recipes or container URI for your training techniques.

  • Bring your own training recipes or use the SDK's intelligent defaults with parameter overrides.

  • The SDK validates your configuration against supported model and instance combinations and provides validation support, preventing errors before training starts.

  • Integrated Amazon CloudWatch monitoring enables you to track training progress in real-time.

  • Integrated MLFlow to track training experiments with SageMaker AI MLFlow tracking servers.

Requirements

The SDK requires at least Python 3.12.

Installation

To install this SDK, please follow below command.

pip install amzn-nova-customization-sdk

Supported Models and Techniques

The SDK supports the following models and techniques within the Amazon Nova family:

Method Supported Models
Continued Pre-training All Nova Models (SMHP only)
Supervised Fine-tuning LoRA All Nova Models
Supervised Fine-tuning Full-Rank All Nova Models
Direct Preference Optimization LoRA Nova 1.0 models
Direct Preference Optimization Full-Rank Nova 1.0 models
Reinforcement Fine-tuning LoRA Nova Lite 2.0
Reinforcement Fine-tuning Full-Rank Nova Lite 2.0
Multi-turn Reinforcement Fine-tuning LoRA Nova Lite 2.0 (SMHP Only)
Multi-turn Reinforcement Fine-tuning Full-Rank Nova Lite 2.0 (SMHP Only)

Getting Started

1. Prepare Your Data

Load your dataset from local files or S3, and let the SDK handle the transformation to the correct format for your chosen training method. Or, provide formatted data and get started immediately.

from amzn_nova_customization_sdk.dataset.dataset_loader import JSONLDatasetLoader from amzn_nova_customization_sdk.model.model_enums import Model, TrainingMethod loader = JSONLDatasetLoader(question="input", answer="output") loader.load("s3://your-bucket/training-data.jsonl") loader.transform(method=TrainingMethod.SFT_LORA, model=Model.NOVA_LITE)

2. Configure Your Infrastructure

Choose your compute resources—the SDK validates configurations and ensures optimal setup.

from amzn_nova_customization_sdk.manager.runtime_manager import SMTJRuntimeManager, SMHPRuntimeManager # SageMaker Training Jobs runtime = SMTJRuntimeManager( instance_type="ml.p5.48xlarge", instance_count=4 ) # SageMaker HyperPod runtime = SMHPRuntimeManager( instance_type="ml.p5.48xlarge", instance_count=4, cluster_name="my-hyperpod-cluster", namespace="kubeflow" )

3. Train

Start training with just a few lines of code.

from amzn_nova_customization_sdk.model import NovaModelCustomizer from amzn_nova_customization_sdk.model.model_enums import Model, TrainingMethod customizer = NovaModelCustomizer( model=Model.NOVA_LITE_2, method=TrainingMethod.SFT_LORA, infra=runtime, data_s3_path="s3://your-bucket/prepared-data.jsonl" ) result = customizer.train(job_name="my-training-job")

4. Monitor

Track your training progress directly from the SDK.

from amzn_nova_customization_sdk.monitor.log_monitor import CloudWatchLogMonitor # Monitor training logs customizer.get_logs() # Or monitor directly via CloudWatchLogMonitor monitor = CloudWatchLogMonitor.from_job_result(result) monitor.show_logs(limit=10) # Check job status result.get_job_status() # InProgress, Completed, Failed

5. Evaluate

Evaluate model performance with a variety of built-in benchmarks, or design your own evaluations.

from amzn_nova_customization_sdk.recipe_config.eval_config import EvaluationTask # Evaluate on benchmark tasks eval_result = customizer.evaluate( job_name="model-eval", eval_task=EvaluationTask.MMLU, model_path=result.model_artifacts.checkpoint_s3_path )

6. Deploy

Deploy your customized model to production with built-in support for Amazon Bedrock or SageMaker.

from amzn_nova_customization_sdk.model.model_enums import DeployPlatform # Bedrock provisioned throughput deployment = customizer.deploy( model_artifact_path=result.model_artifacts.checkpoint_s3_path, deploy_platform=DeployPlatform.BEDROCK_PT, unit_count=10 ) # Bedrock On-Demand deployment = customizer.deploy( model_artifact_path=result.model_artifacts.checkpoint_s3_path, deploy_platform=DeployPlatform.BEDROCK_OD ) # Sagemaker Real-time Inference deployment = customizer.deploy( model_artifact_path=result.model_artifacts.checkpoint_s3_path, deploy_platform=DeployPlatform.SAGEMAKER, unit_count=10, sagemaker_instance_type="ml.p5.48xlarge", sagemaker_environment_variables={ "CONTEXT_LENGTH": "12000", "MAX_CONCURRENCY": "16", } )

Key Capabilities

On The Fly Recipe Creation

The SDK eliminates the need to search for the appropriate recipes or container URI for specific techniques.

Intelligent Data Processing

The SDK automatically transforms your data into the correct format for training. Whether you're working with JSON, JSONL, or CSV files, the data loader handles the conversion seamlessly. Data Loader supports text as well as multimodal data (images and videos).

Enterprise Infrastructure Support

The SDK works with both SageMaker Training Jobs and SageMaker HyperPod, automatically managing:

  • Instance type validation

  • Recipe validation

  • Dataset validation

  • Job orchestration and monitoring

Comprehensive evaluation

Evaluate your customized models against standard benchmarks including:

  • MMLU (Massive Multitask Language Understanding)

  • BBH (Advanced Reasoning Tasks)

  • GPQA (Graduate-Level Google-Proof Q&A)

Either use the benchmark defaults, or modify them to fit your needs:

  • BYOM (Bring Your Own Metric)

  • BYOD (Bring Your Own Dataset)

Production Deployment

Deploy your models to Amazon Bedrock or SageMaker with options for:

  • Bedrock Provisioned Throughput - Dedicated capacity for consistent performance

  • Bedrock On-Demand (only applicable to LoRA based customization) - Pay-per-use pricing

  • Sagemaker Real-time Inference - Dedicated capacity for consistent performance

Batch Inference

Run large-scale inference jobs efficiently:

  • Process thousands of requests in parallel

  • Automatic result aggregation

  • Cost-effective batch processing

Nova Forge

For Nova Forge subscribers, the SDK supports data mixing recipes.

Learn More

Ready to start customizing Nova models with the Nova Customization SDK? Check out our GitHub repository for detailed guides, API references, and additional examples: https://github.com/aws/nova-customization-sdk