Benefits Requirements Installation Supported Models and Techniques Getting Started Key Capabilities Learn More

Nova Customization SDK

The Nova Customization SDK is a comprehensive Python SDK for customizing Amazon Nova models. The SDK provides a unified interface for training, evaluation, monitoring, deployment, and inference of Amazon Nova models across different platforms including SageMaker AI and Amazon Bedrock. Whether you're adapting models to domain-specific tasks or optimizing performance for your use case, this SDK provides everything you need in one unified interface.

Benefits

One SDK for the entire model customization lifecycle—from data preparation to deployment and monitoring.
Support for multiple training methods including continued pre-training (CPT), supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement fine-tuning (RFT), both single-turn and multi-turn, with both LoRA and full-rank approaches.
Built-in support for SageMaker Training Jobs and , with automatic resource management.
No more finding the right recipes or container URI for your training techniques.
Bring your own training recipes or use the SDK's intelligent defaults with parameter overrides.
The SDK validates your configuration against supported model and instance combinations and provides validation support, preventing errors before training starts.
Integrated Amazon CloudWatch monitoring enables you to track training progress in real-time.
Integrated MLFlow to track training experiments with SageMaker AI MLFlow tracking servers.

Requirements

The SDK requires at least Python 3.12.

Installation

To install this SDK, please follow below command.


pip install amzn-nova-customization-sdk

Supported Models and Techniques

The SDK supports the following models and techniques within the Amazon Nova family:

Method	Supported Models
Continued Pre-training	All Nova Models (SMHP only)
Supervised Fine-tuning LoRA	All Nova Models
Supervised Fine-tuning Full-Rank	All Nova Models
Direct Preference Optimization LoRA	Nova 1.0 models
Direct Preference Optimization Full-Rank	Nova 1.0 models
Reinforcement Fine-tuning LoRA	Nova Lite 2.0
Reinforcement Fine-tuning Full-Rank	Nova Lite 2.0
Multi-turn Reinforcement Fine-tuning LoRA	Nova Lite 2.0 (SMHP Only)
Multi-turn Reinforcement Fine-tuning Full-Rank	Nova Lite 2.0 (SMHP Only)

Getting Started

1. Prepare Your Data

Load your dataset from local files or S3, and let the SDK handle the transformation to the correct format for your chosen training method. Or, provide formatted data and get started immediately.


from amzn_nova_customization_sdk.dataset.dataset_loader import JSONLDatasetLoader
from amzn_nova_customization_sdk.model.model_enums import Model, TrainingMethod

loader = JSONLDatasetLoader(question="input", answer="output")
loader.load("s3://your-bucket/training-data.jsonl")
loader.transform(method=TrainingMethod.SFT_LORA, model=Model.NOVA_LITE)

2. Configure Your Infrastructure

Choose your compute resources—the SDK validates configurations and ensures optimal setup.


from amzn_nova_customization_sdk.manager.runtime_manager import SMTJRuntimeManager, SMHPRuntimeManager


# SageMaker Training Jobs
runtime = SMTJRuntimeManager(
    instance_type="ml.p5.48xlarge",
    instance_count=4
)

# SageMaker HyperPod
runtime = SMHPRuntimeManager(
    instance_type="ml.p5.48xlarge",
    instance_count=4,
    cluster_name="my-hyperpod-cluster",
    namespace="kubeflow"
)

3. Train

Start training with just a few lines of code.


from amzn_nova_customization_sdk.model import NovaModelCustomizer
from amzn_nova_customization_sdk.model.model_enums import Model, TrainingMethod

customizer = NovaModelCustomizer(
    model=Model.NOVA_LITE_2,
    method=TrainingMethod.SFT_LORA,
    infra=runtime,
    data_s3_path="s3://your-bucket/prepared-data.jsonl"
)

result = customizer.train(job_name="my-training-job")

4. Monitor

Track your training progress directly from the SDK.


from amzn_nova_customization_sdk.monitor.log_monitor import CloudWatchLogMonitor

# Monitor training logs
customizer.get_logs()

# Or monitor directly via CloudWatchLogMonitor
monitor = CloudWatchLogMonitor.from_job_result(result)
monitor.show_logs(limit=10)

# Check job status
result.get_job_status() # InProgress, Completed, Failed

5. Evaluate

Evaluate model performance with a variety of built-in benchmarks, or design your own evaluations.


from amzn_nova_customization_sdk.recipe_config.eval_config import EvaluationTask

# Evaluate on benchmark tasks
eval_result = customizer.evaluate(
    job_name="model-eval",
    eval_task=EvaluationTask.MMLU,
    model_path=result.model_artifacts.checkpoint_s3_path
)

6. Deploy

Deploy your customized model to production with built-in support for Amazon Bedrock or SageMaker.


from amzn_nova_customization_sdk.model.model_enums import DeployPlatform

# Bedrock provisioned throughput
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.BEDROCK_PT,
    unit_count=10
)

# Bedrock On-Demand
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.BEDROCK_OD
)

# Sagemaker Real-time Inference
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.SAGEMAKER,
    unit_count=10,
    sagemaker_instance_type="ml.p5.48xlarge",
    sagemaker_environment_variables={
        "CONTEXT_LENGTH": "12000",
        "MAX_CONCURRENCY": "16",
    }
)

Key Capabilities

On The Fly Recipe Creation

The SDK eliminates the need to search for the appropriate recipes or container URI for specific techniques.

Intelligent Data Processing

The SDK automatically transforms your data into the correct format for training. Whether you're working with JSON, JSONL, or CSV files, the data loader handles the conversion seamlessly. Data Loader supports text as well as multimodal data (images and videos).

Enterprise Infrastructure Support

The SDK works with both SageMaker Training Jobs and SageMaker HyperPod, automatically managing:

Instance type validation
Recipe validation
Dataset validation
Job orchestration and monitoring

Comprehensive evaluation

Evaluate your customized models against standard benchmarks including:

MMLU (Massive Multitask Language Understanding)
BBH (Advanced Reasoning Tasks)
GPQA (Graduate-Level Google-Proof Q&A)

Either use the benchmark defaults, or modify them to fit your needs:

BYOM (Bring Your Own Metric)
BYOD (Bring Your Own Dataset)

Production Deployment

Deploy your models to Amazon Bedrock or SageMaker with options for:

Bedrock Provisioned Throughput - Dedicated capacity for consistent performance
Bedrock On-Demand (only applicable to LoRA based customization) - Pay-per-use pricing
Sagemaker Real-time Inference - Dedicated capacity for consistent performance

Batch Inference

Run large-scale inference jobs efficiently:

Process thousands of requests in parallel
Automatic result aggregation
Cost-effective batch processing

Nova Forge

For Nova Forge subscribers, the SDK supports data mixing recipes.

Learn More

Ready to start customizing Nova models with the Nova Customization SDK? Check out our GitHub repository for detailed guides, API references, and additional examples: https://github.com/aws/nova-customization-sdk

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

On SageMaker training jobs

Fine-tune Nova 1.0