GENOPS04-BP02 Implement GenAIOps to optimize the application lifecycle

To optimize generative AI workloads, organizations should implement GenAIOps, a best practice that automates the development, deployment, and management of models. This approach establishes CI/CD pipelines for training, tuning, and deploying foundation models. GenAIOps enhances operational efficiency, reduces time-to-market, and enables consistent, high-quality model performance. It creates a robust, automated framework that supports the entire generative AI project lifecycle from development to production deployment. Through GenAIOps, customers can achieve greater agility, improved model reliability, and quick adaptation to changing business requirements, driving innovation and competitive advantage.

Desired outcome: After implementing GenAIOps, organizations can have a robust, automated framework for managing the entire lifecycle of generative AI workloads.

Benefits of establishing this best practice: Safely automate where possible - automate the lifecycle of your foundation models.

Level of risk exposed if this best practice is not established: High

Implementation guidance

GenAIOps is a specialized subset of machine learning operations (MLOps) that focuses on the processes and techniques for managing and operationalizing foundation models in production environments. Organizations can harness the power of foundation models while reducing risks and optimizing their deployments. There are two categories under GenAIOps: operationalizing foundation model consumption and operationalizing foundation model training and tuning. Common concerns across both categories include CI/CD, prompt management, versioning of artifacts, model upgrades, evaluation, and monitoring.

For operationalizing applications that consume foundation models, the model-consuming applications will follow traditional DevOps processes. Applications are often built using complex orchestration patterns such as RAG and agents. Operationalizing RAG applications involves the choice of vector database, indexing pipelines, and retrieval strategies.

For operationalizing foundation model training and tuning, it is essential to perform efficient training, tuning, and deployment of foundation models using automation. Foundation model operations (FMOps), which is the operationalization of foundation models, and large language model operations (LLMOps), which is specifically the operationalization of LLMs, fall under this category. This involves model selection, continuous tuning and training of models, experiment tracking, a central model registry, prompt management and evaluation, and deployment of the models.

Amazon SageMaker AI Pipelines is a serverless workflow orchestration service specifically designed for MLOps and LLMOps automation. Set up SageMaker AI Pipelines to build, run, and monitor repeatable end-to-end ML workflows for LLMs, from data preparation to model deployment. The service can scale to run tens of thousands of concurrent ML workflows in production, which is particularly useful when working with resource-intensive LLMs. Self-managed MLFlow or SageMaker AI MLFlow is well-suited for tracking experiments, cataloging the models, approving them, and deploying them to production.

Amazon Bedrock provides a managed RAG feature called Knowledge Bases, which automates the indexing and ingestion into various vector database options and orchestrates the retrieval process. Amazon Bedrock Agents use the reasoning of foundation models, APIs, and data to break down user requests, gather relevant information, and efficiently complete tasks. Amazon Bedrock has managed features for continued pretraining and finetuning of foundation models.

Implementation steps

For SageMaker AI, implement pipelines.
- Use SageMaker AI SDK to add steps which may include data preparation, model training, model evaluation, and model deployment
- Use SageMaker AI Processing to run evaluation scripts on the trained model with SageMaker AI Clarify
- Automate testing with integration and performance tests. Consider AWS Step Functions to orchestrate them
- Start the pipeline execution
- Use Amazon SageMaker AI Studio to view the pipeline's progress
- Set up notifications for pipeline status updates using Amazon CloudWatch Events
- Integrate this into the larger application's CI/CD pipeline using AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy with Amazon SageMaker AI Projects
Enable MLflow experiment tracking.
- In Amazon SageMaker AI Studio, configure MLflow tracking
- Use MLflow to log parameters, metrics, and artifacts during your model training process
- These will be automatically tracked and stored in your SageMaker AI-managed MLflow server
- Use the MLflow UI in SageMaker AI Studio to analyze metrics and artifacts to determine the best model iterations
- Register your best models in the MLflow Model Registry
Use a version control system.
- Use a Git compatible repository to manage code and configurations effectively
- Set up SageMaker AI Model Registry to catalog and version models
Set up monitoring and logging.
- Monitor real-time FM metrics with Amazon CloudWatch
- Centralize logging with Amazon CloudWatch Logs
Create a feedback loop for continuous improvement.
- Gather user feedback and model performance data
- Automate retraining and model updates based on new data

Resources

Related best practices:

Related documents:

Related examples:

Related tools:

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

GENOPS04-BP01 Automate generative AI application lifecycle with infrastructure as code (IaC)

Model customization