

# Best practices by ML lifecycle
<a name="best-practices-by-ml-lifecycle-phase"></a>

 There are six phases in the machine learning lifecycle. The following index lists the best practices by lifecycle phase. 

## Business goal identification
<a name="business-goal-identification-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar"></a>
+  [MLOPS01-BP01 Develop the right skills with accountability and empowerment](mlops01-bp01.md) 
+  [MLOPS01-BP02 Discuss and agree on the level of model explainability](mlops01-bp02.md) 
+  [MLOPS01-BP03 Monitor model adherence to business requirements](mlops01-bp03.md) 

### Security
<a name="security-pillar"></a>
+  [MLSEC01-BP01 Validate ML data permissions, privacy, software, and license terms](mlsec01-bp01.md) 

### Reliability
<a name="reliability-pillar"></a>

 There are no reliability pillar best practices for business goal identification. 

### Performance efficiency
<a name="performance-efficiency-pillar"></a>
+  [MLPERF01-BP01 Determine key performance indicators](mlperf01-bp01.md) 

### Cost optimization
<a name="cost-optimization-pillar"></a>
+  [MLCOST01-BP01 Define overall return on investment (ROI) and opportunity cost](mlcost01-bp01.md) 
+  [MLCOST01-BP02 Use managed services to reduce total cost of ownership (TCO)](mlcost01-bp02.md) 

### Sustainability
<a name="sustainability-pillar"></a>
+  [MLSUS01-BP01 Define the overall environmental impact or benefit](mlsus01-bp01.md) 

## ML problem framing
<a name="ml-problem-framing-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar-1"></a>
+  [MLOPS02-BP01 Establish ML roles and responsibilities](mlops02-bp01.md) 
+  [MLOPS02-BP02 Prepare an ML profile template](mlops02-bp02.md) 
+  [MLOPS02-BP03 Establish model improvement strategies](mlops02-bp03.md) 
+  [MLOPS02-BP04 Establish a lineage tracker system](mlops02-bp04.md) 
+  [MLOPS02-BP05 Establish feedback loops across ML lifecycle phases](mlops02-bp05.md) 
+  [MLOPS02-BP06 Review fairness and explainability](mlops02-bp06.md) 

### Security
<a name="security-pillar-1"></a>
+  [MLSEC02-BP01 Design data encryption and obfuscation](mlsec02-bp01.md) 

### Reliability
<a name="reliability-pillar-1"></a>
+  [MLREL01-BP01 Use APIs to abstract change from model consuming applications](mlrel01-bp01.md) 
+  [MLREL01-BP02 Adopt a machine learning microservice strategy](mlrel01-bp02.md) 

### Performance efficiency
<a name="performance-efficiency-pillar-1"></a>
+  [MLPERF02-BP01 Define relevant evaluation metrics](mlperf02-bp01.md) 
+  [MLPERF02-BP02 Use purpose-built AI and ML services and resources](mlperf02-bp02.md) 

### Cost optimization
<a name="cost-optimization-pillar-1"></a>
+  [MLCOST02-BP01 Identify if machine learning is the right solution](mlcost02-bp01.md) 
+  [MLCOST02-BP02 Perform a tradeoff analysis between custom and pre-trained models](mlcost02-bp02.md) 

### Sustainability
<a name="sustainability-pillar-1"></a>
+  [MLSUS02-BP01 Consider AI services and pre-trained models](mlsus02-bp01.md) 
+  [MLSUS02-BP02 Select sustainable Regions](mlsus02-bp02.md) 

## Data processing
<a name="data-processing-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar-2"></a>
+  [MLOPS03-BP01 Profile data to improve quality](mlops03-bp01.md) 
+  [MLOPS03-BP02 Create tracking and version control mechanisms](mlops03-bp02.md) 

### Security
<a name="security-pillar-2"></a>
+  [MLSEC03-BP01 Provide least privilege access](mlsec03-bp01.md) 
+  [MLSEC03-BP02 Secure data and modeling environment](mlsec03-bp02.md) 
+  [MLSEC03-BP03 Protect sensitive data privacy](mlsec03-bp03.md) 
+  [MLSEC03-BP04 Enforce data lineage](mlsec03-bp04.md) 
+  [MLSEC03-BP05 Keep only relevant data](mlsec03-bp05.md) 

### Reliability
<a name="reliability-pillar-2"></a>
+  [MLREL02-BP01 Use a data catalog](mlrel02-bp01.md) 
+  [MLREL02-BP02 Use a data pipeline](mlrel02-bp02.md) 
+  [MLREL02-BP03 Automate managing data changes](mlrel02-bp03.md) 

### Performance efficiency
<a name="performance-efficiency-pillar-2"></a>
+  [MLPERF03-BP01 Use a modern data architecture](mlperf03-bp01.md) 

### Cost optimization
<a name="cost-optimization-pillar-2"></a>
+  [MLCOST03-BP01 Use managed data labeling](mlcost03-bp01.md) 
+  [MLCOST03-BP02 Use no-code or low-code and code generation tools for interactive analysis](mlcost03-bp02.md) 
+  [MLCOST03-BP03 Use managed data processing capabilities](mlcost03-bp03.md) 
+  [MLCOST03-BP04 Enable feature reusability](mlcost03-bp04.md) 

### Sustainability
<a name="sustainability-pillar-2"></a>
+  [MLSUS03-BP01 Minimize idle resources](mlsus03-bp01.md) 
+  [MLSUS03-BP02 Implement data lifecycle policies aligned with your sustainability goals](mlsus03-bp02.md) 
+  [MLSUS03-BP03 Adopt sustainable storage options](mlsus03-bp03.md) 

## Model development
<a name="model-development-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar-3"></a>
+  [MLOPS04-BP01 Automate operations through MLOps and CI/CD](mlops04-bp01.md) 
+  [MLOPS04-BP02 Establish reliable packaging patterns to access approved public libraries](mlops04-bp02.md) 

### Security
<a name="security-pillar-3"></a>
+  [MLSEC04-BP01 Secure governed ML environment](mlsec04-bp01.md) 
+ [MLSEC04-BP02 Secure inter-node cluster communications](mlsec04-bp02.md)
+ [MLSEC04-BP03 Protect against data poisoning threats](mlsec04-bp03.md)

### Reliability
<a name="reliability-pillar-3"></a>
+  [MLREL03-BP01 Enable CI/CD/CT automation with traceability](mlrel03-bp01.md) 
+ [MLREL03-BP02 Verify feature consistency across training and inference](mlrel03-bp02.md)
+ [MLREL03-BP03 Validate models with relevant data](mlrel03-bp03.md)
+ [MLREL03-BP04 Establish data bias detection and mitigation](mlrel03-bp04.md)

### Performance efficiency
<a name="performance-efficiency-pillar-3"></a>
+  [MLPERF04-BP01 Optimize training and inference instance types](mlperf04-bp01.md) 
+ [MLPERF04-BP02 Explore alternatives for performance improvement](mlperf04-bp02.md)
+ [MLPERF04-BP03 Establish a model performance evaluation pipeline](mlperf04-bp03.md)
+ [MLPERF04-BP04 Establish feature statistics](mlperf04-bp04.md)
+ [MLPERF04-BP05 Perform a performance trade-off analysis](mlperf04-bp05.md)
+ [MLPERF04-BP06 Detect performance issues when using transfer learning](mlperf04-bp06.md)

### Cost optimization
<a name="cost-optimization-pillar-3"></a>
+ [MLCOST04-BP01 Select optimal computing instance size](mlcost04-bp01.md)
+ [MLCOST04-BP02 Use managed build environments](mlcost04-bp02.md)
+ [MLCOST04-BP03 Select local training for small scale experiments](mlcost04-bp03.md)
+ [MLCOST04-BP04 Select an optimal ML framework](mlcost04-bp04.md)
+ [MLCOST04-BP05 Use automated machine learning](mlcost04-bp05.md)
+ [MLCOST04-BP06 Use managed training capabilities](mlcost04-bp06.md)
+ [MLCOST04-BP07 Use distributed training](mlcost04-bp07.md)
+ [MLCOST04-BP08 Stop resources when not in use](mlcost04-bp08.md)
+ [MLCOST04-BP09 Start training with small datasets](mlcost04-bp09.md)
+ [MLCOST04-BP10 Use warm start and checkpointing hyperparameter tuning](mlcost04-bp10.md)
+ [MLCOST04-BP11 Use hyperparameter optimization technologies](mlcost04-bp11.md)
+ [MLCOST04-BP12 Set up a budget and use resource tagging to track costs](mlcost04-bp12.md)
+ [MLCOST04-BP13 Enable data and compute proximity](mlcost04-bp13.md)
+ [MLCOST04-BP14 Select optimal algorithms](mlcost04-bp14.md)

### Sustainability
<a name="sustainability-pillar-3"></a>
+  [MLSUS04-BP01 Define sustainable performance criteria](mlsus04-bp01.md) 
+ [MLSUS04-BP02 Select energy-efficient algorithms](mlsus04-bp02.md)
+ [MLSUS04-BP03 Archive or delete unnecessary training artifacts](mlsus04-bp03.md)
+ [MLSUS04-BP04 Use efficient model tuning methods](mlsus04-bp04.md)

## Model deployment
<a name="model-deployment-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar-4"></a>
+  [MLOPS05-BP01 Establish deployment environment metrics](mlops05-bp01.md) 

### Security
<a name="security-pillar-4"></a>
+  [MLSEC05-BP01 Protect against adversarial and malicious activities](mlsec05-bp01.md) 

### Reliability
<a name="reliability-pillar-4"></a>
+  [MLREL04-BP01 Automate endpoint changes through a pipeline](mlrel04-bp01.md) 
+  [MLREL04-BP02 Use an appropriate deployment and testing strategy](mlrel04-bp02.md) 

### Performance efficiency
<a name="performance-efficiency-pillar-4"></a>
+  [MLPERF05-BP01 Evaluate cloud versus edge options for machine learning deployment](mlperf05-bp01.md) 
+  [MLPERF05-BP02 Choose an optimal deployment option in the cloud](mlperf05-bp02.md) 

### Cost optimization
<a name="cost-optimization-pillar-4"></a>
+  [MLCOST05-BP01 Use an appropriate deployment option](mlcost05-bp01.md) 
+ [MLCOST05-BP02 Explore cost effective hardware options](mlcost05-bp02.md)
+ [MLCOST05-BP03 Right-size the model hosting instance fleet](mlcost05-bp03.md)

### Sustainability
<a name="sustainability-pillar-4"></a>
+  [MLSUS05-BP01 Align SLAs with sustainability goals](mlsus05-bp01.md) 
+ [MLSUS05-BP02 Use efficient silicon](mlsus05-bp02.md)
+ [MLSUS05-BP03 Optimize models for inference](mlsus05-bp03.md)
+ [MLSUS05-BP04 Deploy multiple models behind a single endpoint](mlsus05-bp04.md)

## Model monitoring
<a name="model-monitoring-phase"></a>

### Operational excellence
<a name="operational-excellence-pillar-5"></a>
+  [MLOPS06-BP01 Synchronize architecture and configuration, and check for skew across environments](mlops06-bp01.md) 
+ [MLOPS06-BP02 Enable model observability and tracking](mlops06-bp02.md)

### Security
<a name="security-pillar-5"></a>
+  [MLSEC06-BP01 Restrict access to intended legitimate consumers](mlsec06-bp01.md) 
+  [MLSEC06-BP02 Monitor human interactions with data for anomalous activity](mlsec06-bp02.md) 

### Reliability
<a name="reliability-pillar-5"></a>
+  [MLREL05-BP01 Allow automatic scaling of the model endpoint](mlrel05-bp01.md) 
+  [MLREL05-BP02 Create a recoverable endpoint with a managed version control strategy](mlrel05-bp02.md) 

### Performance efficiency
<a name="performance-efficiency-pillar-5"></a>
+  [MLPERF06-BP01 Include human-in-the-loop monitoring](mlperf06-bp01.md) 
+  [MLPERF06-BP02 Evaluate model explainability](mlperf06-bp02.md) 
+  [MLPERF06-BP03 Evaluate data drift](mlperf06-bp03.md) 
+  [MLPERF06-BP04 Monitor, detect, and handle model performance degradation](mlperf06-bp04.md) 
+  [MLPERF06-BP05 Establish an automated re-training framework](mlperf06-bp05.md) 
+  [MLPERF06-BP06 Review for updated data and features for retraining](mlperf06-bp06.md) 

### Cost optimization
<a name="cost-optimization-pillar-5"></a>
+  [MLCOST06-BP01 Monitor usage and cost by ML activity](mlcost06-bp01.md) 
+ [MLCOST06-BP02 Monitor return on investment for ML models](mlcost06-bp02.md)
+ [MLCOST06-BP03 Monitor endpoint usage and right-size the instance fleet](mlcost06-bp03.md)
+ [MLCOST06-BP04 Enable debugging and logging](mlcost06-bp04.md)

### Sustainability
<a name="sustainability-pillar-5"></a>
+ [MLSUS06-BP01 Measure material efficiency](mlsus06-bp01.md)
+  [MLSUS06-BP02 Retrain only when necessary](mlsus06-bp02.md) 