MLSEC03-BP02 Secure data and modeling environment - Machine Learning Lens

MLSEC03-BP02 Secure data and modeling environment

Secure your machine learning data and development environments to protect valuable information assets throughout the ML lifecycle. By implementing proper security measures for storage, compute, and network resources, you can maintain data integrity and confidentiality while enabling data scientists to work effectively.

Desired outcome: You have a secure foundation for storing, processing, and utilizing data for machine learning workloads. Your data is encrypted at rest and in transit, with access tightly controlled through identity management, infrastructure isolation, and secure coding practices. Your development environments are protected from unauthorized access while providing the necessary tools for your ML practitioners.

Common anti-patterns:

  • Storing unencrypted training data in publicly accessible storage.

  • Using default security configurations for ML environments.

  • Allowing unrestricted internet access from ML environments.

  • Using hard-coded credentials in ML code and notebooks.

  • Installing ML packages from untrusted sources without validation.

  • Granting excessive permissions to development environments.

Benefits of establishing this best practice:

  • Protection of sensitive training data from unauthorized access or exfiltration.

  • Reduced risk of compromised ML models and systems.

  • Improves adherence to regulatory requirements for data handling.

  • Improved governance of ML development environments.

  • Enhanced ability to detect and respond to security events.

Level of risk exposed if this best practice is not established: High

Implementation guidance

Securing your ML environments requires a comprehensive approach addressing data storage, compute resources, network isolation, and access controls. The ML lifecycle involves multiple stages where data could be exposed if proper security measures aren't implemented. By establishing secure foundations for your ML infrastructure, you can protect valuable intellectual property while still enabling productivity.

Start by securing your data repositories with encryption and access controls. Then build secure compute environments for model development that maintain isolation through private networking. Implement proper credential management to avoid exposure of secrets. Finally, verify that your package management practices block the introduction of malicious code into your ML pipeline.

Modern ML workloads often involve large datasets and complex algorithms, making security even more critical as the impact of a breach could be substantial. By implementing the measures in this best practice, you create a secure foundation for your ML initiatives.

Implementation steps

  1. Build a secure analysis environment. During the data preparation and feature engineering phases, leverage secure data exploration options on AWS. Use Amazon SageMaker AI Studio managed environments or Amazon EMR for data processing. Alternatively, use managed services like Amazon Athena and AWS Glue to explore data without moving it from your data lake. For smaller datasets, use Amazon SageMaker AI Studio to explore, visualize, and engineer features, then scale up your feature engineering using managed ETL services like Amazon EMR or AWS Glue.

  2. Create dedicated IAM and KMS resources. Limit the scope and impact of credentials and keys by creating dedicated AWS IAM roles and AWS KMS keys for ML workloads. Create private Amazon S3 buckets with versioning enabled to protect your data and intellectual property. Implement a centralized data lake using AWS Lake Formation on Amazon S3. Secure your data lake using a combination of services to encrypt data in transit and at rest. Monitor access with granular AWS IAM policies, S3 bucket policies, S3 Access Logs, Amazon CloudWatch, and AWS CloudTrail.

  3. Use Secrets Manager and Parameter Store to protect credentials. Replace hard-coded secrets in your code with API calls to programmatically retrieve and decrypt secrets using AWS Secrets Manager. Use AWS Systems Manager Parameter Store to store application configuration variables such as AMI IDs or license keys. Grant permissions to your SageMaker AI IAM role to access these services from your ML environments.

  4. Automate managing configuration. Use lifecycle configuration scripts to manage ML environments. These scripts run when environments are created or restarted, allowing you to install custom packages, preload datasets, and set up source code repositories. Lifecycle configurations can be reused across multiple environments and updated centrally. Use AWS CloudFormation infrastructure as code and Service Catalog to simplify configuration for end users while maintaining security standards.

  5. Create private, isolated, network environments. Use Amazon Virtual Private Cloud (Amazon VPC) to limit connectivity to only essential services and users. Deploy Amazon SageMaker AI resources in a VPC to enable network-level controls and capture network activity in VPC Flow Logs. For distributed training workloads, use Amazon SageMaker AI HyperPod which provides managed, resilient clusters with built-in VPC integration and multi-AZ deployment for enhanced security and availability. This deployment model also enables secure queries to data sources within your VPC, such as Amazon RDS databases or Amazon Redshift data warehouses. Use IAM to restrict access to ML environment web UIs so they can only be accessed from within your VPC. Implement AWS PrivateLink to privately connect your SageMaker AI resources with supported AWS services, facilitating secure communication within the AWS network. Use AWS KMS to encrypt data on the Amazon EBS volumes attached to SageMaker AI resources.

  6. Restrict access. ML development environments provide web-based access to the underlying compute resources, typically with elevated privileges. Restrict this access to remove the ability to assume root permissions while still allowing users to control their local environment. Implement least privilege access controls for ML resources.

  7. Secure ML algorithms. Amazon SageMaker AI uses container technology to train and host algorithms and models. When creating custom containers, publish them to a private container registry hosted on Amazon Elastic Container Repository (Amazon ECR). Encrypt containers hosted on Amazon ECR at rest using AWS KMS. Regularly scan containers for vulnerabilities and implement a secure container update process.

  8. Enforce code best practices. Use secure git repositories for storing code. Implement code reviews, automated security scanning, and version control for ML code. Integrate security checks into your ML CI/CD pipeline to detect potential security issues early in the development process.

  9. Implement a package mirror for consuming approved packages. Evaluate license terms to determine appropriate ML packages for your business across the ML lifecycle phases. Common ML Python packages include Pandas, PyTorch, Keras, NumPy, and Scikit-learn. Build an automated validation mechanism to check packages for security issues. Only download packages from approved and private repos. Validate package contents before importing. SageMaker AI supports modifying package channel paths to a private repository. When appropriate, use an internal repository as a proxy for public repositories to minimize network traffic and reduce overhead.

  10. Implement model security monitoring. Deploy continuous monitoring solutions to detect unauthorized access attempts, unusual data access patterns, and potential data exfiltration from your ML environments. Use Amazon CloudWatch, AWS Security Hub CSPM, and Amazon GuardDuty to create a comprehensive security monitoring solution for ML resources.

  11. Implement additional security controls for AI workloads. For AI workloads, implement additional security controls around input validation and data leakage prevention. Implement Amazon SageMaker AI Model Monitor to detect drift in production AI systems. Consider using Amazon SageMaker AI Model Cards to document model security characteristics and limitations.

Resources

Related documents:

Related videos:

Related examples: