# Model observability for training jobs on SageMaker HyperPod clusters orchestrated by Amazon EKS SageMaker HyperPod clusters orchestrated with Amazon EKS can integrate with the [MLflow application on Amazon SageMaker Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/mlflow.html). Cluster admins set up the MLflow server and connect it with the SageMaker HyperPod clusters. Data scientists can gain insights into the model. **To set up an MLflow server using AWS CLI** A cluster admin must create an MLflow tracking server. 1. Create a SageMaker AI MLflow tracking server, following the instructions at [Create a tracking server using the AWS CLI](https://docs.aws.amazon.com/sagemaker/latest/dg/mlflow-create-tracking-server-cli.html#mlflow-create-tracking-server-cli-infra-setup). 1. Make sure that the [https://docs.aws.amazon.com/eks/latest/APIReference/API_auth_AssumeRoleForPodIdentity.html](https://docs.aws.amazon.com/eks/latest/APIReference/API_auth_AssumeRoleForPodIdentity.html) permission exists in the IAM execution role for SageMaker HyperPod. 1. If the `eks-pod-identity-agent` add-on is not already installed on your EKS cluster, install the add-on on the EKS cluster. ``` aws eks create-addon \ --cluster-name {{}} \ --addon-name eks-pod-identity-agent \ --addon-version {{vx.y.z-eksbuild.1}} ``` 1. Create a `trust-relationship.json` file for a new role for Pod to call MLflow APIs. ``` cat >trust-relationship.json <hyperpod-mlflow-policy.json <}}" }, { "Effect": "Allow", "Action": [ "s3:PutObject" ], "Resource": "{{arn:aws:s3:::}}" } ] } EOF ``` **Note** The ARNs are the one from the MLflow server and the S3 bucket set up with the MLflow server during the server you created following the instructions [Set up MLflow infrastructure](https://docs.aws.amazon.com/sagemaker/latest/dg/mlflow-create-tracking-server-cli.html#mlflow-create-tracking-server-cli-infra-setup). 1. Attach the `mlflow-metrics-emit-policy` policy to the `hyperpod-mlflow-role` using the policy document saved in the previous step. ``` aws iam put-role-policy \ --role-name {{hyperpod-mlflow-role}} \ --policy-name {{mlflow-metrics-emit-policy}} \ --policy-document {{file://hyperpod-mlflow-policy.json}} ``` 1. Create a Kubernetes service account for Pod to access the MLflow server. ``` cat >{{mlflow-service-account.yaml}} <