

# Starting an automatic model evaluation job in Amazon Bedrock
Create job

You can create an automatic model evaluation job using the AWS Management Console, AWS CLI, or a supported AWS SDK. In an automatic model evaluation job, the model you select performs inference using either prompts from a supported built-in dataset or your own custom prompt dataset. Each job also requires you to select a task type. The task type provides you with some recommended metrics, and built-in prompt datasets. To learn more about available task types and metrics, see [Model evaluation task types in Amazon Bedrock](model-evaluation-tasks.md).

The following examples show you how to create an automatic model evaluation job using the Amazon Bedrock console, AWS CLI, SDK for Python.

All automatic model evaluation jobs require that you create an IAM service role. To learn more about the IAM requirements for setting up a model evaluation job, see [Service role requirements for model evaluation jobs](model-evaluation-security-service-roles.md).

The following examples show you how to create an automatic model evaluation job. In the API, you can also include an [inference profile](cross-region-inference.md) in the job by specifying its ARN in the `modelIdentifier` field.

------
#### [ Amazon Bedrock console ]

Use the following procedure to create a model evaluation job using the Amazon Bedrock console. To successfully complete this procedure make sure that your IAM user, group, or role has the sufficient permissions to access the console. To learn more, see [Required console permissions to create an automatic model evaluation job](model-evaluation-type-automatic.md#base-for-automatic). 

Also, any custom prompt datasets that you want to specify in the model evaluation job must have the required CORS permissions added to the Amazon S3 bucket. To learn more about adding the required CORS permissions see, [Required Cross Origin Resource Sharing (CORS) permissions on S3 buckets](model-evaluation-security-cors.md).

**To create a automatic model evaluation job**

1. Open the Amazon Bedrock console: [https://console.aws.amazon.com/bedrock/home](https://console.aws.amazon.com/bedrock/home)

1. In the navigation pane, choose **Model evaluation**.

1. In the **Build an evaluation** card, under **Automatic** choose **Create automatic evaluation**.

1. On the **Create automatic evaluation** page, provide the following information

   1. **Evaluation name** — Give the model evaluation job a name that describes the job. This name is shown in your model evaluation job list. The name must be unique in your account in an AWS Region.

   1. **Description** (Optional) — Provide an optional description.

   1. **Models** — Choose the model you want to use in the model evaluation job.

      To learn more about available models and accessing them in Amazon Bedrock, see [Access Amazon Bedrock foundation models](model-access.md).

   1. (Optional) To change the inference configuration choose **update**.

      Changing the inference configuration changes the responses generated by the selected models. To learn more about the available inferences parameters, see [Inference request parameters and response fields for foundation models](model-parameters.md).

   1. **Task type ** — Choose the type of task you want the model to attempt to perform during the model evaluation job.

   1. **Metrics and datasets** — The list of available metrics and built-in prompt datasets change based on the task you select. You can choose from the list of **Available built-in datasets** or you can choose **Use your own prompt dataset**. If you choose to use your own prompt dataset, enter the exact S3 URI of your prompt dataset file or choose **Browse S3 **to search for your prompt data set.

   1. **Evaluation results** —Specify the S3 URI of the directory where you want the results saved. Choose **Browse S3** to search for a location in Amazon S3.

   1. (Optional) To enable the use of a customer managed key Choose **Customize encryption settings (advanced)**. Then, provide the ARN of the AWS KMS key you want to use.

   1. **Amazon Bedrock IAM role** — Choose **Use an existing role** to use IAM service role that already has the required permissions, or choose **Create a new role** to create a new IAM service role.

1. Then, choose **Create**.

Once the status changes **Completed**, then you can view the job's report card.

------
#### [ SDK for Python ]

The following example creates an automatic evaluation job using Python.

```
import boto3
client = boto3.client('bedrock')

job_request = client.create_evaluation_job(
    jobName="api-auto-job-titan",
    jobDescription="two different task types",
    roleArn="arn:aws:iam::111122223333:role/role-name",
    inferenceConfig={
        "models": [
            {
                "bedrockModel": {
                    "modelIdentifier":"arn:aws:bedrock:us-west-2::foundation-model/amazon.titan-text-lite-v1",
                    "inferenceParams":"{\"inferenceConfig\":{\"maxTokens\": 512,\"temperature\":0.7,\"topP\":0.9}}"
                }

            }
        ]

    },
    outputDataConfig={
        "s3Uri":"s3://amzn-s3-demo-bucket-model-evaluations/outputs/"
    },
    evaluationConfig={
        "automated": {
            "datasetMetricConfigs": [
                {
                    "taskType": "QuestionAndAnswer",
                    "dataset": {
                        "name": "Builtin.BoolQ"
                    },
                    "metricNames": [
                        "Builtin.Accuracy",
                        "Builtin.Robustness"
                    ]
                }
            ]
        }
    }
)

print(job_request)
```

------
#### [ AWS CLI ]

In the AWS CLI, you can use the `help` command to see which parameters are required, and which parameters are optional when specifying `create-evaluation-job` in the AWS CLI.

```
aws bedrock create-evaluation-job help
```

```
aws bedrock create-evaluation-job \
--job-name 'automatic-eval-job-cli-001' \
--role-arn 'arn:aws:iam::111122223333:role/role-name' \
--evaluation-config '{"automated": {"datasetMetricConfigs": [{"taskType": "QuestionAndAnswer","dataset": {"name": "Builtin.BoolQ"},"metricNames": ["Builtin.Accuracy","Builtin.Robustness"]}]}}' \
--inference-config '{"models": [{"bedrockModel": {"modelIdentifier":"arn:aws:bedrock:us-west-2::foundation-model/amazon.titan-text-lite-v1","inferenceParams":"{\"inferenceConfig\":{\"maxTokens\": 512,\"temperature\":0.7,\"topP\":0.9}}"}}]}' \
--output-data-config '{"s3Uri":"s3://automatic-eval-jobs/outputs"}'
```

------