Code example for batch inference
The code example in this chapter shows how to create a batch inference job, view information about it, and stop it. This example uses the InvokeModel API format. For information about using the Converse API format, see Format and upload your batch inference data.
Select a language to see a code example for it:
- Python
-
Create a JSONL file named
abc.jsonland include a JSON object for each record that contains at least the minimum number of records (see the Minimum number of records per batch inference job for{Model}Quotas for Amazon Bedrock). In this example, you'll use the Anthropic Claude 3 Haiku model. The following example shows the first input JSON in the file:{ "recordId": "CALL0000001", "modelInput": { "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Summarize the following call transcript: ..." } ] } ] } } ... # Add records until you hit the minimumCreate an S3 bucket called
amzn-s3-demo-bucket-inputand upload the file to it. Then create an S3 bucket calledamzn-s3-demo-bucket-outputto write your output files to. Run the following code snippet to submit a job and get thejobArnfrom the response:import boto3 bedrock = boto3.client(service_name="bedrock") inputDataConfig=({ "s3InputDataConfig": { "s3Uri": "s3://amzn-s3-demo-bucket-input/abc.jsonl" } }) outputDataConfig=({ "s3OutputDataConfig": { "s3Uri": "s3://amzn-s3-demo-bucket-output/" } }) response=bedrock.create_model_invocation_job( roleArn="arn:aws:iam::123456789012:role/MyBatchInferenceRole", modelId="anthropic.claude-3-haiku-20240307-v1:0", jobName="my-batch-job", inputDataConfig=inputDataConfig, outputDataConfig=outputDataConfig ) jobArn = response.get('jobArn')Return the
statusof the job.bedrock.get_model_invocation_job(jobIdentifier=jobArn)['status']List batch inference jobs that
Failed.bedrock.list_model_invocation_jobs( maxResults=10, statusEquals="Failed", sortOrder="Descending" )Stop the job that you started.
bedrock.stop_model_invocation_job(jobIdentifier=jobArn)