Amazon Bedrock AgentCore is in preview release and is subject to change.
Invoke an AgentCore Runtime agent
The InvokeAgentRuntime
operation lets you send requests to specific AgentCore Runtime
endpoints identified by their Amazon Resource Name (ARN) and receive streaming
responses containing the agent's output. The API supports session management through session
identifiers, enabling you to maintain conversation context across multiple interactions. You
can target specific agent endpoints using optional qualifiers.
To call InvokeAgentRuntime
, you need
bedrock-agentcore:InvokeAgentRuntime
permissions. In the call you can also
pass a bearer token that the agent can use for user authentication.
The InvokeAgentRuntime
operation accepts your request payload as binary data
up to 100 MB in size and returns a streaming response that delivers chunks of data in
real-time as the agent processes your request. This streaming approach allows you to receive
partial results immediately rather than waiting for the complete response, making it ideal
for interactive applications.
If you plan on integrating your agent with OAuth, you can't use the AWS SDK to call
InvokeAgentRuntime
. Instead, make a HTTPS request to
InvokeAgentRuntime. For more information, see Authenticate and authorize with Inbound Auth and Outbound Auth.
Invoke streaming agents
The following example shows how to use boto3 to invoke an agent runtime:
import boto3 import json # Initialize the Bedrock AgentCore client agent_core_client = boto3.client('bedrock-agentcore') # Prepare the payload payload = json.dumps({"prompt": prompt}).encode() # Invoke the agent response = agent_core_client.invoke_agent_runtime( agentRuntimeArn=agent_arn, runtimeSessionId=session_id, payload=payload ) # Process and print the response if "text/event-stream" in response.get("contentType", ""): # Handle streaming response content = [] for line in response["response"].iter_lines(chunk_size=10): if line: line = line.decode("utf-8") if line.startswith("data: "): line = line[6:] print(line) content.append(line) print("\nComplete response:", "\n".join(content)) elif response.get("contentType") == "application/json": # Handle standard JSON response content = [] for chunk in response.get("response", []): content.append(chunk.decode('utf-8')) print(json.loads(''.join(content))) else: # Print raw response for other content types print(response)
Invoke multi-modal agents
You can use the InvokeAgentRuntime
operation to send multi-modal requests
that include both text and images. The following example shows how to invoke a
multi-modal agent:
import boto3 import json import base64 # Read and encode image with open("image.jpg", "rb") as image_file: image_data = base64.b64encode(image_file.read()).decode('utf-8') # Prepare multi-modal payload payload = json.dumps({ "prompt": "Describe what you see in this image", "media": { "type": "image", "format": "jpeg", "data": image_data } }).encode() # Invoke the agent response = agent_core_client.invoke_agent_runtime( agentRuntimeArn=agent_arn, runtimeSessionId=session_id, payload=payload )
Session management
The InvokeAgentRuntime
operation supports session management through the
runtimeSessionId
parameter. By providing the same session identifier
across multiple requests, you can maintain conversation context, allowing the agent to
reference previous interactions.
To start a new conversation, generate a unique session identifier. To continue an existing conversation, use the same session identifier from previous requests. This approach enables you to build interactive applications that maintain context over time.
Tip
For best results, use a UUID or other unique identifier for your session IDs to avoid collisions between different users or conversations.
Error handling
When using the InvokeAgentRuntime
operation, you might encounter various
errors. Here are some common errors and how to handle them:
- ValidationException
-
Occurs when the request parameters are invalid. Check that your agent ARN, session ID, and payload are correctly formatted.
- ResourceNotFoundException
-
Occurs when the specified agent runtime cannot be found. Verify that the agent ARN is correct and that the agent exists in your AWS account.
- AccessDeniedException
-
Occurs when you don't have the necessary permissions. Ensure that your IAM policy includes the
bedrock-agentcore:InvokeAgentRuntime
permission. - ThrottlingException
-
Occurs when you exceed the request rate limits. Implement exponential backoff and retry logic in your application.
Implement proper error handling in your application to provide a better user experience and to troubleshoot issues effectively.
Best practices
Follow these best practices when using the InvokeAgentRuntime
operation:
-
Use session management to maintain conversation context for a better user experience.
-
Process streaming responses incrementally to provide real-time feedback to users.
-
Implement proper error handling and retry logic for a robust application.
-
Consider payload size limitations (100 MB) when sending requests, especially for multi-modal content.
-
Use appropriate qualifiers to target specific agent versions or endpoints.
-
Implement authentication mechanisms when necessary using bearer tokens.