

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 處理對話中來自代理程式的電腦使用工具請求
<a name="agent-computer-use-handle-tools"></a>

當您的代理程式請求工具時，對 InvokeAgent API 操作的回應會包含 `returnControl` 承載，並在 invocationInputs 中包含要使用的工具和工具動作。如需將控制權傳回給代理程式開發人員的詳細資訊，請參閱[透過在 InvokeAgent 回應中傳送取得的資訊，將控制權傳回給代理程式開發人員](agents-returncontrol.md)。

**Topics**
+ [傳回控制權範例](#agent-computer-use-tool-request-format)
+ [剖析工具請求的程式碼範例](#agent-computer-use-implementation-example)

## 傳回控制權範例
<a name="agent-computer-use-tool-request-format"></a>

以下是 `returnControl` 承載範例，其中包含使用 `ANTHROPIC.Computer` 工具搭配 `screenshot` 動作的請求。

```
{
    "returnControl": {
        "invocationId": "invocationIdExample",
        "invocationInputs": [{
            "functionInvocationInput": {
                "actionGroup": "my_computer",
                "actionInvocationType": "RESULT",
                "agentId": "agentIdExample",
                "function": "computer",
                "parameters": [{
                    "name": "action",
                    "type": "string",
                    "value": "screenshot"
                }]
            }
        }]
    }
}
```

## 剖析工具請求的程式碼範例
<a name="agent-computer-use-implementation-example"></a>

下列程式碼示範如何在 InvokeAgent 回應中擷取電腦使用工具選擇、將其映射至不同工具的模擬工具實作，然後在後續 InvokeAgent 請求中傳送工具使用的結果。
+ `manage_computer_interaction` 函數會執行迴圈，呼叫 InvocationAgent API 操作並剖析回應，直到沒有任務要完成為止。剖析回應時，它會從 `returnControl` 承載擷取要使用的任何工具，並傳遞 `handle_computer_action` 函數。
+ `handle_computer_action` 會將函數名稱映射至四個動作的模擬實作。如需工具實作範例，請參閱 Anthropic GitHub 儲存庫中的 [computer-use-demo](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo/computer_use_demo/)。

如需電腦使用工具的詳細資訊，包括實作範例和工具描述，請參閱 Anthropic 文件中的[電腦使用 (測試版)](https://docs.anthropic.com/en/docs/agents-and-tools/computer-use)。

```
import boto3
from botocore.exceptions import ClientError
import json


def handle_computer_action(action_params):
    """
    Maps computer actions, like taking screenshots and moving the mouse to mock implementations and returns
    the result.

    Args:
        action_params (dict): Dictionary containing the action parameters
            Keys:
                - action (str, required): The type of action to perform (for example 'screenshot' or 'mouse_move')
                - coordinate (str, optional): JSON string containing [x,y] coordinates for mouse_move

    Returns:
        dict: Response containing the action result.
    """

    action = action_params.get('action')
    if action == 'screenshot':
        # Mock screenshot response
        with open("mock_screenshot.png", 'rb') as image_file:
            image_bytes = image_file.read()
        return {
            "IMAGES": {
                "images": [
                    {
                        "format": "png",
                        "source": {
                            "bytes": image_bytes
                        },
                    }
                ]
            }
        }
    elif action == 'mouse_move':
        # Mock mouse movement
        coordinate = json.loads(action_params.get('coordinate', '[0, 0]'))
        return {
            "TEXT": {
                "body": f"Mouse moved to coordinates {coordinate}"
            }
        }
    elif action == 'left_click':
        # Mock mouse left click
        return {
            "TEXT": {
                "body": f"Mouse left clicked"
            }
        }
    elif action == 'right_click':
        # Mock mouse right click
        return {
            "TEXT": {
                "body": f"Mouse right clicked"
            }
        }

    ### handle additional actions here


def manage_computer_interaction(bedrock_agent_runtime_client, agent_id, alias_id):
    """
    Manages interaction between an Amazon Bedrock agent and computer use functions.

    Args:
        bedrock_agent_runtime_client: Boto3 client for Bedrock agent runtime
        agent_id (str): The ID of the agent
        alias_id (str): The Alias ID of the agent

    The function:
    - Initiates a session with initial prompt
    - Makes agent requests with appropriate parameters
    - Processes response chunks and return control events
    - Handles computer actions via handle_computer_action()
    - Continues interaction until task completion
    """
    session_id = "session123"
    initial_prompt = "Open a browser and go to a website"
    computer_use_results = None
    current_prompt = initial_prompt

    while True:
        # Make agent request with appropriate parameters
        invoke_params = {
            "agentId": agent_id,
            "sessionId": session_id,
            "inputText": current_prompt,
            "agentAliasId": alias_id,
        }

        # Include session state if we have results from previous iteration
        if computer_use_results:
            invoke_params["sessionState"] = computer_use_results["sessionState"]

        try:
            response = bedrock_agent_runtime_client.invoke_agent(**invoke_params)
        except ClientError as e:
            print(f"Error: {e}")

        has_return_control = False

        # Process the response
        for event in response.get('completion'):
            if 'chunk' in event:
                chunk_content = event['chunk'].get('bytes', b'').decode('utf-8')
                if chunk_content:
                    print("\nAgent:", chunk_content)

            if 'returnControl' in event:
                has_return_control = True
                invocationId = event["returnControl"]["invocationId"]
                if "invocationInputs" in event["returnControl"]:
                    for invocationInput in event["returnControl"]["invocationInputs"]:
                        func_input = invocationInput["functionInvocationInput"]

                        # Extract action parameters
                        params = {p['name']: p['value'] for p in func_input['parameters']}

                        # Handle computer action and get result
                        action_result = handle_computer_action(params)

                        # Print action result for testing
                        print("\nExecuting function:", func_input['function'])
                        print("Parameters:", params)

                        # Prepare the session state for the next request
                        computer_use_results = {
                            "sessionState": {
                                "invocationId": invocationId,
                                "returnControlInvocationResults": [{
                                    "functionResult": {
                                        "actionGroup": func_input['actionGroup'],
                                        "responseState": "REPROMPT",
                                        "agentId": func_input['agentId'],
                                        "function": func_input['function'],
                                        "responseBody": action_result
                                    }
                                }]
                            }
                        }

        # If there's no return control event, the task is complete
        if not has_return_control:
            print("\nTask completed!")
            break

        # Use empty string as prompt for subsequent iterations
        current_prompt = ""
def main():
    bedrock_agent_runtime_client = boto3.client(service_name="bedrock-agent-runtime",
                                         region_name="REGION"
                                         )

    agent_id = "AGENT_ID"
    alias_id = "ALIAS_ID"

    manage_computer_interaction(bedrock_agent_runtime_client, agent_id, alias_id)


if __name__ == "__main__":
    main()
```

輸出格式應類似以下內容：

```
Executing function: computer
Parameters: {'action': 'screenshot'}

Executing function: computer
Parameters: {'coordinate': '[467, 842]', 'action': 'mouse_move'}

Executing function: computer
Parameters: {'action': 'left_click'}

Agent: I've opened Firefox browser. Which website would you like to visit?

Task completed!
```