BedrockAgentCore / Client / invoke_browser
invoke_browser¶
- BedrockAgentCore.Client.invoke_browser(**kwargs)¶
Invokes an operating system-level action on a browser session in Amazon Bedrock AgentCore. This operation provides direct OS-level control over browser sessions, enabling mouse actions, keyboard input, and screenshots that the WebSocket-based Chrome DevTools Protocol (CDP) cannot handle — such as interacting with print dialogs, context menus, and JavaScript alerts.
You send a request with exactly one action in the
BrowserActionunion, and receive a corresponding result in theBrowserActionResultunion.The following operations are related to
InvokeBrowser:See also: AWS API Documentation
Request Syntax
response = client.invoke_browser( browserIdentifier='string', sessionId='string', action={ 'mouseClick': { 'x': 123, 'y': 123, 'button': 'LEFT'|'RIGHT'|'MIDDLE', 'clickCount': 123 }, 'mouseMove': { 'x': 123, 'y': 123 }, 'mouseDrag': { 'endX': 123, 'endY': 123, 'startX': 123, 'startY': 123, 'button': 'LEFT'|'RIGHT'|'MIDDLE' }, 'mouseScroll': { 'x': 123, 'y': 123, 'deltaX': 123, 'deltaY': 123 }, 'keyType': { 'text': 'string' }, 'keyPress': { 'key': 'string', 'presses': 123 }, 'keyShortcut': { 'keys': [ 'string', ] }, 'screenshot': { 'format': 'PNG' } } )
- Parameters:
browserIdentifier (string) –
[REQUIRED]
The unique identifier of the browser associated with the session. This must match the identifier used when creating the session with
StartBrowserSession.sessionId (string) –
[REQUIRED]
The unique identifier of the browser session on which to perform the action. This must be an active session created with
StartBrowserSession.action (dict) –
[REQUIRED]
The browser action to perform. Exactly one member of the
BrowserActionunion must be set per request.Note
This is a Tagged Union structure. Only one of the following top level keys can be set:
mouseClick,mouseMove,mouseDrag,mouseScroll,keyType,keyPress,keyShortcut,screenshot.mouseClick (dict) –
Click at the specified coordinates.
x (integer) – [REQUIRED]
The X coordinate on screen where the click occurs.
y (integer) – [REQUIRED]
The Y coordinate on screen where the click occurs.
button (string) –
The mouse button to use. Defaults to
LEFT.clickCount (integer) –
The number of clicks to perform. Valid range: 1–10. Defaults to 1.
mouseMove (dict) –
Move the cursor to the specified coordinates.
x (integer) – [REQUIRED]
The target X coordinate on screen.
y (integer) – [REQUIRED]
The target Y coordinate on screen.
mouseDrag (dict) –
Drag from a start position to an end position.
endX (integer) – [REQUIRED]
The ending X coordinate for the drag.
endY (integer) – [REQUIRED]
The ending Y coordinate for the drag.
startX (integer) – [REQUIRED]
The starting X coordinate for the drag.
startY (integer) – [REQUIRED]
The starting Y coordinate for the drag.
button (string) –
The mouse button to use for the drag. Defaults to
LEFT.
mouseScroll (dict) –
Scroll at the specified position.
x (integer) – [REQUIRED]
The X coordinate on screen where the scroll occurs.
y (integer) – [REQUIRED]
The Y coordinate on screen where the scroll occurs.
deltaX (integer) –
The horizontal scroll delta. Valid range: -1000 to 1000.
deltaY (integer) –
The vertical scroll delta. Valid range: -1000 to 1000. Negative values scroll down.
keyType (dict) –
Type a string of text.
text (string) – [REQUIRED]
The text string to type. Maximum length: 10,000 characters.
keyPress (dict) –
Press a key one or more times.
key (string) – [REQUIRED]
The key name to press (for example,
enter,tab,escape).presses (integer) –
The number of times to press the key. Valid range: 1–100. Defaults to 1.
keyShortcut (dict) –
Press a key combination.
keys (list) – [REQUIRED]
The key combination to press (for example,
["ctrl", "s"]). Maximum 5 keys.(string) –
screenshot (dict) –
Capture a full-screen screenshot.
format (string) –
The image format for the screenshot. Defaults to
PNG.
- Return type:
dict
- Returns:
Response Syntax
{ 'result': { 'mouseClick': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'mouseMove': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'mouseDrag': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'mouseScroll': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'keyType': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'keyPress': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'keyShortcut': { 'status': 'SUCCESS'|'FAILED', 'error': 'string' }, 'screenshot': { 'status': 'SUCCESS'|'FAILED', 'error': 'string', 'data': b'bytes' } }, 'sessionId': 'string' }
Response Structure
(dict) –
Response for the InvokeBrowser operation.
result (dict) –
The result of the browser action. The member set in the result corresponds to the action that was performed.
Note
This is a Tagged Union structure. Only one of the following top level keys will be set:
mouseClick,mouseMove,mouseDrag,mouseScroll,keyType,keyPress,keyShortcut,screenshot. If a client receives an unknown member it will setSDK_UNKNOWN_MEMBERas the top level key, which maps to the name or tag of the unknown member. The structure ofSDK_UNKNOWN_MEMBERis as follows:'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
mouseClick (dict) –
The result of a mouse click action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
mouseMove (dict) –
The result of a mouse move action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
mouseDrag (dict) –
The result of a mouse drag action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
mouseScroll (dict) –
The result of a mouse scroll action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
keyType (dict) –
The result of a key type action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
keyPress (dict) –
The result of a key press action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
keyShortcut (dict) –
The result of a key shortcut action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
screenshot (dict) –
The result of a screenshot action.
status (string) –
The status of the action execution.
error (string) –
The error message. Present only when the action failed.
data (bytes) –
The base64-encoded image data. Present only when the action succeeded.
sessionId (string) –
The unique identifier of the browser session on which the action was performed.
Exceptions
BedrockAgentCore.Client.exceptions.ServiceQuotaExceededExceptionBedrockAgentCore.Client.exceptions.AccessDeniedExceptionBedrockAgentCore.Client.exceptions.ValidationExceptionBedrockAgentCore.Client.exceptions.ResourceNotFoundExceptionBedrockAgentCore.Client.exceptions.ThrottlingExceptionBedrockAgentCore.Client.exceptions.InternalServerException