Call the Converse API with guardrails Processing the response when using the Converse API Code example for using Converse API with guardrails

Include a guardrail with the Converse API

You can use a guardrail to guard conversational apps that you create with the Converse API. For example, if you create a chat app with Converse API, you can use a guardrail to block inappropriate content entered by the user and inappropriate content generated by the model. For information about the Converse API, see Carry out a conversation with the Converse API operations.

Topics

Call the Converse API with guardrails
Processing the response when using the Converse API
Code example for using Converse API with guardrails

Call the Converse API with guardrails

To use a guardrail, you include configuration information for the guardrail in calls to the Converse or ConverseStream (for streaming responses) operations. Optionally, you can select specific content in the message that you want the guardrail to assess. For information about the models that you can use with guardrails and the Converse API, see Supported models and model features.

Topics

Configure a guardrail to work with the Converse API
Evaluate only specific content in a message
Guarding a system prompt sent to the Converse API
Message and system prompt guardrail behavior

Configure a guardrail to work with the Converse API

You specify guardrail configuration information in the guardrailConfig input parameter. The configuration includes the ID and the version of the guardrail that you want to use. You can also enable tracing for the guardrail, which provides information about the content that the guardrail blocked.

With the Converse operation, guardrailConfig is a GuardrailConfiguration object, as shown in the following example.


{
        "guardrailIdentifier": "Guardrail ID",
        "guardrailVersion": "Guardrail version",
        "trace": "enabled"
}

If you use ConverseStream, you pass a GuardrailStreamConfiguration object. Optionally, you can use the streamProcessingMode field to specify that you want the model to complete the guardrail assessment, before returning streaming response chunks. Or, you can have the model asynchronously respond whilst the guardrail continues its assessment in the background. For more information, see Configure streaming response behavior to filter content.

Evaluate only specific content in a message

When you pass a Message to a model, your guardrail assesses the content in the message. You also can asses specific parts of a message by using the guardContent (GuardrailConverseContentBlock) field.

Tip

Using the guardContent field is similar to using input tags with InvokeModel and InvokeModelWithResponseStream. For more information, see Apply tags to user input to filter content.

For example, the following guardrail evaluates only the content in the guardContent field and not the rest of the message. This is useful for having the guardrail assess only the most recent message in a conversation, as shown in the following example.


[
    {
        "role": "user",
        "content": [
            {
                "text": "Create a playlist of 2 pop songs."
            }
        ]
    },
    {
        "role": "assistant",
        "content": [
            {
                "text": "Sure! Here are two pop songs:\n1. \"Bad Habits\" by Ed Sheeran\n2. \"All Of The Lights\" by Kanye West\n\nWould you like to add any more songs to this playlist?"
            }
        ]
    },
    {
        "role": "user",
        "content": [
            {
                "guardContent": {
                    "text": {
                        "text": "Create a playlist of 2 heavy metal songs."
                    }
                }
            }
        ]
    }
]

Another use case of guardContent is providing additional context for a message without your guardrail assessing that context. In the following example, the guardrail only assesses "Create a playlist of heavy metal songs" and ignores the "Only answer with a list of songs".


messages = [
    {
        "role": "user",
        "content": [
            {
                "text": "Only answer with a list of songs."
            },
            {
                "guardContent": {
                    "text": {
                        "text": "Create a playlist of heavy metal songs."
                    }
                }
            }
        ]
    }
]

If content isn't in a guardContent block, that doesn't necessarily mean it won't be evaluated. This behavior depends on what filtering polices the guardrail uses.

The following example shows two guardContent blocks with contextual grounding checks (based on the qualifiers fields). The contextual grounding checks in the guardrail will only evaluate the content in these blocks. However, if the guardrail also has a word filter that blocks the word "background", the text "Some additional background information." will still be evaluated, even though it's not in a guardContent block.


[{
    "role": "user",
    "content": [{
            "guardContent": {
                "text": {
                    "text": "London is the capital of UK. Tokyo is the capital of Japan.",
                    "qualifiers": ["grounding_source"]
                }
            }
        },
        {
            "text": "Some additional background information."
        },
        {
            "guardContent": {
                "text": {
                    "text": "What is the capital of Japan?",
                    "qualifiers": ["query"]
                }
            }
        }
    ]
}]

Guarding a system prompt sent to the Converse API

You can use guardrails with system prompts that you send to the Converse API. To guard a system prompt, specify the guardContent (SystemContentBlock) field in the system prompt that you pass to the API, as shown in the following example.


[
    {
        "guardContent": {
            "text": {
                "text": "Only respond with Welsh heavy metal songs."
            }
        }
    }
]

If you don't provide the guardContent field, the guardrail doesn't assess the system prompt message.

Message and system prompt guardrail behavior

How the guardrail assesses guardContent field behaves differently between system prompts and messages that you pass in the message.

	System prompt has guardrail block	System prompt doesn't have guardrail block
Messages have guardrail block	System: Guardrail investigates content in guardrail block Messages: Guardrail investigates content in guardrail block	System: Guardrail investigates nothing Messages: Guardrail investigates content in guardrail block
Messages don't have guardrail block	System: Guardrail investigates content in guardrail block Messages: Guardrail investigates everything	System: Guardrail investigates nothing Messages: Guardrail investigates everything

System prompt has guardrail block

System prompt doesn't have guardrail block

Messages have guardrail block

System: Guardrail investigates content in guardrail block

Messages: Guardrail investigates content in guardrail block

System: Guardrail investigates nothing

Messages: Guardrail investigates content in guardrail block

Messages don't have guardrail block

System: Guardrail investigates content in guardrail block

Messages: Guardrail investigates everything

System: Guardrail investigates nothing

Messages: Guardrail investigates everything

Processing the response when using the Converse API

When you call the Converse operation, the guardrail assesses the message that you send. If the guardrail detects blocked content, the following happens.

The stopReason field in the response is set to guardrail_intervened.
If you enabled tracing, the trace is available in the trace (ConverseTrace) Field. With ConverseStream, the trace is in the metadata (ConverseStreamMetadataEvent) that operation returns.
The blocked content text that you have configured in the guardrail is returned in the output (ConverseOutput) field. With ConverseStream the blocked content text is in the streamed message.

The following partial response shows the blocked content text and the trace from the guardrail assessment. The guardrail has blocked the term Heavy metal in the message.


{
    "output": {
        "message": {
            "role": "assistant",
            "content": [
                {
                    "text": "Sorry, I can't answer questions about heavy metal music."
                }
            ]
        }
    },
    "stopReason": "guardrail_intervened",
    "usage": {
        "inputTokens": 0,
        "outputTokens": 0,
        "totalTokens": 0
    },
    "metrics": {
        "latencyMs": 721
    },
    "trace": {
        "guardrail": {
            "inputAssessment": {
                "3o06191495ze": {
                    "topicPolicy": {
                        "topics": [
                            {
                                "name": "Heavy metal",
                                "type": "DENY",
                                "action": "BLOCKED"
                            }
                        ]
                    },
                    "invocationMetrics": {
                        "guardrailProcessingLatency": 240,
                        "usage": {
                            "topicPolicyUnits": 1,
                            "contentPolicyUnits": 0,
                            "wordPolicyUnits": 0,
                            "sensitiveInformationPolicyUnits": 0,
                            "sensitiveInformationPolicyFreeUnits": 0,
                            "contextualGroundingPolicyUnits": 0
                        },
                        "guardrailCoverage": {
                            "textCharacters": {
                                "guarded": 39,
                                "total": 72
                            }
                        }
                    }
                }
            }
        }
    }
}

Code example for using Converse API with guardrails

This example shows how to guard a conversation with the Converse and ConverseStream operations. The example shows how to prevent a model from creating a playlist that includes songs from the heavy metal genre.

To guard a conversation

Create a guardrail by following the instructions at Create your guardrail .
- Name – Enter Heavy metal.
- Definition for topic – Enter Avoid mentioning songs that are from the heavy metal genre of music.
- Add sample phrases – Enter Create a playlist of heavy metal songs.
In step 9, enter the following:
- Messaging shown for blocked prompts – Enter Sorry, I can't answer questions about heavy metal music.
- Messaging for blocked responses – Enter Sorry, the model generated an answer that mentioned heavy metal music.
You can configure other guardrail options, but it is not required for this example.
Create a version of the guardrail by following the instructions at Create a version of a guardrail.
In the following code examples (Converse and ConverseStream), set the following variables:
- guardrail_id – The ID of the guardrail that you created in step 1.
- guardrail_version – The version of the guardrail that you created in step 2.
- text – Use Create a playlist of heavy metal songs.
Run the code examples. The output should should display the guardrail assessment and the output message Text: Sorry, I can't answer questions about heavy metal music.. The guardrail input assessment shows that the model detected the term heavy metal in the input message.
(Optional) Test that the guardrail blocks inappropriate text that the model generates by changing the value of text to List all genres of rock music.. Run the examples again. You should see an output assessment in the response.

Converse

The following code uses your guardrail with the Converse operation.


# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to use a guardrail with the <noloc>Converse</noloc> API.
"""

import logging
import json
import boto3


from botocore.exceptions import ClientError


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def generate_conversation(bedrock_client,
                          model_id,
                          messages,
                          guardrail_config):
    """
    Sends a message to a model.
    Args:
        bedrock_client: The Boto3 Bedrock runtime client.
        model_id (str): The model ID to use.
        messages JSON): The message to send to the model.
        guardrail_config : Configuration for the guardrail.

    Returns:
        response (JSON): The conversation that the model generated.

    """

    logger.info("Generating message with model %s", model_id)

    # Send the message.
    response = bedrock_client.converse(
        modelId=model_id,
        messages=messages,
        guardrailConfig=guardrail_config
    )

    return response


def main():
    """
    Entrypoint for example.
    """

    logging.basicConfig(level=logging.INFO,
                        format="%(levelname)s: %(message)s")

    # The model to use.
    model_id="meta.llama3-8b-instruct-v1:0"

    # The ID and version of the guardrail.
    guardrail_id = "Your guardrail ID"
    guardrail_version = "DRAFT"

    # Configuration for the guardrail.
    guardrail_config = {
        "guardrailIdentifier": guardrail_id,
        "guardrailVersion": guardrail_version,
        "trace": "enabled"
    }

    text = "Create a playlist of 2 heavy metal songs."
    context_text = "Only answer with a list of songs."

    # The message for the model and the content that you want the guardrail to assess.
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": context_text,
                },
                {
                    "guardContent": {
                        "text": {
                            "text": text
                        }
                    }
                }
            ]
        }
    ]

    try:

        print(json.dumps(messages, indent=4))

        bedrock_client = boto3.client(service_name='bedrock-runtime')

        response = generate_conversation(
            bedrock_client, model_id, messages, guardrail_config)

        output_message = response['output']['message']

        if response['stopReason'] == "guardrail_intervened":
            trace = response['trace']
            print("Guardrail trace:")
            print(json.dumps(trace['guardrail'], indent=4))

        for content in output_message['content']:
            print(f"Text: {content['text']}")

    except ClientError as err:
        message = err.response['Error']['Message']
        logger.error("A client error occurred: %s", message)
        print(f"A client error occured: {message}")

    else:
        print(
            f"Finished generating text with model {model_id}.")


if __name__ == "__main__":
    main()

ConverseStream

The following code uses your guardrail with the ConverseStream operation.


# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to use a guardrail with the ConverseStream operation.
"""

import logging
import json
import boto3


from botocore.exceptions import ClientError


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def stream_conversation(bedrock_client,
                    model_id,
                    messages,
                    guardrail_config):
    """
    Sends messages to a model and streams the response.
    Args:
        bedrock_client: The Boto3 Bedrock runtime client.
        model_id (str): The model ID to use.
        messages (JSON) : The messages to send.
        guardrail_config : Configuration for the guardrail.


    Returns:
        Nothing.

    """

    logger.info("Streaming messages with model %s", model_id)

    response = bedrock_client.converse_stream(
        modelId=model_id,
        messages=messages,
        guardrailConfig=guardrail_config
    )

    stream = response.get('stream')
    if stream:
        for event in stream:

            if 'messageStart' in event:
                print(f"\nRole: {event['messageStart']['role']}")

            if 'contentBlockDelta' in event:
                print(event['contentBlockDelta']['delta']['text'], end="")

            if 'messageStop' in event:
                print(f"\nStop reason: {event['messageStop']['stopReason']}")

            if 'metadata' in event:
                metadata = event['metadata']
                if 'trace' in metadata:
                    print("\nAssessment")
                    print(json.dumps(metadata['trace'], indent=4))


def main():
    """
    Entrypoint for streaming message API response example.
    """

    logging.basicConfig(level=logging.INFO,
                        format="%(levelname)s: %(message)s")

    # The model to use.
    model_id = "amazon.titan-text-express-v1"

    # The ID and version of the guardrail.
    guardrail_id = "Change to your guardrail ID"
    guardrail_version = "DRAFT"

    # Configuration for the guardrail.
    guardrail_config = {
        "guardrailIdentifier": guardrail_id,
        "guardrailVersion": guardrail_version,
        "trace": "enabled",
        "streamProcessingMode" : "sync"
    }

    text = "Create a playlist of heavy metal songs."
  
    # The message for the model and the content that you want the guardrail to assess.
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": text,
                },
                {
                    "guardContent": {
                        "text": {
                            "text": text
                        }
                    }
                }
            ]
        }
    ]

    try:
        bedrock_client = boto3.client(service_name='bedrock-runtime')

        stream_conversation(bedrock_client, model_id, messages,
                        guardrail_config)

    except ClientError as err:
        message = err.response['Error']['Message']
        logger.error("A client error occurred: %s", message)
        print("A client error occured: " +
              format(message))

    else:
        print(
            f"Finished streaming messages with model {model_id}.")


if __name__ == "__main__":
    main()

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Streaming responses

Use the ApplyGuardrail API in your application