

# AI21 Labs models
<a name="model-parameters-ai21"></a>

This section describes the request parameters and response fields for AI21 Labs models. Use this information to make inference calls to AI21 Labs models with the [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) and [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) (streaming) operations. This section also includes Python code examples that shows how to call AI21 Labs models. To use a model in an inference operation, you need the model ID for the model. To get the model ID, see [Supported foundation models in Amazon Bedrock](models-supported.md). Some models also work with the [Converse API](conversation-inference.md). To check if the Converse API supports a specific AI21 Labs model, see [Supported models and model features](conversation-inference-supported-models-features.md). For more code examples, see [Code examples for Amazon Bedrock using AWS SDKs](service_code_examples.md).

Foundation models in Amazon Bedrock support input and output modalities, which vary from model to model. To check the modalities that AI21 Labs models support, see [Supported foundation models in Amazon Bedrock](models-supported.md). To check which Amazon Bedrock features the AI21 Labs models support, see [Supported foundation models in Amazon Bedrock](models-supported.md). To check which AWS Regions that AI21 Labs models are available in, see [Supported foundation models in Amazon Bedrock](models-supported.md).

When you make inference calls with AI21 Labs models, you include a prompt for the model. For general information about creating prompts for the models that Amazon Bedrock supports, see [Prompt engineering concepts](prompt-engineering-guidelines.md). For AI21 Labs specific prompt information, see the [AI21 Labs prompt engineering guide](https://docs.ai21.com/docs/prompt-engineering).

**Topics**
+ [AI21 Labs Jurassic-2 models](model-parameters-jurassic2.md)
+ [AI21 Labs Jamba models](model-parameters-jamba.md)

# AI21 Labs Jurassic-2 models
<a name="model-parameters-jurassic2"></a>

This section provides inference parameters and a code example for using AI21 Labs AI21 Labs Jurassic-2 models.

**Topics**
+ [Inference parameters](#model-parameters-jurassic2-request-response)
+ [Code example](#api-inference-examples-a2i-jurassic)

## Inference parameters
<a name="model-parameters-jurassic2-request-response"></a>

The AI21 Labs Jurassic-2 models support the following inference parameters.

**Topics**
+ [Randomness and Diversity](#model-parameters-jurassic2-random)
+ [Length](#model-parameters-jurassic2-length)
+ [Repetitions](#model-parameters-jurassic2-reps)
+ [Model invocation request body field](#model-parameters-jurassic2-request-body)
+ [Model invocation response body field](#model-parameters-jurassic2-response-body)

### Randomness and Diversity
<a name="model-parameters-jurassic2-random"></a>

The AI21 Labs Jurassic-2 models support the following parameters to control randomness and diversity in the response.
+ **Temperature** (`temperature`)– Use a lower value to decrease randomness in the response.
+ **Top P** (`topP`) – Use a lower value to ignore less probable options.

### Length
<a name="model-parameters-jurassic2-length"></a>

The AI21 Labs Jurassic-2 models support the following parameters to control the length of the generated response.
+ **Max completion length** (`maxTokens`) – Specify the maximum number of tokens to use in the generated response.
+ **Stop sequences** (`stopSequences`) – Configure stop sequences that the model recognizes and after which it stops generating further tokens. Press the Enter key to insert a newline character in a stop sequence. Use the Tab key to finish inserting a stop sequence.

### Repetitions
<a name="model-parameters-jurassic2-reps"></a>

The AI21 Labs Jurassic-2 models support the following parameters to control repetition in the generated response.
+ **Presence penalty** (`presencePenalty`) – Use a higher value to lower the probability of generating new tokens that already appear at least once in the prompt or in the completion.
+ **Count penalty** (`countPenalty`) – Use a higher value to lower the probability of generating new tokens that already appear at least once in the prompt or in the completion. Proportional to the number of appearances.
+ **Frequency penalty** (`frequencyPenalty`) – Use a high value to lower the probability of generating new tokens that already appear at least once in the prompt or in the completion. The value is proportional to the frequency of the token appearances (normalized to text length).
+ **Penalize special tokens** – Reduce the probability of repetition of special characters. The default values are `true`.
  + **Whitespaces** (`applyToWhitespaces`) – A `true` value applies the penalty to whitespaces and new lines.
  + **Punctuations** (`applyToPunctuation`) – A `true` value applies the penalty to punctuation.
  + **Numbers** (`applyToNumbers`) – A `true` value applies the penalty to numbers.
  + **Stop words** (`applyToStopwords`) – A `true` value applies the penalty to stop words.
  + **Emojis** (`applyToEmojis`) – A `true` value excludes emojis from the penalty.

### Model invocation request body field
<a name="model-parameters-jurassic2-request-body"></a>

When you make an [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) or [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) call using an AI21 Labs model, fill the `body` field with a JSON object that conforms to the one below. Enter the prompt in the `prompt` field.

```
{
    "prompt": string,
    "temperature": float,
    "topP": float,
    "maxTokens": int,
    "stopSequences": [string],
    "countPenalty": {
        "scale": float
    },
    "presencePenalty": {
        "scale": float
    },
    "frequencyPenalty": {
        "scale": float
    }
}
```

To penalize special tokens, add those fields to any of the penalty objects. For example, you can modify the `countPenalty` field as follows.

```
"countPenalty": {
    "scale": float,
    "applyToWhitespaces": boolean,
    "applyToPunctuations": boolean,
    "applyToNumbers": boolean,
    "applyToStopwords": boolean,
    "applyToEmojis": boolean
}
```

The following table shows the minimum, maximum, and default values for the numerical parameters.


****  
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-jurassic2.html)

### Model invocation response body field
<a name="model-parameters-jurassic2-response-body"></a>

For information about the format of the `body` field in the response, see [https://docs.ai21.com/reference/j2-complete-api-ref](https://docs.ai21.com/reference/j2-complete-api-ref).

**Note**  
Amazon Bedrock returns the response identifier (`id`) as an integer value.

## Code example
<a name="api-inference-examples-a2i-jurassic"></a>

This examples shows how to call the *A2I AI21 Labs Jurassic-2 Mid* model.

```
import boto3
import json

brt = boto3.client(service_name='bedrock-runtime')

body = json.dumps({
    "prompt": "Translate to spanish: 'Amazon Bedrock is the easiest way to build and scale generative AI applications with base models (FMs)'.", 
    "maxTokens": 200,
    "temperature": 0.5,
    "topP": 0.5
})

modelId = 'ai21.j2-mid-v1'
accept = 'application/json'
contentType = 'application/json'

response = brt.invoke_model(
    body=body, 
    modelId=modelId, 
    accept=accept, 
    contentType=contentType
)

response_body = json.loads(response.get('body').read())

# text
print(response_body.get('completions')[0].get('data').get('text'))
```

# AI21 Labs Jamba models
<a name="model-parameters-jamba"></a>

This section provides inference parameters and a code example for using AI21 Labs Jamba models.

**Topics**
+ [Required fields](#model-parameters-jamba-required-fields)
+ [Inference parameters](#model-parameters-jamba-request-response)
+ [Model invocation request body field](#model-parameters-jamba-request-body)
+ [Model invocation response body field](#model-parameters-jamba-response-body)
+ [Code example](#api-inference-examples-a2i-jamba)
+ [Code example for Jamba 1.5 Large](#api-inference-examples-a2i-jamba15-large)

## Required fields
<a name="model-parameters-jamba-required-fields"></a>

The AI21 Labs Jamba models supports the following required fields:
+ **Messages** (`messages`) – The previous messages in this chat, from oldest (index 0) to newest. Must have at least one user or assistant message in the list. Include both user inputs and system responses. Maximum total size for the list is about 256K tokens. Each message includes the following members:
+ **Role** (`role`) – The role of the message author. One of the following values:
  + **User** (`user`) – Input provided by the user. Any instructions given here that conflict with instructions given in the `system` prompt take precedence over the `system` prompt instructions.
  + **Assistant** (`assistant`) – Response generated by the model.
  + **System** (`system`) – Initial instructions provided to the system to provide general guidance on the tone and voice of the generated message. An initial system message is optional but recommended to provide guidance on the tone of the chat. For example, "You are a helpful chatbot with a background in earth sciences and a charming French accent."
+ **Content** (`content`) – The content of the message.

## Inference parameters
<a name="model-parameters-jamba-request-response"></a>

The AI21 Labs Jamba models support the following inference parameters.

**Topics**
+ [Randomness and Diversity](#model-parameters-jamba-random)
+ [Length](#model-parameters-jamba-length)
+ [Repetitions](#model-parameters-jamba-reps)

### Randomness and Diversity
<a name="model-parameters-jamba-random"></a>

The AI21 Labs Jamba models support the following parameters to control randomness and diversity in the response.
+ **Temperature** (`temperature`)– How much variation to provide in each answer. Setting this value to 0 guarantees the same response to the same question every time. Setting a higher value encourages more variation. Modifies the distribution from which tokens are sampled. Default: 1.0, Range: 0.0 – 2.0
+ **Top P** (`top_p`) – Limit the pool of next tokens in each step to the top N percentile of possible tokens, where 1.0 means the pool of all possible tokens, and 0.01 means the pool of only the most likely next tokens.

### Length
<a name="model-parameters-jamba-length"></a>

The AI21 Labs Jamba models support the following parameters to control the length of the generated response.
+ **Max completion length** (`max_tokens`) – The maximum number of tokens to allow for each generated response message. Typically the best way to limit output length is by providing a length limit in the system prompt (for example, "limit your answers to three sentences"). Default: 4096, Range: 0 – 4096.
+ **Stop sequences** (`stop`) – End the message when the model generates one of these strings. The stop sequence is not included in the generated message. Each sequence can be up to 64K long, and can contain newlines as \$1n characters. 

  Examples:
  + Single stop string with a word and a period: "monkeys."
  + Multiple stop strings and a newline: ["cat", "dog", " .", "\$1\$1\$1\$1", "\$1n"]
+ **Number of responses** (`n`) – How many chat responses to generate. Notes n must be 1 for streaming responses. If n is set to larger than 1, setting `temperature=0` will always fail because all answers are guaranteed to be duplicates. Default:1, Range: 1 – 16

### Repetitions
<a name="model-parameters-jamba-reps"></a>

The AI21 Labs Jamba models support the following parameters to control repetition in the generated response.
+ **Frequency Penalty** (`frequency_penalty`) – Reduce frequency of repeated words within a single response message by increasing this number. This penalty gradually increases the more times a word appears during response generation. Setting to 2.0 will produce a string with few, if any repeated words. 
+ **Presence Penalty ** (`presence_penalty`) – Reduce the frequency of repeated words within a single message by increasing this number. Unlike frequency penalty, presence penalty is the same no matter how many times a word appears. 

## Model invocation request body field
<a name="model-parameters-jamba-request-body"></a>

When you make an [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) or [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) call using an AI21 Labs model, fill the `body` field with a JSON object that conforms to the one below. Enter the prompt in the `prompt` field.

```
{
  "messages": [
    {
      "role":"system", // Non-printing contextual information for the model
      "content":"You are a helpful history teacher. You are kind and you respond with helpful content in a professional manner. Limit your answers to three sentences. Your listener is a high school student."
    },
    {
      "role":"user", // The question we want answered.
      "content":"Who was the first emperor of rome?"
    }
  ],
  "n":1 // Limit response to one answer
}
```

## Model invocation response body field
<a name="model-parameters-jamba-response-body"></a>

For information about the format of the `body` field in the response, see [https://docs.ai21.com/reference/jamba-instruct-api\$1response-details](https://docs.ai21.com/reference/jamba-instruct-api#response-details).

## Code example
<a name="api-inference-examples-a2i-jamba"></a>

This example shows how to call the *AI21 Labs Jamba-Instruct* model.

**`invoke_model`**

```
import boto3 
import json

bedrock = session.client('bedrock-runtime', 'us-east-1') 
response = bedrock.invoke_model( 
        modelId='ai21.jamba-instruct-v1:0', 
        body=json.dumps({
            'messages': [ 
                { 
                    'role': 'user', 
                    'content': 'which llm are you?' 
                } 
             ], 
         }) 
       ) 

print(json.dumps(json.loads(response['body']), indent=4))
```

**converse**

```
import boto3 
import json

bedrock = session.client('bedrock-runtime', 'us-east-1')
response = bedrock.converse( 
    modelId='ai21.jamba-instruct-v1:0', 
    messages=[ 
        { 
            'role': 'user', 
            'content': [ 
                { 
                    'text': 'which llm are you?' 
                } 
             ] 
          } 
     ] 
  ) 

print(json.dumps(json.loads(response['body']), indent=4))
```

## Code example for Jamba 1.5 Large
<a name="api-inference-examples-a2i-jamba15-large"></a>

This example shows how to call the *AI21 Labs Jamba 1.5 Large* model.

**`invoke_model`**

```
POST https://bedrock-runtime.us-east-1.amazonaws.com/model/ai21.jamba-1-5-mini-v1:0/invoke-model HTTP/1.1
{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful chatbot with a background in earth sciences and a charming French accent."
    },
    {
      "role": "user",
      "content": "What are the main causes of earthquakes?"
    }
  ],
  "max_tokens": 512,
  "temperature": 0.7,
  "top_p": 0.9,
  "stop": ["###"],
  "n": 1
}
```