

# AI21 Labs Jamba models


This section provides inference parameters and a code example for using AI21 Labs Jamba models.

**Topics**
+ [

## Required fields
](#model-parameters-jamba-required-fields)
+ [

## Inference parameters
](#model-parameters-jamba-request-response)
+ [

## Model invocation request body field
](#model-parameters-jamba-request-body)
+ [

## Model invocation response body field
](#model-parameters-jamba-response-body)
+ [

## Code example
](#api-inference-examples-a2i-jamba)
+ [

## Code example for Jamba 1.5 Large
](#api-inference-examples-a2i-jamba15-large)

## Required fields


The AI21 Labs Jamba models supports the following required fields:
+ **Messages** (`messages`) – The previous messages in this chat, from oldest (index 0) to newest. Must have at least one user or assistant message in the list. Include both user inputs and system responses. Maximum total size for the list is about 256K tokens. Each message includes the following members:
+ **Role** (`role`) – The role of the message author. One of the following values:
  + **User** (`user`) – Input provided by the user. Any instructions given here that conflict with instructions given in the `system` prompt take precedence over the `system` prompt instructions.
  + **Assistant** (`assistant`) – Response generated by the model.
  + **System** (`system`) – Initial instructions provided to the system to provide general guidance on the tone and voice of the generated message. An initial system message is optional but recommended to provide guidance on the tone of the chat. For example, "You are a helpful chatbot with a background in earth sciences and a charming French accent."
+ **Content** (`content`) – The content of the message.

## Inference parameters


The AI21 Labs Jamba models support the following inference parameters.

**Topics**
+ [

### Randomness and Diversity
](#model-parameters-jamba-random)
+ [

### Length
](#model-parameters-jamba-length)
+ [

### Repetitions
](#model-parameters-jamba-reps)

### Randomness and Diversity


The AI21 Labs Jamba models support the following parameters to control randomness and diversity in the response.
+ **Temperature** (`temperature`)– How much variation to provide in each answer. Setting this value to 0 guarantees the same response to the same question every time. Setting a higher value encourages more variation. Modifies the distribution from which tokens are sampled. Default: 1.0, Range: 0.0 – 2.0
+ **Top P** (`top_p`) – Limit the pool of next tokens in each step to the top N percentile of possible tokens, where 1.0 means the pool of all possible tokens, and 0.01 means the pool of only the most likely next tokens.

### Length


The AI21 Labs Jamba models support the following parameters to control the length of the generated response.
+ **Max completion length** (`max_tokens`) – The maximum number of tokens to allow for each generated response message. Typically the best way to limit output length is by providing a length limit in the system prompt (for example, "limit your answers to three sentences"). Default: 4096, Range: 0 – 4096.
+ **Stop sequences** (`stop`) – End the message when the model generates one of these strings. The stop sequence is not included in the generated message. Each sequence can be up to 64K long, and can contain newlines as \$1n characters. 

  Examples:
  + Single stop string with a word and a period: "monkeys."
  + Multiple stop strings and a newline: ["cat", "dog", " .", "\$1\$1\$1\$1", "\$1n"]
+ **Number of responses** (`n`) – How many chat responses to generate. Notes n must be 1 for streaming responses. If n is set to larger than 1, setting `temperature=0` will always fail because all answers are guaranteed to be duplicates. Default:1, Range: 1 – 16

### Repetitions


The AI21 Labs Jamba models support the following parameters to control repetition in the generated response.
+ **Frequency Penalty** (`frequency_penalty`) – Reduce frequency of repeated words within a single response message by increasing this number. This penalty gradually increases the more times a word appears during response generation. Setting to 2.0 will produce a string with few, if any repeated words. 
+ **Presence Penalty ** (`presence_penalty`) – Reduce the frequency of repeated words within a single message by increasing this number. Unlike frequency penalty, presence penalty is the same no matter how many times a word appears. 

## Model invocation request body field


When you make an [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) or [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) call using an AI21 Labs model, fill the `body` field with a JSON object that conforms to the one below. Enter the prompt in the `prompt` field.

```
{
  "messages": [
    {
      "role":"system", // Non-printing contextual information for the model
      "content":"You are a helpful history teacher. You are kind and you respond with helpful content in a professional manner. Limit your answers to three sentences. Your listener is a high school student."
    },
    {
      "role":"user", // The question we want answered.
      "content":"Who was the first emperor of rome?"
    }
  ],
  "n":1 // Limit response to one answer
}
```

## Model invocation response body field


For information about the format of the `body` field in the response, see [https://docs.ai21.com/reference/jamba-instruct-api\$1response-details](https://docs.ai21.com/reference/jamba-instruct-api#response-details).

## Code example


This example shows how to call the *AI21 Labs Jamba-Instruct* model.

**`invoke_model`**

```
import boto3 
import json

bedrock = session.client('bedrock-runtime', 'us-east-1') 
response = bedrock.invoke_model( 
        modelId='ai21.jamba-instruct-v1:0', 
        body=json.dumps({
            'messages': [ 
                { 
                    'role': 'user', 
                    'content': 'which llm are you?' 
                } 
             ], 
         }) 
       ) 

print(json.dumps(json.loads(response['body']), indent=4))
```

**converse**

```
import boto3 
import json

bedrock = session.client('bedrock-runtime', 'us-east-1')
response = bedrock.converse( 
    modelId='ai21.jamba-instruct-v1:0', 
    messages=[ 
        { 
            'role': 'user', 
            'content': [ 
                { 
                    'text': 'which llm are you?' 
                } 
             ] 
          } 
     ] 
  ) 

print(json.dumps(json.loads(response['body']), indent=4))
```

## Code example for Jamba 1.5 Large


This example shows how to call the *AI21 Labs Jamba 1.5 Large* model.

**`invoke_model`**

```
POST https://bedrock-runtime.us-east-1.amazonaws.com/model/ai21.jamba-1-5-mini-v1:0/invoke-model HTTP/1.1
{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful chatbot with a background in earth sciences and a charming French accent."
    },
    {
      "role": "user",
      "content": "What are the main causes of earthquakes?"
    }
  ],
  "max_tokens": 512,
  "temperature": 0.7,
  "top_p": 0.9,
  "stop": ["###"],
  "n": 1
}
```