Stability.ai Stable Diffusion 3.5 Large - Amazon Bedrock

Stability.ai Stable Diffusion 3.5 Large

The Stable Diffusion 3.5 Large model uses 8 billion parameters and supports 1 megapixel resolution output for text-to-image and image-to-image generation.

The request body is passed in the body field of a request to InvokeModel.

Model invocation request body field

When you make an InvokeModel call using a Stable Diffusion 3.5 Large model, fill the body field with a JSON object that looks like the below.

  • prompt – (string) Text description of the desired output image. Maximum 10,000 characters.

    Minimum Maximum

    0

    10,000

Model invocation responses body field

When you make an InvokeModel call using a Stable Diffusion 3.5 Large model, the response looks like the below

{ 'seeds': [2130420379], "finish_reasons":[null], "images":["..."] }

A response with a finish reason that is not null, will look like the below:

{ "finish_reasons":["Filter reason: prompt"] }
  • seeds – (string) List of seeds used to generate images for the model.

  • finish_reasons – Enum indicating whether the request was filtered or not. null will indicate that the request was successful. Current possible values: "Filter reason: prompt", "Filter reason: output image", "Filter reason: input image", "Inference error", null.

  • images – A list of generated images in base64 string format.

Text to image

The Stability.ai Stable Diffusion 3.5 Large model has the following inference parameters for a text-to-image inference call.

  • prompt (string) – Text description of the desired output image. Maximum 10,000 characters.

    Minimum Maximum
    0 10,000

Optional parameters

  • aspect_ratio (string) – Controls the aspect ratio of the generated image. Valid for text-to-image requests only. Enum: 16:9, 1:1, 21:9, 2:3, 3:2, 4:5, 5:4, 9:16, 9:21. Default 1:1.

  • mode (string) (GenerationMode) - Default: text-to-image. Enum: image-to-image or text-to-image. Controls whether this is a text-to-image or image-to-image generation, which affects which parameters are required:

    • text-to-image requires only the prompt parameter.

    • image-to-image requires the prompt, image, and strength parameters.

  • seed (number) – Value to control randomness in generation. Range 0 to 4294967294. Default 0 (random seed).

    Minimum Maximum Default
    0 4294967294 0
  • negative_prompt (string) – Text describing elements to exclude from the output image. Maximum 10,000 characters.

    Minimum Maximum
    0 10,000
  • cfg_scale (number) – Controls adherence to the prompt text. Higher values increase prompt adherence. Range 1 to 10. Default 4.

    Minimum Maximum Default
    1 10 4
  • style_preset (string) – Applies a specific visual style. Enum: 3d-model, analog-film, anime, cinematic, comic-book, digital-art, enhance, fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk, origami, photographic, pixel-art, tile-texture.

  • output_format (string) – Output image format. Enum: jpeg, png, webp. Default png.

import boto3 import json bedrock = boto3.client('bedrock-runtime', region_name='us-west-2') response = bedrock.invoke_model( modelId='us.stability.sd3-5-large-v1:0', body=json.dumps({ 'prompt': 'A car made out of vegetables.' }) )
Image to image

The Stability.ai Stable Diffusion 3.5 Large model has the following inference parameters for an image-to-image inference call.

  • prompt (string) – Text description of the desired output image. Maximum 10,000 characters.

    Minimum Maximum
    0 10,000
  • image (string) – Base64-encoded input image. Minimum 64 pixels per side. Supported formats: jpeg, png, webp.

  • mode (string) (GenerationMode) - Default: text-to-image. Enum: image-to-image or text-to-image. Controls whether this is a text-to-image or image-to-image generation, which affects which parameters are required:

    • text-to-image requires only the prompt parameter.

    • image-to-image requires the prompt, image, and strength parameters.

  • strength (number) – Controls influence of the input image on the output. Range 0 to 1. Value of 0 preserves the input image, value of 1 ignores the input image.

    Minimum Maximum
    0 1
  • seed (number) – Value to control randomness in generation. Range 0 to 4294967294. Default 0 (random seed).

    Minimum Maximum Default
    0 4294967294 0
  • negative_prompt (string) – Text describing elements to exclude from the output image. Maximum 10,000 characters.

    Minimum Maximum
    0 10,000
  • cfg_scale (number) – Controls adherence to the prompt text. Higher values increase prompt adherence. Range 1 to 10. Default 4.

    Minimum Maximum Default
    1 10 4
  • style_preset (string) – Applies a specific visual style. Enum: 3d-model, analog-film, anime, cinematic, comic-book, digital-art, enhance, fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk, origami, photographic, pixel-art, tile-texture.

  • output_format (string) – Output image format. Enum: jpeg, png, webp. Default png.

import boto3 import base64 import json # Load and encode image with open('input_image.jpg', 'rb') as image_file: image_base64 = base64.b64encode(image_file.read()).decode('utf-8') bedrock = boto3.client('bedrock-runtime', region_name='us-west-2') response = bedrock.invoke_model( modelId='us.stability.sd3-5-large-v1:0', body=json.dumps({ 'prompt': 'A car made out of vegetables.', 'image': image_base64, 'strength': 0.7 }) )