View a markdown version of this page

SampleWithResponseStream - Amazon SageMaker

SampleWithResponseStream

Sends a streaming inference request to the model during a job execution. Returns the response as a stream of payload chunks. Each turn is captured for later use.

Request Syntax

POST /sample-with-response-stream HTTP/1.1 X-Amzn-SageMaker-Job-Arn: JobArn X-Amzn-SageMaker-Trajectory-Id: TrajectoryId Body

URI Request Parameters

The request uses the following URI parameters.

JobArn

The job ARN that identifies which model session to route the inference request to.

Length Constraints: Minimum length of 1. Maximum length of 256.

Pattern: arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:job/[a-zA-Z0-9_\-]+/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

Required: Yes

TrajectoryId

The trajectory ID for grouping turns into a single rollout. Each turn is captured for later use.

Length Constraints: Minimum length of 1. Maximum length of 256.

Required: Yes

Request Body

The request accepts the following binary data.

Body

The raw inference request body in OpenAI-compatible JSON format.

Required: Yes

Response Syntax

HTTP/1.1 200 Content-Type: ContentType Body

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The response returns the following HTTP headers.

ContentType

MIME type of the streaming inference result.

The response returns the following as the HTTP body.

Body

The streaming response body, delivered as a series of PayloadPart events.

Errors

For information about the errors that are common to all actions, see Common Error Types.

AccessDeniedException

You do not have permission to perform this operation.

HTTP Status Code: 403

InternalServiceError

An internal service error occurred. Retry the request.

HTTP Status Code: 500

ResourceNotFoundException

The specified resource was not found.

HTTP Status Code: 404

ServiceQuotaExceededException

You have exceeded a service quota.

HTTP Status Code: 402

ThrottlingException

The request was throttled. Retry the request after a brief wait.

HTTP Status Code: 429

ValidationException

The request is not valid. Check the request syntax and parameters.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: