SampleWithResponseStream
Sends a streaming inference request to the model during a job execution. Returns the response as a stream of payload chunks. Each turn is captured for later use.
Request Syntax
POST /sample-with-response-stream HTTP/1.1
X-Amzn-SageMaker-Job-Arn: JobArn
X-Amzn-SageMaker-Trajectory-Id: TrajectoryId
Body
URI Request Parameters
The request uses the following URI parameters.
- JobArn
-
The job ARN that identifies which model session to route the inference request to.
Length Constraints: Minimum length of 1. Maximum length of 256.
Pattern:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:job/[a-zA-Z0-9_\-]+/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}Required: Yes
- TrajectoryId
-
The trajectory ID for grouping turns into a single rollout. Each turn is captured for later use.
Length Constraints: Minimum length of 1. Maximum length of 256.
Required: Yes
Request Body
The request accepts the following binary data.
- Body
-
The raw inference request body in OpenAI-compatible JSON format.
Required: Yes
Response Syntax
HTTP/1.1 200
Content-Type: ContentType
Body
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The response returns the following HTTP headers.
- ContentType
-
MIME type of the streaming inference result.
The response returns the following as the HTTP body.
- Body
-
The streaming response body, delivered as a series of PayloadPart events.
Errors
For information about the errors that are common to all actions, see Common Error Types.
- AccessDeniedException
-
You do not have permission to perform this operation.
HTTP Status Code: 403
- InternalServiceError
-
An internal service error occurred. Retry the request.
HTTP Status Code: 500
- ResourceNotFoundException
-
The specified resource was not found.
HTTP Status Code: 404
- ServiceQuotaExceededException
-
You have exceeded a service quota.
HTTP Status Code: 402
- ThrottlingException
-
The request was throttled. Retry the request after a brief wait.
HTTP Status Code: 429
- ValidationException
-
The request is not valid. Check the request syntax and parameters.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: