EvaluationReferenceInput

A reference input containing ground truth data for evaluation, scoped to a specific context level (session or trace) through its span context.

context

The contextual information associated with an evaluation, including span context details that identify the specific traces and sessions being evaluated within the agent's execution flow.

Type: Context object

Note: This object is a Union. Only one member of this object can be specified or returned.

Required: Yes

assertions

A list of assertion statements for session-level evaluation. Each assertion describes an expected behavior or outcome the agent should demonstrate during the session.

Type: Array of EvaluationContent objects

Array Members: Minimum number of 1 item. Maximum number of 100 items.

Required: No

expectedResponse

The expected response for trace-level evaluation. Built-in evaluators that support this field compare the agent's actual response against this value for assessment. Custom evaluators can access it through the {expected_response} placeholder in their instructions.

Type: EvaluationContent object

Note: This object is a Union. Only one member of this object can be specified or returned.

Required: No

expectedTrajectory

The expected tool call sequence for session-level trajectory evaluation. Contains a list of tool names representing the tools the agent is expected to invoke.

Type: EvaluationExpectedTrajectory object

Required: No

EvaluationReferenceInput

Contents

See Also