Interface LlmAsAJudgeOptions
- All Superinterfaces:
software.amazon.jsii.JsiiSerializable
- All Known Implementing Classes:
LlmAsAJudgeOptions.Jsii$Proxy
Uses a foundation model to assess agent performance based on custom instructions and a rating scale.
Example:
// Create a custom LLM-as-a-Judge evaluator
Evaluator evaluator = Evaluator.Builder.create(this, "MyEvaluator")
.evaluatorName("my_custom_evaluator")
.level(EvaluationLevel.SESSION)
.evaluatorConfig(EvaluatorConfig.llmAsAJudge(LlmAsAJudgeOptions.builder()
.instructions("Evaluate whether the agent response is helpful and accurate.")
.modelId("us.anthropic.claude-sonnet-4-6")
.ratingScale(EvaluatorRatingScale.categorical(List.of(CategoricalRatingOption.builder().label("Good").definition("The response is helpful and accurate.").build(), CategoricalRatingOption.builder().label("Bad").definition("The response is not helpful or contains errors.").build())))
.build()))
.build();
// Use the custom evaluator in an online evaluation configuration
// Use the custom evaluator in an online evaluation configuration
OnlineEvaluationConfig.Builder.create(this, "MyEvaluation")
.onlineEvaluationConfigName("my_evaluation")
.evaluators(List.of(EvaluatorSelector.builtin(BuiltinEvaluator.HELPFULNESS), EvaluatorSelector.custom(evaluator)))
.dataSource(DataSourceConfig.fromCloudWatchLogs(CloudWatchLogsDataSourceConfig.builder()
.logGroupNames(List.of("/aws/bedrock-agentcore/my-agent"))
.serviceNames(List.of("my-agent.default"))
.build()))
.build();
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic final classA builder forLlmAsAJudgeOptionsstatic final classAn implementation forLlmAsAJudgeOptions -
Method Summary
Modifier and TypeMethodDescriptionstatic LlmAsAJudgeOptions.Builderbuilder()Additional model-specific request fields.default EvaluatorInferenceConfigOptional inference configuration parameters that control model behavior during evaluation.The evaluation instructions that guide the language model in assessing agent performance.The identifier of the Amazon Bedrock model to use for evaluation.The rating scale that defines how the evaluator should score agent performance.Methods inherited from interface software.amazon.jsii.JsiiSerializable
$jsii$toJson
-
Method Details
-
getInstructions
The evaluation instructions that guide the language model in assessing agent performance.These instructions define the evaluation criteria, context, and expected behavior. Instructions must contain placeholders appropriate for the evaluation level (e.g.,
{context},{available_tools}for SESSION level).Note: Evaluators using reference-input placeholders (e.g.,
{expected_tool_trajectory},{assertions},{expected_response}) are only compatible with on-demand evaluation, not online evaluation.- See Also:
-
getModelId
The identifier of the Amazon Bedrock model to use for evaluation.Accepts standard model IDs (e.g.,
'anthropic.claude-sonnet-4-6') and cross-region inference profile IDs with region prefixes (e.g.,'us.anthropic.claude-sonnet-4-6','eu.anthropic.claude-sonnet-4-6'). -
getRatingScale
The rating scale that defines how the evaluator should score agent performance. -
getAdditionalModelRequestFields
Additional model-specific request fields.Default: - No additional fields
-
getInferenceConfig
Optional inference configuration parameters that control model behavior during evaluation.When not specified, the foundation model uses its own default values for maxTokens, temperature, and topP.
Default: - The foundation model's default inference parameters are used
- See Also:
-
builder
- Returns:
- a
LlmAsAJudgeOptions.BuilderofLlmAsAJudgeOptions
-