Returns
Returns a Metric object for the given parameters, orNone
if an error occurs.
Examples
- LLM-as-a-Judge
- Self Hosted
Parameters
The name of the metric.
The type of test this metric is designed for.
Possible values:
QUALITY
, RED_TEAMING
, SCENARIOS
.The name of the model used to evaluate the metric. Required for metrics using
judge_prompt
.Available models:"GPT-35-turbo"
"GPT-4o"
"GPT-4o-mini"
"GPT-4.1"
"Gemini-2.0-flash"
"Gemini-2.5-Flash"
"Gemini-2.5-Flash-Lite"
It should not be provided if the metric is “self hosted” (has no
judge_prompt
) since it does not require a model for evaluation.A custom prompt that defines the evaluation logic for an LLM-as-a-judge metric. You can use placeholders like
{input}
, {actual_output}
, etc., which will be populated at evaluation time. If you provide a judge_prompt
, the metric will be an LLM-based evaluation. If omitted, the metric is considered a deterministic “Custom Score” metric.Tags to categorize the metric.
A brief description of what the metric evaluates.
A URL pointing to more detailed documentation about the metric.