actual_output
addresses the user’s original query (input
). It helps determine whether the model is generating responses that are focused, appropriate, and directly useful in the context of the question.
This is essential for ensuring the model doesn’t drift into unrelated topics or provide verbose but irrelevant information.
Evaluation Parameters
To compute theanswer_relevancy
metric, the following parameters are required:
input
: The user’s query or instruction.actual_output
: The response generated by your LLM application.
How Is It Calculated?
The score is computed via an LLM-as-a-judge process:- Intent Extraction: An LLM identifies the core informational need in the
input
. - Relevancy Judgment: The LLM evaluates whether the
actual_output
appropriately and directly addresses that need.
This metric was incorporated to the Galtea platform from the open source library deepeval, for more information you can also visit their documentation.