actual_output addresses the user’s original query (input). It helps determine whether the model is generating responses that are focused, appropriate, and directly useful in the context of the question.
This is essential for ensuring the model doesn’t drift into unrelated topics or provide verbose but irrelevant information.
Evaluation Parameters
To compute theanswer_relevancy metric, the following parameters are required:
input: The user’s query or instruction.actual_output: The response generated by your LLM application.
How Is It Calculated?
Theanswer_relevancy score is derived using an LLM-as-a-judge approach with explicit pass criteria:
- Intent Alignment: Does the
actual_outputdirectly address the core informational need expressed in theinput? - Relevancy Check: Is the response focused and on-topic, without drifting into unrelated information?
- 1 (Relevant): The response directly and appropriately addresses the user’s query.
- 0 (Not Relevant): The response fails to address the query, is off-topic, or provides irrelevant information.