actual_output
) and the information found in the retrieval_context
. It is a core indicator of hallucination risk in retrieval-augmented generation systems.
A high faithfulness score indicates that the model grounds its answer in retrieved content, rather than introducing unsupported or fabricated information.
Evaluation Parameters
To compute thefaithfulness
metric, the following inputs are required:
input
: The user’s original prompt.actual_output
: The LLM-generated response.retrieval_context
: The retrieved passages or nodes used by the model.
How Is It Calculated?
The score is computed using the following steps:- Fact Comparison: An LLM analyzes whether the statements made in
actual_output
are substantiated by theretrieval_context
. - Hallucination Check: The LLM flags any unsupported claims or discrepancies.
This metric was incorporated to the Galtea platform from the open source library deepeval, for more information you can also visit their documentation.