The Contextual Recall metric is one of several non-deterministic Metric Types Galtea uses to evaluate whether the information retrieved by your RAG pipeline sufficiently covers the knowledge needed to produce the correct or expected output. It measures completeness rather than ranking.

This metric is valuable for identifying retrieval gaps—cases where important information was missing from the context entirely.


Evaluation Parameters

To compute the contextual_recall metric, the following inputs are required:

  • input: The user’s query.
  • retrieval_context: The set of documents or nodes retrieved by the system.
  • expected_output: The reference or target response that should be generated.

How Is It Calculated?

The score is computed using an LLM that:

  1. Information Need Inference: Determines what key facts or concepts are necessary to produce the expected_output.
  2. Coverage Check: Verifies whether those pieces of information exist in the retrieval_context.

The metric is calculated as:

Contextual Recall=Number of Attributable StatementsTotal number of expressed statements\text{Contextual Recall} = \frac{\text{Number of Attributable Statements}}{\text{Total number of expressed statements}}

Higher scores reflect better recall—i.e., the retriever captured all necessary supporting information.

This metric was incorporated to the Galtea platform from the open source library deepeval, for more information you can also visit their documentation.