Knowledge Retention
Checks if your product can follow a multi-turn conversation without losing information.
The Knowledge Retention metric is one of several non-deterministic Metric Types Galtea uses to evaluate your LLM-based chatbot’s ability to retain and consistently apply factual information shared earlier in a conversation. It analyzes the entire conversational history to determine whether the model recalls and reuses relevant facts when generating new responses.
This is particularly useful for long, multi-turn dialogues where context accumulation and memory play a crucial role in the user experience.
Evaluation Parameters
To compute the knowledge_retention
metric, the following parameters are required:
input
: The most recent user input in the conversation.actual_output
: The most recent LLM-generated response.conversational_turns
: The complete list of preceding user and assistant messages in the conversation (i.e., the context/history).
These inputs allow the metric to simulate a memory-check process across multiple turns.
How Is It Calculated?
The knowledge_retention
score is based on a two-step LLM-driven process:
- Knowledge Extraction: An LLM is used to extract key facts, assertions, and informational snippets from the previous
conversational_turns
. - Retention Evaluation: For each subsequent LLM response, the system assesses whether the model has retained the relevant knowledge or has exhibited information attrition (e.g., contradictions, omissions, or inconsistencies).
The final score is calculated as:
This yields a score between 0 (no retention) and 1 (perfect retention), where higher values indicate stronger memory consistency over the course of the conversation.