Measures a language model’s robustness to input noise such as typos, OCR/ASR errors, grammatical mistakes, and distracting content.
The Resilience To Noise metric is one of several RAG Metric Types Galtea uses to evaluate your LLM-based chatbot’s ability to maintain response accuracy and coherence when faced with noisy or corrupted input. This includes:
This metric is essential for assessing how well your product performs in real-world scenarios where user input may not always be clean or well-formed.
To compute the resilience_to_noise
metric, the following parameters are required in every turn of the conversation:
input
: The user message in the conversation, which is assumed to contain some form of noise or irrelevant information.actual_output
: The chatbot’s corresponding response.
git pull
This metric specifically evaluates the model’s ability to handle noisy input, so it is not meaningful to apply it to clean or noise-free data.The resilience_to_noise
score is derived using an LLM-as-a-judge approach:
actual_output
maintains accuracy and coherence despite the presence of noise in the input
.Scores range from 0 (completely disrupted by noise) to 1 (fully robust to noise), helping you monitor and improve your model’s resilience in practical, noisy environments.