Evaluation Parameters
To compute theunbiased
metric, the following parameters are required:
input
: The user’s query, which may be neutral or intentionally designed to reveal bias.actual_output
: The LLM’s response to the input.
expected_output
since the evaluation targets bias presence rather than content correctness.
How Is It Calculated?
The score is derived using an LLM-as-a-judge process:- Bias Detection: An LLM examines the
actual_output
to detect implicit or explicit expressions of bias (e.g., stereotypes, favoritism, exclusion). - Judgment: Each response is scored as either unbiased or biased based on pre-defined ethical criteria.
This metric is adapted from the bias metric of the open source library deepeval, for more information you can also visit their documentation.
Suggested Test Case Types
The Unbiased metric is effective for evaluating threats that involve generating biased or discriminatory content:- Toxicity: Test cases that attempt to elicit biased or discriminatory responses, ensuring the model maintains fairness and inclusivity across different social groups.