Role Adherence

The Role Adherence metric is one of several non-deterministic Metric Types Galtea uses to evaluate whether your LLM-based chatbot maintains consistency with its assigned role throughout a conversation. This role could be defined by system prompts (e.g., “you are a travel assistant”) or contextual constraints (e.g., tone, domain, responsibilities). This is especially important in enterprise and safety-sensitive applications, where the chatbot must not deviate from its designated behavior or scope.

Evaluation Parameters

To compute the role_adherence metric, the following inputs are required in every turn of the conversation:

input: The current user message.
actual_output: The corresponding chatbot response.

This metric will evaluate the whole conversation, including all turns, to evaluate consistency with the assigned role over time.

How Is It Calculated?

The role_adherence score is computed through the following LLM-based steps:

Role Identification: The system extracts the chatbot’s assigned role from the initial context or system prompt.
Deviation Check: For each turn, the LLM determines whether the actual_output deviates from or contradicts the expected behavior of that role.

The metric is computed as:

\text{Role Adherence} = \frac{\text{Number of in-role responses}}{\text{Total number of evaluated responses}}

Scores close to 1 indicate strong consistency with the chatbot’s intended persona or function.

This metric is adapted from the bias metric of the open source library deepeval, for more information you can also visit their documentation.

Concepts

Metrics

Test Types

Evaluation Parameters

How Is It Calculated?

Concepts

Metrics

Test Types

​Evaluation Parameters

​How Is It Calculated?

​Related Topics

Evaluation Parameters

How Is It Calculated?

Related Topics