Role Adherence

The Role Adherence metric is one of several non-deterministic Metrics Galtea uses to evaluate whether your LLM-based chatbot maintains consistency with its assigned role throughout a conversation. This role could be defined by system prompts (e.g., “you are a travel assistant”) or contextual constraints (e.g., tone, domain, responsibilities). This is especially important in enterprise and safety-sensitive applications, where the chatbot must not deviate from its designated behavior or scope.

Evaluation Parameters

To compute the role_adherence metric, the following inputs are required in every turn of the conversation:

input: The current user message.
actual_output: The corresponding chatbot response.

This metric will evaluate the whole conversation, including all turns, to evaluate consistency with the assigned role over time.

How Is It Calculated?

The role_adherence score is computed using an LLM-as-a-judge approach:

Define the Persona: Based on the product_description, the LLM identifies the expected persona, tone, professional boundaries, and style.
Audit the Conversation: The LLM reviews every response from the agent in the conversation history.
Check for Deviations: The LLM evaluates whether the agent broke character, violated tone constraints, strayed from its designated responsibilities, or suddenly deviated from the inferred role.

The metric assigns a binary score:

Score 1.0 (Adherent): The agent consistently maintained its role, tone, and persona throughout all turns.
Score 0.0 (Non-Adherent): The agent deviated from its role, broke character, or adopted an inconsistent tone at any point.

SDK

Concepts

Evaluation Parameters

How Is It Calculated?

SDK

Concepts

​Evaluation Parameters

​How Is It Calculated?

​Related Topics

Evaluation Parameters

How Is It Calculated?

Related Topics