Evaluation Parameters
To compute theuser_satisfaction
metric, the following parameter is required:
conversation_turns
: The complete history of user inputs and chatbot responses.
How Is It Calculated?
Theuser_satisfaction
score is derived using an LLM-as-a-judge approach with explicit pass criteria:
- Efficiency Check: Was the interaction smooth and direct, or did the user have to rephrase, repeat, or correct the chatbot?
- Sentiment Analysis: Did the user display positive/neutral sentiment or negative sentiment?
- 1 (Satisfied): The interaction was efficient and the user’s sentiment was neutral-to-positive.
- 0 (Not Satisfied): The interaction was inefficient, the user expressed frustration, or both.