Evaluation Parameters
To compute theuser_satisfaction metric, the following parameter is required:
input: The user messages sent to the chatbot.actual_output: The chatbot’s corresponding responses.
How Is It Calculated?
Theuser_satisfaction score is derived using an LLM-as-a-judge approach with explicit pass criteria:
- Efficiency Check: Was the interaction smooth and direct, or did the user have to rephrase, repeat, or correct the chatbot?
- Sentiment Analysis: Did the user display positive/neutral sentiment or negative sentiment?
- 1 (Satisfied): The interaction was efficient and the user’s sentiment was neutral-to-positive.
- 0 (Not Satisfied): The interaction was inefficient, the user expressed frustration, or both.