Learn how to evaluate multi-turn conversations using Galtea’s session-based workflow.
Create a Session
Log Inference Results
Evaluate the Session
evaluation_tasks.create()
method. This will generate tasks for each turn against the specified metrics.is_production=True
and omitting the test_case_id
. See the Monitor Production Responses guide for an example.