Create evaluations for single-turn interactions. See Evaluations.
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
"ver_123"
"Model response text"
Required when isProduction is false. Must be omitted when isProduction is true.
"tc_123"
User input/prompt. Required when isProduction is true. Must be omitted when isProduction is false.
"What is the capital of France?"
RAG retrieval context used to generate the actual output
"Retrieved context document"
When true, creates a production evaluation (input required, testCaseId must be omitted). When false (default), testCaseId is required and input must be omitted.
Evaluations created successfully
"eval_123"
"metric_123"
"session_123"
"user_123"
PENDING, PENDING_HUMAN, SUCCESS, FAILED, SKIPPED "SUCCESS"
"tc_123"
"ir_123"
0.95
"High quality response"
false
1
"1.0.0"
User ID of the human evaluator
Conversation turns that failed