Skip to main content
POST
/
evaluations
/
singleTurn
Create single-turn evaluations
curl --request POST \
  --url https://api.galtea.ai/evaluations/singleTurn \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "metrics": [
    {
      "id": "id_123",
      "name": "Example Name",
      "score": 0.95
    }
  ],
  "versionId": "ver_123",
  "actualOutput": "Model response text",
  "testCaseId": "tc_123",
  "input": "What is the capital of France?",
  "retrievalContext": "Retrieved context document",
  "isProduction": true
}
'
[
  {
    "id": "eval_123",
    "metricId": "metric_123",
    "sessionId": "session_123",
    "userId": "user_123",
    "status": "SUCCESS",
    "testCaseId": "tc_123",
    "inferenceResultId": "ir_123",
    "score": 0.95,
    "reason": "High quality response",
    "error": "<string>",
    "canRetry": false,
    "creditsUsed": 1,
    "conversationSimulatorVersion": "1.0.0",
    "humanEvaluatorId": "<string>",
    "humanEvaluatorStartedAt": "2023-11-07T05:31:56Z",
    "failedTurns": [
      "<string>"
    ],
    "createdAt": "2023-11-07T05:31:56Z",
    "deletedAt": "2023-11-07T05:31:56Z",
    "evaluatedAt": "2023-11-07T05:31:56Z",
    "metricLegacyAt": "2023-11-07T05:31:56Z",
    "metricDisabledAt": "2023-11-07T05:31:56Z"
  }
]

Authorizations

Authorization
string
header
required

API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....

Body

application/json
metrics
object[]
required
versionId
string
required
Example:

"ver_123"

actualOutput
string
required
Example:

"Model response text"

testCaseId
string

Required when isProduction is false. Must be omitted when isProduction is true.

Example:

"tc_123"

input
string

User input/prompt. Required when isProduction is true. Must be omitted when isProduction is false.

Example:

"What is the capital of France?"

retrievalContext
string

RAG retrieval context used to generate the actual output

Example:

"Retrieved context document"

isProduction
boolean

When true, creates a production evaluation (input required, testCaseId must be omitted). When false (default), testCaseId is required and input must be omitted.

Response

Evaluations created successfully

id
string
Example:

"eval_123"

metricId
string
Example:

"metric_123"

sessionId
string
Example:

"session_123"

userId
string | null
Example:

"user_123"

status
enum<string>
Available options:
PENDING,
PENDING_HUMAN,
SUCCESS,
FAILED,
SKIPPED
Example:

"SUCCESS"

testCaseId
string | null
Example:

"tc_123"

inferenceResultId
string | null
Example:

"ir_123"

score
number | null
Example:

0.95

reason
string | null
Example:

"High quality response"

error
string | null
canRetry
boolean | null
Example:

false

creditsUsed
integer | null
Example:

1

conversationSimulatorVersion
string | null
Example:

"1.0.0"

humanEvaluatorId
string | null

User ID of the human evaluator

humanEvaluatorStartedAt
string<date-time> | null
failedTurns
string[]

Conversation turns that failed

createdAt
string<date-time>
deletedAt
string<date-time> | null
evaluatedAt
string<date-time> | null
metricLegacyAt
string<date-time> | null
metricDisabledAt
string<date-time> | null