Skip to main content
GET
/
evaluations
Get evaluations
curl --request GET \
  --url https://api.galtea.ai/evaluations \
  --header 'Authorization: Bearer <token>'
[
  {
    "id": "eval_123",
    "metricId": "metric_123",
    "sessionId": "session_123",
    "userId": "user_123",
    "status": "SUCCESS",
    "testCaseId": "tc_123",
    "inferenceResultId": "ir_123",
    "score": 0.95,
    "reason": "High quality response",
    "error": "<string>",
    "canRetry": false,
    "creditsUsed": 1,
    "conversationSimulatorVersion": "1.0.0",
    "humanEvaluatorId": "<string>",
    "humanEvaluatorStartedAt": "2023-11-07T05:31:56Z",
    "failedTurns": [
      "<string>"
    ],
    "createdAt": "2023-11-07T05:31:56Z",
    "deletedAt": "2023-11-07T05:31:56Z",
    "evaluatedAt": "2023-11-07T05:31:56Z",
    "metricLegacyAt": "2023-11-07T05:31:56Z",
    "metricDisabledAt": "2023-11-07T05:31:56Z"
  }
]

Authorizations

Authorization
string
header
required

API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....

Query Parameters

ids
string[]

Filter by evaluation IDs

productIds
string[]

Filter by product IDs

sessionIds
string[]

Filter by session IDs

inferenceResultIds
string[]

Filter by inference result IDs (for single-turn evaluations)

metricIds
string[]

Filter by metric IDs

testCaseIds
string[]

Filter by test case IDs

testIds
string[]

Filter by test IDs (include only). Use an empty string ("") to select evaluations without a test

excludeTestIds
string[]

Omit evaluations linked to the specified test IDs. Use an empty string ("") to omit evaluations without a test

versionIds
string[]

Filter by version IDs

specificationIds
string[]

Filter by specification IDs (returns evaluations whose metric is linked to any of the given specifications)

sort
string[]

Sort instructions (field and direction pairs)

limit
integer

Maximum number of results

offset
integer

Number of results to skip

statuses
enum<string>[]

Filter by evaluation statuses (PENDING, SUCCESS, FAILED)

Available options:
PENDING,
PENDING_HUMAN,
SUCCESS,
FAILED,
SKIPPED
canRetry
boolean

Filter evaluations that can be retried

evaluationTypes
string[]

Filter by evaluation types

humanEvaluatorId
string

Filter by human evaluator user ID

metricSources
string[]

Filter by metric sources

fromCreatedAt
string<date-time>

Filter evaluations created at or after this timestamp (ISO 8601 format)

toCreatedAt
string<date-time>

Filter evaluations created at or before this timestamp (ISO 8601 format)

Response

Evaluations retrieved successfully

id
string
Example:

"eval_123"

metricId
string
Example:

"metric_123"

sessionId
string
Example:

"session_123"

userId
string | null
Example:

"user_123"

status
enum<string>
Available options:
PENDING,
PENDING_HUMAN,
SUCCESS,
FAILED,
SKIPPED
Example:

"SUCCESS"

testCaseId
string | null
Example:

"tc_123"

inferenceResultId
string | null
Example:

"ir_123"

score
number | null
Example:

0.95

reason
string | null
Example:

"High quality response"

error
string | null
canRetry
boolean | null
Example:

false

creditsUsed
integer | null
Example:

1

conversationSimulatorVersion
string | null
Example:

"1.0.0"

humanEvaluatorId
string | null

User ID of the human evaluator

humanEvaluatorStartedAt
string<date-time> | null
failedTurns
string[]

Conversation turns that failed

createdAt
string<date-time>
deletedAt
string<date-time> | null
evaluatedAt
string<date-time> | null
metricLegacyAt
string<date-time> | null
metricDisabledAt
string<date-time> | null