Skip to main content
GET
/
inferenceResults
Get inference results
curl --request GET \
  --url https://api.galtea.ai/inferenceResults \
  --header 'Authorization: Bearer <token>'
[
  {
    "id": "ir_123",
    "sessionId": "session_123",
    "userId": "user_123",
    "index": 0,
    "status": "PENDING",
    "input": {
      "user_message": "User input text"
    },
    "actualOutput": "Model response",
    "retrievalContext": "<string>",
    "latency": 150,
    "inputTokens": 100,
    "outputTokens": 50,
    "cacheReadInputTokens": 20,
    "tokens": 150,
    "costPerInputToken": 0.00001,
    "costPerOutputToken": 0.00003,
    "costPerCacheReadInputToken": 0.000005,
    "cost": 0.001,
    "creditsUsed": 1,
    "conversationSimulatorVersion": "1.0.0",
    "traceId": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4",
    "createdAt": "2023-11-07T05:31:56Z",
    "deletedAt": "2023-11-07T05:31:56Z"
  }
]

Authorizations

Authorization
string
header
required

API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....

Query Parameters

ids
string[]

Filter by inference result IDs

sessionIds
string[]

Filter by session IDs

evaluationIds
string[]

Filter by evaluation IDs

limit
integer

Maximum number of results

offset
integer

Number of results to skip

fromCreatedAt
string<date-time>

Filter inference results created at or after this timestamp (ISO 8601 format)

toCreatedAt
string<date-time>

Filter inference results created at or before this timestamp (ISO 8601 format)

sort
string[]

Sort instructions (field and direction pairs)

Response

Inference results retrieved successfully

id
string
Example:

"ir_123"

sessionId
string
Example:

"session_123"

userId
string | null
Example:

"user_123"

index
integer

Order index within the session

Example:

0

status
enum<string>
Available options:
PENDING,
GENERATED,
FAILED
Example:

"PENDING"

input
object

Structured input data. For plain text input, format is { user_message: "..." }

Example:
{ "user_message": "User input text" }
actualOutput
string | null
Example:

"Model response"

retrievalContext
string | null

The RAG retrieval context (retrieved documents/snippets) used to generate the actual output. Used by RAG-aware evaluation metrics.

latency
integer | null
Example:

150

inputTokens
integer | null
Example:

100

outputTokens
integer | null
Example:

50

cacheReadInputTokens
integer | null
Example:

20

tokens
integer | null

Total tokens

Example:

150

costPerInputToken
number | null
Example:

0.00001

costPerOutputToken
number | null
Example:

0.00003

costPerCacheReadInputToken
number | null
Example:

0.000005

cost
number | null
Example:

0.001

creditsUsed
integer | null
Example:

1

conversationSimulatorVersion
string | null
Example:

"1.0.0"

traceId
string | null

W3C trace ID for the root span created during direct inference. The same trace ID is propagated to the user endpoint via the traceparent header.

Example:

"a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4"

createdAt
string<date-time>
deletedAt
string<date-time> | null