Create inference results batch

curl --request POST \ --url https://api.galtea.ai/inferenceResults/batch \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "sessionId": "session_123", "conversationTurns": [ {} ] } '

[ { "id": "ir_123", "sessionId": "session_123", "userId": "user_123", "index": 0, "status": "PENDING", "input": { "user_message": "User input text" }, "actualOutput": "Model response", "retrievalContext": "<string>", "latency": 150, "inputTokens": 100, "outputTokens": 50, "cacheReadInputTokens": 20, "tokens": 150, "costPerInputToken": 0.00001, "costPerOutputToken": 0.00003, "costPerCacheReadInputToken": 0.000005, "cost": 0.001, "creditsUsed": 1, "conversationSimulatorVersion": "1.0.0", "traceId": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4", "createdAt": "2023-11-07T05:31:56Z", "deletedAt": "2023-11-07T05:31:56Z" } ]

Authorizations

Authorization

string

header

required

API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....

Body

application/json

sessionId

string

required

Example:

"session_123"

conversationTurns

object[]

required

Response

Inference results created successfully

string

Example:

"ir_123"

sessionId

string

Example:

"session_123"

userId

string | null

Example:

"user_123"

index

integer

Order index within the session

Example:

0

status

enum<string>

Available options:

PENDING,

GENERATED,

FAILED

Example:

"PENDING"

input

object

Structured input data. For plain text input, format is { user_message: "..." }

Example:

{ "user_message": "User input text" }

actualOutput

string | null

Example:

"Model response"

retrievalContext

string | null

The RAG retrieval context (retrieved documents/snippets) used to generate the actual output. Used by RAG-aware evaluation metrics.

latency

integer | null

Example:

150

inputTokens

integer | null

Example:

100

outputTokens

integer | null

Example:

50

cacheReadInputTokens

integer | null

Example:

20

tokens

integer | null

Total tokens

Example:

150

costPerInputToken

number | null

Example:

0.00001

costPerOutputToken

number | null

Example:

0.00003

costPerCacheReadInputToken

number | null

Example:

0.000005

cost

number | null

Example:

0.001

creditsUsed

integer | null

Example:

1

conversationSimulatorVersion

string | null

Example:

"1.0.0"

traceId

string | null

W3C trace ID for the root span created during direct inference. The same trace ID is propagated to the user endpoint via the traceparent header.

Example:

"a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4"

createdAt

string<date-time>

deletedAt

string<date-time> | null

Health

Organizations

UserGroups

Metrics

Specifications

Models

Products

Versions

EndpointConnections

Tests

TestCases

Sessions

InferenceResults

Traces

Evaluations

Human Evaluations

Generate From Few Shot

Analytics

Storage

Permissions

EvaluatorModels

ConversationSimulator

SupportedVersion

OTel

Create inference results batch

Authorizations

Body

Response