Skip to main content

What is an Inference Result?

An inference result is a single turn in a conversation — one user input paired with one AI output. It belongs to a session and can optionally include metadata like latency, token usage, cost, and retrieval context (for RAG evaluations). You can create inference results from the Galtea dashboard (using Endpoint Connections) or programmatically using the Galtea SDK.

SDK Integration

Inference Result

A single turn in a conversation between a user and the AI.

Inference Result Properties

Session
Session
required
The session to which the inference result belongs.
Input
string
The input text or prompt for the inference result.
Output
string
The generated output or response for the inference result.
Retrieval Context
string
The context retrieved by a RAG system, if applicable.
Latency
float
The latency in milliseconds for the model’s response.
Usage Info
dict
Token usage information for the inference result.
Cost Info
dict
Cost information for the inference result.