Skip to main content

What is a Version?

A version in Galtea represents a specific iteration of a product. Versions allow you to track changes to your product over time and compare different implementations against the same tests. You can create, view and manage your versions on the Galtea dashboard or programmatically using the Galtea SDK.

Comparing Versions

One of the key benefits of tracking versions in Galtea is the ability to compare different implementations of your product. This allows you to:
  • Measure improvements between versions
  • Identify regressions in newer versions
  • Compare different model providers or approaches
  • Make data-driven decisions about which version to deploy

Run Evaluations

Learn how to run evaluations for your versions

SDK Integration

The Galtea SDK allows you to create, view, and manage versions programmatically. This is particularly useful for organizations that want to automate their versioning process or integrate it into their CI/CD pipeline.

Version Properties

Version Name
Text
required
The name of the version. Example: “v1.2.0” or “GPT-4 Implementation”
Version Description
Text
A description of the version, typically highlighting what makes it different from other versions. Example: “Improved summarization algorithm with better fact retention”
Model
Model
required
The AI Model used by this version. Galtea uses this to track costs, calculate per-evaluation inference spend, and associate the version with the model’s pricing and tokenization characteristics.
System Prompt
Text
The system prompt used for this version. Example: “You are an expert legal document summarizer. Provide concise summaries that capture all key legal points.”
Dataset URI
Text
The URI of the dataset used to train or fine-tune this version. Example: “s3://company-datasets/legal-documents-v2/”
Dataset Description
Text
A description of the dataset used in the version. Example: “Collection of 10,000 legal contracts and agreements with expert-created summaries”
Guardrails
Text
The guardrails applied to the version, separated by commas. Example: “content filtering, citation checking, legal compliance”
Conversation Endpoint Connection
EndpointConnection
The primary Endpoint Connection used for the main conversational interactions with your AI product. This is the only required endpoint connection.Used for:
  • Sending user messages
  • Receiving AI responses
  • (Often) creating and maintaining the external session state
Initialization Endpoint Connection
EndpointConnection
An optional Endpoint Connection executed before the conversation begins. Used to initialize a session with your AI product.Used for:
  • Creating a session on the external API
  • Obtaining a session ID that will be used in subsequent conversation calls
  • Setting up initial context or configuration
The initialization endpoint must return a session_id in its response. Configure the outputMapping with a session_id key pointing to the session identifier in the response. This value is stored in Galtea and made available in subsequent calls via {{ session_id }}.
Finalization Endpoint Connection
EndpointConnection
An optional Endpoint Connection executed after the conversation ends (including after errors). Used to clean up resources on your AI product.Used for:
  • Closing sessions on the external API
  • Releasing resources
  • Triggering post-conversation processing
The finalization step runs in a finally block, meaning it executes even if the conversation encounters an error. Errors in the finalization step are logged but do not fail the overall evaluation.

Endpoint Connections

The Conversation Endpoint Connection is the main way Galtea talks to your AI system. Therefore, this property is the only one required to generate inferences from the platform. In many products, a single Conversation endpoint is enough to handle:
  • Session creation
  • Conversation turns
  • Carrying state between requests
Only when your product requires separate endpoints for setup or cleanup, you can optionally add Initialization and/or Finalization endpoint connections. See Multi-Step Session Lifecycle (Advanced) for details.

Single Conversation Endpoint

Most integrations only need one Conversation endpoint. At its simplest, you just need to configure:
  • Input Template — How to format the request body (using {{ input }} for the simulated user message).
  • Output Mapping — How to extract your product’s AI response (using a JSONPath expression for the output key).

Basic example (stateless)

For a simple API that doesn’t require session state:
{
  "message": "{{ input }}"
}
This is all you need to start running evaluations against your endpoint.

State management (extracting and reusing values)

If your API returns values that need to be sent in subsequent requests (e.g., session_id, tenant_id), Galtea can automatically manage this state:
  1. Extract — Use outputMapping to pull values from the API response using JSONPath expressions.
  2. Store — Extracted values are saved in the session and become available as template variables.
  3. Reuse — Reference any stored value in the inputTemplate or URL using {{ variable_name }} syntax.
On the first turn, undefined placeholders resolve to empty strings. After the first response, all extracted values become available for subsequent turns.
Example: capture session_id and tenant_id from responses:
{
  "output": "$.text",
  "session_id": "$.session_id",
  "tenant_id": "$.tenant"
}

Special keys in Output Mapping

KeyBehavior
outputRequired. The AI’s response content.
session_idStored as the external session identifier, accessible via custom_id.
retrieval_contextStored as retrieval context for RAG evaluations.
Any other keyStored in session metadata and available as {{ key }} in templates.
For the full list of available template variables and detailed configuration options, see Endpoint Connection — Input Template.

Multi-Step Session Lifecycle (Advanced)

Some AI products expose separate endpoints for session setup and cleanup. In those cases, you can configure up to three endpoint connections:
  • Initialization (optional): runs before conversation
  • Conversation (required): runs for each turn
  • Finalization (optional): runs after conversation

Session Lifecycle Flow

When a version has initialization and/or finalization endpoints configured, the evaluation follows this lifecycle:

Example Use Case: Multi-Step Chatbot (high level)

Some chatbot APIs require multiple steps:
  • Initialization: create a session and return a session_id
  • Conversation: send messages using that session_id
  • Finalization: clean up the session
In this setup:
  • Your Initialization connection extracts session_id via outputMapping.
  • Your Conversation connection can reuse it in the URL/body using {{ session_id }}.
  • Any additional fields extracted via outputMapping are stored in session metadata and can also be reused in later turns.