Direct Inferences and Evaluations from the Platform

Galtea allows you to run inferences against your AI system and evaluate its responses directly from the Dashboard, without writing any SDK code. This is made possible by Endpoint Connections, which tell Galtea how to call your API, extract the response, and manage session state across turns.

This guide covers the platform-based workflow. If you prefer to generate inferences programmatically (e.g., in a CI/CD pipeline or custom script), see the SDK tutorials instead.

Prerequisites

Before you begin, make sure you have the following set up in the Galtea Dashboard:

A Product representing your AI system
A Test with at least one Test Case to run against your endpoint

Workflow Overview

Create an Endpoint Connection

Define how Galtea should call your AI endpoint — URL, authentication, request format, and response extraction.

Create a Version with the Endpoint Connection

Create a new version of your product and attach the endpoint connection to it.

Run a Test from the Dashboard

Select a test and run it against the version. Galtea calls your endpoint for each test case and records the inference results.

Evaluate the Results

Once inferences are generated, trigger evaluations with the metrics of your choice to assess your AI’s performance.

Step 1: Create an Endpoint Connection

Navigate to your product in the Dashboard and go to the Endpoint Connections section. Click New Endpoint Connection and configure the following:

Name — A descriptive name (e.g., “Production Chat API”).
Type — Select CONVERSATION for the primary request/response endpoint.
URL — The full URL of your AI endpoint (e.g., https://api.company.com/v1/chat).
HTTP Method — Typically POST.
Authentication — Choose the auth type (Bearer, API_KEY, Basic, or None) and provide the token.
Input Template — A Jinja2 template that defines the request body Galtea will send.
Output Mapping — JSONPath expressions that tell Galtea how to extract values from the response.

Input Template

The input template uses Jinja2 syntax with placeholders that Galtea fills automatically. At minimum, use {{ input }} to inject the test case input:

{
  "model": "gpt-4",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "{{ input }}"}
  ]
}

For multi-turn conversations, use past_turns to include conversation history:

{
  "model": "gpt-4",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {% for turn in past_turns %}
    {"role": "user", "content": "{{ turn.input }}"},
    {"role": "assistant", "content": "{{ turn.output }}"},
    {% endfor %}
    {"role": "user", "content": "{{ input }}"}
  ]
}

See Endpoint Connection — Input Template for the full list of available placeholders and advanced template examples.

Output Mapping

The output mapping tells Galtea how to extract values from the API response using JSONPath expressions. The output key is required:

{
  "output": "$.choices[0].message.content"
}

You can also extract additional values to store as session metadata:

{
  "output": "$.choices[0].message.content",
  "retrieval_context": "$.choices[0].retrieval_context",
  "session_id": "$.metadata.session_id"
}

Any extra keys beyond output and retrieval_context are saved to the session metadata and become available as {{ key }} placeholders in subsequent turns.

See Version — Special keys in Output Mapping for a complete reference of how extracted values are stored and reused.

Step 2: Create a Version with the Endpoint Connection

Navigate to your product and create a new Version. When configuring the version:

Fill in the version name, model, and any other relevant properties.
In the Conversation Endpoint Connection field, select the endpoint connection you created in Step 1.

The Conversation Endpoint Connection is the only required endpoint connection. For most integrations, this single endpoint handles the entire interaction lifecycle.

If your AI system requires separate endpoints for session initialization or cleanup, you can optionally configure Initialization and Finalization endpoint connections. See Version — Multi-Step Session Lifecycle for details.

Step 3: Run a Test

Once your version is set up with an endpoint connection, you can run tests directly from the Dashboard:

Navigate to your product’s Tests section.
Select the test you want to run.
Choose the version with the configured endpoint connection.
Start the test run.

Galtea will iterate through each test case, call your endpoint using the configured endpoint connection, and record the resulting Inference Results. Each test case produces a session with one or more inference results depending on whether it’s a single-turn or multi-turn test.

Step 4: Evaluate the Results

After the inferences have been generated, you can trigger evaluations:

Navigate to the session results in the Dashboard.
Select the Metrics you want to use for the evaluation.
Run the evaluation.

Galtea will assess each inference result using the selected metrics and provide scores and explanations.

For single-turn tests, metrics like Factual Accuracy and Answer Relevancy work well. For multi-turn conversations, consider Knowledge Retention, Role Adherence, and Conversation Completeness.

Learn More

Endpoint Connection

Full reference for configuring endpoint connections

Version

Learn about versions and how endpoint connections integrate with them

Evaluations

Understand how evaluations work

Metrics

Browse available metrics for evaluating your AI

Getting Started

Tutorials

Integrations

Direct Inferences and Evaluations from the Platform

Prerequisites

Workflow Overview

Step 1: Create an Endpoint Connection

Input Template

Output Mapping

Step 2: Create a Version with the Endpoint Connection

Step 3: Run a Test

Step 4: Evaluate the Results

Learn More

Endpoint Connection

Version

Evaluations

Metrics

Getting Started

Tutorials

Integrations

​Prerequisites

​Workflow Overview

​Step 1: Create an Endpoint Connection

​Input Template

​Output Mapping

​Step 2: Create a Version with the Endpoint Connection

​Step 3: Run a Test

​Step 4: Evaluate the Results

​Learn More

Endpoint Connection

Version

Evaluations

Metrics

Prerequisites

Workflow Overview

Step 1: Create an Endpoint Connection

Input Template

Output Mapping

Step 2: Create a Version with the Endpoint Connection

Step 3: Run a Test

Step 4: Evaluate the Results

Learn More