Skip to main content
This page covers the detailed configuration for Endpoint Connection request formatting, response extraction, and retry behavior.

Input Template

The Input Template is a Jinja2 template string that defines how Galtea formats the request body before sending it to your endpoint.

Available Placeholders

PlaceholderDescription
{{ input.<field> }}The test case input (required)
{{ context.<field> }}The test case context
{{ session_id }}The external session ID (if available)
{{ test_case_id }}The test case ID
{{ test_id }}The test ID
{{ galtea_session_id }}The Galtea session ID
{{ inference_result_id }}The inference result ID for this turn (also sent automatically as the X-Galtea-Inference-Id HTTP header — useful for trace collection)
Any metadata keyAny field stored in session metadata (e.g., {{ sessionId }}, {{ tenant }}, {{ conversation_token }})

Conversation History

Use {% for turn in past_turns %}...{% endfor %} to loop through previous conversation turns. Each turn exposes:
  • {{ turn.input }} — Previous user input
  • {{ turn.output }} — Previous assistant response

Examples

OpenAI-compatible format:
{
  "model": "gpt-4",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {% for turn in past_turns %}
    {"role": "user", "content": "{{ turn.input }}"},
    {"role": "assistant", "content": "{{ turn.output }}"},
    {% endfor %}
    {"role": "user", "content": "{{ input.user_message }}"}
  ]
}
Previous queries array:
{
  "model": "gpt-4",
  "previous_queries": [
    {% for turn in past_turns %}
    "{{ turn.input }}"{% if not loop.last %},{% endif %}
    {% endfor %}
  ],
  "current_query": "{{ input.user_message }}"
}
Custom format:
{
  "data": {
    "names": ["query", "contexto"],
    "ndarray": [
      [
        "{{ input.user_message }}",
        [
          {% for turn in past_turns %}
          "{{ turn.input }}",
          "{{ turn.output }}"{% if not loop.last %},{% endif %}
          {% endfor %}
        ]
      ]
    ]
  }
}
When a loop is the last item in a JSON array, use {% if not loop.last %},{% endif %} after each iteration to prevent a trailing comma. When there’s content after the loop (like in the OpenAI example above), trailing commas are valid and this pattern is not required.

Output Mapping

A JSON object defining how to extract values from the API response using JSONPath expressions.

Special Keys

KeyBehavior
outputRequired. The AI’s response content.
session_idStored as the external session identifier, accessible via custom_id.
retrieval_contextStored as retrieval context for RAG evaluations.
tracesExtracts an array of trace objects and stores them linked to the inference result. Each object should contain at least a name field and can include any of the Trace properties: type, description, inputData, outputData, error, latencyMs, metadata, startTime, endTime.
Any other keyStored in session metadata and available as {{ key }} in templates.

JSONPath Syntax Reference

ExpressionDescription
$Root object
.Child operator
[]Array index or child operator
[*]Wildcard (all elements)
[0]First array element
[-1]Last array element

Example

{
  "output": "$.choices[0].message.content",
  "retrieval_context": "$.choices[0].retrieval_context",
  "session_id": "$.metadata.session_id",
  "traces": "$.metadata.traces"
}

State Management

If your API returns values that need to be sent in subsequent requests (e.g., session_id, tenant_id), Galtea can automatically manage this state:
  1. Extract — Use Output Mapping to pull values from the API response using JSONPath expressions
  2. Store — Extracted values are saved in the session and become available as template variables
  3. Reuse — Reference any stored value in the Input Template or URL using {{ variable_name }} syntax
On the first turn, undefined placeholders resolve to empty strings. After the first response, all extracted values become available for subsequent turns.
Example: capture session_id and tenant_id from responses:
{
  "output": "$.text",
  "session_id": "$.session_id",
  "tenant_id": "$.tenant"
}

Custom Headers

Custom headers are sent with every request to your endpoint. Header values support placeholder substitution for injecting authentication credentials, so you can place tokens in any header — not just the standard Authorization or X-API-Key headers.

Available Placeholders

PlaceholderResolves toAvailable when auth type is
{{ auth_token }}The authentication tokenBEARER or API_KEY
{{ api_key }}The authentication tokenBEARER or API_KEY (alias for {{ auth_token }})
{{ bearer_token }}The authentication tokenBEARER or API_KEY (alias for {{ auth_token }})
{{ username }}The Basic auth usernameBASIC
{{ password }}The Basic auth passwordBASIC
Placeholders are case-insensitive and work with both {{ placeholder }} and { placeholder } syntax.

Example

{
  "Authorization": "Bearer {{ bearer_token }}",
  "X-Custom-Api-Key": "{{ api_key }}",
  "X-Tenant-Id": "my-tenant"
}
If you define an Authorization or X-API-Key header in custom headers, Galtea will not add the default authentication header — your custom header takes precedence.

W3C Trace Context Propagation

When W3C trace context propagation is enabled, Galtea adds a traceparent header to every request following the W3C Trace Context specification. This allows your endpoint to correlate requests with Galtea’s traces without any code changes. You can add a traceparent key in your custom headers as a placeholder — Galtea will replace its value with the correct trace context at request time.

Retry Configuration

Configure automatic retry behavior for failed requests. When enabled, Galtea will automatically retry requests that fail with specific HTTP status codes.
Retry Enabled
Boolean
Whether automatic retry is enabled. Default: false
Max Retry Attempts
Number
Maximum number of retry attempts. Default: 3
Initial Delay
Number
Initial delay in milliseconds before the first retry. Default: 2000 (2 seconds)
Backoff Strategy
Enum
Strategy for increasing delay between retry attempts:
  • exponential - Delay doubles with each attempt (recommended)
  • linear - Delay increases linearly
  • fixed - Constant delay between attempts
Default: exponential
Max Delay
Number
Maximum delay cap in milliseconds. Default: 30000 (30 seconds)
Retryable Status Codes
Array of Numbers
HTTP status codes that should trigger a retry. Default: [429, 500, 502, 503, 504]

Endpoint Connection Overview

What endpoint connections are and how to create one.

Direct Inferences Tutorial

Run evaluations from the dashboard using your endpoint connection.