Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The generate() method is the recommended way to execute your agent when you want automatic trace collection, timing, and inference result creation. It handles the complete lifecycle:
  1. Initializes trace collection
  2. Executes your agent with timing measurement
  3. Creates/updates the inference result with all metadata
  4. Saves collected traces to the platform
  5. Cleans up trace context (prevents memory leaks)
This is the recommended method for production use as it handles all trace lifecycle management automatically.

Agent Options

The quickest way to get started. Your function receives just the latest user message as a string.
def my_agent(user_message: str) -> str:
    # In a real scenario, call your model here
    return f"Your model output to: {user_message}"
All three signatures work with evaluations.run(), inference_results.generate(), and simulator.simulate(). Both sync and async functions are supported. The SDK auto-detects which signature you’re using from the type hint on the first parameter.For the full list of fields available on AgentInput (including structured input access via message metadata), see the AgentInput reference.

Parameters

agent
AgentType
required
The agent function to execute. Supported signatures:
  • Simple: (user_message: str) -> str — receives last user message
  • Chat History: (messages: list[dict]) -> str — receives full chat history (OpenAI format)
  • Structured: (input_data: AgentInput) -> AgentResponse — structured input/output. See the AgentInput reference for all available fields including structured input access via message metadata.
Both sync and async functions are supported. Use the structured signature when you need usage_info, cost_info, or retrieval_context.
session
Session
required
The session object to associate the inference result with. Obtain via galtea.sessions.create() or galtea.sessions.get().
input
str | dict
The input to send to the agent. Accepts a plain string or a dict with structured fields (e.g. {"user_message": "hello", "chat_type": "support"}). Stored directly in the inference result. If not provided and the session is test-based, the test case input is used automatically. Required for production/monitoring sessions.
throw_if_empty_agent_response
bool
If True, raises a ValueError when the agent returns an empty response. Defaults to False.
context_data is automatically fetched from the session’s test case and passed to the agent via AgentInput.context_data. No need to provide it explicitly.

Returns

Returns an InferenceResult object with all captured data.

Example

# Define your agent function
def my_agent(user_message: str) -> str:
    return f"Response to: {user_message}"


inference_result = galtea.inference_results.generate(
    agent=my_agent, session=session, input="Generate something"
)

What Gets Captured

The generate() method automatically captures and saves:
DataSourceDescription
Inputinput parameterThe user’s input (string or structured dict)
OutputAgentResponse.contentThe agent’s response
Retrieval ContextAgentResponse.retrieval_contextContext used (for RAG)
LatencyMeasuredEnd-to-end execution time in ms
Usage InfoAgentResponse.usage_infoToken counts (if provided)
Cost InfoAgentResponse.cost_infoCost data (if provided)
Traces@trace decoratorsAll traced operations

Providing Usage and Cost Information

Your agent can return usage and cost information in the AgentResponse:
@galtea.trace(name="main", type=TraceType.AGENT)
def my_agent_with_usage(input_data: AgentInput) -> AgentResponse:
    # Your agent logic...

    return AgentResponse(
        content="Response content",
        usage_info={
            "input_tokens": 150,
            "output_tokens": 75,
            "cache_read_input_tokens": 50,
        },
        cost_info={
            "cost_per_input_token": 0.00001,
            "cost_per_output_token": 0.00003,
            "cost_per_cache_read_input_token": 0.000001,
        },
    )

Comparison with Manual Approach

Featuregenerate()Manual with set_context()
Context initializationAutomaticset_context()
Agent executionAutomaticManual call
Timing measurementAutomaticManual
Inference resultAuto-createdinference_results.create()
Trace exportAutomaticAutomatic on clear_context()
Memory cleanupAutomaticclear_context()
Error handlingBuilt-inManual try/finally
Use generate() for most use cases. Use manual set_context()/clear_context() only when you need to manage inference result creation separately or have complex multi-step workflows.

Error Handling

If the agent raises an exception, generate() ensures trace context is cleaned up:
def failing_agent(input_data: AgentInput) -> AgentResponse:
    raise ValueError("Agent failed intentionally")


try:
    result = galtea_client.inference_results.generate(
        agent=failing_agent, session=session_error, input="test"
    )
except Exception as e:
    # Trace context is automatically cleaned up
    print(f"Agent failed: {e}")