Trace

What is a Trace?

A trace in Galtea represents a single operation or function call that occurs during an AI agent’s execution. Traces capture the internal workings of your agent—such as tool calls, retrieval operations, chain orchestrations, and LLM invocations—providing deep visibility into how your agent processes requests. Traces are linked to inference results, enabling you to understand not just what your agent responded, but how it arrived at that response.

Why Use Traces?

Debugging

Identify exactly where and why your agent failed or produced unexpected results.

Performance Optimization

Pinpoint slow operations with latency tracking at every step.

Compliance & Auditing

Maintain a complete audit trail of all operations for regulatory requirements.

Cost Analysis

Understand which operations consume the most resources.

Trace Hierarchy

Traces support parent-child relationships, allowing you to visualize the complete execution flow of your agent. When a traced function calls another traced function, the hierarchy is automatically captured.

Agent Call (root)
├── Route Query (CHAIN)
├── Retrieve Context (RETRIEVER)
│   └── Vector Search (TOOL)
├── Fetch Product Data (TOOL)
└── Calculate Discount (TOOL)

Each trace includes:

id: Unique identifier for the trace
parent_trace_id: Reference to the parent trace (null for root traces)
name: The operation name
node_type: Classification of the operation

Node Types

Traces are classified by node type to help you understand the nature of each operation and debug issues more effectively:

Node Type	Definition	Why This Matters for Tracing
CHAIN	A node composed of multiple smaller steps (e.g., a LangChain RunnableSequence or LangGraph node).	Composite orchestration nodes that run multiple internal steps and pass data between stages. Exposing each internal step in traces uncovers hidden failures, clarifies data flow, and pinpoints performance hotspots.
TOOL	A node executing deterministic code (search, calculator, API).	Deterministic or external calls where inputs, outputs, and side effects determine correctness. Tracing tool inputs, outputs, and results makes failures reproducible, reveals integration issues, and simplifies debugging.
RETRIEVER	A node specifically for fetching context (RAG operations).	Operations that fetch contextual data which directly affect prompt relevance and the context window. Making retrieval inputs, results, and ranks visible helps diagnose low-quality context, improve relevance, and avoid wasted context and cost.
LLM	A node that wraps a direct model API call (OpenAI, Anthropic, etc.).	This is where cost (tokens) and latency come from. This way you can clearly see these operations and identify expensive calls and bottlenecks.
CUSTOM	Any generic Python function that doesn’t fit the above (e.g., data formatting, state parsing).	Utility or glue-code functions that don’t map cleanly to other types but shape runtime behavior. Tracing these operations surfaces subtle bugs, data transformations, and state changes that would otherwise be opaque.

The `@trace` Decorator

The @trace decorator automatically captures function inputs, outputs, timing, errors, and parent-child relationships.

Syntax Options

from galtea import trace, NodeType

# Full specification
@trace(name="my_operation", node_type=NodeType.TOOL)
def my_function(): ...

# Name only (node_type defaults to None, for unclassified operations)
@trace(name="custom_name")
def my_function(): ...

# Bare decorator (uses function name)
@trace
def my_function(): ...

# Empty parentheses
@trace()
def my_function(): ...

Error Tracking

The decorator automatically captures exceptions. When an error occurs, the trace records:

The error message in the error field
The execution time until the error
Input data that caused the error

@trace(name="risky_operation", node_type=NodeType.TOOL)
def risky_call(self, data: str) -> str:
    if not data:
        raise ValueError("Data cannot be empty")
    return f"Processed: {data}"

Viewing Trace Hierarchy

After collecting traces, you can visualize the execution flow:

traces = galtea.traces.list(inference_result_id=inference_result.id)

def print_trace_tree(traces, parent_id=None, indent=0):
    for trace in traces:
        if trace.parent_trace_id == parent_id:
            prefix = "  " * indent + ("└─ " if indent > 0 else "")
            print(f"{prefix}{trace.name} ({trace.node_type}) - {trace.latency_ms:.2f}ms")
            print_trace_tree(traces, trace.id, indent + 1)

print_trace_tree(traces)

Example output:

main_agent (CHAIN) - 245.30ms
└─ route_query (CHAIN) - 0.15ms
└─ search_documents (RETRIEVER) - 120.50ms
└─ call_llm (LLM) - 124.20ms

SDK Integration

Tracing Tutorial

Step-by-step guide to instrumenting your agent and collecting traces.

Trace Service

Manage and collect traces for your AI agent operations using the SDK.

Trace Properties

Session

required

The session to which the trace belongs.

Inference Result

InferenceResult

The inference result this trace is associated with.

Name

string

required

The name of the traced operation (e.g., function name).

Node Type

string

The type of operation: TOOL, CHAIN, RETRIEVER, LLM, or CUSTOM.

Parent Trace ID

string

The ID of the parent trace for hierarchical relationships.

Input Data

object

The input parameters passed to the operation.

Output Data

object

The result returned by the operation.

Error

string

Error message if the operation failed.

Latency (ms)

float

The execution time of the operation in milliseconds.

Start Time

string

ISO 8601 timestamp when the operation started.

End Time

string

ISO 8601 timestamp when the operation completed.

Metadata

object

Additional custom metadata about the trace.

Best Practices

Use meaningful trace names

# ✅ Good - descriptive
@trace(name="fetch_customer_orders", node_type=NodeType.TOOL)

# ❌ Bad - generic
@trace(name="step_1", node_type=NodeType.TOOL)

Trace at meaningful boundaries

Trace operations that represent logical units of work, not every single function:

# ✅ Good - meaningful operation
@trace(name="search_products", node_type=NodeType.RETRIEVER)
def search_products(self, query):
    results = self._query_vector_db(query)  # Internal, not traced
    return self._format_results(results)     # Internal, not traced

Select appropriate node types

Classify operations correctly to enable better filtering and analysis in the dashboard.

Keep input/output data reasonable

The decorator captures function arguments automatically. Consider what’s useful for debugging:

@trace(name="process_document", node_type=NodeType.TOOL)
def process(self, doc_id: str) -> dict:
    # Only doc_id is captured as input, not the full document
    doc = self.fetch_document(doc_id)
    return {"summary": doc.summary, "status": "processed"}

Concepts

Metrics

Test Types

What is a Trace?

Why Use Traces?

Debugging

Performance Optimization

Compliance & Auditing

Cost Analysis

Trace Hierarchy

Node Types

The `@trace` Decorator

Syntax Options

Error Tracking

Viewing Trace Hierarchy

SDK Integration

Tracing Tutorial

Trace Service

Trace Properties

Best Practices

Concepts

Metrics

Test Types

​What is a Trace?

​Why Use Traces?

Debugging

Performance Optimization

Compliance & Auditing

Cost Analysis

​Trace Hierarchy

​Node Types

​The @trace Decorator

​Syntax Options

​Error Tracking

​Viewing Trace Hierarchy

​SDK Integration

Tracing Tutorial

Trace Service

​Trace Properties

​Best Practices

What is a Trace?

Why Use Traces?

Trace Hierarchy

Node Types

The `@trace` Decorator

Syntax Options

Error Tracking

Viewing Trace Hierarchy

SDK Integration

Trace Properties

Best Practices