Skip to main content

What is a Trace?

A trace in Galtea represents a single operation or function call that occurs during an AI agent’s execution. Traces capture the internal workings of your agent—such as tool calls, retrieval operations, chain orchestrations, and LLM invocations—providing deep visibility into how your agent processes requests. Traces are linked to inference results, enabling you to understand not just what your agent responded, but how it arrived at that response.

Why Use Traces?

Debugging

Identify exactly where and why your agent failed or produced unexpected results.

Performance Optimization

Pinpoint slow operations with latency tracking at every step.

Compliance & Auditing

Maintain a complete audit trail of all operations for regulatory requirements.

Cost Analysis

Understand which operations consume the most resources.

Trace Hierarchy

Traces support parent-child relationships, allowing you to visualize the complete execution flow of your agent. When a traced function calls another traced function, the hierarchy is automatically captured.
Agent Call (root)
├── Route Query (CHAIN)
├── Retrieve Context (RETRIEVER)
│   └── Vector Search (TOOL)
├── Fetch Product Data (TOOL)
└── Calculate Discount (TOOL)
Each trace includes:
  • id: Unique identifier for the trace
  • parent_trace_id: Reference to the parent trace (null for root traces)
  • name: The operation name
  • node_type: Classification of the operation

Node Types

Traces are classified by node type to help you understand the nature of each operation and debug issues more effectively:
Node TypeDefinitionWhy This Matters for Tracing
CHAINA node composed of multiple smaller steps (e.g., a LangChain RunnableSequence or LangGraph node).Composite orchestration nodes that run multiple internal steps and pass data between stages. Exposing each internal step in traces uncovers hidden failures, clarifies data flow, and pinpoints performance hotspots.
TOOLA node executing deterministic code (search, calculator, API).Deterministic or external calls where inputs, outputs, and side effects determine correctness. Tracing tool inputs, outputs, and results makes failures reproducible, reveals integration issues, and simplifies debugging.
RETRIEVERA node specifically for fetching context (RAG operations).Operations that fetch contextual data which directly affect prompt relevance and the context window. Making retrieval inputs, results, and ranks visible helps diagnose low-quality context, improve relevance, and avoid wasted context and cost.
LLMA node that wraps a direct model API call (OpenAI, Anthropic, etc.).This is where cost (tokens) and latency come from. This way you can clearly see these operations and identify expensive calls and bottlenecks.
CUSTOMAny generic Python function that doesn’t fit the above (e.g., data formatting, state parsing).Utility or glue-code functions that don’t map cleanly to other types but shape runtime behavior. Tracing these operations surfaces subtle bugs, data transformations, and state changes that would otherwise be opaque.

The @trace Decorator

The @trace decorator automatically captures function inputs, outputs, timing, errors, and parent-child relationships.

Syntax Options

from galtea import trace, NodeType

# Full specification
@trace(name="my_operation", node_type=NodeType.TOOL)
def my_function(): ...

# Name only (node_type defaults to None, for unclassified operations)
@trace(name="custom_name")
def my_function(): ...

# Bare decorator (uses function name)
@trace
def my_function(): ...

# Empty parentheses
@trace()
def my_function(): ...

Error Tracking

The decorator automatically captures exceptions. When an error occurs, the trace records:
  • The error message in the error field
  • The execution time until the error
  • Input data that caused the error
@trace(name="risky_operation", node_type=NodeType.TOOL)
def risky_call(self, data: str) -> str:
    if not data:
        raise ValueError("Data cannot be empty")
    return f"Processed: {data}"

Viewing Trace Hierarchy

After collecting traces, you can visualize the execution flow:
traces = galtea.traces.list(inference_result_id=inference_result.id)

def print_trace_tree(traces, parent_id=None, indent=0):
    for trace in traces:
        if trace.parent_trace_id == parent_id:
            prefix = "  " * indent + ("└─ " if indent > 0 else "")
            print(f"{prefix}{trace.name} ({trace.node_type}) - {trace.latency_ms:.2f}ms")
            print_trace_tree(traces, trace.id, indent + 1)

print_trace_tree(traces)
Example output:
main_agent (CHAIN) - 245.30ms
└─ route_query (CHAIN) - 0.15ms
└─ search_documents (RETRIEVER) - 120.50ms
└─ call_llm (LLM) - 124.20ms

SDK Integration

Tracing Tutorial

Step-by-step guide to instrumenting your agent and collecting traces.

Trace Service

Manage and collect traces for your AI agent operations using the SDK.

Trace Properties

Session
Session
required
The session to which the trace belongs.
Inference Result
InferenceResult
The inference result this trace is associated with.
Name
string
required
The name of the traced operation (e.g., function name).
Node Type
string
The type of operation: TOOL, CHAIN, RETRIEVER, LLM, or CUSTOM.
Parent Trace ID
string
The ID of the parent trace for hierarchical relationships.
Input Data
object
The input parameters passed to the operation.
Output Data
object
The result returned by the operation.
Error
string
Error message if the operation failed.
Latency (ms)
float
The execution time of the operation in milliseconds.
Start Time
string
ISO 8601 timestamp when the operation started.
End Time
string
ISO 8601 timestamp when the operation completed.
Metadata
object
Additional custom metadata about the trace.

Best Practices

# ✅ Good - descriptive
@trace(name="fetch_customer_orders", node_type=NodeType.TOOL)

# ❌ Bad - generic
@trace(name="step_1", node_type=NodeType.TOOL)
Trace operations that represent logical units of work, not every single function:
# ✅ Good - meaningful operation
@trace(name="search_products", node_type=NodeType.RETRIEVER)
def search_products(self, query):
    results = self._query_vector_db(query)  # Internal, not traced
    return self._format_results(results)     # Internal, not traced
Classify operations correctly to enable better filtering and analysis in the dashboard.
The decorator captures function arguments automatically. Consider what’s useful for debugging:
@trace(name="process_document", node_type=NodeType.TOOL)
def process(self, doc_id: str) -> dict:
    # Only doc_id is captured as input, not the full document
    doc = self.fetch_document(doc_id)
    return {"summary": doc.summary, "status": "processed"}