Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt

Use this file to discover all available pages before exploring further.

You can use Galtea to log and evaluate real user interactions from your production environment. This helps you monitor your product’s performance over time.

Single-Turn Production Monitoring

For simple, single-turn interactions, create a production session and use galtea.inference_results.create_and_evaluate() to log and evaluate the interaction in a single call.
# In your application's request handler...
def handle_user_query(user_query: str, retrieval_context: str | None = None) -> str:
    # Your logic to get a response from your model
    model_response = your_product_function(user_query, retrieval_context)

    # Log and evaluate the interaction in Galtea
    session = galtea.sessions.create(version_id=VERSION_ID, is_production=True)
    galtea.inference_results.create_and_evaluate(
        session_id=session.id,
        input=user_query,
        output=model_response,
        retrieval_context=retrieval_context,
        metrics=[
            {"name": "Role Adherence"},
            {"name": "Answer Relevancy"},
            {"name": "Faithfulness"},
        ],
    )

    return model_response


# Test the handler
handle_user_query(
    "What are your business hours?", "Business hours: 9am-5pm Monday-Friday"
)

Multi-Turn Production Monitoring (Conversations)

For multi-turn conversations, use the session-based workflow to log the entire interaction first and then evaluate it.
1

1. Create a Session

First, create a session at the start of the conversation. For production monitoring, make sure to set is_production=True.
# Use is_production=True for real user interactions
session = galtea.sessions.create(
    custom_id="CLIENT_PROVIDED_SESSION_ID",  # Optional: a custom ID to associate this session in Galtea Platform to the one in your real application.
    version_id=VERSION_ID,
    is_production=True,
)
2

2. Log Conversation Turns

Next, log the user-assistant interactions. You can do this individually as each turn happens or in a single batch after the conversation ends.
This approach is useful for logging interactions in real-time in a live application.
def get_model_response(user_input: str) -> str:
    # Replace this with your actual model call
    model_output = f"This is a simulated response to '{user_input}'"
    return model_output


# This would happen dynamically in your application.
user_questions = [
    "What are some lower-risk investment strategies?",
    "With age, should the investment strategy change?",
    "Great, thanks!",
]

for question in user_questions:
    model_response = get_model_response(question)
    # Log the turn to Galtea right after it happens
    inference_result = galtea.inference_results.create(
        session_id=session.id, input=question, output=model_response
    )
3

3. Evaluate the Session

Finally, once the conversation is complete and all turns are logged, you can run an evaluation on the entire session.
galtea.evaluations.create(session_id=session.id, metrics=METRICS_TO_EVALUATE)

print(f"Logged and evaluated production session {session.id}")
For more details on evaluating multi-turn conversations, see the Evaluating Conversations guide.

Next Steps

Evaluating Conversations

Evaluate full multi-turn production conversations.

Tracing Agent Operations

Capture and analyze internal agent operations in production.