Skip to main content
You can use Galtea to log and evaluate real user interactions from your production environment. This helps you monitor your product’s performance over time.

Single-Turn Production Monitoring

For simple, single-turn interactions, you can send the details to Galtea using create_single_turn with is_production=True.
# In your application's request handler...
def handle_user_query(user_query: str, retrieval_context: str | None = None) -> str:
    # Your logic to get a response from your model
    model_response = your_product_function(user_query, retrieval_context)

    # Log and evaluate the interaction in Galtea
    galtea.evaluations.create_single_turn(
        version_id=VERSION_ID,
        is_production=True,
        metrics=[
            {"name": "Role Adherence"},
            {"name": "Answer Relevancy"},
            {"name": "Faithfulness"},
        ],
        input=user_query,
        actual_output=model_response,
        retrieval_context=retrieval_context,
    )

    return model_response


# Test the handler
handle_user_query(
    "What are your business hours?", "Business hours: 9am-5pm Monday-Friday"
)

Multi-Turn Production Monitoring (Conversations)

For multi-turn conversations, use the session-based workflow to log the entire interaction first and then evaluate it.
1

1. Create a Session

First, create a session at the start of the conversation. For production monitoring, make sure to set is_production=True.
# Use is_production=True for real user interactions
session = galtea.sessions.create(
    custom_id="CLIENT_PROVIDED_SESSION_ID",  # Optional: a custom ID to associate this session in Galtea Platform to the one in your real application.
    version_id=VERSION_ID,
    is_production=True,
)
2

2. Log Conversation Turns

Next, log the user-assistant interactions. You can do this individually as each turn happens or in a single batch after the conversation ends.
This approach is useful for logging interactions in real-time in a live application.
def get_model_response(user_input: str) -> str:
    # Replace this with your actual model call
    model_output = f"This is a simulated response to '{user_input}'"
    return model_output


# This would happen dynamically in your application.
user_questions = [
    "What are some lower-risk investment strategies?",
    "With age, should the investment strategy change?",
    "Great, thanks!",
]

for question in user_questions:
    model_response = get_model_response(question)
    # Log the turn to Galtea right after it happens
    inference_result = galtea.inference_results.create(
        session_id=session.id, input=question, output=model_response
    )
3

3. Evaluate the Session

Finally, once the conversation is complete and all turns are logged, you can run an evaluation on the entire session.
galtea.evaluations.create(session_id=session.id, metrics=METRICS_TO_EVALUATE)

print(f"Logged and evaluated production session {session.id}")
For more details on evaluating multi-turn conversations, see the Evaluating Conversations guide.