Single-Turn Production Monitoring

Multi-Turn Production Monitoring (Conversations)

Learn how to log and evaluate user queries from your production environment.

Monitor Production Responses

Galtea Docs

Welcome to Galtea, the platform that empowers enterprises by providing a comprehensive AI evaluation platform that improves AI reliability, reduces risks, streamlines compliance, and accelerates time to market.

Introduction

How to request access to the Galtea platform

Registration

All you need to get started with Galtea evaluations

Quickstart

Learn how to create effective product descriptions that power comprehensive AI evaluation.

Creating Product Descriptions

Learn how to create and upload custom tests using the SDK

Create a Custom Test

Learn how to run evaluation tasks for a single-turn, test-based workflow.

Run a Test-Based Evaluation

Learn how to evaluate multi-turn conversations using Galtea's session-based workflow.

Evaluating Conversations

Learn how to run evaluations with your own pre-calculated scores.

Evaluate with Custom Scores

Learn how to integrate Galtea's evaluation capabilities into your GitHub Actions workflow

GitHub Actions

A functionality or service evaluated by Galtea

Product

A specific iteration of a product in Galtea

Version

A set of test cases for evaluating product performance

Test

A single challenge for evaluating product performance

Test Case

A group of inference results from a session to be used by evaluation tasks

Evaluation

A task that evaluates a group of inference results using a metric type

Evaluation Task

A group of inference results that make up a full conversation

Session

A single turn in a conversation between a user and an AI system

Inference Result

Ways to evaluate and score product performance

Metric Type

A representation of a LLM Model with cost information to calculate cost estimations

Model

Welcome to the Galtea SDK, a powerful toolkit that enables developers to integrate Galtea's AI evaluation capabilities directly into their workflows. Our SDK provides programmatic access to comprehensive testing, evaluation, and compliance features to improve AI reliability and accelerate development.

Getting Started

Tutorials

Integrations

Monitor Production Responses

Single-Turn Production Monitoring

Multi-Turn Production Monitoring (Conversations)

Getting Started

Tutorials

Integrations

​Single-Turn Production Monitoring

​Multi-Turn Production Monitoring (Conversations)

Single-Turn Production Monitoring

Multi-Turn Production Monitoring (Conversations)