What is a Test Case?

Using Test Cases in Evaluations

SDK Integration

Test Case Properties

A single challenge for evaluating product performance

Test Case

Galtea Docs

Welcome to Galtea, the platform that empowers enterprises by providing a comprehensive AI evaluation platform that improves AI reliability, reduces risks, streamlines compliance, and accelerates time to market.

Introduction

How to request access to the Galtea platform

Registration

All you need to get started with Galtea evaluations

Quickstart

Learn how to create effective product descriptions that power comprehensive AI evaluation.

Creating Product Descriptions

Learn how to create and upload custom tests using the SDK

Create a Custom Test

Learn how to run evaluation tasks for a single-turn, test-based workflow.

Run a Test-Based Evaluation

Learn how to log and evaluate user queries from your production environment.

Monitor Production Responses

Learn how to evaluate multi-turn conversations using Galtea's session-based workflow.

Evaluating Conversations

Learn how to use Galtea's Conversation Simulator to test your AI with a synthetic user.

Simulating User Conversations

Learn how to run evaluations with your own pre-calculated scores.

Evaluate with Custom Scores

Learn how to integrate Galtea's evaluation capabilities into your GitHub Actions workflow

GitHub Actions

A functionality or service evaluated by Galtea

Product

A specific iteration of a product in Galtea

Version

A set of test cases for evaluating product performance

Test

A group of inference results from a session to be used by evaluation tasks

Evaluation

A task that evaluates a group of inference results using a metric type

Evaluation Task

A group of inference results that make up a full conversation

Session

A single turn in a conversation between a user and an AI system

Inference Result

Ways to evaluate and score product performance

Metric Type

A representation of a LLM Model with cost information to calculate cost estimations

Model

Test your conversational AI by simulating realistic user interactions with a synthetic user.

Conversation Simulator

Evaluate multi-turn dialogue interactions using conversation simulation with synthetic users

Scenario Based Tests

Welcome to the Galtea SDK, a powerful toolkit that enables developers to integrate Galtea's AI evaluation capabilities directly into their workflows. Our SDK provides programmatic access to comprehensive testing, evaluation, and compliance features to improve AI reliability and accelerate development.

Concepts

Metrics

Test Types

Test Case

What is a Test Case?

Using Test Cases in Evaluations

Create an Evaluation

SDK Integration

Test Case Service SDK

Test Case Properties

Concepts

Metrics

Test Types

​What is a Test Case?

​Using Test Cases in Evaluations

Create an Evaluation

​SDK Integration

Test Case Service SDK

​Test Case Properties

What is a Test Case?

Using Test Cases in Evaluations

SDK Integration

Test Case Properties