Creating an Evaluation

Running Evaluation Tasks with Self-Calculated Scores

Learn how to create and run fully-customized evaluations self-calculated scores

Evaluate with your own scores

Galtea Docs

Welcome to Galtea, the platform that empowers enterprises by providing a comprehensive AI evaluation platform that improves AI reliability, reduces risks, streamlines compliance, and accelerates time to market.

Introduction

How to request access to the Galtea platform

Registration

All you need to get started with Galtea evaluations

Quickstart

Learn how to create and upload custom tests using the SDK

Create a Custom Test

Learn how to create evaluations and run evaluation tasks using the SDK

Run Evaluation Tasks

Learn how to create evaluation tasks from user queries in production for deeper analysis

Monitor production responses

Learn how to evaluate conversations with multiple turns using the Galtea SDK

Evaluating conversations

Learn how to integrate Galtea's evaluation capabilities into your GitHub Actions workflow

GitHub Actions

A functionality or service evaluated by Galtea

Product

A specific iteration of a product in Galtea

Version

A set of test cases for evaluating product performance

Test

A single challenge for evaluating product performance

Test Case

A link between a product version and a test

Evaluation

The assessment of a specific test case using a metric type

Evaluation Task

Ways to evaluate and score product performance

Metric Type

A representation of a LLM Model with cost information to calculate cost estimations

Model

Welcome to the Galtea SDK, a powerful toolkit that enables developers to integrate Galtea's AI evaluation capabilities directly into their workflows. Our SDK provides programmatic access to comprehensive testing, evaluation, and compliance features to improve AI reliability and accelerate development.

Getting Started

Examples

Integrations

Evaluate with your own scores

Creating an Evaluation

Running Evaluation Tasks with Self-Calculated Scores

Getting Started

Examples

Integrations

​Creating an Evaluation

​Running Evaluation Tasks with Self-Calculated Scores

Creating an Evaluation

Running Evaluation Tasks with Self-Calculated Scores