Introduction - Galtea Docs

Get Started

Register

Quickstart

Execute your first Evaluation in less than 5 minutes

How It Works

The typical workflow in Galtea follows these steps:

Create Product

Create a Product to represent your AI functionality.

Create Version

Define a new Version of your product with specific parameters.

Create Test

Create a Test with Test Cases to evaluate your product.

Run Evaluation Tasks

Execute Evaluation Tasks for each test case against sessions of your product version.

Analyze Results

Review and compare the performance of different versions using the built-in analytics tools.

Iterate

Develop a new version of your product and repeat the process to track improvements and regressions.

Platform Access

You can interact with Galtea through multiple channels:

Web Platform

Manage your products and access insights via the dashboard.

Python SDK

Seamlessly integrate our services using the Python SDK.

GitHub Actions

Automate your workflows by integrating with GitHub Actions.

REST API

Documentation is coming soon.

Core Concepts

Galtea is built around several key concepts that work together to provide comprehensive evaluation of AI products:

Product

A functionality or service being evaluated

Version

A specific iteration of a product

Test

A set of test cases for evaluating product performance

Session

A full conversation between a user and an AI system.

Inference Result

A single turn in a conversation between a user and the AI.

Evaluation

A group of evaluable Inference Results from a particular session

Evaluation Task

The assessment of an evaluation using a specific metric type’s criteria

Metric Type

Ways to evaluate and score product performance

Model

Way to keep track of your models’ costs

On this page

Get Started
How It Works
Platform Access
Core Concepts

Get Started

Register

Quickstart

Execute your first Evaluation in less than 5 minutes

How It Works

The typical workflow in Galtea follows these steps:

Create Product

Create a Product to represent your AI functionality.

Create Version

Define a new Version of your product with specific parameters.

Create Test

Create a Test with Test Cases to evaluate your product.

Run Evaluation Tasks

Execute Evaluation Tasks for each test case against sessions of your product version.

Analyze Results

Review and compare the performance of different versions using the built-in analytics tools.

Iterate

Develop a new version of your product and repeat the process to track improvements and regressions.

Platform Access

You can interact with Galtea through multiple channels:

Web Platform

Manage your products and access insights via the dashboard.

Python SDK

Seamlessly integrate our services using the Python SDK.

GitHub Actions

Automate your workflows by integrating with GitHub Actions.

REST API

Documentation is coming soon.

Core Concepts

Galtea is built around several key concepts that work together to provide comprehensive evaluation of AI products:

Product

A functionality or service being evaluated

Version

A specific iteration of a product

Test

A set of test cases for evaluating product performance

Session

A full conversation between a user and an AI system.

Inference Result

A single turn in a conversation between a user and the AI.

Evaluation

A group of evaluable Inference Results from a particular session

Evaluation Task

The assessment of an evaluation using a specific metric type’s criteria

Metric Type

Ways to evaluate and score product performance

Model

Way to keep track of your models’ costs

On this page

Get Started
How It Works
Platform Access
Core Concepts

​Get Started

Register

Quickstart

​How It Works

​Platform Access

Web Platform

Python SDK

GitHub Actions

REST API

​Core Concepts

Product

Version

Test

Session

Inference Result

Evaluation

Evaluation Task

Metric Type

Model

SDK

API

​Get Started

Register

Quickstart

​How It Works

​Platform Access

Web Platform

Python SDK

GitHub Actions

REST API

​Core Concepts

Product

Version

Test

Session

Inference Result

Evaluation

Evaluation Task

Metric Type

Model

Get Started

How It Works

Platform Access

Core Concepts

Get Started

How It Works

Platform Access

Core Concepts