Welcome to Galtea, the platform that empowers enterprises by providing a comprehensive AI evaluation platform that improves AI reliability, reduces risks, streamlines compliance, and accelerates time to market.
Register your organization into Galtea’s platform
Execute your first Evaluation in less than 5 minutes
The typical workflow in Galtea follows these steps:
Create Product
Create Version
Create Test
Run Evaluation Tasks
Analyze Results
Iterate
You can interact with Galtea through multiple channels:
Manage your products and access insights via the dashboard.
Seamlessly integrate our services using the Python SDK.
Automate your workflows by integrating with GitHub Actions.
Documentation is coming soon.
Galtea is built around several key concepts that work together to provide comprehensive evaluation of AI products:
A functionality or service being evaluated
A specific iteration of a product
A set of test cases for evaluating product performance
A full conversation between a user and an AI system.
A single turn in a conversation between a user and the AI.
A group of evaluable Inference Results from a particular session
The assessment of an evaluation using a specific metric type’s criteria
Ways to evaluate and score product performance
Way to keep track of your models’ costs
Welcome to Galtea, the platform that empowers enterprises by providing a comprehensive AI evaluation platform that improves AI reliability, reduces risks, streamlines compliance, and accelerates time to market.
Register your organization into Galtea’s platform
Execute your first Evaluation in less than 5 minutes
The typical workflow in Galtea follows these steps:
Create Product
Create Version
Create Test
Run Evaluation Tasks
Analyze Results
Iterate
You can interact with Galtea through multiple channels:
Manage your products and access insights via the dashboard.
Seamlessly integrate our services using the Python SDK.
Automate your workflows by integrating with GitHub Actions.
Documentation is coming soon.
Galtea is built around several key concepts that work together to provide comprehensive evaluation of AI products:
A functionality or service being evaluated
A specific iteration of a product
A set of test cases for evaluating product performance
A full conversation between a user and an AI system.
A single turn in a conversation between a user and the AI.
A group of evaluable Inference Results from a particular session
The assessment of an evaluation using a specific metric type’s criteria
Ways to evaluate and score product performance
Way to keep track of your models’ costs