Test
A set of test cases for evaluating product performance
What is a Test?
A test in Galtea is a set of test cases designed to evaluate the performance of a product. A test file provides simulations of interactions with the product (and, in quality tests, expected outcomes for each interaction).
Test Origin
When creating a test in the Galtea dashboard, you’ll be asked to specify the test origin:
Generated
Galtea will take the knowledge base file and generate a set of test cases that will define the test.
Uploaded
The test is uploaded by you as a complete set of test cases.
Your selection will determine whether you need to provide a Knowledge Base File or a Test File.
More information on how to create tests can be found in the Quality Tests and Red Teaming Tests sections.
Test Types
Galtea supports two main types of tests:
Quality Tests
Tests that evaluate the quality and correctness of outputs
Red Teaming Tests
Tests that evaluate security, safety, and bias aspects
Using Tests in Evaluations
Tests are used in evaluation tasks to assess the performance of specific versions of your product.
The data within the Test File should be reused across multiple evaluations of distinct versions to ensure consistent comparison between different product versions. Meaning that distinct evaluation tasks of different evaluations should be composed of the same input and context.
There is no actual need of using the data in the Test File to run evaluation tasks, but it is essential to ensure that the multiple evaluations that are linked to the same test are composed of evaluation tasks that ran with the same input and context.
Create an Evaluation
Learn how to use tests in evaluations
SDK Integration
Test Service SDK
SDK methods for managing tests
Test Properties
When creating a test in Galtea, you’ll need to provide the following information:
The name of the test. Example: “Legal Document Quality Test” or “Customer Support Safety Evaluation”
The type of the test. Possible values:
- Quality: Tests that evaluate the quality and correctness of outputs
- Red Teaming: Tests that evaluate security, safety, and bias aspects
The reference file used to generate the test. Required if the test cases have to be generated by Galtea. Example: A PDF containing a product catalog
The language that will be used for the generation of the synthetic data. Defaults to the language used in the knowledge base file.
Available languages are: “English”, “Spanish”, “Portugese”, “Italian”, “French”, “German”, “Korean”, “Japanese” and “Chinese”.
This field only makes sense if the test is generated by Galtea.
The file containing the test data (the test cases). Essentially the “questions” and “answers”, adversarial inputs, etc. Required if uploading a pre-made test. Example: A CSV file with a predefined set of test cases.