Understanding Quality Test Evolutions

When Galtea generates Quality Tests from a ground_truth_file_path, it can also create variations of the base test cases to enhance test coverage and robustness. This is controlled by the variants parameter in the SDK and API.
In the SDK, the evolutions are known as variants, and the same parameter is used for both Quality Tests and Red Teaming Tests. However, the available options differ between the two types of tests.

What Are Quality Test Evolutions?

Quality test evolutions (also called “variants”) are systematic modifications of test cases that help evaluate your product’s robustness against a wide range of real-world user behaviors and edge cases. By selecting different evolution types, you can generate test case variations that probe for weaknesses and ensure your product performs reliably under diverse conditions.

Available Evolution Types

Below is the full list of evolution types you can select, along with their descriptions and suggested metrics:
Evolution TypeDescriptionSuggested Metrics
ParaphrasedRephrase the question to maintain the same meaning but use different wording.Factual Accuracy, Answer Relevancy, Faithfulness
Expanded QuestionExpand the query by adding more details or making it broader.Factual Accuracy, Answer Relevancy, Faithfulness
Specific Focus QuestionFocus on a specific part of the context to generate a more detailed query.Factual Accuracy, Answer Relevancy, Faithfulness
AmbiguousLack of clarity or specificity in the query.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
IncorrectFactual inaccuracies or misunderstandings in the query.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
IncompleteMissing key details in the query.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
TyposMisspelled words or accidental letter swaps in the query.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
SlangUse of colloquial or informal expressions.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
AbbreviationsShortened forms of words or acronyms.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Unconventional PhrasingRearranging words or using uncommon sentence structures.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Combined TopicsCombine multiple related topics into a single query.Factual Accuracy, Answer Relevancy, Faithfulness
Novel PhrasingUse creative or novel phrasing not typically seen in training data.Factual Accuracy, Answer Relevancy, Faithfulness
Hypothetical ScenariosIntroduce hypothetical or edge-case scenarios related to the original query.Factual Accuracy, Answer Relevancy, Faithfulness
InformalIncorporate vernacular, text speak, and abbreviations. Use informal, everyday language specific to a region or community, including slang and shorthand.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Linguistic DiverseMix languages or dialects within the same query, reflecting the user’s bilingual or dialectal background.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Typographic ErrorIntroduce grammatical errors, misspellings, or unconventional sentence structures, including phonetic spellings and mixed-up letters. Considers neurodivergent forms of writing such as those associated with dyslexia or learning disabilities.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Cognitively DiversePresent thoughts in a non-linear fashion, with rapid shifts in topics or ideas, reflecting variable attention spans.Resilience To Noise, Factual Accuracy, Answer Relevancy, Faithfulness
Note: The available options in the Galtea UI may use slightly different names than the variants parameter in the SDK/API. For example, paraphrased, typos, incorrect, cognitively_diverse, and linguistic_diverse are directly supported as variants. Other evolution types may be available in the UI for future expansion or custom use cases.
These evolutions help assess your product’s understanding and resilience by simulating real-world user behavior and edge cases. For more details on configuring variants, see the Test Creation API documentation.