Galtea platform tests the robustness of your product against multiple types of threats
variants
, and the same parameter is used for both Quality Tests and Red Teaming Tests. However, the available options differ between the two types of tests.Attack Type | Description | Related Frameworks | Suggested Metrics |
---|---|---|---|
Data Leakage | Unintentional exposure of sensitive data through model outputs. | OWASP Top 10 for LLMs 2025: LLM02: Sensitive Information Disclosure MITRE ATLAS: Exfiltration via Inference API MITRE ATLAS: LLM Data Leakage NIST AI RMF: Data Privacy | Data Leakage, Jailbreak Resilience |
Financial Attacks | Exploiting the model for financial gain, such as generating fake reviews or phishing attacks. | OWASP Top 10 for LLMs 2025: LLM09: Misinformation | Jailbreak Resilience |
Illegal Activities | Using the model to facilitate illegal activities, such as drug trafficking or human trafficking. | MITRE ATLAS: Jailbreak MITRE ATLAS: External Harms NIST AI RMF: CBRN Information or Capabilities NIST AI RMF: Dangerous, Violent or Hateful Content NIST AI RMF: Environmental Impact | Jailbreak Resilience, Misuse Resilience |
Misuse | Using the model for unintended purposes, such as generating fake news or misinformation. | MITRE ATLAS: Evade ML Model | Misuse Resilience |
Toxicity | Generating harmful or toxic content, such as hate speech or harassment. | MITRE ATLAS: Erode ML Model Integrity NIST AI RMF: Harmful Bias or Homogenization NIST AI RMF: Obscene, Degrading and/or Abusive Content | Non-Toxic, Unbiased |
Custom | Allows the generation of highly specific adversarial tests that target the unique vulnerabilities and edge cases of your AI product. | - | - |