What is a threat?

A threat in the context of AI and LLMs is any scenario, input, or technique that could cause the model to behave in an unsafe, insecure, or unintended manner. Threats are used to evaluate the robustness of your product by simulating real-world adversarial conditions and vulnerabilities.
In the SDK, the threats are known as variants, and the same parameter is used for both Quality Tests and Red Teaming Tests. However, the available options differ between the two types of tests.

Threat Types

Below are the main threat types evaluated by Galtea, with references to industry standards:
Attack TypeDescriptionRelated FrameworksSuggested Metrics
Data LeakageUnintentional exposure of sensitive data through model outputs.OWASP Top 10 for LLMs 2025: LLM02: Sensitive Information Disclosure
MITRE ATLAS: Exfiltration via Inference API
MITRE ATLAS: LLM Data Leakage
NIST AI RMF: Data Privacy
Data Leakage, Jailbreak Resilience
Financial AttacksExploiting the model for financial gain, such as generating fake reviews or phishing attacks.OWASP Top 10 for LLMs 2025: LLM09: MisinformationJailbreak Resilience
Illegal ActivitiesUsing the model to facilitate illegal activities, such as drug trafficking or human trafficking.MITRE ATLAS: Jailbreak
MITRE ATLAS: External Harms
NIST AI RMF: CBRN Information or Capabilities
NIST AI RMF: Dangerous, Violent or Hateful Content
NIST AI RMF: Environmental Impact
Jailbreak Resilience, Misuse Resilience
MisuseUsing the model for unintended purposes, such as generating fake news or misinformation.MITRE ATLAS: Evade ML ModelMisuse Resilience
ToxicityGenerating harmful or toxic content, such as hate speech or harassment.MITRE ATLAS: Erode ML Model Integrity
NIST AI RMF: Harmful Bias or Homogenization
NIST AI RMF: Obscene, Degrading and/or Abusive Content
Non-Toxic, Unbiased
CustomAllows the generation of highly specific adversarial tests that target the unique vulnerabilities and edge cases of your AI product.--

Why Evaluate Against Threats?

Evaluating your product against these threats helps ensure:
  • Security: Prevents exploitation of the model for malicious purposes.
  • Privacy: Reduces the risk of leaking sensitive or private information.
  • Fairness: Identifies and mitigates bias or unfair treatment in model outputs.
  • Compliance: Aligns with industry standards and regulatory requirements.

References