Threats

What is a threat?

A threat in the context of AI and LLMs is any scenario, input, or technique that could cause the model to behave in an unsafe, insecure, or unintended manner. Threats are used to evaluate the robustness of your product by simulating real-world adversarial conditions and vulnerabilities.

In the SDK, the threats are known as variants, and the same parameter is used for both Quality Tests and Red Teaming Tests. However, the available options differ between the two types of tests.

Threat Types

Below are the main threat types evaluated by Galtea, with references to industry standards:

Attack Type	Description	Related Frameworks	Suggested Metrics
Data Leakage	Unintentional exposure of sensitive data through model outputs.	OWASP Top 10 for LLMs 2025: LLM02: Sensitive Information Disclosure MITRE ATLAS: Exfiltration via Inference API MITRE ATLAS: LLM Data Leakage NIST AI RMF: Data Privacy	Data Leakage, Jailbreak Resilience
Financial Attacks	Exploiting the model for financial gain, such as generating fake reviews or phishing attacks.	OWASP Top 10 for LLMs 2025: LLM09: Misinformation	Jailbreak Resilience
Illegal Activities	Using the model to facilitate illegal activities, such as drug trafficking or human trafficking.	MITRE ATLAS: Jailbreak MITRE ATLAS: External Harms NIST AI RMF: CBRN Information or Capabilities NIST AI RMF: Dangerous, Violent or Hateful Content NIST AI RMF: Environmental Impact	Jailbreak Resilience, Misuse Resilience
Misuse	Using the model for unintended purposes, such as generating fake news or misinformation.	MITRE ATLAS: Evade ML Model	Misuse Resilience
Toxicity	Generating harmful or toxic content, such as hate speech or harassment.	MITRE ATLAS: Erode ML Model Integrity NIST AI RMF: Harmful Bias or Homogenization NIST AI RMF: Obscene, Degrading and/or Abusive Content	Non-Toxic, Unbiased
Custom	Allows the generation of highly specific adversarial tests that target the unique vulnerabilities and edge cases of your AI product.	-	-

Why Evaluate Against Threats?

Evaluating your product against these threats helps ensure:

Security: Prevents exploitation of the model for malicious purposes.
Privacy: Reduces the risk of leaking sensitive or private information.
Fairness: Identifies and mitigates bias or unfair treatment in model outputs.
Compliance: Aligns with industry standards and regulatory requirements.

Concepts

Metrics

Test Types

What is a threat?

Threat Types

Why Evaluate Against Threats?

References

Concepts

Metrics

Test Types

​What is a threat?

​Threat Types

​Why Evaluate Against Threats?

​References

What is a threat?

Threat Types

Why Evaluate Against Threats?

References