When creating or configuring a metric, you select which parameters are relevant for your evaluation. These parameters are made available to the evaluator during scoring.Documentation Index
Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt
Use this file to discover all available pages before exploring further.
- For AI Evaluation, the selected parameters are automatically prepended to your
judge_prompt. - For Human Evaluation, they determine which data fields are displayed to annotators.
- For Self-Hosted metrics, evaluation parameters do not apply.
Parameter Reference
| Parameter | Description | Availability |
|---|---|---|
| input | The prompt or query sent to the model. | Accuracy, Security & Safety, and Behavior |
| actual_output | The actual output generated by the model. | Accuracy, Security & Safety, and Behavior |
| expected_output | The ideal answer for the given input. | Accuracy and Security & Safety |
| context | Additional background information provided to the model alongside the input. | All metrics |
| retrieval_context | The context retrieved by your RAG system before sending the user query to your LLM. | Accuracy, Security & Safety, and Behavior |
| traces | Execution traces from the agent, including tool calls, LLM invocations, and other internal operations. | All metrics |
| expected_tools | List of tools expected to be used by the agent to accomplish the task. | All metrics |
| tools_used | List of tools actually used by the agent during execution (automatically inferred from traces). | All metrics |
| product_description | The description of the product. | All metrics |
| product_capabilities | The capabilities of the product. | All metrics |
| product_inabilities | The product’s known inabilities or restrictions. | All metrics |
| product_security_boundaries | The security boundaries of the product. | All metrics |
| user_persona | Information about the user interacting with the agent. | Behavior tests |
| goal | The user’s objective in the conversation. | Behavior tests |
| scenario | The context or situation for the conversation. | Behavior tests |
| stopping_criterias | List of criteria that define when a conversation should end. | Behavior tests |
| conversation_turns | All turns in a conversation, including user and assistant messages. | Behavior tests (Human Evaluation only) |
Troubleshooting skipped evaluations
When an evaluation runs against a session that does not have all the data the metric needs, Galtea marks the evaluation as SKIPPED instead of producing a misleading score. The evaluation’s error field describes which parameters are missing, grouped by where you provide them. The categorized error message looks like this:| Section | Where to fix it |
|---|---|
| Missing from product settings | Open the product and fill in the missing field (Description, Capabilities, Inabilities, or Security Boundaries). |
| Missing from test case | Edit the test case and provide the missing value. Some fields (Goal, User Persona, Scenario, Stopping Criteria) are only populated on SCENARIOS-type test cases. If your test cases use a different type, use a metric that does not require these parameters. |
| Missing from endpoint connection’s output mapping | Edit the endpoint connection and add the missing key to the output mapping. actual_output is extracted from the output key; retrieval_context from a retrieval_context key. See Templates & Mapping for the JSONPath syntax used in mapping values. |
| Missing trace data | The metric needs traces or tools_used. Configure your product to send traces — see the Tracing Agent Operations tutorial for setup, and the Trace concept page for what gets captured. tools_used is automatically extracted from your trace data (specifically from trace entries of type TOOL); you do not provide it directly. |
| Missing inference data | The session has no inference results yet. Run the metric against a session that contains at least one conversation turn. |
Related
Evaluation Types
Understand AI Evaluation, Human Evaluation, and Self-Hosted scoring.
Metrics Overview
Browse all available metrics and create custom ones.