Get metrics
Get list of metrics with pagination and filtering. See Metrics.
Documentation Index
Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
Query Parameters
Filter by metric IDs
Filter by organization IDs
Filter by product IDs
Filter by metric names (exact match, multiple)
Filter by metric name (partial match)
Filter by metric description (partial match)
Filter by tags
Filter by metric sources
SELF_HOSTED, FULL_PROMPT, PARTIAL_PROMPT, HUMAN_EVALUATION, GEVAL, DEEPEVAL, DETERMINISTIC Filter by specification IDs
Filter by user group IDs
Filter metrics created at or after this timestamp (ISO 8601 format)
Filter metrics created at or before this timestamp (ISO 8601 format)
Include default/predefined metrics
Include metrics owned by the user's organization
Include legacy/deprecated metrics
Filter metrics suitable for monitoring
Sort instructions (field and direction pairs)
Maximum number of results
Number of results to skip
Response
Metrics retrieved successfully
"metric_123"
Identifier shared by every metric in the same revision family. Server-managed — derived from parentMetricId on create (or generated for roots). Cannot be set by the caller.
"metric_123"
Id of the direct parent metric. On create, providing this value turns the new metric into a revision: it joins the parent's family and (if the parent is active) flips the parent to legacy. Omit or null to create a root metric in a fresh group. On responses, this is the recorded parent edge (null for roots).
"metric_122"
"org_123"
"user_123"
"Accuracy"
Ordered list of inference-result fields the evaluator needs (e.g. input, actualOutput, expectedOutput, retrievalContext). Determines which data the evaluation engine extracts from each inference result.
["input", "actualOutput", "expectedOutput"]SELF_HOSTED, FULL_PROMPT, PARTIAL_PROMPT, HUMAN_EVALUATION, GEVAL, DEEPEVAL, DETERMINISTIC "FULL_PROMPT"
"Evaluate the accuracy of the response"
["accuracy", "quality"]"Measures the accuracy of responses"
"https://docs.example.com/metrics/accuracy"
"GPT-4"
When true, evaluationParams are injected at the top level of the evaluator prompt instead of nested inside the conversation context.
Whether the metric is currently being optimized.
["spec_123"]["ug_123"]