Get list of metrics with pagination and filtering. See Metrics.
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
Filter by metric IDs
Filter by organization IDs
Filter by product IDs
Filter by metric names (exact match, multiple)
Filter by metric name (partial match)
Filter by metric description (partial match)
Filter by tags
Filter by metric sources
SELF_HOSTED, FULL_PROMPT, PARTIAL_PROMPT, HUMAN_EVALUATION, GEVAL, DEEPEVAL, DETERMINISTIC Filter by specification IDs
Filter by user group IDs
Filter metrics created at or after this timestamp (ISO 8601 format)
Filter metrics created at or before this timestamp (ISO 8601 format)
Include default/predefined metrics
Include metrics owned by the user's organization
Include legacy/deprecated metrics
Filter metrics suitable for monitoring
Sort instructions (field and direction pairs)
Maximum number of results
Number of results to skip
Metrics retrieved successfully
"metric_123"
"org_123"
"user_123"
"Accuracy"
Ordered list of inference-result fields the evaluator needs (e.g. input, actualOutput, expectedOutput, retrievalContext). Determines which data the evaluation engine extracts from each inference result.
["input", "actualOutput", "expectedOutput"]SELF_HOSTED, FULL_PROMPT, PARTIAL_PROMPT, HUMAN_EVALUATION, GEVAL, DEEPEVAL, DETERMINISTIC "FULL_PROMPT"
"Evaluate the accuracy of the response"
["accuracy", "quality"]"Measures the accuracy of responses"
"https://docs.example.com/metrics/accuracy"
"GPT-4"
When true, evaluationParams are injected at the top level of the evaluator prompt instead of nested inside the conversation context.
["spec_123"]