Evaluate multiple sessions in one call
Batch-evaluate existing sessions in a single call. Pick exactly one mode: (a) sessionIds — explicit list of session IDs; (b) versionId — every session attached to that version. Each selected session is evaluated by the same logic as POST /evaluations/fromSession, so the inference results that already live on the session are scored. Unlike POST /evaluations/fromVersion, this endpoint does NOT require a conversation endpoint connection on the version — use it when the inferences already exist (e.g. imported traces). Use POST /evaluations/fromSession for a single session. See Evaluations.
Authorizations
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
Body
Exactly one of sessionIds or versionId is required.
Explicit list of session IDs to evaluate. Mutually exclusive with versionId.
["ses_123", "ses_456"]Evaluate every session attached to this version. Mutually exclusive with sessionIds.
"ver_123"
Metrics to evaluate per session. Optional if specificationIds is provided or the product has specifications with linked metrics.
Specification IDs whose linked metrics will be evaluated against each session. Can be combined with metrics; the API merges and deduplicates.
Response
Batch evaluation processed
Number of sessions whose evaluations were dispatched successfully.
Number of sessions whose evaluation failed.
Total number of Evaluation records created across all successful sessions.
Per-session failure details.
Total sessions matching the request (versionId mode only). When greater than the count actually evaluated the response was truncated to the first 1000 rows; the caller can re-issue the request with explicit sessionIds to cover the remainder. null in sessionIds mode.
True when the version had more sessions than the per-request page limit (1000).