Create multiple inference results from conversation turns. See Inference Results.
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
Inference results created successfully
"ir_123"
"session_123"
"user_123"
Order index within the session
0
PENDING, GENERATED, FAILED "PENDING"
Structured input data. For plain text input, format is { user_message: "..." }
{ "user_message": "User input text" }"Model response"
The RAG retrieval context (retrieved documents/snippets) used to generate the actual output. Used by RAG-aware evaluation metrics.
150
100
50
20
Total tokens
150
0.00001
0.00003
0.000005
0.001
1
"1.0.0"
W3C trace ID for the root span created during direct inference. The same trace ID is propagated to the user endpoint via the traceparent header.
"a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4"