Returns
Returns an InferenceResult object.Example
Parameters
The session ID to log the inference result to.
The input text/prompt.
The generated output/response.
Context retrieved for RAG systems.
Latency in milliseconds.
Information about token usage during the model call.
Possible keys include:
- input_tokens: Number of input tokens sent to the model.
- output_tokens: Number of output tokens generated by the model.
- cache_read_input_tokens: Number of input tokens read from the cache.
Information about the cost per token during the model call.
Possible keys include:
- cost_per_input_token: Cost per input token sent to the model.
- cost_per_output_token: Cost per output token generated by the model.
- cost_per_cache_read_input_token: Cost per input token read from the cache.
The version of Galtea’s conversation simulator used to generate the user message (input). This should only be provided when logging a conversation that was generated using the simulator.