Create Inference Result

Returns

Example

inference_result = galtea.inference_results.create(
    session_id="YOUR_SESSION_ID",
    input="What is the capital of France?",
    output="Paris is the capital of France."
)

Parameters

session_id

string

required

The session ID to log the inference result to.

input

string

required

The input text/prompt.

output

string

required

The generated output/response.

retrieval_context

string

Context retrieved for RAG systems.

latency

float

Latency in milliseconds.

usage_info

dict[str, int]

Information about token usage during the model call. Possible keys include:

input_tokens: Number of input tokens sent to the model.
output_tokens: Number of output tokens generated by the model.
cache_read_input_tokens: Number of input tokens read from the cache.

cost_info

dict[str, float]

Information about the cost per token during the model call. Possible keys include:

cost_per_input_token: Cost per input token sent to the model.
cost_per_output_token: Cost per output token generated by the model.
cost_per_cache_read_input_token: Cost per input token read from the cache.

conversation_simulator_version

string

The version of Galtea’s conversation simulator used to generate the user message (input). This should only be provided when logging a conversation that was generated using the simulator.

SDK

API

Create Inference Result

Returns

Example

Parameters

SDK

API

​Returns

​Example

​Parameters

Returns

Example

Parameters