Update Inference Result

Overview

The update() method allows you to modify an existing inference result after it has been created. This is useful when you need to add or update output, latency, token usage, or cost information.

Parameters

inference_result_id

string

required

The ID of the inference result to update.

Output Fields

output

string

The generated output or response from the AI model.

input

string

The input text or prompt for the inference result.

retrieval_context

string

The context retrieved by a RAG system, if applicable.

Performance Fields

latency

float

The time in milliseconds from request to response.

Usage Fields

input_tokens

int

Number of input tokens sent to the model.

output_tokens

int

Number of output tokens generated by the model.

cache_read_input_tokens

int

Number of input tokens read from the cache.

tokens

int

Total tokens used in the model call.

Cost Fields

cost

float

The total cost associated with the model call.

cost_per_input_token

float

Cost per input token sent to the model.

cost_per_output_token

float

Cost per output token generated by the model.

cost_per_cache_read_input_token

float

Cost per input token read from the cache.

Returns

Returns the updated InferenceResult object.

Example

inference_result = galtea.inference_results.update(
    inference_result_id=inference_result_id,
    output="Paris is the capital.",
    cost=0.0001,
)

Use Cases

Deferred Output Update

Create an inference result first, then update it after processing completes:

# Create inference result with just input
user_input = "What is the weather today?"
inference_result = galtea.inference_results.create(
    session_id=session.id, input=user_input
)
if inference_result is None:
    raise ValueError("inference_result is None")

# Process with your model
start_time = time.time()
response = my_model_generate(user_input)
latency_ms = (time.time() - start_time) * 1000

# Update with output and metrics
galtea.inference_results.update(
    inference_result_id=inference_result.id, output=response, latency=latency_ms
)

Adding Cost Information

Update an inference result with cost data after receiving billing info:

galtea.inference_results.update(
    inference_result_id=inference_result.id,
    cost=0.0025,
    cost_per_input_token=0.00001,
    cost_per_output_token=0.00003,
)

Notes

Only include fields you want to update. Fields not specified will remain unchanged.

Pass None explicitly to clear a field’s value
The creditsUsed field cannot be modified through this method

Create Inference Result - Create a new inference result
Generate Inference Result - Create with automatic trace collection
Get Inference Result - Retrieve an inference result

SDK

Concepts

Update Inference Result

Overview

Parameters

Output Fields

Performance Fields

Usage Fields

Cost Fields

Returns

Example

Use Cases

Deferred Output Update

Adding Cost Information

Notes

SDK

Concepts

​Overview

​Parameters

​Output Fields

​Performance Fields

​Usage Fields

​Cost Fields

​Returns

​Example

​Use Cases

​Deferred Output Update

​Adding Cost Information

​Notes

​Related Methods

Overview

Parameters

Output Fields

Performance Fields

Usage Fields

Cost Fields

Returns

Example

Use Cases

Deferred Output Update

Adding Cost Information

Notes

Related Methods