Overview
The update() method allows you to modify an existing inference result after it has been created. This is useful when you need to add or update output, latency, token usage, or cost information.
Method Signature
galtea.inference_results.update(
inference_result_id: str,
actual_output: Optional[str] = None,
actual_input: Optional[str] = None,
retrieval_context: Optional[str] = None,
latency: Optional[float] = None,
input_tokens: Optional[int] = None,
output_tokens: Optional[int] = None,
cache_read_input_tokens: Optional[int] = None,
tokens: Optional[int] = None,
cost: Optional[float] = None,
cost_per_input_token: Optional[float] = None,
cost_per_output_token: Optional[float] = None,
cost_per_cache_read_input_token: Optional[float] = None
) -> InferenceResult
Parameters
The ID of the inference result to update.
Output Fields
The generated output or response from the AI model.
The input text or prompt for the inference result.
The context retrieved by a RAG system, if applicable.
The time in milliseconds from request to response.
Usage Fields
Number of input tokens sent to the model.
Number of output tokens generated by the model.
Number of input tokens read from the cache.
Total tokens used in the model call.
Cost Fields
The total cost associated with the model call.
Cost per input token sent to the model.
Cost per output token generated by the model.
cost_per_cache_read_input_token
Cost per input token read from the cache.
Returns
Returns the updated InferenceResult object.
Example
from galtea import Galtea
galtea = Galtea(api_key="YOUR_API_KEY")
# Update an inference result with output and metrics
updated_result = galtea.inference_results.update(
inference_result_id="inf_abc123",
actual_output="Here is the response from the model.",
latency=245.5,
input_tokens=150,
output_tokens=75,
cost=0.002
)
print(f"Updated: {updated_result.id}")
print(f"Output: {updated_result.actual_output}")
print(f"Latency: {updated_result.latency}ms")
Use Cases
Deferred Output Update
Create an inference result first, then update it after processing completes:
import time
# Assume 'session' and 'my_model' are defined
# session = galtea.sessions.create(...)
# Create inference result with just input
user_input = "What is the weather today?"
inference_result = galtea.inference_results.create(
session_id=session.id,
input=user_input
)
# Process with your model
start_time = time.time()
response = my_model.generate(user_input)
latency_ms = (time.time() - start_time) * 1000
# Update with output and metrics
galtea.inference_results.update(
inference_result_id=inference_result.id,
actual_output=response,
latency=latency_ms
)
Update an inference result with cost data after receiving billing info:
galtea.inference_results.update(
inference_result_id=inference_result.id,
cost=0.0025,
cost_per_input_token=0.00001,
cost_per_output_token=0.00003
)
Notes
Only include fields you want to update. Fields not specified will remain unchanged.
- Pass
None explicitly to clear a field’s value
- The
creditsUsed field cannot be modified through this method