The Metrics Service in the Galtea SDK allows you to manage metric types for evaluating your products. This Service is exposed by the galtea.metrics object and we will further explore its API down below.

Remember that we will be using the galtea object. More information here.

Create Metric

This method allows you to create a metric.

metric = galtea.metrics.create(
    name="accuracy_v1",
    criteria="Determine whether the actual output is equivalent to the expected output."
    evaluation_params=["input", "expected output", "actual output"],
)
name
string
required

The name of the metric.

evaluation_params
list[string]
required

Standard parameters that define the inputs and outputs available during the evaluation process. These parameters should be explicitly mentioned in your evaluation criteria or steps to ensure they’re taken into account during assessment.

  • input: The original query or prompt sent to your product. This should always be provided.
  • expected output: The expected output or ideal response for the test case
  • actual output: The actual output produced by the product
  • retrieval context: The context retrieved by your RAG system that was used to generate the actual output
  • context: Additional background information provided with the input and related to the grodun truth

“Input” will always need to be part of the evaluation params.

You can directly reference these parameters in your criteria or evaluation steps. For example: “Evaluate if the Actual Output contains factually correct information that aligns with verified sources in the Retrieval Context.”

To ensure accurate evaluation results, include only those parameters in your evaluation_params list that you’ve explicitly referenced in your criteria or evaluation steps. You may refer to some parameters by other descriptive names like ‘response’ instead of ‘actual output’ and that is ok but you would also need to include actual output in the evaluation_params list.

criteria
string
required

The criteria for the metric.

evaluation_steps
list[string]
required

The evaluation steps for the metric.

You need to provide either Criteria or Evaluation Steps, but not both. Your choice depends on your preferred evaluation approach.

Listing Metrics

This method allows you to list all metrics.

metrics = galtea.metrics.list()
offset
int

The number of metrics to skip before starting to collect the result set.

limit
int

The maximum number of metrics to return.

Retrieving Metric

This method allows you to retrieve a specific metric by its ID.

metric = galtea.metrics.get(metric_type_id="YOUR_METRIC_ID")
metric_type_id
string
required

The ID of the metric you want to retrieve.

Retrieving Metric By Name

This method allows you to retrieve a specific metric by its name.

metric = galtea.metrics.get_by_name(name="YOUR_METRIC_NAME")
name
string
required

The name of the metric you want to retrieve.

Deleting Metric

This method allows you to delete a specific metric by its ID.

galtea.metrics.delete(metric_type_id="YOUR_METRIC_ID")
metric_type_id
string
required

The ID of the metric you want to delete.