Skip to main content
The JSON Field Match (Normalized) metric is one of the Deterministic Metric options in Galtea. It performs a field-level comparison between two JSON objects, just like JSON Field Match, but with case-insensitive and accent-insensitive string comparison. This is ideal for entity extraction use cases where superficial text differences (e.g., “Sí” vs “SI”) should not count as mismatches.

Evaluation Parameters

To compute the JSON Field Match (Normalized) metric, the following parameters are required:
  • actual_output: The JSON output generated by the model (string or object). Supports lenient parsing — JSON wrapped in markdown code fences or embedded in surrounding text is automatically extracted.
  • expected_output: The reference JSON object to compare against. Must be a valid JSON object.

How Is It Calculated?

The metric compares top-level fields between the expected and actual JSON outputs:
  1. Parse Inputs expected_output is parsed as a strict JSON object. actual_output is parsed leniently — the metric will attempt to extract a JSON object from markdown code fences (e.g., ```json ... ```) or surrounding text before parsing. If either cannot be resolved to a valid JSON object, the evaluation raises an error.
  2. Normalize and Compare Fields For each top-level key in the expected output, check whether the same key exists in the actual output with a matching value. String values are normalized before comparison:
    • Accent removal: Unicode NFD decomposition strips combining marks (e.g., “é” becomes “e”).
    • Case folding: Strings are compared case-insensitively (e.g., “SI” matches “si”).
    This means “Sí”, “SI”, and “si” are all considered equal after normalization. For non-string values (numbers, booleans, null), strict equality is used — 30 does not match "30". For nested objects and arrays, string values are recursively normalized before deep equality comparison. Note that array order is preserved["ADMIN", "USER"] and ["user", "admin"] are not considered equal even though the individual elements normalize to the same strings.
  3. Compute Score The score is calculated as:
    score = 1 - (non_matching_fields / total_fields)
    
    Where total_fields is the number of top-level keys in the expected output, and non_matching_fields is the count of keys that are missing or have different values in the actual output.

Interpretation of Scores

  • 1.0 — All expected fields are present in the actual output with matching values (after normalization). Also returned when the expected output is an empty JSON object {}.
  • 0.5 — Half of the expected fields match.
  • 0.0 — No fields match.
Extra keys in the actual output that are not in the expected output are ignored. Missing keys in the actual output count as non-matching.

When to Use This vs JSON Field Match

ScenarioRecommended Metric
Values must match exactly (e.g., IDs, codes)JSON Field Match
String values may differ in case or accents (e.g., entity extraction, user-facing text)JSON Field Match (Normalized)

Suggested Test Case Types

Use JSON Field Match (Normalized) when evaluating:
  • Entity extraction tasks where the model extracts names, cities, or labels that may vary in accents or capitalization.
  • Multilingual extraction where accent marks are inconsistently applied (e.g., “José” vs “Jose”).
  • Form-filling or slot-filling agents where case normalization is acceptable.
  • Golden dataset evaluations where partial credit is useful and minor text variations should not penalize the score.