Evaluation Parameters
To compute theJSON Field Match (Normalized) metric, the following parameters are required:
actual_output: The JSON output generated by the model (string or object). Supports lenient parsing — JSON wrapped in markdown code fences or embedded in surrounding text is automatically extracted.expected_output: The reference JSON object to compare against. Must be a valid JSON object.
How Is It Calculated?
The metric compares top-level fields between the expected and actual JSON outputs:-
Parse Inputs
expected_outputis parsed as a strict JSON object.actual_outputis parsed leniently — the metric will attempt to extract a JSON object from markdown code fences (e.g.,```json ... ```) or surrounding text before parsing. If either cannot be resolved to a valid JSON object, the evaluation raises an error. -
Normalize and Compare Fields
For each top-level key in the expected output, check whether the same key exists in the actual output with a matching value. String values are normalized before comparison:
- Accent removal: Unicode NFD decomposition strips combining marks (e.g., “é” becomes “e”).
- Case folding: Strings are compared case-insensitively (e.g., “SI” matches “si”).
null), strict equality is used —30does not match"30". For nested objects and arrays, string values are recursively normalized before deep equality comparison. Note that array order is preserved —["ADMIN", "USER"]and["user", "admin"]are not considered equal even though the individual elements normalize to the same strings. -
Compute Score
The score is calculated as:
Where
total_fieldsis the number of top-level keys in the expected output, andnon_matching_fieldsis the count of keys that are missing or have different values in the actual output.
Interpretation of Scores
- 1.0 — All expected fields are present in the actual output with matching values (after normalization). Also returned when the expected output is an empty JSON object
{}. - 0.5 — Half of the expected fields match.
- 0.0 — No fields match.
When to Use This vs JSON Field Match
| Scenario | Recommended Metric |
|---|---|
| Values must match exactly (e.g., IDs, codes) | JSON Field Match |
| String values may differ in case or accents (e.g., entity extraction, user-facing text) | JSON Field Match (Normalized) |
Suggested Test Case Types
Use JSON Field Match (Normalized) when evaluating:- Entity extraction tasks where the model extracts names, cities, or labels that may vary in accents or capitalization.
- Multilingual extraction where accent marks are inconsistently applied (e.g., “José” vs “Jose”).
- Form-filling or slot-filling agents where case normalization is acceptable.
- Golden dataset evaluations where partial credit is useful and minor text variations should not penalize the score.