Evaluation Parameters
To compute the iou metric, the following parameters must be provided:-
actual_output: A list of predicted bounding boxes. Each box must be in[x1, y1, x2, y2]format (coordinates of top-left and bottom-right corners). Accepts two formats:- JSON array:
"[[10, 10, 50, 50], [100, 100, 120, 120]]" - JSON object with “bboxes” key:
'{"bboxes": [[10, 10, 50, 50], [100, 100, 120, 120]]}'
- JSON array:
-
expected_output: A single ground truth bounding box in[x1, y1, x2, y2]format. Accepts two formats:- JSON array:
"[10, 10, 50, 50]" - JSON object with “bbox” key (singular):
'{"bbox": [10, 10, 50, 50]}'
- JSON array:
actual_output and expected_output must be valid JSON. Truncated or malformed JSON will cause evaluation failures.
How Is It Calculated?
- The predicted bounding box is compared against each ground truth box.
- For each comparison, the IoU is calculated as:
- The maximum IoU value across all comparisons is returned as the final score.
Interpretation of Scores
- ≥ 0.7 – Strong spatial alignment.
- 0.4 – 0.7 – Moderate overlap; may require refinement.
- < 0.4 – Poor alignment; predicted box diverges from reference.
Suggested Test Case Types
Use IoU when evaluating:- Layout-aware predictions, such as bounding boxes in OCR or form extraction.
- Visual document understanding, where spatial positioning is essential.
- Object or text region detection in image or PDF-based tasks.