HIPE-OCRepair 2026 evaluation results
Results are in! Overall results on the test split are:
| Rank | System | Overall cMER ↓ | 95% CI | Overall Pref Macro ↑ | 95% CI | Test sets |
|---|---|---|---|---|---|---|
| 1 | bnf-mistral_hipe-ocrepair-bench_v0.9_run1 | 0.0050 | [0.004, 0.007] | 0.9000 | [0.835, 0.952] | 8/8 |
| 2 | bnf-mistral_hipe-ocrepair-bench_v0.9_run3 | 0.0071 | [0.005, 0.009] | 0.8737 | [0.803, 0.931] | 8/8 |
| 3 | bnf-mistral_hipe-ocrepair-bench_v0.9_run2 | 0.0090 | [0.007, 0.011] | 0.8739 | [0.791, 0.942] | 8/8 |
| 4 | blocr_hipe-ocrepair-bench_v0.9_run1 | 0.0106 | [0.008, 0.013] | 0.7028 | [0.574, 0.816] | 8/8 |
| 5 | l3i_hipe-ocrepair-bench_v0.9_run1 | 0.0176 | [0.014, 0.022] | 0.3591 | [0.229, 0.486] | 8/8 |
| 8 | baseline-no-correction_hipe-ocrepair-bench_v0.9_run1 | 0.0226 | [0.019, 0.026] | 0.0000 | [0.000, 0.000] | 8/8 |
CI = Confidence Interval calculated with the bootstrap method.
Further results on individual splits and other details are in this markdown file