binarization-segformer-b3

This model is a fine-tuned version of nvidia/segformer-b3-1024-1024 on the same ensemble of 13 datasets as the SauvolaNet work publicly available in their GitHub repository.

It achieves the following results on the evaluation set on DIBCO metrics:

loss: 0.0743
DRD: 5.9548
F-measure: 0.9840
pseudo F-measure: 0.9740
PSNR: 16.0119

with PSNR the peak signal-to-noise ratio and DRD the distance reciprocal distortion.

For more information on the above DIBCO metrics, see the 2017 introductory paper.

Model description

This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO). This is in contrast to the late trend of adapting classical binarization algorithms with neural networks, such as DeepOtsu or SauvolaNet as extensions of Otsu's method and Sauvola thresholding algorithm, respectively.

Intended uses & limitations

TBC

Training and evaluation data

TBC

Training procedure

Training hyperparameters

TBC

Training results

training loss	epoch	step	validation loss	DRD	F-measure	pseudo F-measure	PSNR
0.6983	0.26	10	0.7079	199.5096	0.5945	0.5801	3.4552
0.6657	0.52	20	0.6755	149.2346	0.7006	0.6165	4.6752
0.6145	0.77	30	0.6433	109.7298	0.7831	0.6520	5.5489
0.5553	1.03	40	0.5443	53.7149	0.8952	0.8000	8.1736
0.4627	1.29	50	0.4896	32.7649	0.9321	0.8603	9.8706
0.3969	1.55	60	0.4327	21.5508	0.9526	0.8985	11.3400
0.3414	1.81	70	0.3002	11.0094	0.9732	0.9462	13.5901
0.2898	2.06	80	0.2839	10.1064	0.9748	0.9563	13.9796
0.2292	2.32	90	0.2427	9.4437	0.9761	0.9584	14.2161
0.2153	2.58	100	0.2095	8.8696	0.9771	0.9621	14.4319
0.1767	2.84	110	0.1916	8.6152	0.9776	0.9646	14.5528
0.1509	3.1	120	0.1704	8.0761	0.9791	0.9632	14.7961
0.1265	3.35	130	0.1561	8.5627	0.9784	0.9655	14.7400
0.132	3.61	140	0.1318	8.1849	0.9788	0.9670	14.8469
0.1115	3.87	150	0.1317	7.8438	0.9790	0.9657	14.9072
0.0983	4.13	160	0.1273	7.9405	0.9791	0.9673	14.9701
0.1001	4.39	170	0.1234	8.4132	0.9788	0.9691	14.8573
0.0862	4.65	180	0.1147	8.0838	0.9797	0.9678	15.0433
0.0713	4.9	190	0.1134	7.6027	0.9806	0.9687	15.2235
0.0905	5.16	200	0.1061	7.2973	0.9803	0.9699	15.1646
0.0902	5.42	210	0.1061	8.4049	0.9787	0.9699	14.8460
0.0759	5.68	220	0.1062	7.7147	0.9809	0.9695	15.2426
0.0638	5.94	230	0.1019	7.7449	0.9806	0.9695	15.2195
0.0852	6.19	240	0.0962	7.0221	0.9817	0.9693	15.4730
0.0677	6.45	250	0.0961	7.2520	0.9814	0.9710	15.3878
0.0668	6.71	260	0.0972	6.6658	0.9823	0.9689	15.6106
0.0701	6.97	270	0.0909	6.9454	0.9820	0.9713	15.5458
0.0567	7.23	280	0.0925	6.5498	0.9824	0.9718	15.5965
0.0624	7.48	290	0.0899	7.3125	0.9813	0.9717	15.3255
0.0649	7.74	300	0.0932	7.4915	0.9816	0.9684	15.5666
0.0524	8.0	310	0.0905	7.1666	0.9815	0.9711	15.4526
0.0693	8.26	320	0.0901	6.5627	0.9827	0.9704	15.7335
0.0528	8.52	330	0.0845	6.6690	0.9826	0.9734	15.5950
0.0632	8.77	340	0.0822	6.2661	0.9833	0.9723	15.8631
0.0522	9.03	350	0.0844	6.0073	0.9836	0.9715	15.9393
0.0568	9.29	360	0.0817	5.9460	0.9837	0.9721	15.9523
0.057	9.55	370	0.0900	7.9726	0.9812	0.9730	15.1229
0.052	9.81	380	0.0836	6.5444	0.9822	0.9712	15.6388
0.0568	10.06	390	0.0810	6.0359	0.9836	0.9714	15.9796
0.0481	10.32	400	0.0784	6.2110	0.9835	0.9724	15.9235
0.0513	10.58	410	0.0803	6.0990	0.9835	0.9715	15.9502
0.0595	10.84	420	0.0798	6.0829	0.9835	0.9720	15.9052
0.047	11.1	430	0.0779	5.8847	0.9838	0.9725	16.0043
0.0406	11.35	440	0.0802	5.7944	0.9838	0.9713	16.0620
0.0493	11.61	450	0.0781	6.0947	0.9836	0.9731	15.9033
0.064	11.87	460	0.0769	6.1257	0.9837	0.9736	15.9080
0.0622	12.13	470	0.0765	6.2964	0.9835	0.9739	15.8188
0.0457	12.39	480	0.0773	5.9826	0.9838	0.9728	16.0119
0.0447	12.65	490	0.0761	5.7977	0.9841	0.9728	16.0900
0.0515	12.9	500	0.0750	5.8569	0.9840	0.9729	16.0633
0.0357	13.16	510	0.0796	5.7990	0.9837	0.9713	16.0818
0.0503	13.42	520	0.0749	5.8323	0.9841	0.9736	16.0510
0.0508	13.68	530	0.0746	6.0361	0.9839	0.9735	15.9709
0.0533	13.94	540	0.0768	6.1596	0.9836	0.9740	15.9193
0.0503	14.19	550	0.0739	5.5900	0.9843	0.9723	16.1883
0.0515	14.45	560	0.0740	5.4660	0.9845	0.9727	16.2745
0.0502	14.71	570	0.0740	5.5895	0.9844	0.9736	16.2054
0.0401	14.97	580	0.0741	5.9694	0.9840	0.9747	15.9603
0.0495	15.23	590	0.0745	5.9136	0.9841	0.9740	16.0458
0.0413	15.48	600	0.0743	5.9548	0.9840	0.9740	16.0119

Framework versions

transformers 4.31.0
torch 2.0.0
datasets 2.13.1
tokenizers 0.13.3

DiTo97
/

binarization-segformer-b3