metadata

license: openrail
tags:
  - document-image-binarization
  - image-segmentation
  - generated_from_trainer
model-index:
  - name: binarization-segformer-b3
    results: []
pipeline_tag: image-segmentation

binarization-segformer-b3

This model is a fine-tuned version of nvidia/segformer-b3 on the same ensemble of 13 datasets as the SauvolaNet work publicly available in their GitHub repository.

It achieves the following results on the evaluation set on DIBCO metrics:

loss: 0.1017
F-measure: 0.9776
pseudo F-measure: 0.9531
PSNR: 14.5040
DRD: 5.3749

with PSNR the peak signal-to-noise ratio and DRD the distance reciprocal distortion.

For more information on the above DIBCO metrics, see the 2017 introductory paper.

Warning: This model only accepts images with a resolution of 640 due to GPU compute constraints on Colab free tier during training.

Model description

This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO). This is in contrast to the late trend of adapting classic binarization algorithms with neural networks, such as DeepOtsu or the aforementioned SauvolaNet work as extensions of the classical Otsu's method and Sauvola thresholding algorithm, respectively.

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 10
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 50
num_epochs: 50

Training results

training loss	epoch	step	validation loss	F-measure	pseudo F-measure	PSNR	DRD
0.6667	1.03	10	0.6683	0.7127	0.6831	4.8248	107.2894
0.6371	2.05	20	0.6390	0.8173	0.7360	6.1079	69.7770
0.587	3.08	30	0.5652	0.8934	0.8187	7.9143	40.5464
0.5288	4.1	40	0.4926	0.9240	0.8554	9.2247	27.4220
0.4601	5.13	50	0.4244	0.9490	0.8944	10.8830	16.8051
0.3864	6.15	60	0.3446	0.9638	0.9218	12.3460	10.6997
0.3331	7.18	70	0.3055	0.9693	0.9317	13.0531	8.5298
0.2821	8.21	80	0.2512	0.9736	0.9427	13.6929	6.8343
0.2392	9.23	90	0.2112	0.9744	0.9462	13.8825	6.4094
0.2126	10.26	100	0.1948	0.9743	0.9433	13.8424	6.5637
0.1889	11.28	110	0.1710	0.9749	0.9499	13.9784	6.1757
0.1662	12.31	120	0.1604	0.9753	0.9495	14.0450	6.0929
0.1506	13.33	130	0.1451	0.9750	0.9550	14.0028	6.1031
0.1359	14.36	140	0.1362	0.9759	0.9501	14.1383	5.9699
0.1321	15.38	150	0.1351	0.9761	0.9485	14.1907	5.9045
0.1283	16.41	160	0.1266	0.9758	0.9541	14.1515	5.8287
0.1198	17.44	170	0.1232	0.9763	0.9535	14.2411	5.7300
0.1151	18.46	180	0.1232	0.9765	0.9482	14.2788	5.8266
0.1146	19.49	190	0.1183	0.9764	0.9530	14.2363	5.7922
0.1027	20.51	200	0.1162	0.9765	0.9535	14.2867	5.6246
0.1051	21.54	210	0.1146	0.9766	0.9551	14.2963	5.6159
0.1095	22.56	220	0.1159	0.9767	0.9497	14.3153	5.8966
0.1076	23.59	230	0.1106	0.9768	0.9533	14.3267	5.6436
0.1006	24.62	240	0.1113	0.9769	0.9483	14.3683	5.6679
0.1077	25.64	250	0.1086	0.9770	0.9544	14.3843	5.4949
0.0966	26.67	260	0.1077	0.9770	0.9553	14.3660	5.5337
0.0958	27.69	270	0.1071	0.9773	0.9529	14.4405	5.4582
0.0984	28.72	280	0.1055	0.9772	0.9536	14.4405	5.4365
0.0936	29.74	290	0.1056	0.9774	0.9528	14.4634	5.4066
0.0958	30.77	300	0.1049	0.9772	0.9544	14.4138	5.4854
0.0896	31.79	310	0.1043	0.9774	0.9533	14.4593	5.4351
0.0973	32.82	320	0.1035	0.9774	0.9528	14.4633	5.4430
0.0943	33.85	330	0.1033	0.9775	0.9527	14.4809	5.4193
0.0956	34.87	340	0.1026	0.9774	0.9543	14.4576	5.4070
0.0936	35.9	350	0.1031	0.9775	0.9531	14.4827	5.4137
0.0937	36.92	360	0.1028	0.9773	0.9551	14.4420	5.4084
0.0952	37.95	370	0.1023	0.9775	0.9541	14.4809	5.3769
0.0952	38.97	380	0.1023	0.9776	0.9525	14.5086	5.3839
0.0948	40.0	390	0.1020	0.9774	0.9546	14.4667	5.3800
0.0931	41.03	400	0.1020	0.9776	0.9534	14.5043	5.3728
0.0906	42.05	410	0.1023	0.9774	0.9544	14.4771	5.3773
0.0974	43.08	420	0.1019	0.9776	0.9536	14.5024	5.3718
0.0908	44.1	430	0.1025	0.9776	0.9536	14.4995	5.3730
0.0935	45.13	440	0.1024	0.9775	0.9537	14.4978	5.3715
0.0927	46.15	450	0.1017	0.9776	0.9531	14.5040	5.3749

Framework versions

Transformers 4.27.4
Pytorch 2.0.0+cu118
Datasets 2.11.0
Tokenizers 0.13.3