license: openrail
tags:
- document-image-binarization
- image-segmentation
- generated_from_trainer
model-index:
- name: binarization-segformer-b3
results: []
pipeline_tag: image-segmentation
binarization-segformer-b3
This model is a fine-tuned version of nvidia/segformer-b3 on the same ensemble of 13 datasets as the SauvolaNet work publicly available in their GitHub repository.
It achieves the following results on the evaluation set on DIBCO metrics:
- loss: 0.1017
- F-measure: 0.9776
- pseudo F-measure: 0.9531
- PSNR: 14.5040
- DRD: 5.3749
with PSNR the peak signal-to-noise ratio and DRD the distance reciprocal distortion.
For more information on the above DIBCO metrics, see the 2017 introductory paper.
Warning: This model only accepts images with a resolution of 640 due to GPU compute constraints on Colab free tier during training.
Model description
This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO). This is in contrast to the late trend of adapting classic binarization algorithms with neural networks, such as DeepOtsu or the aforementioned SauvolaNet work as extensions of the classical Otsu's method and Sauvola thresholding algorithm, respectively.
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 10
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 50
- num_epochs: 50
Training results
training loss | epoch | step | validation loss | F-measure | pseudo F-measure | PSNR | DRD |
---|---|---|---|---|---|---|---|
0.6667 | 1.03 | 10 | 0.6683 | 0.7127 | 0.6831 | 4.8248 | 107.2894 |
0.6371 | 2.05 | 20 | 0.6390 | 0.8173 | 0.7360 | 6.1079 | 69.7770 |
0.587 | 3.08 | 30 | 0.5652 | 0.8934 | 0.8187 | 7.9143 | 40.5464 |
0.5288 | 4.1 | 40 | 0.4926 | 0.9240 | 0.8554 | 9.2247 | 27.4220 |
0.4601 | 5.13 | 50 | 0.4244 | 0.9490 | 0.8944 | 10.8830 | 16.8051 |
0.3864 | 6.15 | 60 | 0.3446 | 0.9638 | 0.9218 | 12.3460 | 10.6997 |
0.3331 | 7.18 | 70 | 0.3055 | 0.9693 | 0.9317 | 13.0531 | 8.5298 |
0.2821 | 8.21 | 80 | 0.2512 | 0.9736 | 0.9427 | 13.6929 | 6.8343 |
0.2392 | 9.23 | 90 | 0.2112 | 0.9744 | 0.9462 | 13.8825 | 6.4094 |
0.2126 | 10.26 | 100 | 0.1948 | 0.9743 | 0.9433 | 13.8424 | 6.5637 |
0.1889 | 11.28 | 110 | 0.1710 | 0.9749 | 0.9499 | 13.9784 | 6.1757 |
0.1662 | 12.31 | 120 | 0.1604 | 0.9753 | 0.9495 | 14.0450 | 6.0929 |
0.1506 | 13.33 | 130 | 0.1451 | 0.9750 | 0.9550 | 14.0028 | 6.1031 |
0.1359 | 14.36 | 140 | 0.1362 | 0.9759 | 0.9501 | 14.1383 | 5.9699 |
0.1321 | 15.38 | 150 | 0.1351 | 0.9761 | 0.9485 | 14.1907 | 5.9045 |
0.1283 | 16.41 | 160 | 0.1266 | 0.9758 | 0.9541 | 14.1515 | 5.8287 |
0.1198 | 17.44 | 170 | 0.1232 | 0.9763 | 0.9535 | 14.2411 | 5.7300 |
0.1151 | 18.46 | 180 | 0.1232 | 0.9765 | 0.9482 | 14.2788 | 5.8266 |
0.1146 | 19.49 | 190 | 0.1183 | 0.9764 | 0.9530 | 14.2363 | 5.7922 |
0.1027 | 20.51 | 200 | 0.1162 | 0.9765 | 0.9535 | 14.2867 | 5.6246 |
0.1051 | 21.54 | 210 | 0.1146 | 0.9766 | 0.9551 | 14.2963 | 5.6159 |
0.1095 | 22.56 | 220 | 0.1159 | 0.9767 | 0.9497 | 14.3153 | 5.8966 |
0.1076 | 23.59 | 230 | 0.1106 | 0.9768 | 0.9533 | 14.3267 | 5.6436 |
0.1006 | 24.62 | 240 | 0.1113 | 0.9769 | 0.9483 | 14.3683 | 5.6679 |
0.1077 | 25.64 | 250 | 0.1086 | 0.9770 | 0.9544 | 14.3843 | 5.4949 |
0.0966 | 26.67 | 260 | 0.1077 | 0.9770 | 0.9553 | 14.3660 | 5.5337 |
0.0958 | 27.69 | 270 | 0.1071 | 0.9773 | 0.9529 | 14.4405 | 5.4582 |
0.0984 | 28.72 | 280 | 0.1055 | 0.9772 | 0.9536 | 14.4405 | 5.4365 |
0.0936 | 29.74 | 290 | 0.1056 | 0.9774 | 0.9528 | 14.4634 | 5.4066 |
0.0958 | 30.77 | 300 | 0.1049 | 0.9772 | 0.9544 | 14.4138 | 5.4854 |
0.0896 | 31.79 | 310 | 0.1043 | 0.9774 | 0.9533 | 14.4593 | 5.4351 |
0.0973 | 32.82 | 320 | 0.1035 | 0.9774 | 0.9528 | 14.4633 | 5.4430 |
0.0943 | 33.85 | 330 | 0.1033 | 0.9775 | 0.9527 | 14.4809 | 5.4193 |
0.0956 | 34.87 | 340 | 0.1026 | 0.9774 | 0.9543 | 14.4576 | 5.4070 |
0.0936 | 35.9 | 350 | 0.1031 | 0.9775 | 0.9531 | 14.4827 | 5.4137 |
0.0937 | 36.92 | 360 | 0.1028 | 0.9773 | 0.9551 | 14.4420 | 5.4084 |
0.0952 | 37.95 | 370 | 0.1023 | 0.9775 | 0.9541 | 14.4809 | 5.3769 |
0.0952 | 38.97 | 380 | 0.1023 | 0.9776 | 0.9525 | 14.5086 | 5.3839 |
0.0948 | 40.0 | 390 | 0.1020 | 0.9774 | 0.9546 | 14.4667 | 5.3800 |
0.0931 | 41.03 | 400 | 0.1020 | 0.9776 | 0.9534 | 14.5043 | 5.3728 |
0.0906 | 42.05 | 410 | 0.1023 | 0.9774 | 0.9544 | 14.4771 | 5.3773 |
0.0974 | 43.08 | 420 | 0.1019 | 0.9776 | 0.9536 | 14.5024 | 5.3718 |
0.0908 | 44.1 | 430 | 0.1025 | 0.9776 | 0.9536 | 14.4995 | 5.3730 |
0.0935 | 45.13 | 440 | 0.1024 | 0.9775 | 0.9537 | 14.4978 | 5.3715 |
0.0927 | 46.15 | 450 | 0.1017 | 0.9776 | 0.9531 | 14.5040 | 5.3749 |
Framework versions
- Transformers 4.27.4
- Pytorch 2.0.0+cu118
- Datasets 2.11.0
- Tokenizers 0.13.3