gokulsrinivasagan's picture
End of training
05535cc verified
metadata
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - generated_from_trainer
datasets:
  - gokulsrinivasagan/processed_book_corpus_cleaned
metrics:
  - accuracy
model-index:
  - name: bert_base_train_book
    results:
      - task:
          name: Masked Language Modeling
          type: fill-mask
        dataset:
          name: gokulsrinivasagan/processed_book_corpus_cleaned
          type: gokulsrinivasagan/processed_book_corpus_cleaned
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7530519758192171

bert_base_train_book

This model is a fine-tuned version of distilbert-base-uncased on the gokulsrinivasagan/processed_book_corpus_cleaned dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0775
  • Accuracy: 0.7531

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 10
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.6277 0.4215 10000 5.4679 0.1648
5.5308 0.8431 20000 5.3921 0.1656
5.4819 1.2646 30000 5.3559 0.1668
5.4576 1.6861 40000 5.3327 0.1669
5.434 2.1077 50000 5.3193 0.1671
5.423 2.5292 60000 5.3064 0.1676
5.4078 2.9507 70000 5.3011 0.1670
5.3996 3.3723 80000 5.2891 0.1675
5.3864 3.7938 90000 5.2806 0.1672
5.3883 4.2153 100000 5.2894 0.1641
5.3743 4.6369 110000 5.2662 0.1678
5.3614 5.0584 120000 5.2495 0.1677
2.7786 5.4799 130000 2.4132 0.5314
2.191 5.9014 140000 1.8931 0.6135
1.997 6.3230 150000 1.7234 0.6414
1.8894 6.7445 160000 1.6208 0.6582
1.801 7.1660 170000 1.5466 0.6709
1.7429 7.5876 180000 1.4959 0.6795
1.6988 8.0091 190000 1.4521 0.6867
1.6587 8.4306 200000 1.4160 0.6930
1.6247 8.8522 210000 1.3884 0.6977
1.5996 9.2737 220000 1.3623 0.7023
1.5686 9.6952 230000 1.3387 0.7062
1.5445 10.1168 240000 1.3201 0.7099
1.5316 10.5383 250000 1.3002 0.7128
1.51 10.9598 260000 1.2850 0.7156
1.4938 11.3814 270000 1.2728 0.7178
1.4864 11.8029 280000 1.2574 0.7205
1.4641 12.2244 290000 1.2453 0.7228
1.4549 12.6460 300000 1.2324 0.7250
1.4394 13.0675 310000 1.2212 0.7270
1.4298 13.4890 320000 1.2135 0.7284
1.4227 13.9106 330000 1.2044 0.7299
1.414 14.3321 340000 1.1946 0.7319
1.4028 14.7536 350000 1.1855 0.7333
1.3929 15.1751 360000 1.1794 0.7344
1.3863 15.5967 370000 1.1696 0.7360
1.3762 16.0182 380000 1.1627 0.7372
1.3697 16.4397 390000 1.1562 0.7387
1.36 16.8613 400000 1.1513 0.7395
1.3566 17.2828 410000 1.1425 0.7411
1.3482 17.7043 420000 1.1388 0.7417
1.3398 18.1259 430000 1.1331 0.7430
1.3332 18.5474 440000 1.1295 0.7436
1.3316 18.9689 450000 1.1221 0.7448
1.3235 19.3905 460000 1.1177 0.7457
1.321 19.8120 470000 1.1127 0.7464
1.3123 20.2335 480000 1.1087 0.7474
1.3069 20.6551 490000 1.1046 0.7480
1.3016 21.0766 500000 1.0994 0.7486
1.2977 21.4981 510000 1.0952 0.7497
1.2929 21.9197 520000 1.0932 0.7500
1.2924 22.3412 530000 1.0899 0.7505
1.2862 22.7627 540000 1.0887 0.7510
1.2853 23.1843 550000 1.0847 0.7517
1.2827 23.6058 560000 1.0813 0.7523
1.2787 24.0273 570000 1.0805 0.7524
1.276 24.4488 580000 1.0765 0.7532
1.2732 24.8704 590000 1.0770 0.7530

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.2.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.1