gokulsrinivasagan's picture
End of training
e908bbf verified
metadata
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - generated_from_trainer
datasets:
  - gokulsrinivasagan/processed_book_corpus-ld
metrics:
  - accuracy
model-index:
  - name: distilbert_base_train_book
    results:
      - task:
          name: Masked Language Modeling
          type: fill-mask
        dataset:
          name: gokulsrinivasagan/processed_book_corpus-ld
          type: gokulsrinivasagan/processed_book_corpus-ld
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7294869281732683

distilbert_base_train_book

This model is a fine-tuned version of distilbert-base-uncased on the gokulsrinivasagan/processed_book_corpus-ld dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2047
  • Accuracy: 0.7295

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 160
  • eval_batch_size: 160
  • seed: 10
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.6039 0.7025 10000 5.4506 0.1653
4.4684 1.4051 20000 3.7450 0.3849
2.3547 2.1076 30000 2.0441 0.5926
2.0785 2.8102 40000 1.7986 0.6309
1.938 3.5127 50000 1.6650 0.6520
1.8476 4.2153 60000 1.5862 0.6648
1.7905 4.9178 70000 1.5281 0.6746
1.74 5.6203 80000 1.4845 0.6815
1.7042 6.3229 90000 1.4543 0.6868
1.6725 7.0254 100000 1.4226 0.6917
1.6516 7.7280 110000 1.4016 0.6957
1.6269 8.4305 120000 1.3791 0.6991
1.6032 9.1331 130000 1.3647 0.7016
1.5903 9.8356 140000 1.3465 0.7051
1.5759 10.5381 150000 1.3326 0.7074
1.5641 11.2407 160000 1.3235 0.7090
1.5487 11.9432 170000 1.3103 0.7110
1.5384 12.6458 180000 1.2964 0.7133
1.527 13.3483 190000 1.2920 0.7144
1.5186 14.0509 200000 1.2808 0.7160
1.5086 14.7534 210000 1.2729 0.7174
1.4991 15.4560 220000 1.2637 0.7191
1.4936 16.1585 230000 1.2589 0.7198
1.4843 16.8610 240000 1.2534 0.7209
1.4763 17.5636 250000 1.2467 0.7219
1.4701 18.2661 260000 1.2408 0.7230
1.4668 18.9687 270000 1.2353 0.7240
1.458 19.6712 280000 1.2307 0.7249
1.4547 20.3738 290000 1.2251 0.7258
1.4466 21.0763 300000 1.2207 0.7266
1.4446 21.7788 310000 1.2153 0.7275
1.4375 22.4814 320000 1.2119 0.7281
1.4343 23.1839 330000 1.2086 0.7286
1.4325 23.8865 340000 1.2057 0.7293
1.4294 24.5890 350000 1.2024 0.7297

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.2.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.1