distilbert_lda_v1_book

This model is a fine-tuned version of on the gokulsrinivasagan/processed_book_corpus-ld dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1051
  • Accuracy: 0.7315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 10
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy
7.6644 0.4215 10000 7.4872 0.1609
4.6569 0.8431 20000 4.2985 0.5609
4.2453 1.2646 30000 3.9352 0.6094
4.0748 1.6861 40000 3.7740 0.6315
3.9464 2.1077 50000 3.6555 0.6453
3.8728 2.5292 60000 3.5809 0.6566
3.814 2.9507 70000 3.5364 0.6635
3.771 3.3723 80000 3.4922 0.6700
3.735 3.7938 90000 3.4582 0.6753
3.7016 4.2153 100000 3.4345 0.6790
3.681 4.6369 110000 3.4123 0.6824
3.6573 5.0584 120000 3.3854 0.6861
3.6373 5.4799 130000 3.3676 0.6889
3.6238 5.9014 140000 3.3501 0.6915
3.6004 6.3230 150000 3.3354 0.6939
3.5931 6.7445 160000 3.3241 0.6959
3.5703 7.1660 170000 3.3077 0.6986
3.5616 7.5876 180000 3.3021 0.6993
3.5502 8.0091 190000 3.2892 0.7014
3.5388 8.4306 200000 3.2785 0.7033
3.5264 8.8522 210000 3.2708 0.7046
3.5212 9.2737 220000 3.2598 0.7061
3.5045 9.6952 230000 3.2526 0.7073
3.4939 10.1168 240000 3.2483 0.7087
3.4934 10.5383 250000 3.2361 0.7101
3.4833 10.9598 260000 3.2301 0.7111
3.4747 11.3814 270000 3.2252 0.7120
3.4753 11.8029 280000 3.2172 0.7129
3.46 12.2244 290000 3.2102 0.7141
3.457 12.6460 300000 3.2041 0.7154
3.4464 13.0675 310000 3.1984 0.7163
3.4446 13.4890 320000 3.1933 0.7171
3.4398 13.9106 330000 3.1897 0.7174
3.436 14.3321 340000 3.1838 0.7185
3.4289 14.7536 350000 3.1784 0.7193
3.4223 15.1751 360000 3.1748 0.7198
3.4187 15.5967 370000 3.1676 0.7208
3.414 16.0182 380000 3.1651 0.7216
3.409 16.4397 390000 3.1609 0.7222
3.4022 16.8613 400000 3.1584 0.7226
3.4019 17.2828 410000 3.1511 0.7238
3.395 17.7043 420000 3.1483 0.7241
3.3878 18.1259 430000 3.1473 0.7248
3.3833 18.5474 440000 3.1439 0.7250
3.3828 18.9689 450000 3.1381 0.7260
3.3795 19.3905 460000 3.1349 0.7265
3.3746 19.8120 470000 3.1318 0.7272
3.3704 20.2335 480000 3.1287 0.7275
3.366 20.6551 490000 3.1248 0.7283
3.3621 21.0766 500000 3.1214 0.7286
3.3582 21.4981 510000 3.1189 0.7291
3.3547 21.9197 520000 3.1174 0.7294
3.3561 22.3412 530000 3.1152 0.7298
3.3516 22.7627 540000 3.1145 0.7300
3.3517 23.1843 550000 3.1110 0.7303
3.349 23.6058 560000 3.1087 0.7309
3.3446 24.0273 570000 3.1080 0.7311
3.342 24.4488 580000 3.1042 0.7317
3.3397 24.8704 590000 3.1048 0.7314

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.2.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.1
Downloads last month
54
Safetensors
Model size
67.6M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train gokulsrinivasagan/distilbert_lda_v1_book

Evaluation results

  • Accuracy on gokulsrinivasagan/processed_book_corpus-ld
    self-reported
    0.731