yefo-ufpe's picture
Adapting `google-bert/bert-large-uncased` for `swag`.
545869d verified
|
raw
history blame
4.29 kB
metadata
base_model: google-bert/bert-large-uncased
library_name: peft
license: apache-2.0
metrics:
  - accuracy
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: bert-large-uncased-swag
    results: []

bert-large-uncased-swag

This model is a fine-tuned version of google-bert/bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4643
  • Accuracy: 0.8295

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.2132 0.1088 500 0.8717 0.6959
0.908 0.2175 1000 0.7149 0.7473
0.8353 0.3263 1500 0.6474 0.7575
0.8075 0.4351 2000 0.6142 0.7798
0.8011 0.5438 2500 0.5785 0.7867
0.7727 0.6526 3000 0.5643 0.7936
0.7647 0.7614 3500 0.5698 0.7956
0.7731 0.8701 4000 0.5453 0.8011
0.7489 0.9789 4500 0.5336 0.8052
0.7496 1.0877 5000 0.5431 0.8033
0.735 1.1964 5500 0.5231 0.8083
0.7194 1.3052 6000 0.5147 0.8096
0.7307 1.4140 6500 0.5102 0.8112
0.7355 1.5227 7000 0.5223 0.8133
0.7085 1.6315 7500 0.5054 0.8142
0.7206 1.7403 8000 0.5026 0.8157
0.7143 1.8490 8500 0.5126 0.8144
0.7045 1.9578 9000 0.5035 0.8162
0.6972 2.0666 9500 0.4948 0.8178
0.6885 2.1753 10000 0.4890 0.8202
0.7079 2.2841 10500 0.4910 0.8193
0.6874 2.3929 11000 0.4907 0.8222
0.6832 2.5016 11500 0.4875 0.8217
0.6807 2.6104 12000 0.4824 0.8224
0.6865 2.7192 12500 0.4877 0.8227
0.6863 2.8279 13000 0.4821 0.8232
0.6913 2.9367 13500 0.4914 0.8229
0.6996 3.0455 14000 0.4843 0.8241
0.687 3.1542 14500 0.4753 0.8250
0.6896 3.2630 15000 0.4762 0.8251
0.6745 3.3718 15500 0.4753 0.8242
0.6735 3.4805 16000 0.4713 0.8267
0.6764 3.5893 16500 0.4715 0.8259
0.6521 3.6981 17000 0.4669 0.8285
0.6686 3.8068 17500 0.4726 0.8269
0.6721 3.9156 18000 0.4703 0.8273
0.6682 4.0244 18500 0.4660 0.8274
0.6533 4.1331 19000 0.4690 0.8281
0.6547 4.2419 19500 0.4697 0.8282
0.6589 4.3507 20000 0.4640 0.8291
0.6518 4.4594 20500 0.4638 0.8294
0.6739 4.5682 21000 0.4669 0.8285
0.6763 4.6770 21500 0.4628 0.8304
0.6503 4.7857 22000 0.4640 0.8296
0.6659 4.8945 22500 0.4643 0.8295

Framework versions

  • PEFT 0.12.1.dev0
  • Transformers 4.45.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1