ahmed792002's picture
gpt-2-alzh
60d8ec3 verified
metadata
license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: custom_model
    results: []

custom_model

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7420

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.4865 1.0 369 1.3129
1.2021 2.0 738 1.1809
1.0322 3.0 1107 1.1234
0.9008 4.0 1476 1.0991
0.8134 5.0 1845 1.0827
0.7293 6.0 2214 1.0923
0.6539 7.0 2583 1.0942
0.5962 8.0 2952 1.1175
0.546 9.0 3321 1.1365
0.4915 10.0 3690 1.1490
0.4523 11.0 4059 1.1860
0.4204 12.0 4428 1.1977
0.3831 13.0 4797 1.2311
0.357 14.0 5166 1.2499
0.3378 15.0 5535 1.2674
0.3203 16.0 5904 1.2902
0.2943 17.0 6273 1.3226
0.2796 18.0 6642 1.3355
0.2679 19.0 7011 1.3618
0.2479 20.0 7380 1.3775
0.2361 21.0 7749 1.3995
0.2274 22.0 8118 1.4151
0.2102 23.0 8487 1.4315
0.1994 24.0 8856 1.4490
0.1943 25.0 9225 1.4714
0.1777 26.0 9594 1.4906
0.1697 27.0 9963 1.5078
0.1602 28.0 10332 1.5293
0.1497 29.0 10701 1.5457
0.1403 30.0 11070 1.5652
0.1315 31.0 11439 1.5814
0.124 32.0 11808 1.5987
0.1142 33.0 12177 1.6151
0.1057 34.0 12546 1.6354
0.1002 35.0 12915 1.6508
0.093 36.0 13284 1.6641
0.0867 37.0 13653 1.6808
0.081 38.0 14022 1.6866
0.076 39.0 14391 1.7061
0.0716 40.0 14760 1.7150
0.067 41.0 15129 1.7232
0.0638 42.0 15498 1.7322
0.0598 43.0 15867 1.7388
0.0575 44.0 16236 1.7446
0.0539 45.0 16605 1.7524
0.0525 46.0 16974 1.7580
0.0505 47.0 17343 1.7609
0.0479 48.0 17712 1.7612
0.0473 49.0 18081 1.7642
0.0462 50.0 18450 1.7644

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2