gpt2-finetuned-justification-v1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
0.2403 1.0 676 0.1991
0.1824 2.0 1352 0.1990
0.1366 3.0 2028 0.2091
0.1098 4.0 2704 0.2222
0.0997 5.0 3380 0.2386
0.0724 6.0 4056 0.2535
0.0608 7.0 4732 0.2694
0.0516 8.0 5408 0.2861
0.0409 9.0 6084 0.2941
0.0356 10.0 6760 0.3040
0.0319 11.0 7436 0.3124
0.0265 12.0 8112 0.3184
0.0242 13.0 8788 0.3235
0.0225 14.0 9464 0.3261
0.0197 15.0 10140 0.3330
0.0183 16.0 10816 0.3372
0.0185 17.0 11492 0.3410
0.0157 18.0 12168 0.3394
0.0155 19.0 12844 0.3468
0.0147 20.0 13520 0.3522
0.0135 21.0 14196 0.3532
0.0135 22.0 14872 0.3538
0.0125 23.0 15548 0.3605
0.0123 24.0 16224 0.3594
0.012 25.0 16900 0.3635
0.0116 26.0 17576 0.3649
0.0114 27.0 18252 0.3665
0.011 28.0 18928 0.3685
0.0108 29.0 19604 0.3689
0.0108 30.0 20280 0.3724
0.0103 31.0 20956 0.3719
0.0102 32.0 21632 0.3717
0.01 33.0 22308 0.3764
0.0102 34.0 22984 0.3751
0.0094 35.0 23660 0.3787
0.0099 36.0 24336 0.3789
0.0096 37.0 25012 0.3857
0.0094 38.0 25688 0.3825
0.0093 39.0 26364 0.3831
0.0091 40.0 27040 0.3878
0.0091 41.0 27716 0.3857
0.0089 42.0 28392 0.3863
0.0089 43.0 29068 0.3878
0.0089 44.0 29744 0.3895
0.0087 45.0 30420 0.3885
0.0088 46.0 31096 0.3900
0.0084 47.0 31772 0.3930
0.0087 48.0 32448 0.3916
0.0084 49.0 33124 0.3907
0.0083 50.0 33800 0.3922
0.0083 51.0 34476 0.3937
0.0082 52.0 35152 0.3934
0.0082 53.0 35828 0.3976
0.0081 54.0 36504 0.3959
0.008 55.0 37180 0.3996
0.0079 56.0 37856 0.3999
0.0079 57.0 38532 0.3997
0.0079 58.0 39208 0.4024
0.0078 59.0 39884 0.4027
0.0079 60.0 40560 0.3980
0.0077 61.0 41236 0.4019
0.0077 62.0 41912 0.4019
0.0078 63.0 42588 0.4020
0.0076 64.0 43264 0.4062
0.0077 65.0 43940 0.4041
0.0077 66.0 44616 0.4011
0.0076 67.0 45292 0.4029
0.0075 68.0 45968 0.4046
0.0074 69.0 46644 0.4043
0.0075 70.0 47320 0.4066
0.0075 71.0 47996 0.4055
0.0074 72.0 48672 0.4064
0.0075 73.0 49348 0.4089
0.0074 74.0 50024 0.4089
0.0072 75.0 50700 0.4087
0.0073 76.0 51376 0.4066
0.0073 77.0 52052 0.4035
0.0072 78.0 52728 0.4050
0.0072 79.0 53404 0.4059
0.0071 80.0 54080 0.4104
0.0071 81.0 54756 0.4095
0.0072 82.0 55432 0.4081
0.0072 83.0 56108 0.4095
0.0071 84.0 56784 0.4092
0.007 85.0 57460 0.4099
0.007 86.0 58136 0.4070
0.007 87.0 58812 0.4070
0.007 88.0 59488 0.4057
0.0069 89.0 60164 0.4090
0.0069 90.0 60840 0.4106
0.007 91.0 61516 0.4096
0.0069 92.0 62192 0.4106
0.0069 93.0 62868 0.4101
0.0069 94.0 63544 0.4099
0.0068 95.0 64220 0.4104
0.0068 96.0 64896 0.4106
0.0068 97.0 65572 0.4102
0.0067 98.0 66248 0.4102
0.0067 99.0 66924 0.4104
0.0067 100.0 67600 0.4104

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.2
Downloads last month
13
Safetensors
Model size
301M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.