OpenELM-1_1B-DPO-full-1-5
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1836
- Rewards/chosen: -14.0
- Rewards/rejected: -17.625
- Rewards/accuracies: 0.7227
- Rewards/margins: 3.625
- Logps/rejected: -2048.0
- Logps/chosen: -1720.0
- Logits/rejected: 4.2812
- Logits/chosen: 2.625
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6268 | 0.1047 | 100 | 0.6449 | -0.4805 | -0.6680 | 0.6406 | 0.1885 | -356.0 | -366.0 | -9.5625 | -10.0 |
0.5924 | 0.2093 | 200 | 0.5985 | -1.2031 | -1.6172 | 0.6875 | 0.4199 | -450.0 | -438.0 | -12.875 | -13.125 |
0.6197 | 0.3140 | 300 | 0.5811 | -1.375 | -1.8438 | 0.7090 | 0.4668 | -474.0 | -456.0 | -11.75 | -12.1875 |
0.5968 | 0.4186 | 400 | 0.5933 | -2.3125 | -2.8438 | 0.6934 | 0.5273 | -572.0 | -548.0 | -8.5625 | -9.25 |
0.5854 | 0.5233 | 500 | 0.5737 | -1.7422 | -2.2812 | 0.6953 | 0.5352 | -516.0 | -492.0 | -7.7188 | -8.625 |
0.5524 | 0.6279 | 600 | 0.5768 | -3.0156 | -3.7031 | 0.6914 | 0.6953 | -660.0 | -620.0 | -7.0312 | -7.7188 |
0.5602 | 0.7326 | 700 | 0.5756 | -3.1562 | -3.9062 | 0.7168 | 0.75 | -680.0 | -636.0 | -5.125 | -6.3438 |
0.5581 | 0.8373 | 800 | 0.5854 | -3.3906 | -4.0312 | 0.6914 | 0.6289 | -692.0 | -656.0 | -5.0938 | -5.9688 |
0.5793 | 0.9419 | 900 | 0.5657 | -3.1719 | -3.9062 | 0.7207 | 0.7383 | -680.0 | -636.0 | -3.9531 | -5.0312 |
0.2783 | 1.0466 | 1000 | 0.6053 | -4.75 | -5.875 | 0.7188 | 1.125 | -876.0 | -792.0 | -2.2188 | -3.3594 |
0.2417 | 1.1512 | 1100 | 0.6139 | -4.7812 | -5.8125 | 0.7070 | 1.0469 | -872.0 | -796.0 | -2.3594 | -4.125 |
0.2429 | 1.2559 | 1200 | 0.5897 | -5.7188 | -6.8125 | 0.7227 | 1.0781 | -968.0 | -892.0 | -0.7188 | -2.1719 |
0.2508 | 1.3605 | 1300 | 0.5948 | -5.4062 | -6.4062 | 0.6914 | 1.0 | -928.0 | -860.0 | -0.0104 | -1.5156 |
0.2169 | 1.4652 | 1400 | 0.6104 | -5.7812 | -6.9062 | 0.7031 | 1.1016 | -976.0 | -896.0 | 0.0820 | -1.75 |
0.2107 | 1.5699 | 1500 | 0.6062 | -6.0625 | -7.2812 | 0.6973 | 1.1953 | -1016.0 | -924.0 | -0.4590 | -2.1719 |
0.2472 | 1.6745 | 1600 | 0.6158 | -5.625 | -6.7188 | 0.7070 | 1.1016 | -960.0 | -880.0 | -2.0312 | -3.9688 |
0.2545 | 1.7792 | 1700 | 0.6170 | -6.25 | -7.5 | 0.7031 | 1.25 | -1040.0 | -944.0 | -1.2578 | -3.2031 |
0.2383 | 1.8838 | 1800 | 0.6061 | -5.625 | -6.75 | 0.7012 | 1.1172 | -964.0 | -880.0 | 0.7383 | -1.1328 |
0.2107 | 1.9885 | 1900 | 0.6135 | -6.5 | -7.7812 | 0.7383 | 1.2578 | -1064.0 | -968.0 | 0.3027 | -1.4297 |
0.0186 | 2.0931 | 2000 | 0.7473 | -8.0625 | -9.875 | 0.7090 | 1.8594 | -1280.0 | -1120.0 | 2.2812 | 0.4980 |
0.03 | 2.1978 | 2100 | 0.8345 | -9.9375 | -12.25 | 0.7070 | 2.2812 | -1512.0 | -1312.0 | 3.2031 | 1.5938 |
0.0284 | 2.3025 | 2200 | 0.7741 | -9.1875 | -11.3125 | 0.7012 | 2.0781 | -1416.0 | -1240.0 | 2.7812 | 1.0156 |
0.0352 | 2.4071 | 2300 | 0.7983 | -9.3125 | -11.3125 | 0.7090 | 2.0156 | -1424.0 | -1248.0 | 2.6406 | 0.9961 |
0.0345 | 2.5118 | 2400 | 0.8249 | -9.8125 | -12.0 | 0.7266 | 2.1719 | -1488.0 | -1304.0 | 3.2656 | 1.5625 |
0.0192 | 2.6164 | 2500 | 0.8865 | -10.25 | -12.5625 | 0.6973 | 2.2969 | -1544.0 | -1344.0 | 3.5938 | 1.9609 |
0.0261 | 2.7211 | 2600 | 0.7963 | -9.1875 | -11.4375 | 0.7129 | 2.25 | -1432.0 | -1240.0 | 2.7031 | 0.8672 |
0.0315 | 2.8257 | 2700 | 0.7619 | -9.0 | -10.9375 | 0.7109 | 1.9766 | -1384.0 | -1216.0 | 2.8594 | 0.8320 |
0.0293 | 2.9304 | 2800 | 0.8241 | -9.75 | -12.0625 | 0.7070 | 2.2656 | -1496.0 | -1296.0 | 3.1719 | 1.3359 |
0.0071 | 3.0351 | 2900 | 0.8609 | -10.0625 | -12.5 | 0.7188 | 2.3906 | -1536.0 | -1328.0 | 3.1719 | 1.3125 |
0.0099 | 3.1397 | 3000 | 0.9558 | -11.5 | -14.1875 | 0.7051 | 2.6875 | -1704.0 | -1472.0 | 3.4062 | 1.6484 |
0.0079 | 3.2444 | 3100 | 0.9341 | -11.125 | -13.75 | 0.7090 | 2.6562 | -1664.0 | -1432.0 | 3.25 | 1.5078 |
0.0104 | 3.3490 | 3200 | 0.9926 | -11.9375 | -14.8125 | 0.7090 | 2.9062 | -1768.0 | -1512.0 | 3.6719 | 1.9922 |
0.0089 | 3.4537 | 3300 | 0.9665 | -11.9375 | -14.8125 | 0.7188 | 2.875 | -1768.0 | -1512.0 | 3.8594 | 2.2656 |
0.0098 | 3.5583 | 3400 | 0.9548 | -11.1875 | -13.875 | 0.7109 | 2.75 | -1680.0 | -1432.0 | 4.0 | 2.3438 |
0.0109 | 3.6630 | 3500 | 1.0670 | -12.5625 | -15.6875 | 0.7168 | 3.1406 | -1856.0 | -1576.0 | 4.1875 | 2.5312 |
0.0081 | 3.7677 | 3600 | 1.0376 | -12.375 | -15.4375 | 0.7188 | 3.0938 | -1832.0 | -1552.0 | 4.125 | 2.4844 |
0.0081 | 3.8723 | 3700 | 1.0725 | -13.0 | -16.25 | 0.7168 | 3.25 | -1912.0 | -1616.0 | 4.1875 | 2.5938 |
0.0041 | 3.9770 | 3800 | 1.1346 | -13.5 | -17.0 | 0.7188 | 3.4688 | -1984.0 | -1672.0 | 4.2188 | 2.5781 |
0.0036 | 4.0816 | 3900 | 1.1589 | -13.8125 | -17.375 | 0.7168 | 3.5156 | -2024.0 | -1696.0 | 4.25 | 2.625 |
0.0016 | 4.1863 | 4000 | 1.1790 | -14.0625 | -17.625 | 0.7168 | 3.5781 | -2048.0 | -1720.0 | 4.2812 | 2.6719 |
0.0037 | 4.2909 | 4100 | 1.1847 | -14.0625 | -17.625 | 0.7168 | 3.6094 | -2064.0 | -1728.0 | 4.3125 | 2.6562 |
0.007 | 4.3956 | 4200 | 1.1905 | -14.1875 | -17.75 | 0.7227 | 3.6406 | -2064.0 | -1736.0 | 4.3125 | 2.6719 |
0.0038 | 4.5003 | 4300 | 1.1835 | -14.0625 | -17.75 | 0.7207 | 3.6406 | -2064.0 | -1728.0 | 4.2812 | 2.6406 |
0.0093 | 4.6049 | 4400 | 1.1819 | -14.0625 | -17.625 | 0.7207 | 3.625 | -2048.0 | -1720.0 | 4.2812 | 2.625 |
0.006 | 4.7096 | 4500 | 1.1817 | -14.0 | -17.625 | 0.7227 | 3.6406 | -2048.0 | -1720.0 | 4.2812 | 2.6094 |
0.0037 | 4.8142 | 4600 | 1.1826 | -14.0 | -17.625 | 0.7227 | 3.6406 | -2048.0 | -1720.0 | 4.25 | 2.6094 |
0.0059 | 4.9189 | 4700 | 1.1836 | -14.0 | -17.625 | 0.7227 | 3.625 | -2048.0 | -1720.0 | 4.2812 | 2.625 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.19.1
- Downloads last month
- 5
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API does not yet support model repos that contain custom code.