results

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2477
  • Rewards/chosen: -0.2025
  • Rewards/rejected: -0.2831
  • Rewards/accuracies: 0.8875
  • Rewards/margins: 0.0806
  • Logps/rejected: -2.8313
  • Logps/chosen: -2.0249
  • Logits/rejected: -2.1125
  • Logits/chosen: -1.7341
  • Nll Loss: 2.2267
  • Log Odds Ratio: -0.3842
  • Log Odds Chosen: 0.8874

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen Nll Loss Log Odds Ratio Log Odds Chosen
5.5008 0.2907 50 5.6262 -0.5231 -0.6023 0.8250 0.0792 -6.0233 -5.2314 -2.0311 -1.8904 5.5816 -0.4363 0.7951
4.92 0.5814 100 5.1023 -0.4828 -0.5584 0.8250 0.0756 -5.5836 -4.8278 -2.1181 -2.0055 5.0596 -0.4441 0.7604
4.6969 0.8721 150 4.6774 -0.4489 -0.5171 0.8500 0.0682 -5.1705 -4.4885 -2.1660 -2.0410 4.6355 -0.4630 0.6879
3.9492 1.1628 200 3.8213 -0.3674 -0.4438 0.875 0.0765 -4.4384 -3.6736 -2.2855 -1.9961 3.8167 -0.4302 0.7799
3.45 1.4535 250 3.4864 -0.3342 -0.4227 0.9125 0.0885 -4.2266 -3.3420 -2.2557 -1.8804 3.4837 -0.3910 0.9067
3.2561 1.7442 300 3.2679 -0.3119 -0.3956 0.9000 0.0837 -3.9559 -3.1191 -2.2849 -1.9045 3.2595 -0.4022 0.8630
3.0471 2.0349 350 3.1300 -0.3005 -0.3768 0.9000 0.0763 -3.7679 -3.0046 -2.2584 -1.8626 3.1220 -0.4214 0.7911
2.9312 2.3256 400 2.9729 -0.2816 -0.3469 0.875 0.0653 -3.4686 -2.8161 -2.2750 -1.8891 2.9539 -0.4551 0.6823
2.6856 2.6163 450 2.8281 -0.2630 -0.3133 0.8375 0.0503 -3.1333 -2.6298 -2.2692 -1.8896 2.8010 -0.5058 0.5330
2.7304 2.9070 500 2.7191 -0.2493 -0.2893 0.7875 0.0400 -2.8928 -2.4927 -2.2573 -1.8775 2.6907 -0.5448 0.4286
2.6224 3.1977 550 2.6362 -0.2406 -0.2809 0.7750 0.0403 -2.8089 -2.4062 -2.2342 -1.8500 2.6066 -0.5412 0.4341
2.5026 3.4884 600 2.5858 -0.2354 -0.2761 0.7750 0.0407 -2.7606 -2.3537 -2.2217 -1.8389 2.5555 -0.5383 0.4406
2.6062 3.7791 650 2.5413 -0.2315 -0.2783 0.7875 0.0468 -2.7833 -2.3151 -2.2000 -1.8150 2.5111 -0.5115 0.5079
2.3809 4.0698 700 2.4987 -0.2264 -0.2712 0.8000 0.0448 -2.7123 -2.2642 -2.1931 -1.8048 2.4689 -0.5187 0.4884
2.4307 4.3605 750 2.4637 -0.2232 -0.2721 0.8000 0.0489 -2.7213 -2.2323 -2.1814 -1.7947 2.4350 -0.5014 0.5339
2.4116 4.6512 800 2.4364 -0.2203 -0.2709 0.8000 0.0506 -2.7095 -2.2034 -2.1728 -1.7871 2.4081 -0.4942 0.5536
2.3713 4.9419 850 2.4145 -0.2180 -0.2716 0.8125 0.0535 -2.7157 -2.1803 -2.1681 -1.7788 2.3873 -0.4823 0.5863
2.3885 5.2326 900 2.3904 -0.2160 -0.2735 0.8250 0.0575 -2.7352 -2.1603 -2.1621 -1.7749 2.3630 -0.4664 0.6301
2.3782 5.5233 950 2.3710 -0.2141 -0.2735 0.8250 0.0595 -2.7355 -2.1408 -2.1522 -1.7627 2.3448 -0.4588 0.6524
2.2396 5.8140 1000 2.3565 -0.2130 -0.2767 0.8500 0.0637 -2.7666 -2.1295 -2.1432 -1.7523 2.3312 -0.4429 0.6988
2.2947 6.1047 1050 2.3363 -0.2109 -0.2761 0.8625 0.0652 -2.7607 -2.1086 -2.1430 -1.7592 2.3118 -0.4374 0.7162
2.2506 6.3953 1100 2.3212 -0.2094 -0.2765 0.8625 0.0671 -2.7653 -2.0941 -2.1394 -1.7585 2.2969 -0.4304 0.7376
2.2421 6.6860 1150 2.3090 -0.2084 -0.2781 0.8625 0.0697 -2.7808 -2.0840 -2.1324 -1.7495 2.2853 -0.4213 0.7657
2.2733 6.9767 1200 2.2972 -0.2072 -0.2788 0.875 0.0715 -2.7878 -2.0724 -2.1276 -1.7452 2.2739 -0.4147 0.7865
2.269 7.2674 1250 2.2879 -0.2064 -0.2803 0.875 0.0738 -2.8025 -2.0641 -2.1251 -1.7449 2.2651 -0.4067 0.8118
2.1922 7.5581 1300 2.2843 -0.2056 -0.2779 0.875 0.0723 -2.7791 -2.0565 -2.1274 -1.7480 2.2614 -0.4121 0.7953
2.1969 7.8488 1350 2.2745 -0.2050 -0.2797 0.875 0.0748 -2.7975 -2.0497 -2.1249 -1.7453 2.2520 -0.4034 0.8228
2.1968 8.1395 1400 2.2674 -0.2043 -0.2805 0.875 0.0762 -2.8054 -2.0433 -2.1219 -1.7424 2.2452 -0.3987 0.8385
2.2984 8.4302 1450 2.2618 -0.2038 -0.2810 0.8875 0.0772 -2.8104 -2.0379 -2.1210 -1.7416 2.2398 -0.3952 0.8501
2.2809 8.7209 1500 2.2636 -0.2041 -0.2852 0.9125 0.0811 -2.8523 -2.0408 -2.1185 -1.7341 2.2419 -0.3823 0.8918
2.2605 9.0116 1550 2.2537 -0.2032 -0.2833 0.9000 0.0801 -2.8331 -2.0316 -2.1153 -1.7363 2.2324 -0.3857 0.8816
2.1305 9.3023 1600 2.2505 -0.2028 -0.2832 0.9000 0.0804 -2.8322 -2.0279 -2.1129 -1.7336 2.2294 -0.3849 0.8848
2.1614 9.5930 1650 2.2487 -0.2026 -0.2833 0.9000 0.0807 -2.8330 -2.0261 -2.1129 -1.7343 2.2276 -0.3841 0.8878
2.1278 9.8837 1700 2.2478 -0.2025 -0.2832 0.8875 0.0807 -2.8322 -2.0250 -2.1129 -1.7345 2.2268 -0.3839 0.8882

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for retinol/results

Adapter
(538)
this model