Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-W0-noES3-0.1

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:

  • Loss: 191.8617
  • Logps: -86.1321
  • Logits: -1.2576
  • Objective: 186.8551
  • Dpo Loss: 0.6807
  • Regularize: 0.4245
  • Ranking Simple: 0.5336
  • Wo Beta: 15.8722

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 7

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Wo Beta
183.2287 0.1417 50 182.4718 -90.7112 -1.4185 180.4123 0.6896 0.4091 0.5243 16.2952
160.3991 0.2834 100 181.4201 -91.1796 -1.4582 179.4420 0.6854 0.4074 0.5305 16.3106
153.0553 0.4251 150 180.9289 -90.2802 -1.4606 178.3229 0.6809 0.4027 0.5326 16.6558
136.9477 0.5668 200 179.7231 -90.2826 -1.4280 176.9630 0.6796 0.4004 0.5316 16.3329
133.9615 0.7085 250 185.2480 -90.3239 -1.5209 181.9659 0.6804 0.4148 0.5367 16.5328
117.2675 0.8503 300 183.6018 -92.0978 -1.4559 181.2600 0.6830 0.4138 0.5280 16.5618
113.618 0.9920 350 187.3962 -90.2357 -1.4778 183.1441 0.6818 0.4156 0.5295 16.3125
108.282 1.1337 400 186.7854 -88.5629 -1.3931 183.6067 0.6814 0.4168 0.5347 16.2558
90.0262 1.2754 450 184.4520 -87.5954 -1.3706 179.7387 0.6794 0.4079 0.5331 16.2049
97.8439 1.4171 500 186.6105 -87.7391 -1.3773 181.2539 0.6799 0.4116 0.5290 16.1404
91.5957 1.5588 550 185.8633 -89.6898 -1.3445 180.9778 0.6797 0.4120 0.5347 16.1335
89.0238 1.7005 600 185.6100 -86.9355 -1.3632 179.5353 0.6773 0.4080 0.5347 16.2059
90.7044 1.8422 650 186.1243 -87.1991 -1.3882 180.0248 0.6776 0.4102 0.5342 16.1165
84.5287 1.9839 700 188.5602 -87.6351 -1.3019 183.1935 0.6814 0.4164 0.5352 16.0500
76.9421 2.1256 750 188.9042 -88.4364 -1.3259 183.6094 0.6794 0.4182 0.5326 15.9422
73.209 2.2674 800 188.3336 -86.2484 -1.3130 183.7086 0.6811 0.4180 0.5321 16.0113
66.2169 2.4091 850 192.0453 -86.7490 -1.3156 186.8341 0.6831 0.4251 0.5316 15.8832
60.5689 2.5508 900 190.1148 -85.9587 -1.2951 185.2343 0.6801 0.4219 0.5321 15.9341
61.9855 2.6925 950 190.6609 -86.4854 -1.3163 185.6429 0.6812 0.4229 0.5321 15.9612
60.2402 2.8342 1000 190.4743 -85.4829 -1.3084 184.9089 0.6796 0.4209 0.5316 15.8681
59.5621 2.9759 1050 191.3895 -85.2853 -1.2977 186.0318 0.6818 0.4236 0.5311 15.9189
57.3013 3.1176 1100 191.3520 -86.2308 -1.3160 186.1591 0.6791 0.4230 0.5367 15.8460
48.599 3.2593 1150 190.8563 -86.5047 -1.2764 185.6679 0.6803 0.4221 0.5373 15.9498
50.0065 3.4010 1200 190.9622 -85.7436 -1.2851 185.7565 0.6795 0.4218 0.5311 15.8572
47.4703 3.5427 1250 191.3775 -86.1116 -1.2775 186.8072 0.6817 0.4239 0.5305 15.9621
44.9179 3.6845 1300 191.8354 -86.2878 -1.2826 186.7091 0.6804 0.4241 0.5305 15.8192
40.9292 3.8262 1350 192.5214 -85.5316 -1.2757 187.4250 0.6820 0.4260 0.5321 15.8804
42.9136 3.9679 1400 192.7924 -85.8583 -1.2427 187.9270 0.6815 0.4268 0.5342 15.8520
38.8325 4.1096 1450 192.5806 -85.5114 -1.2569 187.5089 0.6820 0.4269 0.5342 15.8565
38.0409 4.2513 1500 192.5007 -85.3251 -1.2588 187.1571 0.6813 0.4255 0.5362 15.8393
34.4862 4.3930 1550 191.5790 -86.5480 -1.2534 186.4161 0.6811 0.4236 0.5362 15.9020
34.5799 4.5347 1600 191.4073 -86.2764 -1.2706 186.0825 0.6796 0.4229 0.5336 15.9068
27.3454 4.6764 1650 191.3007 -85.9348 -1.2432 186.2914 0.6801 0.4233 0.5342 15.9007
26.7167 4.8181 1700 191.8703 -86.1611 -1.2529 187.0981 0.6810 0.4254 0.5326 15.8993
27.1152 4.9598 1750 191.9133 -85.8044 -1.2599 187.1985 0.6809 0.4253 0.5336 15.8728
22.8305 5.1016 1800 192.4291 -86.0359 -1.2645 187.6874 0.6808 0.4263 0.5336 15.8275
21.1772 5.2433 1850 192.0774 -85.9744 -1.2599 187.1845 0.6808 0.4254 0.5321 15.8873
18.5995 5.3850 1900 191.8287 -86.0679 -1.2538 186.9649 0.6807 0.4248 0.5326 15.8683
17.8136 5.5267 1950 191.8575 -86.0837 -1.2633 186.7980 0.6805 0.4244 0.5331 15.8704
16.8259 5.6684 2000 191.8466 -86.1388 -1.2609 186.7259 0.6807 0.4245 0.5331 15.8647
15.5852 5.8101 2050 191.9476 -86.2758 -1.2583 186.8974 0.6808 0.4247 0.5336 15.8729
14.5477 5.9518 2100 191.9842 -86.0437 -1.2603 186.9000 0.6807 0.4246 0.5326 15.8685
13.7824 6.0935 2150 191.8207 -86.0032 -1.2604 186.7842 0.6806 0.4243 0.5326 15.8747
11.3504 6.2352 2200 191.8495 -86.0359 -1.2598 186.7322 0.6807 0.4243 0.5326 15.8700
11.1693 6.3769 2250 191.8128 -86.1265 -1.2585 186.7514 0.6807 0.4243 0.5336 15.8747
11.6161 6.5187 2300 191.8225 -86.1693 -1.2558 186.8004 0.6807 0.4244 0.5326 15.8736
10.8866 6.6604 2350 191.8734 -86.1390 -1.2576 186.8597 0.6807 0.4245 0.5336 15.8719
10.3699 6.8021 2400 191.8644 -86.1292 -1.2577 186.8542 0.6807 0.4245 0.5336 15.8721
10.8668 6.9438 2450 191.8617 -86.1321 -1.2576 186.8551 0.6807 0.4245 0.5336 15.8722

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-W0-noES3-0.1

Finetuned
(74)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-W0-noES3-0.1