Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-5-5e6

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 22.2823
  • Logps: -81.7800
  • Logits: -0.6395
  • Objective: 22.4840
  • Dpo Loss: 11.4246
  • Regularize: 22.4840
  • Ranking Simple: 0.5072
  • Ranking Idealized: 0.5093
  • Ranking Idealized Expo: 0.5093

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 6
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 288
  • total_eval_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo
7.7199 0.2834 50 5.7454 -88.5676 -1.3592 5.8006 2.9352 5.8006 0.5093 0.5093 0.5093
14.7825 0.5668 100 14.5118 -82.6872 -1.0744 14.7179 7.5504 14.7179 0.5052 0.5093 0.5093
15.364 0.8503 150 17.9996 -82.8069 -0.9215 17.9487 9.0677 17.9487 0.5052 0.5093 0.5093
13.5405 1.1337 200 20.4472 -81.7112 -0.8710 20.8042 10.4069 20.8042 0.5155 0.5093 0.5093
12.3187 1.4171 250 20.4460 -80.3281 -0.9020 20.6868 10.5730 20.6868 0.5083 0.5093 0.5093
11.1496 1.7005 300 21.5408 -81.0334 -0.6058 21.7594 10.9942 21.7594 0.5021 0.5093 0.5093
9.8756 1.9839 350 21.6497 -82.6833 -0.6455 21.7908 11.0636 21.7908 0.5103 0.5093 0.5093
8.7383 2.2674 400 22.0188 -82.5924 -0.6389 22.2506 11.2378 22.2506 0.5083 0.5093 0.5093
7.8659 2.5508 450 22.1530 -81.1508 -0.6986 22.4826 11.2333 22.4826 0.5165 0.5093 0.5093
6.4451 2.8342 500 22.1806 -80.7941 -0.7415 22.4462 11.3734 22.4462 0.5114 0.5093 0.5093
5.3913 3.1176 550 22.5555 -81.1559 -0.6593 22.7930 11.5514 22.7930 0.5114 0.5093 0.5093
4.4825 3.4010 600 22.5560 -81.6865 -0.6143 22.7375 11.5064 22.7375 0.5103 0.5093 0.5093
3.8178 3.6845 650 22.4465 -82.0276 -0.6491 22.6879 11.5084 22.6879 0.5093 0.5093 0.5093
3.084 3.9679 700 22.2750 -82.0004 -0.6309 22.4606 11.4155 22.4606 0.5083 0.5093 0.5093
2.2691 4.2513 750 22.2876 -81.7782 -0.6324 22.4857 11.4308 22.4857 0.5072 0.5093 0.5093
1.9909 4.5347 800 22.2976 -81.7023 -0.6413 22.5029 11.4381 22.5029 0.5072 0.5093 0.5093
1.8048 4.8181 850 22.2810 -81.7745 -0.6391 22.4824 11.4237 22.4824 0.5072 0.5093 0.5093

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-5-5e6

Finetuned
(74)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-5-5e6