Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.1-5e7

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4061
  • Logps: -93.6123
  • Logits: -1.5421
  • Objective: 0.4082
  • Dpo Loss: 0.6835
  • Regularize: 0.4082
  • Ranking Simple: 0.5217
  • Ranking Idealized: 0.5888
  • Ranking Idealized Expo: 0.5103

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 6
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 288
  • total_eval_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo
0.4009 0.2834 50 0.4075 -91.1598 -1.4550 0.4075 0.6900 0.4075 0.5145 0.5888 0.5103
0.3473 0.5668 100 0.4035 -92.0013 -1.4828 0.4074 0.6864 0.4074 0.5145 0.5888 0.5103
0.2868 0.8503 150 0.4047 -93.0470 -1.4826 0.4111 0.6868 0.4111 0.5207 0.5888 0.5103
0.2165 1.1337 200 0.4073 -93.5155 -1.5000 0.4126 0.6870 0.4126 0.5165 0.5888 0.5103
0.2005 1.4171 250 0.4063 -93.3534 -1.5170 0.4069 0.6846 0.4069 0.5196 0.5888 0.5103
0.1818 1.7005 300 0.4041 -92.4353 -1.5152 0.4060 0.6846 0.4060 0.5217 0.5888 0.5103
0.165 1.9839 350 0.4057 -93.2243 -1.5221 0.4057 0.6835 0.4057 0.5207 0.5888 0.5103
0.1389 2.2674 400 0.4073 -93.7396 -1.5308 0.4090 0.6834 0.4090 0.5176 0.5888 0.5103
0.1152 2.5508 450 0.4049 -93.8421 -1.5304 0.4060 0.6827 0.4060 0.5217 0.5888 0.5103
0.11 2.8342 500 0.4061 -93.4569 -1.5372 0.4082 0.6844 0.4082 0.5165 0.5888 0.5103
0.0961 3.1176 550 0.4054 -93.4277 -1.5393 0.4072 0.6832 0.4072 0.5186 0.5888 0.5103
0.0793 3.4010 600 0.4061 -93.4066 -1.5455 0.4071 0.6832 0.4071 0.5217 0.5888 0.5103
0.0715 3.6845 650 0.4060 -93.5656 -1.5409 0.4077 0.6834 0.4077 0.5227 0.5888 0.5103
0.0763 3.9679 700 0.4063 -93.5022 -1.5427 0.4085 0.6837 0.4085 0.5186 0.5888 0.5103
0.0623 4.2513 750 0.4061 -93.5738 -1.5419 0.4081 0.6836 0.4081 0.5227 0.5888 0.5103
0.0655 4.5347 800 0.4060 -93.5684 -1.5425 0.4082 0.6835 0.4082 0.5227 0.5888 0.5103
0.0601 4.8181 850 0.4061 -93.6119 -1.5421 0.4082 0.6835 0.4082 0.5217 0.5888 0.5103

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.1-5e7

Finetuned
(69)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.1-5e7