zephyr-7b / README.md
jikaixuan's picture
Model save
8cd461b verified
|
raw
history blame
4.02 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - trl
  - dpo
  - generated_from_trainer
base_model: mistralai/Mistral-7B-v0.1
model-index:
  - name: zephyr-7b
    results: []

zephyr-7b

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3625
  • Rewards/chosen: -150.6127
  • Rewards/rejected: -146.3050
  • Rewards/accuracies: 0.2421
  • Rewards/margins: -4.3077
  • Logps/rejected: -14705.8975
  • Logps/chosen: -15130.1680
  • Logits/rejected: 13.5362
  • Logits/chosen: 13.4716
  • Use Label: 11165.9844
  • Pred Label: 7522.0161

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen Use Label Pred Label
0.6637 0.1 100 0.6642 -0.0947 -0.1635 0.3254 0.0687 -91.7446 -78.3734 -2.0927 -2.1253 1838.9207 17.0794
0.3902 0.21 200 0.3930 -14.4219 -14.0352 0.2560 -0.3866 -1478.9202 -1511.0870 2.8471 2.7727 3444.6985 515.3016
0.3845 0.31 300 0.3786 -23.0869 -24.5685 0.2520 1.4817 -2532.2498 -2377.5872 5.4283 5.3070 4579.4922 1484.5079
0.3477 0.42 400 0.3622 -111.3259 -109.5294 0.25 -1.7965 -11028.3408 -11201.4893 11.6816 11.5716 5682.4922 2485.5081
0.3468 0.52 500 0.3613 -144.7782 -140.7408 0.2421 -4.0373 -14149.4824 -14546.7158 13.8885 13.8347 6784.2383 3487.7620
0.33 0.63 600 0.3605 -143.0167 -138.8336 0.2401 -4.1831 -13958.7627 -14370.5693 12.5943 12.5399 7857.4287 4518.5713
0.3665 0.73 700 0.3614 -150.1877 -145.8865 0.2421 -4.3011 -14664.0518 -15087.6680 13.4024 13.3367 8936.4287 5543.5713
0.3731 0.84 800 0.3623 -150.4385 -146.1303 0.2401 -4.3082 -14688.4258 -15112.7539 13.5339 13.4696 10050.3330 6533.6665
0.3696 0.94 900 0.3625 -150.6127 -146.3050 0.2421 -4.3077 -14705.8975 -15130.1680 13.5362 13.4716 11165.9844 7522.0161

Framework versions

  • PEFT 0.7.1
  • Transformers 4.38.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2