Model save

Browse files

Files changed (6) hide show

README.md +114 -0
adapter_model.safetensors +1 -1
all_results.json +8 -0
runs/Jul15_07-16-08_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721028151.notebook-deployment-48-7d9b6c99-p5kv4.43131.0 +2 -2
train_results.json +8 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+base_model: alignment-handbook/zephyr-7b-sft-full
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-dpo-qlora-uf-5e-6
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-dpo-qlora-uf-5e-6
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4889
+- Rewards/chosen: -2.8993
+- Rewards/rejected: -4.0739
+- Rewards/accuracies: 0.7778
+- Rewards/margins: 1.1746
+- Rewards/margins Max: 3.6856
+- Rewards/margins Min: -0.9285
+- Rewards/margins Std: 1.5334
+- Logps/rejected: -669.5308
+- Logps/chosen: -574.4193
+- Logits/rejected: -1.7408
+- Logits/chosen: -1.7997
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- total_train_batch_size: 16
+- total_eval_batch_size: 32
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6893        | 0.03  | 100  | 0.6897          | 0.0026         | -0.0055          | 0.7202             | 0.0082          | 0.0362              | -0.0170             | 0.0176              | -262.6957      | -284.2244    | -2.7822         | -2.8200       |
+| 0.6681        | 0.05  | 200  | 0.6689          | 0.0162         | -0.0429          | 0.7222             | 0.0591          | 0.2404              | -0.1128             | 0.1163              | -266.4325      | -282.8687    | -2.7520         | -2.7906       |
+| 0.64          | 0.08  | 300  | 0.6293          | -0.3380        | -0.5276          | 0.7044             | 0.1896          | 0.7935              | -0.3661             | 0.3880              | -314.9071      | -318.2889    | -2.7294         | -2.7644       |
+| 0.6335        | 0.1   | 400  | 0.6076          | -0.3780        | -0.6803          | 0.7143             | 0.3023          | 1.2436              | -0.5587             | 0.5973              | -330.1778      | -322.2904    | -2.7035         | -2.7413       |
+| 0.5664        | 0.13  | 500  | 0.5693          | -1.0517        | -1.6202          | 0.7222             | 0.5685          | 2.1499              | -0.8056             | 0.9738              | -424.1662      | -389.6617    | -2.3570         | -2.3930       |
+| 0.5428        | 0.16  | 600  | 0.5504          | -1.1351        | -1.8251          | 0.7460             | 0.6900          | 2.5221              | -0.8419             | 1.1085              | -444.6526      | -397.9947    | -2.3087         | -2.3340       |
+| 0.5696        | 0.18  | 700  | 0.5407          | -1.6072        | -2.2945          | 0.7302             | 0.6873          | 2.3968              | -0.8008             | 1.0591              | -491.5914      | -445.2077    | -2.0233         | -2.0544       |
+| 0.4864        | 0.21  | 800  | 0.5377          | -1.4823        | -2.3816          | 0.7381             | 0.8993          | 2.9869              | -0.9704             | 1.3291              | -500.2979      | -432.7151    | -2.1126         | -2.1435       |
+| 0.542         | 0.24  | 900  | 0.5399          | -1.9887        | -2.8948          | 0.7302             | 0.9061          | 3.1667              | -0.9490             | 1.3690              | -551.6262      | -483.3614    | -2.1744         | -2.2024       |
+| 0.5518        | 0.26  | 1000 | 0.5300          | -1.9427        | -2.8559          | 0.7540             | 0.9131          | 3.1137              | -0.9029             | 1.3265              | -547.7310      | -478.7619    | -2.1380         | -2.1708       |
+| 0.5538        | 0.29  | 1100 | 0.5361          | -1.1129        | -1.9809          | 0.7520             | 0.8681          | 3.0506              | -0.8555             | 1.2919              | -460.2347      | -395.7733    | -2.1859         | -2.2234       |
+| 0.5482        | 0.31  | 1200 | 0.5345          | -1.2650        | -2.1623          | 0.7798             | 0.8973          | 3.0598              | -0.8739             | 1.2932              | -478.3762      | -410.9884    | -2.0283         | -2.0696       |
+| 0.5325        | 0.34  | 1300 | 0.5237          | -1.3489        | -2.2549          | 0.7540             | 0.9060          | 2.9285              | -0.9000             | 1.2688              | -487.6328      | -419.3813    | -2.0319         | -2.0646       |
+| 0.5647        | 0.37  | 1400 | 0.5171          | -1.8056        | -2.7729          | 0.7738             | 0.9673          | 3.0310              | -0.9191             | 1.3055              | -539.4321      | -465.0507    | -2.0499         | -2.0808       |
+| 0.5458        | 0.39  | 1500 | 0.5139          | -1.4005        | -2.3080          | 0.7659             | 0.9074          | 2.8815              | -0.9358             | 1.2687              | -492.9399      | -424.5414    | -2.1490         | -2.1788       |
+| 0.4935        | 0.42  | 1600 | 0.5159          | -1.4135        | -2.4191          | 0.7619             | 1.0056          | 3.1947              | -0.8547             | 1.3594              | -504.0516      | -425.8337    | -2.0721         | -2.1058       |
+| 0.4832        | 0.44  | 1700 | 0.5182          | -1.5594        | -2.6076          | 0.7579             | 1.0482          | 3.3861              | -0.8998             | 1.4429              | -522.9042      | -440.4306    | -2.1434         | -2.1797       |
+| 0.5158        | 0.47  | 1800 | 0.5181          | -1.7427        | -2.8825          | 0.7639             | 1.1398          | 3.5508              | -0.9741             | 1.5177              | -550.3890      | -458.7530    | -1.9600         | -2.0015       |
+| 0.451         | 0.5   | 1900 | 0.5090          | -1.5156        | -2.5725          | 0.7579             | 1.0569          | 3.3790              | -0.8482             | 1.4174              | -519.3948      | -436.0498    | -1.8888         | -1.9342       |
+| 0.4879        | 0.52  | 2000 | 0.5003          | -1.8435        | -2.8625          | 0.7718             | 1.0190          | 3.2173              | -0.9040             | 1.3683              | -548.3914      | -468.8387    | -1.8468         | -1.8969       |
+| 0.4879        | 0.55  | 2100 | 0.5044          | -1.6709        | -2.7719          | 0.7579             | 1.1010          | 3.5672              | -0.8763             | 1.4852              | -539.3310      | -451.5732    | -1.9027         | -1.9476       |
+| 0.4949        | 0.58  | 2200 | 0.4964          | -3.2082        | -4.4391          | 0.7778             | 1.2309          | 3.8910              | -1.0365             | 1.6390              | -706.0513      | -605.3098    | -1.7221         | -1.7794       |
+| 0.5796        | 0.6   | 2300 | 0.4990          | -2.6972        | -3.7097          | 0.7897             | 1.0125          | 3.2200              | -0.8781             | 1.3552              | -633.1115      | -554.2051    | -1.7896         | -1.8422       |
+| 0.5492        | 0.63  | 2400 | 0.4969          | -3.4670        | -4.5017          | 0.7778             | 1.0347          | 3.3130              | -0.9050             | 1.3962              | -712.3122      | -631.1838    | -1.6170         | -1.6768       |
+| 0.4667        | 0.65  | 2500 | 0.5004          | -3.5869        | -4.8937          | 0.7817             | 1.3068          | 4.1402              | -1.0666             | 1.7418              | -751.5126      | -643.1785    | -1.5865         | -1.6490       |
+| 0.5777        | 0.68  | 2600 | 0.4974          | -2.4014        | -3.5339          | 0.7619             | 1.1325          | 3.5063              | -0.9035             | 1.4860              | -615.5330      | -524.6262    | -1.7399         | -1.7949       |
+| 0.5021        | 0.71  | 2700 | 0.4927          | -2.6594        | -3.8176          | 0.7798             | 1.1583          | 3.6119              | -0.9273             | 1.5118              | -643.9045      | -550.4240    | -1.7427         | -1.7988       |
+| 0.5332        | 0.73  | 2800 | 0.4905          | -3.2417        | -4.4343          | 0.7817             | 1.1926          | 3.7159              | -0.9639             | 1.5556              | -705.5735      | -608.6549    | -1.6555         | -1.7144       |
+| 0.5514        | 0.76  | 2900 | 0.4934          | -3.7499        | -5.0405          | 0.7798             | 1.2906          | 3.9723              | -1.0907             | 1.6887              | -766.1927      | -659.4749    | -1.6687         | -1.7302       |
+| 0.4162        | 0.79  | 3000 | 0.4917          | -3.2815        | -4.4510          | 0.7698             | 1.1694          | 3.6486              | -0.9447             | 1.5323              | -707.2395      | -612.6413    | -1.6605         | -1.7208       |
+| 0.5252        | 0.81  | 3100 | 0.4897          | -3.1223        | -4.3214          | 0.7857             | 1.1991          | 3.7431              | -0.9577             | 1.5632              | -694.2787      | -596.7130    | -1.6937         | -1.7536       |
+| 0.4626        | 0.84  | 3200 | 0.4892          | -3.0544        | -4.1957          | 0.7798             | 1.1413          | 3.5819              | -0.9046             | 1.4895              | -681.7123      | -589.9283    | -1.7159         | -1.7744       |
+| 0.5186        | 0.86  | 3300 | 0.4896          | -2.9688        | -4.1127          | 0.7738             | 1.1440          | 3.5867              | -0.9061             | 1.4963              | -673.4175      | -581.3629    | -1.7207         | -1.7796       |
+| 0.4699        | 0.89  | 3400 | 0.4892          | -2.8679        | -4.0085          | 0.7758             | 1.1406          | 3.5840              | -0.8920             | 1.4895              | -662.9918      | -571.2766    | -1.7332         | -1.7916       |
+| 0.4332        | 0.92  | 3500 | 0.4890          | -2.8539        | -4.0222          | 0.7817             | 1.1684          | 3.6683              | -0.9166             | 1.5238              | -664.3640      | -569.8725    | -1.7403         | -1.7991       |
+| 0.5292        | 0.94  | 3600 | 0.4888          | -2.9244        | -4.1012          | 0.7758             | 1.1768          | 3.6946              | -0.9285             | 1.5356              | -672.2607      | -576.9283    | -1.7327         | -1.7920       |
+| 0.5462        | 0.97  | 3700 | 0.4889          | -2.8929        | -4.0659          | 0.7758             | 1.1730          | 3.6816              | -0.9250             | 1.5309              | -668.7320      | -573.7759    | -1.7393         | -1.7981       |
+| 0.4859        | 0.99  | 3800 | 0.4889          | -2.8993        | -4.0739          | 0.7778             | 1.1746          | 3.6856              | -0.9285             | 1.5334              | -669.5308      | -574.4193    | -1.7408         | -1.7997       |
+### Framework versions
+- PEFT 0.7.1
+- Transformers 4.39.0.dev0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.6
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f8221a573c504ee4df155792996bc2950f5ae5f82837e26f68adcbb4bf3413fc
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:27b8855be0a17ae02f70f9657638a1a2146920cebb489abf701ed08858155670
 size 671150064

all_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.5262645053336271,
+    "train_runtime": 24722.998,
+    "train_samples": 61134,
+    "train_samples_per_second": 2.473,
+    "train_steps_per_second": 0.155
+}

runs/Jul15_07-16-08_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721028151.notebook-deployment-48-7d9b6c99-p5kv4.43131.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:656b93427ab83d1564f7b8edd32522570567357c794fa3a24f7bb7d3863663d3
-size 375706

 version https://git-lfs.github.com/spec/v1
+oid sha256:9784ad4316aad12add5b6a63fae167ca9937bc431b6f484db3e2ee7545617696
+size 377820

train_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.5262645053336271,
+    "train_runtime": 24722.998,
+    "train_samples": 61134,
+    "train_samples_per_second": 2.473,
+    "train_steps_per_second": 0.155
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff