Junxiong Wang commited on
Commit
1f9986e
1 Parent(s): 24a7edb

add models

Browse files
Files changed (2) hide show
  1. README.md +4 -4
  2. configs.yaml +3 -3
README.md CHANGED
@@ -1,21 +1,21 @@
1
  ---
2
- base_model: /data/junxiong/sft/zephyr_0_5_sft_open_not_openhermes_progressive_train_largest_dataset/
3
  tags:
4
  - alignment-handbook
5
  - generated_from_trainer
6
  datasets:
7
  - HuggingFaceH4/ultrafeedback_binarized
8
  model-index:
9
- - name: zephyr_0_5_dpo_open_not_openhermes_progressive_train_largest_dataset_ep3
10
  results: []
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- # zephyr_0_5_dpo_open_not_openhermes_progressive_train_largest_dataset_ep3
17
 
18
- This model is a fine-tuned version of [/data/junxiong/sft/zephyr_0_5_sft_open_not_openhermes_progressive_train_largest_dataset/](https://huggingface.co//data/junxiong/sft/zephyr_0_5_sft_open_not_openhermes_progressive_train_largest_dataset/) on the HuggingFaceH4/ultrafeedback_binarized dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.7141
21
  - Rewards/chosen: -5.3346
 
1
  ---
2
+ base_model: JunxiongWang/mamba_0_5_sft
3
  tags:
4
  - alignment-handbook
5
  - generated_from_trainer
6
  datasets:
7
  - HuggingFaceH4/ultrafeedback_binarized
8
  model-index:
9
+ - name: mamba_0_5_dpo_ep3
10
  results: []
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ # mamba_0_5_dpo_ep3
17
 
18
+ This model is a fine-tuned version of [JunxiongWang/mamba_0_5_dpo_ep3](https://huggingface.co/JunxiongWang/mamba_0_5_dpo_ep3) on the HuggingFaceH4/ultrafeedback_binarized dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.7141
21
  - Rewards/chosen: -5.3346
configs.yaml CHANGED
@@ -1,8 +1,8 @@
1
- mamba_0_5:
2
  prompt_template: "zephyr-7b-alpha/prompt.txt"
3
  fn_completions: "huggingface_local_completions"
4
  completions_kwargs:
5
- model_name: "/data/junxiong/sft/zephyr_0_5_dpo_open_not_openhermes_progressive_train_largest_dataset_ep3/"
6
  model_kwargs:
7
  torch_dtype: 'bfloat16'
8
  max_new_tokens: 2048
@@ -10,4 +10,4 @@ mamba_0_5:
10
  top_p: 1.0
11
  do_sample: True
12
  pretty_name: "Mamba 0 5 From Zephyr 7B Beta"
13
- link: "https://huggingface.co/HuggingFaceH4/zephyr-7b-beta"
 
1
+ mamba_0_5_dpo_ep3:
2
  prompt_template: "zephyr-7b-alpha/prompt.txt"
3
  fn_completions: "huggingface_local_completions"
4
  completions_kwargs:
5
+ model_name: "JunxiongWang/mamba_0_5_dpo_ep3"
6
  model_kwargs:
7
  torch_dtype: 'bfloat16'
8
  max_new_tokens: 2048
 
10
  top_p: 1.0
11
  do_sample: True
12
  pretty_name: "Mamba 0 5 From Zephyr 7B Beta"
13
+ link: "https://huggingface.co/JunxiongWang/mamba_0_5_dpo_ep3"