thesunday commited on
Commit
e3905cb
·
1 Parent(s): 7f8e09b

Update model card

Browse files
Files changed (1) hide show
  1. README.md +106 -1
README.md CHANGED
@@ -1,3 +1,108 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - merge
7
  ---
8
+
9
+ # Model Description
10
+
11
+ This is an update to [EmbeddedLLM/Mistral-7B-Merge-14-v0.2](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.2) that removes
12
+ potentially TruthfulQA-contaminated models and non-commercially licensed models:
13
+ 1. [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
14
+ 2. [Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling)
15
+ 3. [v1olet/v1olet_marcoroni-go-bruins-merge-7B](https://huggingface.co/v1olet/v1olet_marcoroni-go-bruins-merge-7B)
16
+
17
+
18
+ This is an experiment to test merging 14 models using DARE TIES 🦙
19
+
20
+ The result is a base model that performs quite well but may need some further chat fine-tuning.
21
+
22
+ The 14 models are as follows:
23
+ 1. [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
24
+ 2. [ehartford/dolphin-2.2.1-mistral-7b](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b)
25
+ 3. [SciPhi/SciPhi-Mistral-7B-32k](https://huggingface.co/SciPhi/SciPhi-Mistral-7B-32k)
26
+ 4. [ehartford/samantha-1.2-mistral-7b](https://huggingface.co/ehartford/samantha-1.2-mistral-7b)
27
+ 5. [Arc53/docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral)
28
+ 6. [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
29
+ 7. [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)
30
+ 8. [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
31
+ 9. [openchat/openchat-3.5-1210](https://huggingface.co/openchat/openchat-3.5-1210)
32
+ 10. [beowolx/MistralHermes-CodePro-7B-v1](https://huggingface.co/beowolx/MistralHermes-CodePro-7B-v1)
33
+ 11. [TIGER-Lab/MAmmoTH-7B-Mistral](https://huggingface.co/TIGER-Lab/MAmmoTH-7B-Mistral)
34
+ 12. [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
35
+ 13. [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp)
36
+ 14. [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)
37
+
38
+ - base model: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
39
+
40
+ The yaml config file for this model is here:
41
+
42
+ ```yaml
43
+ models:
44
+ - model: mistralai/Mistral-7B-v0.1
45
+ # no parameters necessary for base model
46
+ - model: ehartford/dolphin-2.2.1-mistral-7b
47
+ parameters:
48
+ weight: 0.08
49
+ density: 0.4
50
+ - model: SciPhi/SciPhi-Mistral-7B-32k
51
+ parameters:
52
+ weight: 0.08
53
+ density: 0.4
54
+ - model: ehartford/samantha-1.2-mistral-7b
55
+ parameters:
56
+ weight: 0.08
57
+ density: 0.4
58
+ - model: Arc53/docsgpt-7b-mistral
59
+ parameters:
60
+ weight: 0.08
61
+ density: 0.4
62
+ - model: HuggingFaceH4/zephyr-7b-beta
63
+ parameters:
64
+ weight: 0.08
65
+ density: 0.4
66
+ - model: meta-math/MetaMath-Mistral-7B
67
+ parameters:
68
+ weight: 0.08
69
+ density: 0.4
70
+ - model: Open-Orca/Mistral-7B-OpenOrca
71
+ parameters:
72
+ weight: 0.08
73
+ density: 0.4
74
+ - model: openchat/openchat-3.5-1210
75
+ parameters:
76
+ weight: 0.08
77
+ density: 0.4
78
+ - model: beowolx/MistralHermes-CodePro-7B-v1
79
+ parameters:
80
+ weight: 0.08
81
+ density: 0.4
82
+ - model: TIGER-Lab/MAmmoTH-7B-Mistral
83
+ parameters:
84
+ weight: 0.08
85
+ density: 0.4
86
+ - model: teknium/OpenHermes-2.5-Mistral-7B
87
+ parameters:
88
+ weight: 0.08
89
+ density: 0.4
90
+ - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
91
+ parameters:
92
+ weight: 0.08
93
+ density: 0.4
94
+ - model: mlabonne/NeuralHermes-2.5-Mistral-7B
95
+ parameters:
96
+ weight: 0.08
97
+ density: 0.4
98
+ - model: mistralai/Mistral-7B-Instruct-v0.2
99
+ parameters:
100
+ weight: 0.08
101
+ density: 0.5
102
+ merge_method: dare_ties
103
+ base_model: mistralai/Mistral-7B-v0.1
104
+ parameters:
105
+ int8_mask: true
106
+ dtype: bfloat16
107
+
108
+ ```