KaraKaraWitch commited on
Commit
0d2ef98
·
verified ·
1 Parent(s): 6cf6814

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md CHANGED
@@ -1,5 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
  # ...?
3
 
4
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/OsfG_W1DavvURfreTJ52l.png)
5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Sao10K/70B-L3.3-Cirrus-x1
4
+ - nitky/Llama-3.3-SuperSwallowX-70B-Instruct-v0.1
5
+ - EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
6
+ - Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
7
+ - TheDrummer/Anubis-70B-v1
8
+ - Undi95/Sushi-v1.4
9
+ - pankajmathur/orca_mini_v9_3_70B
10
+ - SicariusSicariiStuff/Negative_LLAMA_70B
11
+ - Sao10K/L3.3-70B-Euryale-v2.3
12
+ - nbeerbower/Llama-3.1-Nemotron-lorablated-70B
13
+ - Blackroot/Mirai-3.0-70B
14
+ library_name: transformers
15
+ tags:
16
+ - mergekit
17
+ - merge
18
+
19
+ ---
20
 
21
  # ...?
22
 
23
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/OsfG_W1DavvURfreTJ52l.png)
24
 
25
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
26
+
27
+ Doomer but probably Gooder. Rest of the numbers are guessed up.
28
+
29
+ This will probably be the last model I mix for a while. Going to touch grass in another country.
30
+
31
+ ## Merge Details
32
+ ### Merge Method
33
+
34
+ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [SicariusSicariiStuff/Negative_LLAMA_70B](https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B) as a base.
35
+
36
+ ### Models Merged
37
+
38
+ The following models were included in the merge:
39
+ * [Sao10K/70B-L3.3-Cirrus-x1](https://huggingface.co/Sao10K/70B-L3.3-Cirrus-x1)
40
+ * [nitky/Llama-3.3-SuperSwallowX-70B-Instruct-v0.1](https://huggingface.co/nitky/Llama-3.3-SuperSwallowX-70B-Instruct-v0.1)
41
+ * [EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1](https://huggingface.co/EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1)
42
+ * [Doctor-Shotgun/L3.3-70B-Magnum-v4-SE](https://huggingface.co/Doctor-Shotgun/L3.3-70B-Magnum-v4-SE)
43
+ * [TheDrummer/Anubis-70B-v1](https://huggingface.co/TheDrummer/Anubis-70B-v1)
44
+ * [Undi95/Sushi-v1.4](https://huggingface.co/Undi95/Sushi-v1.4)
45
+ * [pankajmathur/orca_mini_v9_3_70B](https://huggingface.co/pankajmathur/orca_mini_v9_3_70B)
46
+ * [Sao10K/L3.3-70B-Euryale-v2.3](https://huggingface.co/Sao10K/L3.3-70B-Euryale-v2.3)
47
+ * [nbeerbower/Llama-3.1-Nemotron-lorablated-70B](https://huggingface.co/nbeerbower/Llama-3.1-Nemotron-lorablated-70B)
48
+ * [Blackroot/Mirai-3.0-70B](https://huggingface.co/Blackroot/Mirai-3.0-70B)
49
+
50
+ ### Configuration
51
+
52
+ The following YAML configuration was used to produce this model:
53
+
54
+ ```yaml
55
+ models:
56
+ - model: Blackroot/Mirai-3.0-70B
57
+ parameters:
58
+ density: 0.2
59
+ weight: 0.5
60
+ - model: nbeerbower/Llama-3.1-Nemotron-lorablated-70B
61
+ parameters:
62
+ density: 1
63
+ weight: 0.25
64
+ - model: Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
65
+ parameters:
66
+ density: 0.3
67
+ weight: 0.5
68
+ - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
69
+ parameters:
70
+ density: 0.75
71
+ weight: 0.5
72
+ - model: TheDrummer/Anubis-70B-v1
73
+ parameters:
74
+ density: 0.351
75
+ weight: 0.751
76
+ - model: Sao10K/L3.3-70B-Euryale-v2.3
77
+ parameters:
78
+ density: 0.420
79
+ weight: 0.679
80
+ - model: Sao10K/70B-L3.3-Cirrus-x1
81
+ parameters:
82
+ density: 0.43
83
+ weight: 0.3
84
+ - model: nitky/Llama-3.3-SuperSwallowX-70B-Instruct-v0.1
85
+ parameters:
86
+ density: 0.25
87
+ weight: 0.2
88
+ - model: Undi95/Sushi-v1.4
89
+ parameters:
90
+ density: 0.1457
91
+ weight: 0.69
92
+ - model: pankajmathur/orca_mini_v9_3_70B
93
+ parameters:
94
+ density: 0.2
95
+ weight: 0.2
96
+
97
+ merge_method: ties
98
+ base_model: SicariusSicariiStuff/Negative_LLAMA_70B
99
+ parameters:
100
+ normalize: true
101
+ dtype: bfloat16
102
+ ```