matchaaaaa commited on
Commit
dc6cf89
·
verified ·
1 Parent(s): 8d06f30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -58
README.md CHANGED
@@ -1,58 +1,58 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # Tiramisu-12B-v0.1k
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the linear [DARE](https://arxiv.org/abs/2311.03099) merge method using D:/MLnonsense/models/flammenai_Mahou-1.3-mistral-nemo-12B as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * D:/MLnonsense/models/nbeerbower_mistral-nemo-gutenberg-12B-v4
22
- * D:/MLnonsense/models/Sao10K_MN-12B-Lyra-v1
23
- * D:/MLnonsense/models/Gryphe_Pantheon-RP-1.5-12b-Nemo
24
-
25
- ### Configuration
26
-
27
- The following YAML configuration was used to produce this model:
28
-
29
- ```yaml
30
- base_model: D:/MLnonsense/models/flammenai_Mahou-1.3-mistral-nemo-12B
31
- dtype: bfloat16
32
- merge_method: dare_linear
33
- slices:
34
- - sources:
35
- - layer_range: [0, 40]
36
- model: D:/MLnonsense/models/Gryphe_Pantheon-RP-1.5-12b-Nemo
37
- parameters:
38
- weight: [0.45, 0.35, 0.35, 0.2, 0.2]
39
- - layer_range: [0, 40]
40
- model: D:/MLnonsense/models/Sao10K_MN-12B-Lyra-v1
41
- parameters:
42
- weight: [0.25, 0.3, 0.35, 0.3, 0.2]
43
- - layer_range: [0, 40]
44
- model: D:/MLnonsense/models/nbeerbower_mistral-nemo-gutenberg-12B-v4
45
- parameters:
46
- weight:
47
- - filter: mlp
48
- value: [0.1, 0.2, 0.1, 0.4, 0.5]
49
- - value: [0.1, 0.2, 0.1, 0.2, 0.2]
50
- - layer_range: [0, 40]
51
- model: D:/MLnonsense/models/flammenai_Mahou-1.3-mistral-nemo-12B
52
- parameters:
53
- weight:
54
- - filter: mlp
55
- value: [0.2, 0.15, 0.2, 0.1, 0.1]
56
- - value: [0.2, 0.15, 0.2, 0.3, 0.4]
57
- tokenizer_source: union
58
- ```
 
1
+ ---
2
+ base_model: [flammenai/Mahou-1.3-mistral-nemo-12B]
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # Tiramisu-12B-v0.1
10
+
11
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ ## Merge Details
14
+ ### Merge Method
15
+
16
+ This model was merged using the linear [DARE](https://arxiv.org/abs/2311.03099) merge method using flammenai/Mahou-1.3-mistral-nemo-12B as a base.
17
+
18
+ ### Models Merged
19
+
20
+ The following models were included in the merge:
21
+ * nbeerbower/mistral-nemo-gutenberg-12B-v4
22
+ * Sao10K/MN-12B-Lyra-v1
23
+ * Gryphe/Pantheon-RP-1.5-12b-Nemo
24
+
25
+ ### Configuration
26
+
27
+ The following YAML configuration was used to produce this model:
28
+
29
+ ```yaml
30
+ base_model: flammenai/Mahou-1.3-mistral-nemo-12B
31
+ dtype: bfloat16
32
+ merge_method: dare_linear
33
+ slices:
34
+ - sources:
35
+ - layer_range: [0, 40]
36
+ model: Gryphe/Pantheon-RP-1.5-12b-Nemo
37
+ parameters:
38
+ weight: [0.45, 0.35, 0.35, 0.2, 0.2]
39
+ - layer_range: [0, 40]
40
+ model: Sao10K/MN-12B-Lyra-v1
41
+ parameters:
42
+ weight: [0.25, 0.3, 0.35, 0.3, 0.2]
43
+ - layer_range: [0, 40]
44
+ model: nbeerbower/mistral-nemo-gutenberg-12B-v4
45
+ parameters:
46
+ weight:
47
+ - filter: mlp
48
+ value: [0.1, 0.2, 0.1, 0.4, 0.5]
49
+ - value: [0.1, 0.2, 0.1, 0.2, 0.2]
50
+ - layer_range: [0, 40]
51
+ model: flammenai/Mahou-1.3-mistral-nemo-12B
52
+ parameters:
53
+ weight:
54
+ - filter: mlp
55
+ value: [0.2, 0.15, 0.2, 0.1, 0.1]
56
+ - value: [0.2, 0.15, 0.2, 0.3, 0.4]
57
+ tokenizer_source: union
58
+ ```