giraffe176
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -143,7 +143,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
|
|
143 |
### Merge Method
|
144 |
|
145 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
|
146 |
-
Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants reasoning and conversation.
|
147 |
|
148 |
|
149 |
|
|
|
143 |
### Merge Method
|
144 |
|
145 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
|
146 |
+
Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants, reasoning, and conversation.
|
147 |
|
148 |
|
149 |
|