giraffe176 commited on
Commit
5c2d264
·
verified ·
1 Parent(s): ec3d032

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -143,7 +143,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
143
  ### Merge Method
144
 
145
  This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
146
- Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants reasoning and conversation.
147
 
148
 
149
 
 
143
  ### Merge Method
144
 
145
  This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
146
+ Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants, reasoning, and conversation.
147
 
148
 
149