gemma-7b-slerp / README.md
Mark-Arcee's picture
Update README.md
cf123b5 verified
---
library_name: transformers
license: apache-2.0
base_model:
- google/gemma-7b
merge-model:
- google/gemma-7b-it
tags:
- merge
- mergekit
- google/gemma-7b-it
- google/gemma-7b
---
![image/webp](https://plus.unsplash.com/premium_photo-1664526284199-e36d32a3941d?w=800&auto=format&fit=crop&q=60&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxzZWFyY2h8MTN8fHNtYWxsZXJ8ZW58MHx8MHx8fDA%3D)
# Gemma-7B-slerp
This model is a merge of Gemma 7b base and 7b-instruct, using the Slerp merging method.
Test-7B-slerp is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [google/gemma-7b-it](https://huggingface.co/google/gemma-7b-it)
* [google/gemma-7b](https://huggingface.co/google/gemma-7b)
## πŸ† Evaluation
### Nous
Gemma-7B-slerp's Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval)).
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|---|---:|---:|---:|---:|---:|
| [arcee-ai/Gemma-7B-slerp](https://huggingface.co/arcee-ai/gemma-7b-slerp) [πŸ“„](https://gist.github.com/shamanez/4c18f8d79747d4019ecf6d5ce098cf72) | 34.14 | 23.86 | 36.55 | 46.22 | 29.94 |
## 🧩 Configuration
Slerp YAML Config
```yaml
slices:
- sources:
- model: google/gemma-7b-it
layer_range: [0, 28]
- model: google/gemma-7b
layer_range: [0, 28]
merge_method: slerp
base_model: google/gemma-7b
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
```