Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +19 -6
README.md CHANGED
@@ -1,15 +1,15 @@
1
  ---
 
 
 
 
 
2
  base_model:
3
  - mistralai/Mistral-7B-v0.1
4
  - argilla/distilabeled-OpenHermes-2.5-Mistral-7B
5
  - NeverSleep/Noromaid-7B-0.4-DPO
6
  - senseable/WestLake-7B-v2
7
  - mlabonne/AlphaMonarch-7B
8
- library_name: transformers
9
- tags:
10
- - mergekit
11
- - merge
12
- license: cc-by-nc-4.0
13
  model-index:
14
  - name: WestLake_Noromaid_OpenHermes_neural-chatv0.1
15
  results:
@@ -205,4 +205,17 @@ dtype: bfloat16
205
  | NeverSleep/Noromaid-7B-0.4-DPO | | | 59.08 | 62.29 | 84.32 | 63.2 | 42.28 | 76.95 | 25.47 |
206
  | claude-v1 | 7.900000 | 76.83 | | | | | | | |
207
  | gpt-3.5-turbo | 7.943750 | 71.74 | | | | | | | |
208
- | | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) | | | | | | | |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-nc-4.0
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
  base_model:
8
  - mistralai/Mistral-7B-v0.1
9
  - argilla/distilabeled-OpenHermes-2.5-Mistral-7B
10
  - NeverSleep/Noromaid-7B-0.4-DPO
11
  - senseable/WestLake-7B-v2
12
  - mlabonne/AlphaMonarch-7B
 
 
 
 
 
13
  model-index:
14
  - name: WestLake_Noromaid_OpenHermes_neural-chatv0.1
15
  results:
 
205
  | NeverSleep/Noromaid-7B-0.4-DPO | | | 59.08 | 62.29 | 84.32 | 63.2 | 42.28 | 76.95 | 25.47 |
206
  | claude-v1 | 7.900000 | 76.83 | | | | | | | |
207
  | gpt-3.5-turbo | 7.943750 | 71.74 | | | | | | | |
208
+ | | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) | | | | | | | |
209
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
210
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_giraffe176__WestMaid_HermesMonarchv0.1)
211
+
212
+ | Metric |Value|
213
+ |---------------------------------|----:|
214
+ |Avg. |72.62|
215
+ |AI2 Reasoning Challenge (25-Shot)|70.22|
216
+ |HellaSwag (10-Shot) |87.42|
217
+ |MMLU (5-Shot) |64.31|
218
+ |TruthfulQA (0-shot) |61.99|
219
+ |Winogrande (5-shot) |82.16|
220
+ |GSM8k (5-shot) |69.60|
221
+