Added evaluations
Browse files
README.md
CHANGED
@@ -149,13 +149,14 @@ Why is the sky blue?<|im_end|>
|
|
149 |
|
150 |
```
|
151 |
|
|
|
152 |
|
153 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
154 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_rhysjones__phi-2-orange-v2)
|
155 |
|
|
|
|
|
156 |
| Metric |Value|
|
157 |
|---------------------------------|----:|
|
158 |
-
|
|
159 |
|AI2 Reasoning Challenge (25-Shot)|61.86|
|
160 |
|HellaSwag (10-Shot) |76.32|
|
161 |
|MMLU (5-Shot) |55.72|
|
@@ -163,3 +164,13 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
|
|
163 |
|Winogrande (5-shot) |75.69|
|
164 |
|GSM8k (5-shot) |57.62|
|
165 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
149 |
|
150 |
```
|
151 |
|
152 |
+
# Evaluations
|
153 |
|
|
|
|
|
154 |
|
155 |
+
[Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
156 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_rhysjones__phi-2-orange-v2)
|
157 |
| Metric |Value|
|
158 |
|---------------------------------|----:|
|
159 |
+
|Average |63.67|
|
160 |
|AI2 Reasoning Challenge (25-Shot)|61.86|
|
161 |
|HellaSwag (10-Shot) |76.32|
|
162 |
|MMLU (5-Shot) |55.72|
|
|
|
164 |
|Winogrande (5-shot) |75.69|
|
165 |
|GSM8k (5-shot) |57.62|
|
166 |
|
167 |
+
[YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard)
|
168 |
+
Evaluation from [mlabonne](https://huggingface.co/mlabonne)'s alternative LLM leaderboard:
|
169 |
+
| Metric |Value|
|
170 |
+
|---------------------------------|----:|
|
171 |
+
|Average |49.64|
|
172 |
+
|AGIEval |34.55|
|
173 |
+
|GPT4All |70.96|
|
174 |
+
|TruthfulQA |54.87|
|
175 |
+
|Bigbench |38.17|
|
176 |
+
|