JingweiZuo
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,10 @@
|
|
1 |
-
---
|
2 |
-
datasets:
|
3 |
-
- tiiuae/falcon-refinedweb
|
4 |
-
- HuggingFaceFW/fineweb-edu
|
5 |
-
language:
|
6 |
-
- en
|
7 |
-
|
8 |
-
---
|
9 |
|
10 |
<img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/falcon_mamba/thumbnail.png" alt="drawing" width="800"/>
|
11 |
|
@@ -205,9 +204,9 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
|
|
205 |
| `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
|
206 |
| `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
|
207 |
| `Meta-Llama-3.1-8B` | 12.70 | 25.29 | 4.61 | 6.15 | 8.98 | 24.95 | 13.78 |
|
208 |
-
| `gemma-7B` | 26.59 | 21.12 | 6.42 | 4.92 | 10.98 | 21.64 |**15.28**|
|
209 |
| `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
|
210 |
-
| `Mistral-Nemo-Base` | 16.83 | 29.37 | 4.98 | 5.82 | 6.52 | 27.46 | 15.08 |
|
|
|
211 |
|
212 |
|
213 |
|
@@ -222,8 +221,8 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
|
|
222 |
|***Transformer models*** | | | | | | | |
|
223 |
| `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
|
224 |
| `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
|
225 |
-
| `gemma-7B` | 61.09 | 82.20 | 64.56 | 79.01 | 44.79 | 50.87 | 63.75 |
|
226 |
| `Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
|
|
|
227 |
|
228 |
## Throughput
|
229 |
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- tiiuae/falcon-refinedweb
|
4 |
+
- HuggingFaceFW/fineweb-edu
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
---
|
|
|
8 |
|
9 |
<img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/falcon_mamba/thumbnail.png" alt="drawing" width="800"/>
|
10 |
|
|
|
204 |
| `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
|
205 |
| `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
|
206 |
| `Meta-Llama-3.1-8B` | 12.70 | 25.29 | 4.61 | 6.15 | 8.98 | 24.95 | 13.78 |
|
|
|
207 |
| `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
|
208 |
+
| `Mistral-Nemo-Base-2407 (12B)` | 16.83 | 29.37 | 4.98 | 5.82 | 6.52 | 27.46 | 15.08 |
|
209 |
+
| `gemma-7B` | 26.59 | 21.12 | 6.42 | 4.92 | 10.98 | 21.64 |**15.28**|
|
210 |
|
211 |
|
212 |
|
|
|
221 |
|***Transformer models*** | | | | | | | |
|
222 |
| `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
|
223 |
| `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
|
|
|
224 |
| `Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
|
225 |
+
| `gemma-7B` | 61.09 | 82.20 | 64.56 | 79.01 | 44.79 | 50.87 | 63.75 |
|
226 |
|
227 |
## Throughput
|
228 |
|