Text Generation
Transformers
Safetensors
English
falcon_mamba
Eval Results
Inference Endpoints
JingweiZuo commited on
Commit
d2cd831
·
verified ·
1 Parent(s): 02688fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -11
README.md CHANGED
@@ -1,11 +1,10 @@
1
- ---
2
- datasets:
3
- - tiiuae/falcon-refinedweb
4
- - HuggingFaceFW/fineweb-edu
5
- language:
6
- - en
7
- license: apache-2.0
8
- ---
9
 
10
  <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/falcon_mamba/thumbnail.png" alt="drawing" width="800"/>
11
 
@@ -205,9 +204,9 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
205
  | `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
206
  | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
207
  | `Meta-Llama-3.1-8B` | 12.70 | 25.29 | 4.61 | 6.15 | 8.98 | 24.95 | 13.78 |
208
- | `gemma-7B` | 26.59 | 21.12 | 6.42 | 4.92 | 10.98 | 21.64 |**15.28**|
209
  | `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
210
- | `Mistral-Nemo-Base` | 16.83 | 29.37 | 4.98 | 5.82 | 6.52 | 27.46 | 15.08 |
 
211
 
212
 
213
 
@@ -222,8 +221,8 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
222
  |***Transformer models*** | | | | | | | |
223
  | `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
224
  | `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
225
- | `gemma-7B` | 61.09 | 82.20 | 64.56 | 79.01 | 44.79 | 50.87 | 63.75 |
226
  | `Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
 
227
 
228
  ## Throughput
229
 
 
1
+ ---
2
+ datasets:
3
+ - tiiuae/falcon-refinedweb
4
+ - HuggingFaceFW/fineweb-edu
5
+ language:
6
+ - en
7
+ ---
 
8
 
9
  <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/falcon_mamba/thumbnail.png" alt="drawing" width="800"/>
10
 
 
204
  | `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
205
  | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
206
  | `Meta-Llama-3.1-8B` | 12.70 | 25.29 | 4.61 | 6.15 | 8.98 | 24.95 | 13.78 |
 
207
  | `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
208
+ | `Mistral-Nemo-Base-2407 (12B)` | 16.83 | 29.37 | 4.98 | 5.82 | 6.52 | 27.46 | 15.08 |
209
+ | `gemma-7B` | 26.59 | 21.12 | 6.42 | 4.92 | 10.98 | 21.64 |**15.28**|
210
 
211
 
212
 
 
221
  |***Transformer models*** | | | | | | | |
222
  | `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
223
  | `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
 
224
  | `Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
225
+ | `gemma-7B` | 61.09 | 82.20 | 64.56 | 79.01 | 44.79 | 50.87 | 63.75 |
226
 
227
  ## Throughput
228