sail
/

Text Generation
Transformers
English
llama
Inference Endpoints
SivilTaram commited on
Commit
1606665
·
verified ·
1 Parent(s): 7393b95

Update README.md

Browse files

Update the first 16 model's performance

Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+
6
+ | **Task / Model** | **model-index-1** | **model-index-2** | **model-index-3** | **model-index-4** | **model-index-5** | **model-index-6** | **model-index-7** | **model-index-8** |
7
+ |--------------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
8
+ | **Social IQA** | 33.27 | 33.33 | 33.62 | 33.53 | 33.49 | 33.56 | 33.62 | 33.55 |
9
+ | **HellaSwag** | 40.58 | 36.86 | 40.58 | 36.06 | 40.07 | 37.85 | 37.93 | 39.59 |
10
+ | **PiQA** | 67.29 | 65.14 | 67.97 | 64.66 | 67.03 | 65.36 | 66.0 | 66.55 |
11
+ | **OpenBookQA** | 28.63 | 27.87 | 29.33 | 29.1 | 29.23 | 28.33 | 29.13 | 28.73 |
12
+ | **Lambada** | 29.17 | 26.86 | 31.55 | 27.11 | 29.16 | 28.92 | 31.53 | 30.92 |
13
+ | **SciQ** | 80.68 | 79.98 | 81.05 | 80.8 | 82.4 | 79.88 | 78.67 | 79.7 |
14
+ | **COPA** | 70.5 | 63.83 | 69.17 | 65.0 | 67.5 | 66.0 | 66.67 | 68.67 |
15
+ | **RACE** | 29.47 | 30.0 | 32.11 | 28.82 | 31.13 | 30.06 | 29.9 | 30.75 |
16
+ | **ARC Easy** | 50.03 | 48.72 | 50.01 | 46.64 | 51.06 | 47.46 | 46.75 | 48.39 |
17
+ | **LogiQA** | 23.76 | 24.17 | 25.29 | 25.29 | 24.55 | 25.96 | 25.45 | 26.32 |
18
+ | **QQP** | 55.71 | 55.9 | 54.84 | 56.52 | 54.01 | 56.34 | 52.35 | 54.2 |
19
+ | **WinoGrande** | 51.54 | 51.59 | 51.39 | 50.91 | 53.13 | 52.26 | 51.26 | 51.45 |
20
+ | **MultiRC** | 52.65 | 53.39 | 51.89 | 50.92 | 49.03 | 53.09 | 53.64 | 50.23 |
21
+ | **Avg** | 47.18 | 45.97 | 47.60 | 45.80 | 47.06 | 46.54 | 46.38 | 46.85 |
22
+
23
+ | **Task / Model** | **model-index-9** | **model-index-10** | **model-index-11** | **model-index-12** | **model-index-13** | **model-index-14** | **model-index-15** | **model-index-16** |
24
+ |--------------------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
25
+ | **Social IQA** | 33.43 | 33.21 | 33.31 | 33.17 | 33.28 | 32.43 | 33.57 | 33.7 |
26
+ | **HellaSwag** | 40.05 | 35.89 | 39.55 | 39.89 | 38.63 | 36.18 | 39.52 | 35.94 |
27
+ | **PiQA** | 66.6 | 64.74 | 66.29 | 66.27 | 66.9 | 64.05 | 66.7 | 64.51 |
28
+ | **OpenBookQA** | 28.87 | 26.6 | 29.33 | 28.73 | 29.4 | 27.87 | 29.67 | 27.83 |
29
+ | **Lambada** | 31.39 | 27.37 | 30.32 | 30.31 | 31.38 | 26.25 | 29.86 | 26.95 |
30
+ | **SciQ** | 81.1 | 79.12 | 79.97 | 82.85 | 79.42 | 81.4 | 81.38 | 81.23 |
31
+ | **COPA** | 67.0 | 64.5 | 66.83 | 69.5 | 67.33 | 65.83 | 69.5 | 66.33 |
32
+ | **RACE** | 30.57 | 29.63 | 30.49 | 30.85 | 30.35 | 28.66 | 31.21 | 29.57 |
33
+ | **ARC Easy** | 50.66 | 47.74 | 47.47 | 50.18 | 49.92 | 49.52 | 50.73 | 48.65 |
34
+ | **LogiQA** | 23.6 | 25.65 | 26.37 | 23.81 | 25.58 | 26.29 | 25.86 | 25.12 |
35
+ | **QQP** | 54.89 | 54.79 | 54.2 | 55.23 | 53.69 | 57.09 | 53.95 | 54.24 |
36
+ | **WinoGrande** | 50.83 | 51.84 | 51.05 | 51.83 | 52.12 | 52.0 | 51.01 | 51.82 |
37
+ | **MultiRC** | 54.18 | 54.48 | 50.17 | 52.12 | 51.42 | 52.69 | 51.87 | 53.48 |
38
+ | **Avg** | 47.17 | 45.81 | 46.57 | 47.29 | 46.88 | 46.17 | 47.30 | 46.11 |