Update README.md
Browse files
README.md
CHANGED
@@ -48,23 +48,23 @@ This training run is monolingual and uses c4en and english wikipedia datasets.
|
|
48 |
|
49 |
## Test results
|
50 |
|
51 |
-
These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at
|
52 |
|
53 |
| Task |Version| Metric | Value | |Stderr|
|
54 |
|--------------|------:|--------|------:|---|-----:|
|
55 |
-
|anli_r1 | 0|acc | 0.
|
56 |
-
|anli_r2 | 0|acc | 0.
|
57 |
-
|anli_r3 | 0|acc | 0.
|
58 |
-
|hellaswag | 0|acc | 0.
|
59 |
-
| | |acc_norm| 0.
|
60 |
-
|lambada_openai| 0|ppl |
|
61 |
-
| | |acc | 0.
|
62 |
-
|mathqa | 0|acc | 0.
|
63 |
-
| | |acc_norm| 0.
|
64 |
-
|piqa | 0|acc | 0.
|
65 |
-
| | |acc_norm| 0.
|
66 |
-
|winogrande | 0|acc | 0.
|
67 |
-
|wsc | 0|acc | 0.
|
68 |
|
69 |
|
70 |
## Installation
|
|
|
48 |
|
49 |
## Test results
|
50 |
|
51 |
+
These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at 50B (tokens trained) checkpoint.
|
52 |
|
53 |
| Task |Version| Metric | Value | |Stderr|
|
54 |
|--------------|------:|--------|------:|---|-----:|
|
55 |
+
|anli_r1 | 0|acc | 0.3480|± |0.0151|
|
56 |
+
|anli_r2 | 0|acc | 0.3340|± |0.0149|
|
57 |
+
|anli_r3 | 0|acc | 0.3375|± |0.0137|
|
58 |
+
|hellaswag | 0|acc | 0.4476|± |0.0050|
|
59 |
+
| | |acc_norm| 0.5904|± |0.0049|
|
60 |
+
|lambada_openai| 0|ppl |11.0912|± |0.3672|
|
61 |
+
| | |acc | 0.5257|± |0.0070|
|
62 |
+
|mathqa | 0|acc | 0.2315|± |0.0077|
|
63 |
+
| | |acc_norm| 0.2318|± |0.0077|
|
64 |
+
|piqa | 0|acc | 0.7546|± |0.0100|
|
65 |
+
| | |acc_norm| 0.7481|± |0.0101|
|
66 |
+
|winogrande | 0|acc | 0.5754|± |0.0139|
|
67 |
+
|wsc | 0|acc | 0.5000|± |0.0493|
|
68 |
|
69 |
|
70 |
## Installation
|