slimfrikha-tii
commited on
fix benchs
Browse files
README.md
CHANGED
@@ -130,9 +130,9 @@ We report in the following table our internal pipeline benchmarks.
|
|
130 |
</tr>
|
131 |
<tr>
|
132 |
<td>IFEval</td>
|
133 |
-
<td><b>
|
134 |
-
<td>64.
|
135 |
-
<td>66.
|
136 |
<td>68.3</td>
|
137 |
</tr>
|
138 |
<tr>
|
@@ -167,7 +167,7 @@ We report in the following table our internal pipeline benchmarks.
|
|
167 |
</tr>
|
168 |
<tr>
|
169 |
<td>GPQA (0-shot)</td>
|
170 |
-
<td>
|
171 |
<td>29.2</td>
|
172 |
<td>27.0</td>
|
173 |
<td><b>29.6</b></td>
|
|
|
130 |
</tr>
|
131 |
<tr>
|
132 |
<td>IFEval</td>
|
133 |
+
<td><b>74.7</b></td>
|
134 |
+
<td>64.1</td>
|
135 |
+
<td>66.3</td>
|
136 |
<td>68.3</td>
|
137 |
</tr>
|
138 |
<tr>
|
|
|
167 |
</tr>
|
168 |
<tr>
|
169 |
<td>GPQA (0-shot)</td>
|
170 |
+
<td>32.2</td>
|
171 |
<td>29.2</td>
|
172 |
<td>27.0</td>
|
173 |
<td><b>29.6</b></td>
|