allenai
/

llama-3-tulu-v2.5-8b-uf-mean-8b-uf-rm

Model card Files Files and versions Community

hamishivi commited on Oct 14, 2024

Commit

6d52b05

·

verified ·

1 Parent(s): 618fad1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -51,7 +51,7 @@ For details on training and evaluation, read [our paper](https://arxiv.org/abs/2
 | Model | Size | Alignment | GSM8k 8-shot CoT Acc. | AlpacaEval 2 Winrate (LC) |
-|-|-|-|-|-|-|
 | **Tulu V2.5 PPO Llama 3 8B (this model)** | 8B | PPO with 8B RM | 61.5 | 22.7 |
 | **Tulu V2.5 PPO 13B** | 13B | PPO with 70B RM | 58.0 | **26.7** |
 | **Tulu V2 DPO 13B** | 13B | DPO | 50.5 | 16.0 |

 | Model | Size | Alignment | GSM8k 8-shot CoT Acc. | AlpacaEval 2 Winrate (LC) |
+|-|-|-|-|-|
 | **Tulu V2.5 PPO Llama 3 8B (this model)** | 8B | PPO with 8B RM | 61.5 | 22.7 |
 | **Tulu V2.5 PPO 13B** | 13B | PPO with 70B RM | 58.0 | **26.7** |
 | **Tulu V2 DPO 13B** | 13B | DPO | 50.5 | 16.0 |