Ray2333
/

GRM-Gemma2-2B-sftreg

Text Classification

Model card Files Files and versions Community

Ray2333 commited on Oct 23, 2024

Commit

beab710

·

verified ·

1 Parent(s): 99d5a45

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -19,6 +19,8 @@ The framework is shown above. The introduced text generation regularization mark
 This reward model is finetuned from [gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) using the [weqweasdas/preference_dataset_mixture2_and_safe_pku](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku) dataset.
 ## Evaluation
 We evaluate GRM-Gemma2-2B-sftreg on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 84.7.

 This reward model is finetuned from [gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) using the [weqweasdas/preference_dataset_mixture2_and_safe_pku](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku) dataset.
+Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b) and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
 ## Evaluation
 We evaluate GRM-Gemma2-2B-sftreg on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 84.7.