Text Classification
Safetensors
gemma2
custom_code
Ray2333 commited on
Commit
beab710
·
verified ·
1 Parent(s): 99d5a45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -19,6 +19,8 @@ The framework is shown above. The introduced text generation regularization mark
19
 
20
  This reward model is finetuned from [gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) using the [weqweasdas/preference_dataset_mixture2_and_safe_pku](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku) dataset.
21
 
 
 
22
 
23
  ## Evaluation
24
  We evaluate GRM-Gemma2-2B-sftreg on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 84.7.
 
19
 
20
  This reward model is finetuned from [gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) using the [weqweasdas/preference_dataset_mixture2_and_safe_pku](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku) dataset.
21
 
22
+ Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b) and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
23
+
24
 
25
  ## Evaluation
26
  We evaluate GRM-Gemma2-2B-sftreg on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 84.7.