HuggingFaceH4
/

starchat2-15b-v0.1

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

lewtun HF staff commited on Mar 12, 2024

Commit

bf5bc06

·

verified ·

1 Parent(s): e6872dd

Update README.md

Files changed (1) hide show

README.md +2 -13

README.md CHANGED Viewed

@@ -92,7 +92,8 @@ In particular, the model was evaluated on some categories of gender biases, prop
 ## Training details
-This model is a fine-tuned version of [starchat2-15b-sft-v0.1](https://huggingface.co/HuggingFaceH4/starchat2-15b-sft-v0.1) on the HuggingFaceH4/ultrafeedback_binarized and the HuggingFaceH4/orca_dpo_pairs datasets.
 It achieves the following results on the evaluation set:
 - Loss: 0.4347
 - Rewards/chosen: -0.9461
@@ -104,18 +105,6 @@ It achieves the following results on the evaluation set:
 - Logits/rejected: -2.3817
 - Logits/chosen: -2.3005
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters

 ## Training details
+This model is a fine-tuned version of [starchat2-15b-sft-v0.1](https://huggingface.co/HuggingFaceH4/starchat2-15b-sft-v0.1) on the HuggingFaceH4/ultrafeedback_binarized and the HuggingFaceH4/orca_dpo_pairs datasets. Check out the recipe in the [Alignment Handbook](https://github.com/huggingface/alignment-handbook) for more details.
 It achieves the following results on the evaluation set:
 - Loss: 0.4347
 - Rewards/chosen: -0.9461
 - Logits/rejected: -2.3817
 - Logits/chosen: -2.3005
 ## Training procedure
 ### Training hyperparameters