Text Generation
Transformers
PyTorch
English
llama
conversational
text-generation-inference
Inference Endpoints
hamishivi commited on
Commit
51944c8
·
verified ·
1 Parent(s): e83c2fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -7,7 +7,7 @@ datasets:
7
  - allenai/tulu-v2-sft-mixture
8
  language:
9
  - en
10
- base_model: allenai/tulu-2-dpo-13b
11
  license: apache-2.0
12
  ---
13
  <center>
@@ -18,7 +18,7 @@ license: apache-2.0
18
 
19
  Tulu is a series of language models that are trained to act as helpful assistants.
20
  Tulu V2.5 is a series of models trained using DPO and PPO starting from the [Tulu 2 suite](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101).
21
- This model is trained on the UltraFeedback dataset (using the per-aspect/fine-grained scores for deciding chosen and rejected) using PPO.
22
  We used a 70B RM trained on our preference data mix, and then used the UltraFeedback prompts during PPO training.
23
 
24
  For more details, read the paper:
 
7
  - allenai/tulu-v2-sft-mixture
8
  language:
9
  - en
10
+ base_model: allenai/tulu-2-13b
11
  license: apache-2.0
12
  ---
13
  <center>
 
18
 
19
  Tulu is a series of language models that are trained to act as helpful assistants.
20
  Tulu V2.5 is a series of models trained using DPO and PPO starting from the [Tulu 2 suite](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101).
21
+ This model is trained using PPO.
22
  We used a 70B RM trained on our preference data mix, and then used the UltraFeedback prompts during PPO training.
23
 
24
  For more details, read the paper: