Text Generation
Transformers
Safetensors
English
olmoe
Mixture of Experts
olmo
conversational
Inference Endpoints
Muennighoff commited on
Commit
5280d35
·
verified ·
1 Parent(s): 8da7e16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -21,7 +21,7 @@ co2_eq_emissions: 1
21
 
22
  # Use
23
 
24
- Install the `transformers` & `torch` libraries and run:
25
 
26
  ```python
27
  from transformers import OlmoeForCausalLM, AutoTokenizer
@@ -48,8 +48,8 @@ Here's how it works: imagine you have a bunch of toys, and you want to
48
 
49
  Branches:
50
  - `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT (`main` branch)
51
- - `load-balancing`: Ablation with load balancing loss during DPO starting from the `load-balancing` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT
52
- - `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0924)
53
  - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
54
 
55
  # Citation
 
21
 
22
  # Use
23
 
24
+ Install the `pip install git+https://github.com/Muennighoff/transformers.git` & `torch` and run:
25
 
26
  ```python
27
  from transformers import OlmoeForCausalLM, AutoTokenizer
 
48
 
49
  Branches:
50
  - `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT (`main` branch)
51
+ - `load-balancing`: Ablation with load balancing loss during DPO starting from the `load-balancing` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT
52
+ - `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924)
53
  - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
54
 
55
  # Citation