juliehunter
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -39,6 +39,7 @@ Lucie-7B-Instruct-human-data is a fine-tuned version of [Lucie-7B](), an open-so
|
|
39 |
|
40 |
Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
|
41 |
|
|
|
42 |
|
43 |
## Training details
|
44 |
### Training data
|
@@ -66,11 +67,12 @@ And the following datasets developed for the Lucie instruct models:
|
|
66 |
### Training procedure
|
67 |
|
68 |
The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
|
69 |
-
* context length: 4096
|
70 |
* batch size: 1024
|
71 |
* max learning rate: 3e-5
|
72 |
* min learning rate: 3e-6
|
73 |
|
|
|
74 |
|
75 |
## Testing the model
|
76 |
|
@@ -153,7 +155,7 @@ Lucie-7B LLM and its training dataset
|
|
153 |
|
154 |
## Acknowledgements
|
155 |
|
156 |
-
This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444).
|
157 |
|
158 |
Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
|
159 |
Olivier Gouvert (LINAGORA),
|
@@ -176,6 +178,8 @@ and
|
|
176 |
Olivier Ferret (CEA)
|
177 |
for their helpful input.
|
178 |
|
|
|
|
|
179 |
## Contact
|
180 |
|
181 |
|
|
39 |
|
40 |
Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
|
41 |
|
42 |
+
While Lucie-7B-Instruct-human-data is trained on sequences of 4096 tokens, its base model, Lucie-7B has a context size of 32K tokens. Based on Needle-in-a-haystack evaluations, Lucie-7B-Instruct-human-data maintains the capacity of the base model to handle 32K-size context windows.
|
43 |
|
44 |
## Training details
|
45 |
### Training data
|
|
|
67 |
### Training procedure
|
68 |
|
69 |
The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
|
70 |
+
* context length: 4096<sup>*</sup>
|
71 |
* batch size: 1024
|
72 |
* max learning rate: 3e-5
|
73 |
* min learning rate: 3e-6
|
74 |
|
75 |
+
<sup>*</sup>As noted above, while Lucie-7B-Instruct is trained on sequences of 4096 tokens, it maintains the capacity of the base model, Lucie-7B, to handle context sizes of up to 32K tokens.
|
76 |
|
77 |
## Testing the model
|
78 |
|
|
|
155 |
|
156 |
## Acknowledgements
|
157 |
|
158 |
+
This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444). We gratefully acknowledge support from GENCI and IDRIS and from Pierre-François Lavallée (IDRIS) and Stephane Requena (GENCI) in particular.
|
159 |
|
160 |
Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
|
161 |
Olivier Gouvert (LINAGORA),
|
|
|
178 |
Olivier Ferret (CEA)
|
179 |
for their helpful input.
|
180 |
|
181 |
+
Finally, we thank the entire OpenLLM-France community, whose members have helped in diverse ways.
|
182 |
+
|
183 |
## Contact
|
184 |
|
185 |