juliehunter commited on
Commit
d1bad59
·
verified ·
1 Parent(s): 4842e3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -39,6 +39,7 @@ Lucie-7B-Instruct-human-data is a fine-tuned version of [Lucie-7B](), an open-so
39
 
40
  Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
41
 
 
42
 
43
  ## Training details
44
  ### Training data
@@ -66,11 +67,12 @@ And the following datasets developed for the Lucie instruct models:
66
  ### Training procedure
67
 
68
  The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
69
- * context length: 4096
70
  * batch size: 1024
71
  * max learning rate: 3e-5
72
  * min learning rate: 3e-6
73
 
 
74
 
75
  ## Testing the model
76
 
@@ -153,7 +155,7 @@ Lucie-7B LLM and its training dataset
153
 
154
  ## Acknowledgements
155
 
156
- This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444).
157
 
158
  Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
159
  Olivier Gouvert (LINAGORA),
@@ -176,6 +178,8 @@ and
176
  Olivier Ferret (CEA)
177
  for their helpful input.
178
 
 
 
179
  ## Contact
180
 
181
 
39
 
40
  Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
41
 
42
+ While Lucie-7B-Instruct-human-data is trained on sequences of 4096 tokens, its base model, Lucie-7B has a context size of 32K tokens. Based on Needle-in-a-haystack evaluations, Lucie-7B-Instruct-human-data maintains the capacity of the base model to handle 32K-size context windows.
43
 
44
  ## Training details
45
  ### Training data
 
67
  ### Training procedure
68
 
69
  The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
70
+ * context length: 4096<sup>*</sup>
71
  * batch size: 1024
72
  * max learning rate: 3e-5
73
  * min learning rate: 3e-6
74
 
75
+ <sup>*</sup>As noted above, while Lucie-7B-Instruct is trained on sequences of 4096 tokens, it maintains the capacity of the base model, Lucie-7B, to handle context sizes of up to 32K tokens.
76
 
77
  ## Testing the model
78
 
 
155
 
156
  ## Acknowledgements
157
 
158
+ This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444). We gratefully acknowledge support from GENCI and IDRIS and from Pierre-François Lavallée (IDRIS) and Stephane Requena (GENCI) in particular.
159
 
160
  Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
161
  Olivier Gouvert (LINAGORA),
 
178
  Olivier Ferret (CEA)
179
  for their helpful input.
180
 
181
+ Finally, we thank the entire OpenLLM-France community, whose members have helped in diverse ways.
182
+
183
  ## Contact
184
 
185