juliehunter
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,116 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- yahma/alpaca-cleaned
|
5 |
+
- cmh/alpaca_data_cleaned_fr_52k
|
6 |
+
- Magpie-Align/Magpie-Gemma2-Pro-200K-Filtered
|
7 |
+
- allenai/WildChat-1M
|
8 |
+
language:
|
9 |
+
- fr
|
10 |
+
- en
|
11 |
+
base_model:
|
12 |
+
- OpenLLM-France/Lucie-7B
|
13 |
+
pipeline_tag: text-generation
|
14 |
+
---
|
15 |
+
|
16 |
+
## Model Description
|
17 |
+
|
18 |
+
Lucie-7B-Instruct is a fine-tuned version of [Lucie-7B](), an open-source, multilingual causal language model created by OpenLLM-France.
|
19 |
+
|
20 |
+
Lucie-7B-Instruct is fine-tuned on synthetic instructions produced by ChatGPT and Gemma and a small set of customized prompts about OpenLLM and Lucie.
|
21 |
+
|
22 |
+
|
23 |
+
|
24 |
+
|
25 |
+
## Training details
|
26 |
+
|
27 |
+
### Training data
|
28 |
+
|
29 |
+
Lucie-7B-Instruct is trained on the following datasets:
|
30 |
+
* [Alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (English; 51604 samples)
|
31 |
+
* [Alpaca-cleaned-fr](https://huggingface.co/datasets/cmh/alpaca_data_cleaned_fr_52k) (French; 51655 samples)
|
32 |
+
* [Magpie-Gemma](https://huggingface.co/datasets/Magpie-Align/Magpie-Gemma2-Pro-200K-Filtered) (English; 195167 samples)
|
33 |
+
* [Wildchat](https://huggingface.co/datasets/allenai/WildChat-1M) (French subset; 26436 samples)
|
34 |
+
* Hard-coded prompts concerning OpenLLM and Lucie (based on [allenai/tulu-3-hard-coded-10x](https://huggingface.co/datasets/allenai/tulu-3-hard-coded-10x))
|
35 |
+
* French: openllm_french.jsonl (24x10 samples)
|
36 |
+
* English: openllm_english.jsonl (24x10 samples)
|
37 |
+
|
38 |
+
|
39 |
+
### Preprocessing
|
40 |
+
* Filtering by language: Magpie-Gemma and Wildchat were filtered to keep only English and French samples, respectively.
|
41 |
+
* Filtering by keyword: Examples containing assistant responses were filtered out from the four synthetic datasets if the responses contained a keyword from the list [filter_strings](https://github.com/OpenLLM-France/Lucie-Training/blob/98792a1a9015dcf613ff951b1ce6145ca8ecb174/tokenization/data.py#L2012). This filter is designed to remove examples in which the assistant is presented as model other than Lucie (e.g., ChatGPT, Gemma, Llama, ...).
|
42 |
+
|
43 |
+
|
44 |
+
|
45 |
+
### Training procedure
|
46 |
+
|
47 |
+
The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
|
48 |
+
* context length: 4096
|
49 |
+
* batch size: 1024
|
50 |
+
* max learning rate: 3e-5
|
51 |
+
* min learning rate: 3e-6
|
52 |
+
|
53 |
+
|
54 |
+
## Testing the model
|
55 |
+
|
56 |
+
### Test in python
|
57 |
+
|
58 |
+
* [test_transformers_gguf.py](test_transformers_gguf.py): Test GGUF model with `transformers` package (WARNING: loading the model is long)
|
59 |
+
|
60 |
+
### Test with ollama
|
61 |
+
|
62 |
+
* Download and install [Ollama](https://ollama.com/download)
|
63 |
+
* Download the [GGUF model](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct-v1/resolve/main/Lucie-7B-q4_k_m.gguf)
|
64 |
+
* Copy the [`Modelfile`](Modelfile), adpating if necessary the path to the GGUF file (line starting with `FROM`).
|
65 |
+
* Run in a shell:
|
66 |
+
* `ollama create -f Modelfile Lucie`
|
67 |
+
* `ollama run Lucie`
|
68 |
+
* Once ">>>" appears, type your prompt(s) and press Enter.
|
69 |
+
* Optionally, restart a conversation by typing "`/clear`"
|
70 |
+
* End the session by typing "`/bye`".
|
71 |
+
|
72 |
+
Useful for debug:
|
73 |
+
* [How to print input requests and output responses in Ollama server?](https://stackoverflow.com/a/78831840)
|
74 |
+
* [Documentation on Modelfile](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter)
|
75 |
+
* Examples: [Ollama model library](https://github.com/ollama/ollama#model-library)
|
76 |
+
* Llama 3 example: https://ollama.com/library/llama3.1
|
77 |
+
* Add GUI : https://docs.openwebui.com/
|
78 |
+
|
79 |
+
### Test with vLLM
|
80 |
+
|
81 |
+
#### 1. Run vLLM Docker Container
|
82 |
+
|
83 |
+
Use the following command to deploy the model,
|
84 |
+
replacing `INSERT_YOUR_HF_TOKEN` with your Hugging Face Hub token.
|
85 |
+
|
86 |
+
```bash
|
87 |
+
docker run --runtime nvidia --gpus=all \
|
88 |
+
--env "HUGGING_FACE_HUB_TOKEN=INSERT_YOUR_HF_TOKEN" \
|
89 |
+
-p 8000:8000 \
|
90 |
+
--ipc=host \
|
91 |
+
vllm/vllm-openai:latest \
|
92 |
+
--model OpenLLM-France/Lucie-7B-Instruct-v1
|
93 |
+
```
|
94 |
+
|
95 |
+
#### 2. Test using OpenAI Client in Python
|
96 |
+
|
97 |
+
To test the deployed model, use the OpenAI Python client as follows:
|
98 |
+
|
99 |
+
```python
|
100 |
+
from openai import OpenAI
|
101 |
+
|
102 |
+
# Initialize the client
|
103 |
+
client = OpenAI(base_url='http://localhost:8000/v1', api_key='empty')
|
104 |
+
|
105 |
+
# Define the input content
|
106 |
+
content = "Hello Lucie"
|
107 |
+
|
108 |
+
# Generate a response
|
109 |
+
chat_response = client.chat.completions.create(
|
110 |
+
model="OpenLLM-France/Lucie-7B-Instruct-v1",
|
111 |
+
messages=[
|
112 |
+
{"role": "user", "content": content}
|
113 |
+
],
|
114 |
+
)
|
115 |
+
print(chat_response.choices[0].message.content)
|
116 |
+
```
|