Sandiago21 commited on
Commit
047dd6d
·
1 Parent(s): d7ef239

Create initial README.md

Browse files
Files changed (1) hide show
  1. README.md +255 -0
README.md ADDED
@@ -0,0 +1,255 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - llama
9
+ - decapoda-research-7b-hf
10
+ - prompt answering
11
+ - peft
12
+ ---
13
+
14
+ ## Model Card for Model ID
15
+
16
+ This repository contains a LLaMA-7B further fine-tuned model on conversations and question answering prompts.
17
+
18
+ ⚠️ **I used falcon-7b (https://huggingface.co/tiiuae/falcon-7b) as a base model, so this model is for Research purpose only (See the [license](https://huggingface.co/tiiuae/falcon-7b/blob/main/LICENSE))**
19
+
20
+
21
+ ## Model Details
22
+
23
+ Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder. The Jupyter Notebook contains example code to load the model and ask prompts to it as well as example prompts to get you started.
24
+
25
+ ### Model Description
26
+
27
+ The tiiuae/falcon-7b model was finetuned on conversations and question answering prompts.
28
+
29
+ **Developed by:** [More Information Needed]
30
+
31
+ **Shared by:** [More Information Needed]
32
+
33
+ **Model type:** Causal LM
34
+
35
+ **Language(s) (NLP):** English, multilingual
36
+
37
+ **License:** Research
38
+
39
+ **Finetuned from model:** tiiuae/falcon-7b
40
+
41
+
42
+ ## Model Sources [optional]
43
+
44
+ **Repository:** [More Information Needed]
45
+ **Paper:** [More Information Needed]
46
+ **Demo:** [More Information Needed]
47
+
48
+ ## Uses
49
+
50
+ The model can be used for prompt answering
51
+
52
+
53
+ ### Direct Use
54
+
55
+ The model can be used for prompt answering
56
+
57
+
58
+ ### Downstream Use
59
+
60
+ Generating text and prompt answering
61
+
62
+
63
+ ## Recommendations
64
+
65
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
66
+
67
+
68
+ # Usage
69
+
70
+ ## Creating prompt
71
+
72
+ The model was trained on the following kind of prompt:
73
+
74
+ ```python
75
+ def generate_prompt(instruction: str, input_ctxt: str = None) -> str:
76
+ if input_ctxt:
77
+ return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
78
+
79
+ ### Instruction:
80
+ {instruction}
81
+
82
+ ### Input:
83
+ {input_ctxt}
84
+
85
+ ### Response:"""
86
+ else:
87
+ return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
88
+
89
+ ### Instruction:
90
+ {instruction}
91
+
92
+ ### Response:"""
93
+ ```
94
+
95
+ ## How to Get Started with the Model
96
+
97
+ Use the code below to get started with the model.
98
+
99
+ 1. You can git clone the repo, which contains also the artifacts for the base model for simplicity and completeness, and run the following code snippet to load the mode:
100
+
101
+ ```python
102
+ import torch
103
+ from peft import PeftConfig, PeftModel
104
+ from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
105
+
106
+ MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
107
+
108
+ config = PeftConfig.from_pretrained(MODEL_NAME)
109
+
110
+ model = LlamaForCausalLM.from_pretrained(
111
+ config.base_model_name_or_path,
112
+ load_in_8bit=True,
113
+ torch_dtype=torch.float16,
114
+ device_map="auto",
115
+ )
116
+
117
+ tokenizer = LlamaTokenizer.from_pretrained(MODEL_NAME)
118
+
119
+ model = PeftModel.from_pretrained(model, MODEL_NAME)
120
+
121
+ generation_config = GenerationConfig(
122
+ temperature=0.2,
123
+ top_p=0.75,
124
+ top_k=40,
125
+ num_beams=4,
126
+ max_new_tokens=32,
127
+ )
128
+
129
+ model.eval()
130
+ if torch.__version__ >= "2":
131
+ model = torch.compile(model)
132
+ ```
133
+
134
+ ### Example of Usage
135
+ ```python
136
+ instruction = "What is the capital city of Greece and with which countries does Greece border?"
137
+ input_ctxt = None # For some tasks, you can provide an input context to help the model generate a better response.
138
+
139
+ prompt = generate_prompt(instruction, input_ctxt)
140
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
141
+ input_ids = input_ids.to(model.device)
142
+
143
+ with torch.no_grad():
144
+ outputs = model.generate(
145
+ input_ids=input_ids,
146
+ generation_config=generation_config,
147
+ return_dict_in_generate=True,
148
+ output_scores=True,
149
+ )
150
+
151
+ response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
152
+ print(response)
153
+
154
+ >>> The capital city of Greece is Athens and it borders Turkey, Bulgaria, Macedonia, Albania, and the Aegean Sea.
155
+ ```
156
+
157
+ 2. You can also directly call the model from HuggingFace using the following code snippet:
158
+
159
+ ```python
160
+ import torch
161
+ from peft import PeftConfig, PeftModel
162
+ from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
163
+
164
+ MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
165
+ BASE_MODEL = "tiiuae/falcon-7b"
166
+
167
+ config = PeftConfig.from_pretrained(MODEL_NAME)
168
+
169
+ model = LlamaForCausalLM.from_pretrained(
170
+ BASE_MODEL,
171
+ load_in_8bit=True,
172
+ torch_dtype=torch.float16,
173
+ device_map="auto",
174
+ )
175
+
176
+ tokenizer = LlamaTokenizer.from_pretrained(MODEL_NAME)
177
+
178
+ model = PeftModel.from_pretrained(model, MODEL_NAME)
179
+
180
+ generation_config = GenerationConfig(
181
+ temperature=0.2,
182
+ top_p=0.75,
183
+ top_k=40,
184
+ num_beams=4,
185
+ max_new_tokens=32,
186
+ )
187
+
188
+ model.eval()
189
+ if torch.__version__ >= "2":
190
+ model = torch.compile(model)
191
+ ```
192
+
193
+ ### Example of Usage
194
+
195
+ ```python
196
+ instruction = "What is the capital city of Greece and with which countries does Greece border?"
197
+ input_ctxt = None # For some tasks, you can provide an input context to help the model generate a better response.
198
+
199
+ prompt = generate_prompt(instruction, input_ctxt)
200
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
201
+ input_ids = input_ids.to(model.device)
202
+
203
+ with torch.no_grad():
204
+ outputs = model.generate(
205
+ input_ids=input_ids,
206
+ generation_config=generation_config,
207
+ return_dict_in_generate=True,
208
+ output_scores=True,
209
+ )
210
+
211
+ response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
212
+ print(response)
213
+
214
+ >>> The capital city of Greece is Athens and it borders Turkey, Bulgaria, Macedonia, Albania, and the Aegean Sea.
215
+ ```
216
+
217
+ ## Training Details
218
+
219
+ ## Training procedure
220
+
221
+ ### Training hyperparameters
222
+
223
+ The following hyperparameters were used during training:
224
+ - learning_rate: 2e-05
225
+ - train_batch_size: 4
226
+ - eval_batch_size: 8
227
+ - seed: 42
228
+ - gradient_accumulation_steps: 2
229
+ - total_train_batch_size: 8
230
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
231
+ - lr_scheduler_type: linear
232
+ - lr_scheduler_warmup_steps: 50
233
+ - num_epochs: 2
234
+ - mixed_precision_training: Native AMP
235
+
236
+ ### Framework versions
237
+
238
+ - Transformers 4.28.1
239
+ - Pytorch 2.0.0+cu117
240
+ - Datasets 2.12.0
241
+ - Tokenizers 0.12.1
242
+
243
+ ### Training Data
244
+
245
+ The tiiuae/falcon-7b was finetuned on conversations and question answering data
246
+
247
+
248
+ ### Training Procedure
249
+
250
+ The tiiuae/falcon-7b model was further trained and finetuned on question answering and prompts data for 1 epoch (approximately 10 hours of training on a single GPU)
251
+
252
+
253
+ ## Model Architecture and Objective
254
+
255
+ The model is based on tiiuae/falcon-7b model and finetuned adapters on top of the main model on conversations and question answering data.