Bruno commited on
Commit
24d52a5
·
1 Parent(s): 00efc34

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ datasets:
4
+ - dominguesm/alpaca-data-pt-br
5
+ library_name: adapter-transformers
6
+ pipeline_tag: text-generation
7
+ language:
8
+ - pt
9
+ - en
10
+ thumbnail: https://huggingface.co/Bruno/Harpia-7b-guanacoLora/blob/main/har.png
11
+ ---
12
+ https://huggingface.co/Bruno/Harpia-7b-guanacoLora/blob/main/har.png
13
+
14
+ <div style="text-align:center;width:250;height:250;">
15
+ <img src="https://huggingface.co/Bruno/Harpia-7b-guanacoLora/blob/main/har.png" alt="Harpia logo"">
16
+ </div>
17
+
18
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
+ should probably proofread and complete it, then remove this comment. -->
20
+
21
+ # Harpia
22
+
23
+ ## Adapter Description
24
+ This adapter was created with the [PEFT](https://github.com/huggingface/peft) library and allowed the base model **Falcon-7b** to be fine-tuned on the **https://huggingface.co/datasets/dominguesm/alpaca-data-pt-br** by using the method **QLoRA**.
25
+
26
+ ## Model description
27
+
28
+ [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b)
29
+
30
+ ## Intended uses & limitations
31
+
32
+ TBA
33
+
34
+ ## Training and evaluation data
35
+
36
+ TBA
37
+
38
+
39
+ ### Training results
40
+
41
+
42
+ ### How to use
43
+ ```py
44
+ from peft import PeftModel, PeftConfig
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
46
+
47
+ peft_model_id = "Bruno/Caramelo_7B"
48
+
49
+ config = PeftConfig.from_pretrained(peft_model_id)
50
+ bnb_config = BitsAndBytesConfig(
51
+ load_in_4bit=True,
52
+ bnb_4bit_quant_type="nf4",
53
+ bnb_4bit_compute_dtype=torch.float16,
54
+ )
55
+
56
+ tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
57
+
58
+ model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path,
59
+ return_dict=True,
60
+ quantization_config=bnb_config,
61
+ trust_remote_code=True,
62
+ device_map={"": 0})
63
+ prompt_input = "Abaixo está uma declaração que descreve uma tarefa, juntamente com uma entrada que fornece mais contexto. Escreva uma resposta que conclua corretamente a solicitação.\n\n ### Instrução:\n{instruction}\n\n### Entrada:\n{input}\n\n### Resposta:\n"
64
+ prompt_no_input = "Abaixo está uma instrução que descreve uma tarefa. Escreva uma resposta que conclua corretamente a solicitação.\n\n### Instrução:\n{instruction}\n\n### Resposta:\n"
65
+
66
+ def create_prompt(instruction, input=None):
67
+ if input:
68
+ return prompt_input.format(instruction=instruction, input=input)
69
+ else:
70
+ return prompt_no_input.format(instruction=instruction)
71
+
72
+ def generate(
73
+ instruction,
74
+ input=None,
75
+ max_new_tokens=128,
76
+ temperature=0.1,
77
+ top_p=0.75,
78
+ top_k=40,
79
+ num_beams=4,
80
+ repetition_penalty=1.5,
81
+ max_length=512
82
+ ):
83
+ prompt = create_prompt(instruction, input)
84
+ inputs = tokenizer.encode_plus(prompt, return_tensors="pt", truncation=True, max_length=max_length, padding="longest")
85
+ input_ids = inputs["input_ids"].to("cuda")
86
+ attention_mask = inputs["attention_mask"].to("cuda")
87
+
88
+ generation_output = model.generate(
89
+ input_ids=input_ids,
90
+ attention_mask=attention_mask,
91
+ max_length=max_length,
92
+ pad_token_id=tokenizer.pad_token_id,
93
+ eos_token_id=tokenizer.eos_token_id,
94
+ temperature=temperature,
95
+ top_p=top_p,
96
+ top_k=top_k,
97
+ num_beams=num_beams,
98
+ repetition_penalty=repetition_penalty,
99
+ length_penalty=0.8,
100
+ early_stopping=True,
101
+ output_scores=True,
102
+ return_dict_in_generate=True
103
+ )
104
+
105
+ output = tokenizer.decode(generation_output.sequences[0], skip_special_tokens=True)
106
+ return output.split("### Resposta:")[1]
107
+
108
+ instruction = "como faço um bolo de cenoura?"
109
+ print("instruction:", instruction)
110
+ print("Resposta:", generate(instruction))
111
+
112
+
113
+ ### Framework versions
114
+
115
+ - Transformers 4.30.0.dev0
116
+ - Pytorch 2.0.1+cu118
117
+ - Datasets 2.12.0
118
+ - Tokenizers 0.13.3