afrideva commited on
Commit
556664d
·
1 Parent(s): 732ec04

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +193 -0
README.md ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: freecs/phine-2-v0
3
+ datasets:
4
+ - vicgalle/alpaca-gpt4
5
+ inference: false
6
+ license: unknown
7
+ model_creator: freecs
8
+ model_name: phine-2-v0
9
+ pipeline_tag: text-generation
10
+ quantized_by: afrideva
11
+ tags:
12
+ - gguf
13
+ - ggml
14
+ - quantized
15
+ - q2_k
16
+ - q3_k_m
17
+ - q4_k_m
18
+ - q5_k_m
19
+ - q6_k
20
+ - q8_0
21
+ ---
22
+ # freecs/phine-2-v0-GGUF
23
+
24
+ Quantized GGUF model files for [phine-2-v0](https://huggingface.co/freecs/phine-2-v0) from [freecs](https://huggingface.co/freecs)
25
+
26
+
27
+ | Name | Quant method | Size |
28
+ | ---- | ---- | ---- |
29
+ | [phine-2-v0.fp16.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.fp16.gguf) | fp16 | 5.56 GB |
30
+ | [phine-2-v0.q2_k.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q2_k.gguf) | q2_k | 1.17 GB |
31
+ | [phine-2-v0.q3_k_m.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q3_k_m.gguf) | q3_k_m | 1.48 GB |
32
+ | [phine-2-v0.q4_k_m.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q4_k_m.gguf) | q4_k_m | 1.79 GB |
33
+ | [phine-2-v0.q5_k_m.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q5_k_m.gguf) | q5_k_m | 2.07 GB |
34
+ | [phine-2-v0.q6_k.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q6_k.gguf) | q6_k | 2.29 GB |
35
+ | [phine-2-v0.q8_0.gguf](https://huggingface.co/afrideva/phine-2-v0-GGUF/resolve/main/phine-2-v0.q8_0.gguf) | q8_0 | 2.96 GB |
36
+
37
+
38
+
39
+ ## Original Model Card:
40
+ ---
41
+ # Model Card: Phine-2-v0
42
+
43
+ ## Overview
44
+
45
+ - **Model Name:** Phine-2
46
+ - **Base Model:** Phi-2 (Microsoft model)
47
+ - **Created By:** [GR](https://twitter.com/gr_username)
48
+ - **Donations Link:** [Click Me](https://www.buymeacoffee.com/gr.0)
49
+
50
+ ## Code Usage
51
+
52
+ To try Phine, use the following Python code snippet:
53
+
54
+ ```python
55
+ #######################
56
+ '''
57
+ Name: Phine Inference
58
+ License: MIT
59
+ '''
60
+ #######################
61
+
62
+
63
+ ##### Dependencies
64
+
65
+ """ IMPORTANT: Uncomment the following line if you are in a Colab/Notebook environment """
66
+
67
+ #!pip install gradio einops accelerate bitsandbytes transformers
68
+
69
+ #####
70
+
71
+ import gradio as gr
72
+ import transformers
73
+ from transformers import AutoTokenizer, AutoModelForCausalLM
74
+ import torch
75
+ import random
76
+ import re
77
+
78
+ def cut_text_after_last_token(text, token):
79
+
80
+ last_occurrence = text.rfind(token)
81
+
82
+ if last_occurrence != -1:
83
+ result = text[last_occurrence + len(token):].strip()
84
+ return result
85
+ else:
86
+ return None
87
+
88
+
89
+ class _SentinelTokenStoppingCriteria(transformers.StoppingCriteria):
90
+
91
+ def __init__(self, sentinel_token_ids: torch.LongTensor,
92
+ starting_idx: int):
93
+ transformers.StoppingCriteria.__init__(self)
94
+ self.sentinel_token_ids = sentinel_token_ids
95
+ self.starting_idx = starting_idx
96
+
97
+ def __call__(self, input_ids: torch.LongTensor,
98
+ _scores: torch.FloatTensor) -> bool:
99
+ for sample in input_ids:
100
+ trimmed_sample = sample[self.starting_idx:]
101
+
102
+ if trimmed_sample.shape[-1] < self.sentinel_token_ids.shape[-1]:
103
+ continue
104
+
105
+ for window in trimmed_sample.unfold(
106
+ 0, self.sentinel_token_ids.shape[-1], 1):
107
+ if torch.all(torch.eq(self.sentinel_token_ids, window)):
108
+ return True
109
+ return False
110
+
111
+
112
+
113
+
114
+
115
+ model_path = 'freecs/phine-2-v0'
116
+
117
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
118
+
119
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
120
+
121
+ model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, load_in_4bit=False, torch_dtype=torch.float16).to(device) #remove .to() if load_in_4/8bit = True
122
+
123
+ sys_message = "You are an AI assistant named Phine developed by FreeCS.org. You are polite and smart." #System Message
124
+
125
+ def phine(message, history, temperature, top_p, top_k, repetition_penalty):
126
+
127
+
128
+
129
+ n = 0
130
+ context = ""
131
+ if history and len(history) > 0:
132
+
133
+ for x in history:
134
+ for h in x:
135
+ if n%2 == 0:
136
+ context+=f"""\n<|prompt|>{h}\n"""
137
+ else:
138
+ context+=f"""<|response|>{h}"""
139
+ n+=1
140
+ else:
141
+
142
+ context = ""
143
+
144
+
145
+
146
+
147
+ prompt = f"""\n<|system|>{sys_message}"""+context+"\n<|prompt|>"+message+"<|endoftext|>\n<|response|>"
148
+ tokenized = tokenizer(prompt, return_tensors="pt").to(device)
149
+
150
+
151
+ stopping_criteria_list = transformers.StoppingCriteriaList([
152
+ _SentinelTokenStoppingCriteria(
153
+ sentinel_token_ids=tokenizer(
154
+ "<|endoftext|>",
155
+ add_special_tokens=False,
156
+ return_tensors="pt",
157
+ ).input_ids.to(device),
158
+ starting_idx=tokenized.input_ids.shape[-1])
159
+ ])
160
+
161
+
162
+ token = model.generate(**tokenized,
163
+ stopping_criteria=stopping_criteria_list,
164
+ do_sample=True,
165
+ max_length=2048, temperature=temperature, top_p=top_p, top_k = top_k, repetition_penalty = repetition_penalty
166
+ )
167
+
168
+ completion = tokenizer.decode(token[0], skip_special_tokens=False)
169
+ token = "<|response|>"
170
+ res = cut_text_after_last_token(completion, token)
171
+ return res.replace('<|endoftext|>', '')
172
+ demo = gr.ChatInterface(phine,
173
+ additional_inputs=[
174
+ gr.Slider(0.1, 2.0, label="temperature", value=0.5),
175
+ gr.Slider(0.1, 2.0, label="Top P", value=0.9),
176
+ gr.Slider(1, 500, label="Top K", value=50),
177
+ gr.Slider(0.1, 2.0, label="Repetition Penalty", value=1.15)
178
+ ]
179
+ )
180
+
181
+ if __name__ == "__main__":
182
+ demo.queue().launch(share=True, debug=True) #If debug=True causes problems you can set it to False
183
+ ```
184
+
185
+ ## Contact
186
+
187
+ For inquiries, collaboration opportunities, or additional information, reach out to me on Twitter: [gr](https://twitter.com/gr_username).
188
+
189
+ ## Disclaimer
190
+
191
+ As of now, I have not applied Reinforcement Learning from Human Feedback (RLHF). Due to this, the model may generate unexpected or potentially unethical outputs.
192
+
193
+ ---