OPEA
/

Safetensors
qwen2
4-bit precision
awq
wenhuach commited on
Commit
763a9f3
·
1 Parent(s): d9d60db

first commit

Browse files

Signed-off-by: wenhuach <[email protected]>

.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,237 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Model Details
2
+
3
+ This awq model is an int4 model with group_size 128 and symmetric quantization of [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) generated by [intel/auto-round](https://github.com/intel/auto-round). We excluded 3 layers from quantization due to the overflow issue on some int4 backends.
4
+
5
+ ## How To Use
6
+
7
+ ### INT4 Inference(CPU/HPU/CUDA)
8
+
9
+ ```python
10
+ from transformers import AutoModelForCausalLM, AutoTokenizer
11
+
12
+ model_name = "OPEA/QwQ-32B-Preview-int4-sym-mixed-awq-inc"
13
+
14
+ model = AutoModelForCausalLM.from_pretrained(
15
+ model_name,
16
+ torch_dtype="auto",
17
+ device_map="auto"
18
+ )
19
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
20
+
21
+ prompt = "How many r in strawberry."
22
+ messages = [
23
+ {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
24
+ {"role": "user", "content": prompt}
25
+ ]
26
+ text = tokenizer.apply_chat_template(
27
+ messages,
28
+ tokenize=False,
29
+ add_generation_prompt=True
30
+ )
31
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
32
+
33
+ generated_ids = model.generate(
34
+ **model_inputs,
35
+ max_new_tokens=512,
36
+ do_sample=False ##change this to follow official usage
37
+ )
38
+ generated_ids = [
39
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
40
+ ]
41
+
42
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
43
+ print(response)
44
+
45
+
46
+ prompt = "9.11和9.8哪个数字大"
47
+ #INT4:
48
+ """9.11和9.8,哪个数字大呢?让我想想。首先,这两个数字都是小数,对吧?9.11和9.8。我需要比较它们的大小。
49
+
50
+ 首先,我看看整数部分。两个数字的整数部分都是9,所以整数部分相等。那我就需要看小数部分。
51
+
52
+ 小数部分,9.11是0.11,而9.8是0.8。现在比较0.11和0.8,哪个更大。
53
+
54
+ 0.8看起来比0.11大,因为8比1大。但是,为了确信,我可以把它们看成分数。
55
+
56
+ 0.8是8/10,而0.11是11/100。为了比较它们,我可以把它们转换成相同的分母。
57
+
58
+ 10和100的最小公分母是100。所以,8/10等于80/100,而11/100 remains 11/100。
59
+
60
+ 现在,80/100大于11/100,所以0.8大于0.11。
61
+
62
+ 因此,9.8大于9.11。
63
+
64
+ 不过,再想想,也许我应该直接比较小数。9.11是9加上0.11,9.8是9加上0.8。
65
+
66
+ 很明显,0.8大于0.11,所以9.8大于9.11。
67
+
68
+ 或者,我可以把它们看成货币,比如美元。9.11美元和9.8美元,哪个更多?
69
+
70
+ 9.8美元显然比9.11美元多。
71
+
72
+ 再或者,想想它们在数轴上的位置。9.11在9和10之间,靠近9.1,而9.8在9和10之间,靠近9.8。
73
+
74
+ 显然,9.8在数轴上更靠右,所以更大。
75
+
76
+ 另外,我也可以把它们转换成分数来比较。
77
+
78
+ 9.11是9又11/100,9.8是9又8/10,which is 9又4/5.
79
+
80
+ 现在,比较11/100和4/5.
81
+
82
+ 11/100 is 0.11, and 4/5 is 0.8.
83
+
84
+ Again, 0.8 is larger than 0.1"""
85
+
86
+ prompt = "How many r in strawberry."
87
+ ##INT4:
88
+ """Let's see. The word is "strawberry." I need to find out how many times the letter "r" appears in it.
89
+
90
+ First, I'll spell out the word to make sure I don't miss any letters. S-T-R-A-W-B-E-R-R-Y. Okay, that's all the letters in "strawberry."
91
+
92
+ Now, I need to count how many "r"s are there. Let's go through the word one letter at a time.
93
+
94
+ Starting with the first letter: S - not an "r".
95
+
96
+ Second letter: T - not an "r".
97
+
98
+ Third letter: R - that's one "r".
99
+
100
+ Fourth letter: A - not an "r".
101
+
102
+ Fifth letter: W - not an "r".
103
+
104
+ Sixth letter: B - not an "r".
105
+
106
+ Seventh letter: E - not an "r".
107
+
108
+ Eighth letter: R - that's another "r".
109
+
110
+ Ninth letter: R - that's another "r".
111
+
112
+ Tenth letter: Y - not an "r".
113
+
114
+ So, I've found three "r"s in "strawberry."
115
+
116
+ Wait a minute, let me double-check. Sometimes I might miscount, especially if there are multiple "r"s close together.
117
+
118
+ Let's spell it again: S-T-R-A-W-B-E-R-R-Y.
119
+
120
+ First "r" is the third letter.
121
+
122
+ Second "r" is the eighth letter.
123
+
124
+ Third "r" is the ninth letter.
125
+
126
+ Yes, that's three "r"s in total.
127
+
128
+ I think that's correct.
129
+
130
+ **Final Answer**
131
+
132
+ \[ \boxed{3} \]"""
133
+
134
+ ##BF16:
135
+ """Let's see. The word is "strawberry." I need to find out how many times the letter "r" appears in it. Okay, so I'll look at each letter in the word one by one.
136
+
137
+ First letter: s - that's not r.
138
+
139
+ Second letter: t - no, not r.
140
+
141
+ Third letter: r - okay, that's one r.
142
+
143
+ Fourth letter: a - not r.
144
+
145
+ Fifth letter: w - not r.
146
+
147
+ Sixth letter: b - no.
148
+
149
+ Seventh letter: e - not r.
150
+
151
+ Eighth letter: r - another r, so that's two rs.
152
+
153
+ Ninth letter: r - wait, is there a ninth letter? Let me check. S-t-r-a-w-b-e-r-r-y. Yes, there are two rs, but I think there might be more.
154
+
155
+ Wait, let's count again. S-t-r-a-w-b-e-r-r-y. That's 10 letters. So, positions:
156
+
157
+ 1: s
158
+
159
+ 2: t
160
+
161
+ 3: r
162
+
163
+ 4: a
164
+
165
+ 5: w
166
+
167
+ 6: b
168
+
169
+ 7: e
170
+
171
+ 8: r
172
+
173
+ 9: r
174
+
175
+ 10: y
176
+
177
+ So, positions 3, 8, and 9 are rs. That means there are three rs in "strawberry."
178
+
179
+ But earlier I thought there were only two. Maybe I missed one. Let's double-check.
180
+
181
+ S-t-r-a-w-b-e-r-r-y.
182
+
183
+ r is the third letter, then the eighth, and the ninth. So, three rs.
184
+
185
+ Wait, but sometimes people might pronounce it differently, but in the spelling, it's three rs.
186
+
187
+ I think the answer is three.
188
+
189
+ **Final Answer**
190
+
191
+ \[ \boxed{3} \]
192
+ """
193
+
194
+ ```
195
+
196
+
197
+
198
+ ### Generate the model
199
+
200
+ Here is the sample command to generate the model. For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >0.4.1
201
+
202
+ ```bash
203
+ auto-round \
204
+ --model Qwen/QwQ-32B-Preview \
205
+ --device 0 \
206
+ --group_size 128 \
207
+ --bits 4 \
208
+ --disable_eval \
209
+ --model_dtype "fp16" \
210
+ --fp_layers "model.layers.5.mlp.down_proj,model.layers.5.mlp.up_proj,model.layers.5.mlp.gate_proj" \
211
+ --format 'auto_round' \
212
+ --output_dir "./tmp_autoround"
213
+ ```
214
+
215
+ ## Ethical Considerations and Limitations
216
+
217
+ The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
218
+
219
+ Therefore, before deploying any applications of the model, developers should perform safety testing.
220
+
221
+ ## Caveats and Recommendations
222
+
223
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
224
+
225
+ Here are a couple of useful links to learn more about Intel's AI software:
226
+
227
+ - Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
228
+
229
+ ## Disclaimer
230
+
231
+ The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
232
+
233
+ ## Cite
234
+
235
+ @article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
236
+
237
+ [arxiv](https://arxiv.org/abs/2309.05516) [github](https://github.com/intel/auto-round)
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
config.json ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/data5/models/QwQ-32B-Preview",
3
+ "architectures": [
4
+ "Qwen2ForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "eos_token_id": 151645,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 5120,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 27648,
13
+ "max_position_embeddings": 32768,
14
+ "max_window_layers": 64,
15
+ "model_type": "qwen2",
16
+ "num_attention_heads": 40,
17
+ "num_hidden_layers": 64,
18
+ "num_key_value_heads": 8,
19
+ "quantization_config": {
20
+ "amp": true,
21
+ "autoround_version": "0.4.2.dev",
22
+ "batch_size": 8,
23
+ "bits": 4,
24
+ "data_type": "int",
25
+ "dataset": "NeelNanda/pile-10k",
26
+ "enable_minmax_tuning": true,
27
+ "enable_norm_bias_tuning": false,
28
+ "enable_quanted_input": true,
29
+ "gradient_accumulate_steps": 1,
30
+ "group_size": 128,
31
+ "iters": 200,
32
+ "low_gpu_mem_usage": false,
33
+ "lr": 0.005,
34
+ "minmax_lr": 0.005,
35
+ "modules_to_not_convert": [
36
+ "model.layers.5.mlp.down_proj",
37
+ "model.layers.5.mlp.up_proj",
38
+ "model.layers.5.mlp.gate_proj",
39
+ "lm_head"
40
+ ],
41
+ "nsamples": 128,
42
+ "quant_method": "awq",
43
+ "scale_dtype": "torch.float16",
44
+ "seqlen": 2048,
45
+ "sym": true,
46
+ "to_quant_block_names": [
47
+ [
48
+ "model.layers.0",
49
+ "model.layers.1",
50
+ "model.layers.2",
51
+ "model.layers.3",
52
+ "model.layers.4",
53
+ "model.layers.5",
54
+ "model.layers.6",
55
+ "model.layers.7",
56
+ "model.layers.8",
57
+ "model.layers.9",
58
+ "model.layers.10",
59
+ "model.layers.11",
60
+ "model.layers.12",
61
+ "model.layers.13",
62
+ "model.layers.14",
63
+ "model.layers.15",
64
+ "model.layers.16",
65
+ "model.layers.17",
66
+ "model.layers.18",
67
+ "model.layers.19",
68
+ "model.layers.20",
69
+ "model.layers.21",
70
+ "model.layers.22",
71
+ "model.layers.23",
72
+ "model.layers.24",
73
+ "model.layers.25",
74
+ "model.layers.26",
75
+ "model.layers.27",
76
+ "model.layers.28",
77
+ "model.layers.29",
78
+ "model.layers.30",
79
+ "model.layers.31",
80
+ "model.layers.32",
81
+ "model.layers.33",
82
+ "model.layers.34",
83
+ "model.layers.35",
84
+ "model.layers.36",
85
+ "model.layers.37",
86
+ "model.layers.38",
87
+ "model.layers.39",
88
+ "model.layers.40",
89
+ "model.layers.41",
90
+ "model.layers.42",
91
+ "model.layers.43",
92
+ "model.layers.44",
93
+ "model.layers.45",
94
+ "model.layers.46",
95
+ "model.layers.47",
96
+ "model.layers.48",
97
+ "model.layers.49",
98
+ "model.layers.50",
99
+ "model.layers.51",
100
+ "model.layers.52",
101
+ "model.layers.53",
102
+ "model.layers.54",
103
+ "model.layers.55",
104
+ "model.layers.56",
105
+ "model.layers.57",
106
+ "model.layers.58",
107
+ "model.layers.59",
108
+ "model.layers.60",
109
+ "model.layers.61",
110
+ "model.layers.62",
111
+ "model.layers.63"
112
+ ]
113
+ ],
114
+ "version": "gemm",
115
+ "zero_point": false
116
+ },
117
+ "rms_norm_eps": 1e-05,
118
+ "rope_scaling": null,
119
+ "rope_theta": 1000000.0,
120
+ "sliding_window": null,
121
+ "tie_word_embeddings": false,
122
+ "torch_dtype": "float16",
123
+ "transformers_version": "4.46.3",
124
+ "use_cache": true,
125
+ "use_sliding_window": false,
126
+ "vocab_size": 152064
127
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "pad_token_id": 151643,
9
+ "temperature": 0.7,
10
+ "top_k": 20,
11
+ "top_p": 0.8,
12
+ "transformers_version": "4.46.3"
13
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7d241049eab4e7fbe2dbed923e714ea3e8d60fee4dcabd0235541335371bf8a
3
+ size 4991836400
model-00002-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7d40fc90b9109e3951dbb45ebae55ff3ceb0b976daec563b38bdce246c63f7d
3
+ size 4974450512
model-00003-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd608c7bc8c3ec78f605783008cdcb1ed65f52494797e40e9a8176639ba00e22
3
+ size 4993553824
model-00004-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29a8ec19af306e01c6d1f7e93c41b605422e14708438f31c22c8ea967a2141af
3
+ size 4997868096
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
quantization_config.json ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bits": 4,
3
+ "group_size": 128,
4
+ "sym": true,
5
+ "data_type": "int",
6
+ "enable_quanted_input": true,
7
+ "enable_minmax_tuning": true,
8
+ "seqlen": 2048,
9
+ "batch_size": 8,
10
+ "scale_dtype": "torch.float16",
11
+ "lr": 0.005,
12
+ "minmax_lr": 0.005,
13
+ "gradient_accumulate_steps": 1,
14
+ "iters": 200,
15
+ "amp": true,
16
+ "nsamples": 128,
17
+ "low_gpu_mem_usage": false,
18
+ "to_quant_block_names": [
19
+ [
20
+ "model.layers.0",
21
+ "model.layers.1",
22
+ "model.layers.2",
23
+ "model.layers.3",
24
+ "model.layers.4",
25
+ "model.layers.5",
26
+ "model.layers.6",
27
+ "model.layers.7",
28
+ "model.layers.8",
29
+ "model.layers.9",
30
+ "model.layers.10",
31
+ "model.layers.11",
32
+ "model.layers.12",
33
+ "model.layers.13",
34
+ "model.layers.14",
35
+ "model.layers.15",
36
+ "model.layers.16",
37
+ "model.layers.17",
38
+ "model.layers.18",
39
+ "model.layers.19",
40
+ "model.layers.20",
41
+ "model.layers.21",
42
+ "model.layers.22",
43
+ "model.layers.23",
44
+ "model.layers.24",
45
+ "model.layers.25",
46
+ "model.layers.26",
47
+ "model.layers.27",
48
+ "model.layers.28",
49
+ "model.layers.29",
50
+ "model.layers.30",
51
+ "model.layers.31",
52
+ "model.layers.32",
53
+ "model.layers.33",
54
+ "model.layers.34",
55
+ "model.layers.35",
56
+ "model.layers.36",
57
+ "model.layers.37",
58
+ "model.layers.38",
59
+ "model.layers.39",
60
+ "model.layers.40",
61
+ "model.layers.41",
62
+ "model.layers.42",
63
+ "model.layers.43",
64
+ "model.layers.44",
65
+ "model.layers.45",
66
+ "model.layers.46",
67
+ "model.layers.47",
68
+ "model.layers.48",
69
+ "model.layers.49",
70
+ "model.layers.50",
71
+ "model.layers.51",
72
+ "model.layers.52",
73
+ "model.layers.53",
74
+ "model.layers.54",
75
+ "model.layers.55",
76
+ "model.layers.56",
77
+ "model.layers.57",
78
+ "model.layers.58",
79
+ "model.layers.59",
80
+ "model.layers.60",
81
+ "model.layers.61",
82
+ "model.layers.62",
83
+ "model.layers.63"
84
+ ]
85
+ ],
86
+ "enable_norm_bias_tuning": false,
87
+ "dataset": "NeelNanda/pile-10k",
88
+ "autoround_version": "0.4.2.dev",
89
+ "quant_method": "awq",
90
+ "zero_point": false,
91
+ "version": "gemm",
92
+ "modules_to_not_convert": [
93
+ "model.layers.5.mlp.down_proj",
94
+ "model.layers.5.mlp.up_proj",
95
+ "model.layers.5.mlp.gate_proj",
96
+ "lm_head"
97
+ ]
98
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
3
+ size 11421896
tokenizer_config.json ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
199
+ "clean_up_tokenization_spaces": false,
200
+ "eos_token": "<|im_end|>",
201
+ "errors": "replace",
202
+ "model_max_length": 32768,
203
+ "pad_token": "<|endoftext|>",
204
+ "split_special_tokens": false,
205
+ "tokenizer_class": "Qwen2Tokenizer",
206
+ "unk_token": null
207
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff