DavidAU commited on
Commit
215990b
·
verified ·
1 Parent(s): 89eddcb

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,462 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # MN-Wordstorm-I-Brainstorm-exp40-3x
10
+
11
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ ## Merge Details
14
+ ### Merge Method
15
+
16
+ This model was merged using the passthrough merge method.
17
+
18
+ ### Models Merged
19
+
20
+ The following models were included in the merge:
21
+ * H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
22
+
23
+ ### Configuration
24
+
25
+ The following YAML configuration was used to produce this model:
26
+
27
+ ```yaml
28
+ # Six splits plus "end game
29
+ # "D" starts at plus .1 VS D/O proj.
30
+ # 40 plus.
31
+
32
+ slices:
33
+ - sources:
34
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
35
+ layer_range: [0, 62]
36
+
37
+ # conc layers
38
+ # split 1
39
+
40
+ - sources:
41
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
42
+ layer_range: [62,63]
43
+ parameters:
44
+ scale:
45
+ - filter: o_proj
46
+ value: 0.01
47
+ - filter: down_proj
48
+ value: 0.01
49
+ - value: 0.11
50
+ - sources:
51
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
52
+ layer_range: [62,63]
53
+ parameters:
54
+ scale:
55
+ - filter: o_proj
56
+ value: 0.02
57
+ - filter: down_proj
58
+ value: 0.02
59
+ - value: 0.12
60
+ - sources:
61
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
62
+ layer_range: [62,63]
63
+ parameters:
64
+ scale:
65
+ - filter: o_proj
66
+ value: 0.03
67
+ - filter: down_proj
68
+ value: 0.03
69
+ - value: 0.13
70
+
71
+ - sources:
72
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
73
+ layer_range: [62,63]
74
+ parameters:
75
+ scale:
76
+ - filter: o_proj
77
+ value: 0.04
78
+ - filter: down_proj
79
+ value: 0.04
80
+ - value: 0.61
81
+
82
+ # split 2, SURGE D THEN D drop .46, continues @ D .15 (from .13)
83
+
84
+ - sources:
85
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
86
+ layer_range: [62,63]
87
+ parameters:
88
+ scale:
89
+ - filter: o_proj
90
+ value: 0.05
91
+ - filter: down_proj
92
+ value: 0.05
93
+ - value: 0.15
94
+ - sources:
95
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
96
+ layer_range: [62,63]
97
+ parameters:
98
+ scale:
99
+ - filter: o_proj
100
+ value: 0.06
101
+ - filter: down_proj
102
+ value: 0.06
103
+ - value: 0.16
104
+ - sources:
105
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
106
+ layer_range: [62,63]
107
+ parameters:
108
+ scale:
109
+ - filter: o_proj
110
+ value: 0.07
111
+ - filter: down_proj
112
+ value: 0.07
113
+ - value: 0.17
114
+ - sources:
115
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
116
+ layer_range: [62,63]
117
+ parameters:
118
+ scale:
119
+ - filter: o_proj
120
+ value: 0.08
121
+ - filter: down_proj
122
+ value: 0.08
123
+ - value: 0.41
124
+
125
+ # split 3, SURGE D to .41, D drop .21 ... follows .17 previous
126
+
127
+ - sources:
128
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
129
+ layer_range: [62,63]
130
+ parameters:
131
+ scale:
132
+ - filter: o_proj
133
+ value: 0.09
134
+ - filter: down_proj
135
+ value: 0.09
136
+ - value: 0.19
137
+ - sources:
138
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
139
+ layer_range: [62,63]
140
+ parameters:
141
+ scale:
142
+ - filter: o_proj
143
+ value: 0.10
144
+ - filter: down_proj
145
+ value: 0.10
146
+ - value: 0.20
147
+ - sources:
148
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
149
+ layer_range: [62,63]
150
+ parameters:
151
+ scale:
152
+ - filter: o_proj
153
+ value: 0.11
154
+ - filter: down_proj
155
+ value: 0.11
156
+ - value: .22
157
+ - sources:
158
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
159
+ layer_range: [62,63]
160
+ parameters:
161
+ scale:
162
+ - filter: o_proj
163
+ value: 0.12
164
+ - filter: down_proj
165
+ value: 0.12
166
+ - value: .24
167
+ - sources:
168
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
169
+ layer_range: [62,63]
170
+ parameters:
171
+ scale:
172
+ - filter: o_proj
173
+ value: 0.13
174
+ - filter: down_proj
175
+ value: 0.13
176
+ - value: .26
177
+ - sources:
178
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
179
+ layer_range: [62,63]
180
+ parameters:
181
+ scale:
182
+ - filter: o_proj
183
+ value: 0.14
184
+ - filter: down_proj
185
+ value: 0.14
186
+ - value: .28
187
+ - sources:
188
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
189
+ layer_range: [62,63]
190
+ parameters:
191
+ scale:
192
+ - filter: o_proj
193
+ value: 0.15
194
+ - filter: down_proj
195
+ value: 0.15
196
+ - value: .30
197
+ - sources:
198
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
199
+ layer_range: [62,63]
200
+ parameters:
201
+ scale:
202
+ - filter: o_proj
203
+ value: 0.16
204
+ - filter: down_proj
205
+ value: 0.16
206
+ - value: .31
207
+ - sources:
208
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
209
+ layer_range: [62,63]
210
+ parameters:
211
+ scale:
212
+ - filter: o_proj
213
+ value: 0.20
214
+ - filter: down_proj
215
+ value: 0.20
216
+ - value: .32
217
+ - sources:
218
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
219
+ layer_range: [62,63]
220
+ parameters:
221
+ scale:
222
+ - filter: o_proj
223
+ value: 0.21
224
+ - filter: down_proj
225
+ value: 0.21
226
+ - value: .33
227
+ - sources:
228
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
229
+ layer_range: [62,63]
230
+ parameters:
231
+ scale:
232
+ - filter: o_proj
233
+ value: 0.22
234
+ - filter: down_proj
235
+ value: 0.22
236
+ - value: .34
237
+ - sources:
238
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
239
+ layer_range: [62,63]
240
+ parameters:
241
+ scale:
242
+ - filter: o_proj
243
+ value: 0.23
244
+ - filter: down_proj
245
+ value: 0.23
246
+ - value: .35
247
+
248
+ # split 4 , NO SURGE D, "D" down drop of .24 ; reverts to .11 (the very first "D" setting )
249
+
250
+ - sources:
251
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
252
+ layer_range: [62,63]
253
+ parameters:
254
+ scale:
255
+ - filter: o_proj
256
+ value: 0.24
257
+ - filter: down_proj
258
+ value: 0.24
259
+ - value: 0.11
260
+ - sources:
261
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
262
+ layer_range: [62,63]
263
+ parameters:
264
+ scale:
265
+ - filter: o_proj
266
+ value: 0.241
267
+ - filter: down_proj
268
+ value: 0.241
269
+ - value: 0.12
270
+ - sources:
271
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
272
+ layer_range: [62,63]
273
+ parameters:
274
+ scale:
275
+ - filter: o_proj
276
+ value: 0.242
277
+ - filter: down_proj
278
+ value: 0.243
279
+ - value: 0.13
280
+ - sources:
281
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
282
+ layer_range: [62,63]
283
+ parameters:
284
+ scale:
285
+ - filter: o_proj
286
+ value: 0.244
287
+ - filter: down_proj
288
+ value: 0.244
289
+ - value: 0.61
290
+
291
+ # split 5, D Surge to .61, drop to .15 (following .13)
292
+
293
+ - sources:
294
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
295
+ layer_range: [62,63]
296
+ parameters:
297
+ scale:
298
+ - filter: o_proj
299
+ value: 0.245
300
+ - filter: down_proj
301
+ value: 0.245
302
+ - value: 0.15
303
+ - sources:
304
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
305
+ layer_range: [62,63]
306
+ parameters:
307
+ scale:
308
+ - filter: o_proj
309
+ value: 0.246
310
+ - filter: down_proj
311
+ value: 0.246
312
+ - value: 0.16
313
+ - sources:
314
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
315
+ layer_range: [62,63]
316
+ parameters:
317
+ scale:
318
+ - filter: o_proj
319
+ value: 0.247
320
+ - filter: down_proj
321
+ value: 0.247
322
+ - value: 0.17
323
+ - sources:
324
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
325
+ layer_range: [62,63]
326
+ parameters:
327
+ scale:
328
+ - filter: o_proj
329
+ value: 0.248
330
+ - filter: down_proj
331
+ value: 0.248
332
+ - value: 0.41
333
+
334
+ # split 6, D surge to .41 , then follows .17
335
+
336
+ - sources:
337
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
338
+ layer_range: [62,63]
339
+ parameters:
340
+ scale:
341
+ - filter: o_proj
342
+ value: 0.249
343
+ - filter: down_proj
344
+ value: 0.249
345
+ - value: 0.19
346
+ - sources:
347
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
348
+ layer_range: [62,63]
349
+ parameters:
350
+ scale:
351
+ - filter: o_proj
352
+ value: 0.250
353
+ - filter: down_proj
354
+ value: 0.250
355
+ - value: 0.20
356
+ - sources:
357
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
358
+ layer_range: [62,63]
359
+ parameters:
360
+ scale:
361
+ - filter: o_proj
362
+ value: 0.251
363
+ - filter: down_proj
364
+ value: 0.251
365
+ - value: .22
366
+ - sources:
367
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
368
+ layer_range: [62,63]
369
+ parameters:
370
+ scale:
371
+ - filter: o_proj
372
+ value: 0.252
373
+ - filter: down_proj
374
+ value: 0.252
375
+ - value: .24
376
+ - sources:
377
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
378
+ layer_range: [62,63]
379
+ parameters:
380
+ scale:
381
+ - filter: o_proj
382
+ value: 0.253
383
+ - filter: down_proj
384
+ value: 0.254
385
+ - value: .26
386
+ - sources:
387
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
388
+ layer_range: [62,63]
389
+ parameters:
390
+ scale:
391
+ - filter: o_proj
392
+ value: 0.255
393
+ - filter: down_proj
394
+ value: 0.255
395
+ - value: .28
396
+ - sources:
397
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
398
+ layer_range: [62,63]
399
+ parameters:
400
+ scale:
401
+ - filter: o_proj
402
+ value: 0.256
403
+ - filter: down_proj
404
+ value: 0.256
405
+ - value: .60
406
+
407
+ # O PROJ, DPROJ to .3333 /
408
+ # end game
409
+
410
+ - sources:
411
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
412
+ layer_range: [62,63]
413
+ parameters:
414
+ scale:
415
+ - filter: o_proj
416
+ value: 0.3333333333333
417
+ - filter: down_proj
418
+ value: 0.3333333333333
419
+ - value: 0.3333333333333
420
+ - sources:
421
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
422
+ layer_range: [62,63]
423
+ parameters:
424
+ scale:
425
+ - filter: o_proj
426
+ value: 0.4444444444444
427
+ - filter: down_proj
428
+ value: 0.4444444444444
429
+ - value: 0.4444444444444
430
+ - sources:
431
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
432
+ layer_range: [62,63]
433
+ parameters:
434
+ scale:
435
+ - filter: o_proj
436
+ value: 0.5555555555555
437
+ - filter: down_proj
438
+ value: 0.5555555555555
439
+ - value: 0.5555555555555
440
+ - sources:
441
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
442
+ layer_range: [62,63]
443
+ parameters:
444
+ scale:
445
+ - filter: o_proj
446
+ value: 0.6666666666666
447
+ - filter: down_proj
448
+ value: 0.6666666666666
449
+ - value: 0.6666666666666
450
+ - sources:
451
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
452
+ layer_range: [62,63]
453
+ parameters:
454
+ scale:
455
+ - filter: o_proj
456
+ value: 0.85
457
+ - filter: down_proj
458
+ value: 0.90
459
+ - value: 0.92
460
+ merge_method: passthrough
461
+ dtype: bfloat16
462
+ ```
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct",
3
+ "architectures": [
4
+ "MistralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 5120,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 14336,
14
+ "max_position_embeddings": 1024000,
15
+ "model_type": "mistral",
16
+ "num_attention_heads": 32,
17
+ "num_hidden_layers": 102,
18
+ "num_key_value_heads": 8,
19
+ "rms_norm_eps": 1e-05,
20
+ "rope_theta": 1000000.0,
21
+ "sliding_window": null,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "bfloat16",
24
+ "transformers_version": "4.46.0",
25
+ "use_cache": true,
26
+ "vocab_size": 131074
27
+ }
mergekit_config.yml ADDED
@@ -0,0 +1,434 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Six splits plus "end game
2
+ # "D" starts at plus .1 VS D/O proj.
3
+ # 40 plus.
4
+
5
+ slices:
6
+ - sources:
7
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
8
+ layer_range: [0, 62]
9
+
10
+ # conc layers
11
+ # split 1
12
+
13
+ - sources:
14
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
15
+ layer_range: [62,63]
16
+ parameters:
17
+ scale:
18
+ - filter: o_proj
19
+ value: 0.01
20
+ - filter: down_proj
21
+ value: 0.01
22
+ - value: 0.11
23
+ - sources:
24
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
25
+ layer_range: [62,63]
26
+ parameters:
27
+ scale:
28
+ - filter: o_proj
29
+ value: 0.02
30
+ - filter: down_proj
31
+ value: 0.02
32
+ - value: 0.12
33
+ - sources:
34
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
35
+ layer_range: [62,63]
36
+ parameters:
37
+ scale:
38
+ - filter: o_proj
39
+ value: 0.03
40
+ - filter: down_proj
41
+ value: 0.03
42
+ - value: 0.13
43
+
44
+ - sources:
45
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
46
+ layer_range: [62,63]
47
+ parameters:
48
+ scale:
49
+ - filter: o_proj
50
+ value: 0.04
51
+ - filter: down_proj
52
+ value: 0.04
53
+ - value: 0.61
54
+
55
+ # split 2, SURGE D THEN D drop .46, continues @ D .15 (from .13)
56
+
57
+ - sources:
58
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
59
+ layer_range: [62,63]
60
+ parameters:
61
+ scale:
62
+ - filter: o_proj
63
+ value: 0.05
64
+ - filter: down_proj
65
+ value: 0.05
66
+ - value: 0.15
67
+ - sources:
68
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
69
+ layer_range: [62,63]
70
+ parameters:
71
+ scale:
72
+ - filter: o_proj
73
+ value: 0.06
74
+ - filter: down_proj
75
+ value: 0.06
76
+ - value: 0.16
77
+ - sources:
78
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
79
+ layer_range: [62,63]
80
+ parameters:
81
+ scale:
82
+ - filter: o_proj
83
+ value: 0.07
84
+ - filter: down_proj
85
+ value: 0.07
86
+ - value: 0.17
87
+ - sources:
88
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
89
+ layer_range: [62,63]
90
+ parameters:
91
+ scale:
92
+ - filter: o_proj
93
+ value: 0.08
94
+ - filter: down_proj
95
+ value: 0.08
96
+ - value: 0.41
97
+
98
+ # split 3, SURGE D to .41, D drop .21 ... follows .17 previous
99
+
100
+ - sources:
101
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
102
+ layer_range: [62,63]
103
+ parameters:
104
+ scale:
105
+ - filter: o_proj
106
+ value: 0.09
107
+ - filter: down_proj
108
+ value: 0.09
109
+ - value: 0.19
110
+ - sources:
111
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
112
+ layer_range: [62,63]
113
+ parameters:
114
+ scale:
115
+ - filter: o_proj
116
+ value: 0.10
117
+ - filter: down_proj
118
+ value: 0.10
119
+ - value: 0.20
120
+ - sources:
121
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
122
+ layer_range: [62,63]
123
+ parameters:
124
+ scale:
125
+ - filter: o_proj
126
+ value: 0.11
127
+ - filter: down_proj
128
+ value: 0.11
129
+ - value: .22
130
+ - sources:
131
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
132
+ layer_range: [62,63]
133
+ parameters:
134
+ scale:
135
+ - filter: o_proj
136
+ value: 0.12
137
+ - filter: down_proj
138
+ value: 0.12
139
+ - value: .24
140
+ - sources:
141
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
142
+ layer_range: [62,63]
143
+ parameters:
144
+ scale:
145
+ - filter: o_proj
146
+ value: 0.13
147
+ - filter: down_proj
148
+ value: 0.13
149
+ - value: .26
150
+ - sources:
151
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
152
+ layer_range: [62,63]
153
+ parameters:
154
+ scale:
155
+ - filter: o_proj
156
+ value: 0.14
157
+ - filter: down_proj
158
+ value: 0.14
159
+ - value: .28
160
+ - sources:
161
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
162
+ layer_range: [62,63]
163
+ parameters:
164
+ scale:
165
+ - filter: o_proj
166
+ value: 0.15
167
+ - filter: down_proj
168
+ value: 0.15
169
+ - value: .30
170
+ - sources:
171
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
172
+ layer_range: [62,63]
173
+ parameters:
174
+ scale:
175
+ - filter: o_proj
176
+ value: 0.16
177
+ - filter: down_proj
178
+ value: 0.16
179
+ - value: .31
180
+ - sources:
181
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
182
+ layer_range: [62,63]
183
+ parameters:
184
+ scale:
185
+ - filter: o_proj
186
+ value: 0.20
187
+ - filter: down_proj
188
+ value: 0.20
189
+ - value: .32
190
+ - sources:
191
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
192
+ layer_range: [62,63]
193
+ parameters:
194
+ scale:
195
+ - filter: o_proj
196
+ value: 0.21
197
+ - filter: down_proj
198
+ value: 0.21
199
+ - value: .33
200
+ - sources:
201
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
202
+ layer_range: [62,63]
203
+ parameters:
204
+ scale:
205
+ - filter: o_proj
206
+ value: 0.22
207
+ - filter: down_proj
208
+ value: 0.22
209
+ - value: .34
210
+ - sources:
211
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
212
+ layer_range: [62,63]
213
+ parameters:
214
+ scale:
215
+ - filter: o_proj
216
+ value: 0.23
217
+ - filter: down_proj
218
+ value: 0.23
219
+ - value: .35
220
+
221
+ # split 4 , NO SURGE D, "D" down drop of .24 ; reverts to .11 (the very first "D" setting )
222
+
223
+ - sources:
224
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
225
+ layer_range: [62,63]
226
+ parameters:
227
+ scale:
228
+ - filter: o_proj
229
+ value: 0.24
230
+ - filter: down_proj
231
+ value: 0.24
232
+ - value: 0.11
233
+ - sources:
234
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
235
+ layer_range: [62,63]
236
+ parameters:
237
+ scale:
238
+ - filter: o_proj
239
+ value: 0.241
240
+ - filter: down_proj
241
+ value: 0.241
242
+ - value: 0.12
243
+ - sources:
244
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
245
+ layer_range: [62,63]
246
+ parameters:
247
+ scale:
248
+ - filter: o_proj
249
+ value: 0.242
250
+ - filter: down_proj
251
+ value: 0.243
252
+ - value: 0.13
253
+ - sources:
254
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
255
+ layer_range: [62,63]
256
+ parameters:
257
+ scale:
258
+ - filter: o_proj
259
+ value: 0.244
260
+ - filter: down_proj
261
+ value: 0.244
262
+ - value: 0.61
263
+
264
+ # split 5, D Surge to .61, drop to .15 (following .13)
265
+
266
+ - sources:
267
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
268
+ layer_range: [62,63]
269
+ parameters:
270
+ scale:
271
+ - filter: o_proj
272
+ value: 0.245
273
+ - filter: down_proj
274
+ value: 0.245
275
+ - value: 0.15
276
+ - sources:
277
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
278
+ layer_range: [62,63]
279
+ parameters:
280
+ scale:
281
+ - filter: o_proj
282
+ value: 0.246
283
+ - filter: down_proj
284
+ value: 0.246
285
+ - value: 0.16
286
+ - sources:
287
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
288
+ layer_range: [62,63]
289
+ parameters:
290
+ scale:
291
+ - filter: o_proj
292
+ value: 0.247
293
+ - filter: down_proj
294
+ value: 0.247
295
+ - value: 0.17
296
+ - sources:
297
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
298
+ layer_range: [62,63]
299
+ parameters:
300
+ scale:
301
+ - filter: o_proj
302
+ value: 0.248
303
+ - filter: down_proj
304
+ value: 0.248
305
+ - value: 0.41
306
+
307
+ # split 6, D surge to .41 , then follows .17
308
+
309
+ - sources:
310
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
311
+ layer_range: [62,63]
312
+ parameters:
313
+ scale:
314
+ - filter: o_proj
315
+ value: 0.249
316
+ - filter: down_proj
317
+ value: 0.249
318
+ - value: 0.19
319
+ - sources:
320
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
321
+ layer_range: [62,63]
322
+ parameters:
323
+ scale:
324
+ - filter: o_proj
325
+ value: 0.250
326
+ - filter: down_proj
327
+ value: 0.250
328
+ - value: 0.20
329
+ - sources:
330
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
331
+ layer_range: [62,63]
332
+ parameters:
333
+ scale:
334
+ - filter: o_proj
335
+ value: 0.251
336
+ - filter: down_proj
337
+ value: 0.251
338
+ - value: .22
339
+ - sources:
340
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
341
+ layer_range: [62,63]
342
+ parameters:
343
+ scale:
344
+ - filter: o_proj
345
+ value: 0.252
346
+ - filter: down_proj
347
+ value: 0.252
348
+ - value: .24
349
+ - sources:
350
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
351
+ layer_range: [62,63]
352
+ parameters:
353
+ scale:
354
+ - filter: o_proj
355
+ value: 0.253
356
+ - filter: down_proj
357
+ value: 0.254
358
+ - value: .26
359
+ - sources:
360
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
361
+ layer_range: [62,63]
362
+ parameters:
363
+ scale:
364
+ - filter: o_proj
365
+ value: 0.255
366
+ - filter: down_proj
367
+ value: 0.255
368
+ - value: .28
369
+ - sources:
370
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
371
+ layer_range: [62,63]
372
+ parameters:
373
+ scale:
374
+ - filter: o_proj
375
+ value: 0.256
376
+ - filter: down_proj
377
+ value: 0.256
378
+ - value: .60
379
+
380
+ # O PROJ, DPROJ to .3333 /
381
+ # end game
382
+
383
+ - sources:
384
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
385
+ layer_range: [62,63]
386
+ parameters:
387
+ scale:
388
+ - filter: o_proj
389
+ value: 0.3333333333333
390
+ - filter: down_proj
391
+ value: 0.3333333333333
392
+ - value: 0.3333333333333
393
+ - sources:
394
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
395
+ layer_range: [62,63]
396
+ parameters:
397
+ scale:
398
+ - filter: o_proj
399
+ value: 0.4444444444444
400
+ - filter: down_proj
401
+ value: 0.4444444444444
402
+ - value: 0.4444444444444
403
+ - sources:
404
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
405
+ layer_range: [62,63]
406
+ parameters:
407
+ scale:
408
+ - filter: o_proj
409
+ value: 0.5555555555555
410
+ - filter: down_proj
411
+ value: 0.5555555555555
412
+ - value: 0.5555555555555
413
+ - sources:
414
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
415
+ layer_range: [62,63]
416
+ parameters:
417
+ scale:
418
+ - filter: o_proj
419
+ value: 0.6666666666666
420
+ - filter: down_proj
421
+ value: 0.6666666666666
422
+ - value: 0.6666666666666
423
+ - sources:
424
+ - model: H:/David_au-RCM-11-models/MN-WORDSTORM-pt8-RCM-Emotion-Action-18.5B-Instruct
425
+ layer_range: [62,63]
426
+ parameters:
427
+ scale:
428
+ - filter: o_proj
429
+ value: 0.85
430
+ - filter: down_proj
431
+ value: 0.90
432
+ - value: 0.92
433
+ merge_method: passthrough
434
+ dtype: bfloat16
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {"mergekit_version": "0.0.4.4", "total_size": 58302965760}, "weight_map": {"lm_head.weight": "model-00001-of-00062.safetensors", "model.embed_tokens.weight": "model-00002-of-00062.safetensors", "model.layers.0.input_layernorm.weight": "model-00003-of-00062.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00003-of-00062.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00003-of-00062.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00003-of-00062.safetensors", "model.layers.1.input_layernorm.weight": "model-00003-of-00062.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00003-of-00062.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00003-of-00062.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00003-of-00062.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00003-of-00062.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00003-of-00062.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00004-of-00062.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00004-of-00062.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.input_layernorm.weight": "model-00004-of-00062.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00004-of-00062.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00004-of-00062.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00004-of-00062.safetensors", "model.layers.11.input_layernorm.weight": "model-00004-of-00062.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00004-of-00062.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00004-of-00062.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00005-of-00062.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00005-of-00062.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00005-of-00062.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00005-of-00062.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00005-of-00062.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.input_layernorm.weight": "model-00005-of-00062.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00005-of-00062.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00005-of-00062.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00005-of-00062.safetensors", "model.layers.13.input_layernorm.weight": "model-00005-of-00062.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00005-of-00062.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00006-of-00062.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00006-of-00062.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00006-of-00062.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00006-of-00062.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00006-of-00062.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00006-of-00062.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.input_layernorm.weight": "model-00006-of-00062.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00006-of-00062.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00006-of-00062.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00006-of-00062.safetensors", "model.layers.15.input_layernorm.weight": "model-00006-of-00062.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00007-of-00062.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00007-of-00062.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00007-of-00062.safetensors", "model.layers.16.input_layernorm.weight": "model-00007-of-00062.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00007-of-00062.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00007-of-00062.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00007-of-00062.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00007-of-00062.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00007-of-00062.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00008-of-00062.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00008-of-00062.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.input_layernorm.weight": "model-00008-of-00062.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00008-of-00062.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00008-of-00062.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00008-of-00062.safetensors", "model.layers.18.input_layernorm.weight": "model-00008-of-00062.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00008-of-00062.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00008-of-00062.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00009-of-00062.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00009-of-00062.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00009-of-00062.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00009-of-00062.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00009-of-00062.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.input_layernorm.weight": "model-00009-of-00062.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00009-of-00062.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00009-of-00062.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00009-of-00062.safetensors", "model.layers.2.input_layernorm.weight": "model-00009-of-00062.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00009-of-00062.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00010-of-00062.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00010-of-00062.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00010-of-00062.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00010-of-00062.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00010-of-00062.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00010-of-00062.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.input_layernorm.weight": "model-00010-of-00062.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00010-of-00062.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00010-of-00062.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00010-of-00062.safetensors", "model.layers.21.input_layernorm.weight": "model-00010-of-00062.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00011-of-00062.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00011-of-00062.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00011-of-00062.safetensors", "model.layers.22.input_layernorm.weight": "model-00011-of-00062.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00011-of-00062.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00011-of-00062.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00011-of-00062.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00011-of-00062.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00011-of-00062.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00012-of-00062.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00012-of-00062.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.input_layernorm.weight": "model-00012-of-00062.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00012-of-00062.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00012-of-00062.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00012-of-00062.safetensors", "model.layers.24.input_layernorm.weight": "model-00012-of-00062.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00012-of-00062.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00012-of-00062.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00013-of-00062.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00013-of-00062.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00013-of-00062.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00013-of-00062.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00013-of-00062.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.input_layernorm.weight": "model-00013-of-00062.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00013-of-00062.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00013-of-00062.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00013-of-00062.safetensors", "model.layers.26.input_layernorm.weight": "model-00013-of-00062.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00013-of-00062.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00014-of-00062.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00014-of-00062.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00014-of-00062.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00014-of-00062.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00014-of-00062.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00014-of-00062.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.input_layernorm.weight": "model-00014-of-00062.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00014-of-00062.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00014-of-00062.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00014-of-00062.safetensors", "model.layers.28.input_layernorm.weight": "model-00014-of-00062.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00015-of-00062.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00015-of-00062.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00015-of-00062.safetensors", "model.layers.29.input_layernorm.weight": "model-00015-of-00062.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00015-of-00062.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00015-of-00062.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00015-of-00062.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00015-of-00062.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00015-of-00062.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00016-of-00062.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00016-of-00062.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.input_layernorm.weight": "model-00016-of-00062.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00016-of-00062.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00016-of-00062.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00016-of-00062.safetensors", "model.layers.30.input_layernorm.weight": "model-00016-of-00062.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00016-of-00062.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00016-of-00062.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00017-of-00062.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00017-of-00062.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00017-of-00062.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00017-of-00062.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00017-of-00062.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.input_layernorm.weight": "model-00017-of-00062.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00017-of-00062.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00017-of-00062.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00017-of-00062.safetensors", "model.layers.32.input_layernorm.weight": "model-00017-of-00062.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00017-of-00062.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00018-of-00062.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00018-of-00062.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00018-of-00062.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00018-of-00062.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00018-of-00062.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00018-of-00062.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.input_layernorm.weight": "model-00018-of-00062.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00018-of-00062.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00018-of-00062.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00018-of-00062.safetensors", "model.layers.34.input_layernorm.weight": "model-00018-of-00062.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00019-of-00062.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00019-of-00062.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00019-of-00062.safetensors", "model.layers.35.input_layernorm.weight": "model-00019-of-00062.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00019-of-00062.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00019-of-00062.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00019-of-00062.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00019-of-00062.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00019-of-00062.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00020-of-00062.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00020-of-00062.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.input_layernorm.weight": "model-00020-of-00062.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00020-of-00062.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00020-of-00062.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00020-of-00062.safetensors", "model.layers.37.input_layernorm.weight": "model-00020-of-00062.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00020-of-00062.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00020-of-00062.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00021-of-00062.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00021-of-00062.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00021-of-00062.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00021-of-00062.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00021-of-00062.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.input_layernorm.weight": "model-00021-of-00062.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00021-of-00062.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00021-of-00062.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00021-of-00062.safetensors", "model.layers.39.input_layernorm.weight": "model-00021-of-00062.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00021-of-00062.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00022-of-00062.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00022-of-00062.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00022-of-00062.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00022-of-00062.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00022-of-00062.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00022-of-00062.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.input_layernorm.weight": "model-00022-of-00062.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00022-of-00062.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00022-of-00062.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00022-of-00062.safetensors", "model.layers.40.input_layernorm.weight": "model-00022-of-00062.safetensors", "model.layers.40.mlp.down_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.mlp.gate_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.mlp.up_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.post_attention_layernorm.weight": "model-00023-of-00062.safetensors", "model.layers.40.self_attn.k_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.self_attn.o_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.self_attn.q_proj.weight": "model-00023-of-00062.safetensors", "model.layers.40.self_attn.v_proj.weight": "model-00023-of-00062.safetensors", "model.layers.41.input_layernorm.weight": "model-00023-of-00062.safetensors", "model.layers.41.mlp.down_proj.weight": "model-00023-of-00062.safetensors", "model.layers.41.mlp.gate_proj.weight": "model-00023-of-00062.safetensors", "model.layers.41.mlp.up_proj.weight": "model-00023-of-00062.safetensors", "model.layers.41.post_attention_layernorm.weight": "model-00023-of-00062.safetensors", "model.layers.41.self_attn.k_proj.weight": "model-00023-of-00062.safetensors", "model.layers.41.self_attn.o_proj.weight": "model-00024-of-00062.safetensors", "model.layers.41.self_attn.q_proj.weight": "model-00024-of-00062.safetensors", "model.layers.41.self_attn.v_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.input_layernorm.weight": "model-00024-of-00062.safetensors", "model.layers.42.mlp.down_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.mlp.gate_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.mlp.up_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.post_attention_layernorm.weight": "model-00024-of-00062.safetensors", "model.layers.42.self_attn.k_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.self_attn.o_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.self_attn.q_proj.weight": "model-00024-of-00062.safetensors", "model.layers.42.self_attn.v_proj.weight": "model-00024-of-00062.safetensors", "model.layers.43.input_layernorm.weight": "model-00024-of-00062.safetensors", "model.layers.43.mlp.down_proj.weight": "model-00024-of-00062.safetensors", "model.layers.43.mlp.gate_proj.weight": "model-00024-of-00062.safetensors", "model.layers.43.mlp.up_proj.weight": "model-00025-of-00062.safetensors", "model.layers.43.post_attention_layernorm.weight": "model-00025-of-00062.safetensors", "model.layers.43.self_attn.k_proj.weight": "model-00025-of-00062.safetensors", "model.layers.43.self_attn.o_proj.weight": "model-00025-of-00062.safetensors", "model.layers.43.self_attn.q_proj.weight": "model-00025-of-00062.safetensors", "model.layers.43.self_attn.v_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.input_layernorm.weight": "model-00025-of-00062.safetensors", "model.layers.44.mlp.down_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.mlp.gate_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.mlp.up_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.post_attention_layernorm.weight": "model-00025-of-00062.safetensors", "model.layers.44.self_attn.k_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.self_attn.o_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.self_attn.q_proj.weight": "model-00025-of-00062.safetensors", "model.layers.44.self_attn.v_proj.weight": "model-00025-of-00062.safetensors", "model.layers.45.input_layernorm.weight": "model-00025-of-00062.safetensors", "model.layers.45.mlp.down_proj.weight": "model-00025-of-00062.safetensors", "model.layers.45.mlp.gate_proj.weight": "model-00026-of-00062.safetensors", "model.layers.45.mlp.up_proj.weight": "model-00026-of-00062.safetensors", "model.layers.45.post_attention_layernorm.weight": "model-00026-of-00062.safetensors", "model.layers.45.self_attn.k_proj.weight": "model-00026-of-00062.safetensors", "model.layers.45.self_attn.o_proj.weight": "model-00026-of-00062.safetensors", "model.layers.45.self_attn.q_proj.weight": "model-00026-of-00062.safetensors", "model.layers.45.self_attn.v_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.input_layernorm.weight": "model-00026-of-00062.safetensors", "model.layers.46.mlp.down_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.mlp.gate_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.mlp.up_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.post_attention_layernorm.weight": "model-00026-of-00062.safetensors", "model.layers.46.self_attn.k_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.self_attn.o_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.self_attn.q_proj.weight": "model-00026-of-00062.safetensors", "model.layers.46.self_attn.v_proj.weight": "model-00026-of-00062.safetensors", "model.layers.47.input_layernorm.weight": "model-00026-of-00062.safetensors", "model.layers.47.mlp.down_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.mlp.gate_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.mlp.up_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.post_attention_layernorm.weight": "model-00027-of-00062.safetensors", "model.layers.47.self_attn.k_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.self_attn.o_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.self_attn.q_proj.weight": "model-00027-of-00062.safetensors", "model.layers.47.self_attn.v_proj.weight": "model-00027-of-00062.safetensors", "model.layers.48.input_layernorm.weight": "model-00027-of-00062.safetensors", "model.layers.48.mlp.down_proj.weight": "model-00027-of-00062.safetensors", "model.layers.48.mlp.gate_proj.weight": "model-00027-of-00062.safetensors", "model.layers.48.mlp.up_proj.weight": "model-00027-of-00062.safetensors", "model.layers.48.post_attention_layernorm.weight": "model-00027-of-00062.safetensors", "model.layers.48.self_attn.k_proj.weight": "model-00027-of-00062.safetensors", "model.layers.48.self_attn.o_proj.weight": "model-00028-of-00062.safetensors", "model.layers.48.self_attn.q_proj.weight": "model-00028-of-00062.safetensors", "model.layers.48.self_attn.v_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.input_layernorm.weight": "model-00028-of-00062.safetensors", "model.layers.49.mlp.down_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.mlp.gate_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.mlp.up_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.post_attention_layernorm.weight": "model-00028-of-00062.safetensors", "model.layers.49.self_attn.k_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.self_attn.o_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.self_attn.q_proj.weight": "model-00028-of-00062.safetensors", "model.layers.49.self_attn.v_proj.weight": "model-00028-of-00062.safetensors", "model.layers.5.input_layernorm.weight": "model-00028-of-00062.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00028-of-00062.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00028-of-00062.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00029-of-00062.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00029-of-00062.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00029-of-00062.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00029-of-00062.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00029-of-00062.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.input_layernorm.weight": "model-00029-of-00062.safetensors", "model.layers.50.mlp.down_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.mlp.gate_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.mlp.up_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.post_attention_layernorm.weight": "model-00029-of-00062.safetensors", "model.layers.50.self_attn.k_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.self_attn.o_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.self_attn.q_proj.weight": "model-00029-of-00062.safetensors", "model.layers.50.self_attn.v_proj.weight": "model-00029-of-00062.safetensors", "model.layers.51.input_layernorm.weight": "model-00029-of-00062.safetensors", "model.layers.51.mlp.down_proj.weight": "model-00029-of-00062.safetensors", "model.layers.51.mlp.gate_proj.weight": "model-00030-of-00062.safetensors", "model.layers.51.mlp.up_proj.weight": "model-00030-of-00062.safetensors", "model.layers.51.post_attention_layernorm.weight": "model-00030-of-00062.safetensors", "model.layers.51.self_attn.k_proj.weight": "model-00030-of-00062.safetensors", "model.layers.51.self_attn.o_proj.weight": "model-00030-of-00062.safetensors", "model.layers.51.self_attn.q_proj.weight": "model-00030-of-00062.safetensors", "model.layers.51.self_attn.v_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.input_layernorm.weight": "model-00030-of-00062.safetensors", "model.layers.52.mlp.down_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.mlp.gate_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.mlp.up_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.post_attention_layernorm.weight": "model-00030-of-00062.safetensors", "model.layers.52.self_attn.k_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.self_attn.o_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.self_attn.q_proj.weight": "model-00030-of-00062.safetensors", "model.layers.52.self_attn.v_proj.weight": "model-00030-of-00062.safetensors", "model.layers.53.input_layernorm.weight": "model-00030-of-00062.safetensors", "model.layers.53.mlp.down_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.mlp.gate_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.mlp.up_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.post_attention_layernorm.weight": "model-00031-of-00062.safetensors", "model.layers.53.self_attn.k_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.self_attn.o_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.self_attn.q_proj.weight": "model-00031-of-00062.safetensors", "model.layers.53.self_attn.v_proj.weight": "model-00031-of-00062.safetensors", "model.layers.54.input_layernorm.weight": "model-00031-of-00062.safetensors", "model.layers.54.mlp.down_proj.weight": "model-00031-of-00062.safetensors", "model.layers.54.mlp.gate_proj.weight": "model-00031-of-00062.safetensors", "model.layers.54.mlp.up_proj.weight": "model-00031-of-00062.safetensors", "model.layers.54.post_attention_layernorm.weight": "model-00031-of-00062.safetensors", "model.layers.54.self_attn.k_proj.weight": "model-00031-of-00062.safetensors", "model.layers.54.self_attn.o_proj.weight": "model-00032-of-00062.safetensors", "model.layers.54.self_attn.q_proj.weight": "model-00032-of-00062.safetensors", "model.layers.54.self_attn.v_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.input_layernorm.weight": "model-00032-of-00062.safetensors", "model.layers.55.mlp.down_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.mlp.gate_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.mlp.up_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.post_attention_layernorm.weight": "model-00032-of-00062.safetensors", "model.layers.55.self_attn.k_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.self_attn.o_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.self_attn.q_proj.weight": "model-00032-of-00062.safetensors", "model.layers.55.self_attn.v_proj.weight": "model-00032-of-00062.safetensors", "model.layers.56.input_layernorm.weight": "model-00032-of-00062.safetensors", "model.layers.56.mlp.down_proj.weight": "model-00032-of-00062.safetensors", "model.layers.56.mlp.gate_proj.weight": "model-00032-of-00062.safetensors", "model.layers.56.mlp.up_proj.weight": "model-00033-of-00062.safetensors", "model.layers.56.post_attention_layernorm.weight": "model-00033-of-00062.safetensors", "model.layers.56.self_attn.k_proj.weight": "model-00033-of-00062.safetensors", "model.layers.56.self_attn.o_proj.weight": "model-00033-of-00062.safetensors", "model.layers.56.self_attn.q_proj.weight": "model-00033-of-00062.safetensors", "model.layers.56.self_attn.v_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.input_layernorm.weight": "model-00033-of-00062.safetensors", "model.layers.57.mlp.down_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.mlp.gate_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.mlp.up_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.post_attention_layernorm.weight": "model-00033-of-00062.safetensors", "model.layers.57.self_attn.k_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.self_attn.o_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.self_attn.q_proj.weight": "model-00033-of-00062.safetensors", "model.layers.57.self_attn.v_proj.weight": "model-00033-of-00062.safetensors", "model.layers.58.input_layernorm.weight": "model-00033-of-00062.safetensors", "model.layers.58.mlp.down_proj.weight": "model-00033-of-00062.safetensors", "model.layers.58.mlp.gate_proj.weight": "model-00034-of-00062.safetensors", "model.layers.58.mlp.up_proj.weight": "model-00034-of-00062.safetensors", "model.layers.58.post_attention_layernorm.weight": "model-00034-of-00062.safetensors", "model.layers.58.self_attn.k_proj.weight": "model-00034-of-00062.safetensors", "model.layers.58.self_attn.o_proj.weight": "model-00034-of-00062.safetensors", "model.layers.58.self_attn.q_proj.weight": "model-00034-of-00062.safetensors", "model.layers.58.self_attn.v_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.input_layernorm.weight": "model-00034-of-00062.safetensors", "model.layers.59.mlp.down_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.mlp.gate_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.mlp.up_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.post_attention_layernorm.weight": "model-00034-of-00062.safetensors", "model.layers.59.self_attn.k_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.self_attn.o_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.self_attn.q_proj.weight": "model-00034-of-00062.safetensors", "model.layers.59.self_attn.v_proj.weight": "model-00034-of-00062.safetensors", "model.layers.6.input_layernorm.weight": "model-00034-of-00062.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00035-of-00062.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00035-of-00062.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00035-of-00062.safetensors", "model.layers.60.input_layernorm.weight": "model-00035-of-00062.safetensors", "model.layers.60.mlp.down_proj.weight": "model-00035-of-00062.safetensors", "model.layers.60.mlp.gate_proj.weight": "model-00035-of-00062.safetensors", "model.layers.60.mlp.up_proj.weight": "model-00035-of-00062.safetensors", "model.layers.60.post_attention_layernorm.weight": "model-00035-of-00062.safetensors", "model.layers.60.self_attn.k_proj.weight": "model-00035-of-00062.safetensors", "model.layers.60.self_attn.o_proj.weight": "model-00036-of-00062.safetensors", "model.layers.60.self_attn.q_proj.weight": "model-00036-of-00062.safetensors", "model.layers.60.self_attn.v_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.61.mlp.down_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.mlp.gate_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.mlp.up_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.post_attention_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.61.self_attn.k_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.self_attn.o_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.self_attn.q_proj.weight": "model-00036-of-00062.safetensors", "model.layers.61.self_attn.v_proj.weight": "model-00036-of-00062.safetensors", "model.layers.101.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.100.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.99.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.98.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.97.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.96.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.95.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.94.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.93.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.92.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.91.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.90.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.89.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.88.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.87.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.86.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.85.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.84.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.83.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.82.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.81.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.80.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.79.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.78.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.77.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.76.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.75.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.74.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.73.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.72.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.71.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.70.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.69.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.68.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.67.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.66.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.65.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.64.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.63.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.62.input_layernorm.weight": "model-00036-of-00062.safetensors", "model.layers.101.mlp.down_proj.weight": "model-00036-of-00062.safetensors", "model.layers.100.mlp.down_proj.weight": "model-00036-of-00062.safetensors", "model.layers.99.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.98.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.97.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.96.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.95.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.94.mlp.down_proj.weight": "model-00037-of-00062.safetensors", "model.layers.93.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.92.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.91.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.90.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.89.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.88.mlp.down_proj.weight": "model-00038-of-00062.safetensors", "model.layers.87.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.86.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.85.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.84.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.83.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.82.mlp.down_proj.weight": "model-00039-of-00062.safetensors", "model.layers.81.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.80.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.79.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.78.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.77.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.76.mlp.down_proj.weight": "model-00040-of-00062.safetensors", "model.layers.75.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.74.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.73.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.72.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.71.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.70.mlp.down_proj.weight": "model-00041-of-00062.safetensors", "model.layers.69.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.68.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.67.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.66.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.65.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.64.mlp.down_proj.weight": "model-00042-of-00062.safetensors", "model.layers.63.mlp.down_proj.weight": "model-00043-of-00062.safetensors", "model.layers.62.mlp.down_proj.weight": "model-00043-of-00062.safetensors", "model.layers.101.mlp.gate_proj.weight": "model-00043-of-00062.safetensors", "model.layers.100.mlp.gate_proj.weight": "model-00043-of-00062.safetensors", "model.layers.99.mlp.gate_proj.weight": "model-00043-of-00062.safetensors", "model.layers.98.mlp.gate_proj.weight": "model-00043-of-00062.safetensors", "model.layers.97.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.96.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.95.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.94.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.93.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.92.mlp.gate_proj.weight": "model-00044-of-00062.safetensors", "model.layers.91.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.90.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.89.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.88.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.87.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.86.mlp.gate_proj.weight": "model-00045-of-00062.safetensors", "model.layers.85.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.84.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.83.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.82.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.81.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.80.mlp.gate_proj.weight": "model-00046-of-00062.safetensors", "model.layers.79.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.78.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.77.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.76.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.75.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.74.mlp.gate_proj.weight": "model-00047-of-00062.safetensors", "model.layers.73.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.72.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.71.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.70.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.69.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.68.mlp.gate_proj.weight": "model-00048-of-00062.safetensors", "model.layers.67.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.66.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.65.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.64.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.63.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.62.mlp.gate_proj.weight": "model-00049-of-00062.safetensors", "model.layers.101.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.100.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.99.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.98.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.97.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.96.mlp.up_proj.weight": "model-00050-of-00062.safetensors", "model.layers.95.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.94.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.93.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.92.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.91.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.90.mlp.up_proj.weight": "model-00051-of-00062.safetensors", "model.layers.89.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.88.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.87.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.86.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.85.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.84.mlp.up_proj.weight": "model-00052-of-00062.safetensors", "model.layers.83.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.82.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.81.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.80.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.79.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.78.mlp.up_proj.weight": "model-00053-of-00062.safetensors", "model.layers.77.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.76.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.75.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.74.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.73.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.72.mlp.up_proj.weight": "model-00054-of-00062.safetensors", "model.layers.71.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.70.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.69.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.68.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.67.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.66.mlp.up_proj.weight": "model-00055-of-00062.safetensors", "model.layers.65.mlp.up_proj.weight": "model-00056-of-00062.safetensors", "model.layers.64.mlp.up_proj.weight": "model-00056-of-00062.safetensors", "model.layers.63.mlp.up_proj.weight": "model-00056-of-00062.safetensors", "model.layers.62.mlp.up_proj.weight": "model-00056-of-00062.safetensors", "model.layers.101.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.100.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.99.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.98.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.97.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.96.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.95.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.94.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.93.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.92.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.91.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.90.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.89.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.88.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.87.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.86.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.85.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.84.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.83.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.82.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.81.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.80.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.79.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.78.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.77.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.76.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.75.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.74.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.73.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.72.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.71.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.70.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.69.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.68.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.67.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.66.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.65.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.64.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.63.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.62.post_attention_layernorm.weight": "model-00056-of-00062.safetensors", "model.layers.101.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.100.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.99.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.98.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.97.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.96.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.95.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.94.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.93.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.92.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.91.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.90.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.89.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.88.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.87.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.86.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.85.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.84.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.83.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.82.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.81.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.80.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.79.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.78.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.77.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.76.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.75.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.74.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.73.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.72.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.71.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.70.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.69.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.68.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.67.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.66.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.65.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.64.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.63.self_attn.k_proj.weight": "model-00056-of-00062.safetensors", "model.layers.62.self_attn.k_proj.weight": "model-00057-of-00062.safetensors", "model.layers.101.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.100.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.99.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.98.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.97.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.96.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.95.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.94.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.93.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.92.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.91.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.90.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.89.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.88.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.87.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.86.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.85.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.84.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.83.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.82.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.81.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.80.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.79.self_attn.o_proj.weight": "model-00057-of-00062.safetensors", "model.layers.78.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.77.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.76.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.75.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.74.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.73.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.72.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.71.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.70.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.69.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.68.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.67.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.66.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.65.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.64.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.63.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.62.self_attn.o_proj.weight": "model-00058-of-00062.safetensors", "model.layers.101.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.100.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.99.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.98.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.97.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.96.self_attn.q_proj.weight": "model-00058-of-00062.safetensors", "model.layers.95.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.94.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.93.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.92.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.91.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.90.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.89.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.88.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.87.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.86.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.85.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.84.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.83.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.82.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.81.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.80.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.79.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.78.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.77.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.76.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.75.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.74.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.73.self_attn.q_proj.weight": "model-00059-of-00062.safetensors", "model.layers.72.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.71.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.70.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.69.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.68.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.67.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.66.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.65.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.64.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.63.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.62.self_attn.q_proj.weight": "model-00060-of-00062.safetensors", "model.layers.101.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.100.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.99.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.98.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.97.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.96.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.95.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.94.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.93.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.92.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.91.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.90.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.89.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.88.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.87.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.86.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.85.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.84.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.83.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.82.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.81.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.80.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.79.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.78.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.77.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.76.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.75.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.74.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.73.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.72.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.71.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.70.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.69.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.68.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.67.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.66.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.65.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.64.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.63.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.62.self_attn.v_proj.weight": "model-00060-of-00062.safetensors", "model.layers.7.input_layernorm.weight": "model-00060-of-00062.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00061-of-00062.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00061-of-00062.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00061-of-00062.safetensors", "model.layers.8.input_layernorm.weight": "model-00061-of-00062.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00061-of-00062.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00061-of-00062.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00061-of-00062.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00061-of-00062.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00061-of-00062.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00062-of-00062.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00062-of-00062.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.input_layernorm.weight": "model-00062-of-00062.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00062-of-00062.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00062-of-00062.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00062-of-00062.safetensors", "model.norm.weight": "model-00062-of-00062.safetensors"}}
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff