CharlieFRuan commited on
Commit
5cc249e
·
verified ·
1 Parent(s): 7386ec3

Initial commit

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
added_tokens.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "<|im_end|>": 32002,
3
+ "<|im_start|>": 32001,
4
+ "[PAD]": 32000
5
+ }
logs.txt ADDED
The diff for this file is too large to render. See raw diff
 
mlc-chat-config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "llama",
3
+ "quantization": "q0f16",
4
+ "model_config": {
5
+ "hidden_size": 2048,
6
+ "intermediate_size": 5632,
7
+ "num_attention_heads": 32,
8
+ "num_hidden_layers": 22,
9
+ "rms_norm_eps": 1e-05,
10
+ "vocab_size": 32003,
11
+ "position_embedding_base": 10000.0,
12
+ "context_window_size": 2048,
13
+ "prefill_chunk_size": 2048,
14
+ "num_key_value_heads": 4,
15
+ "head_dim": 64,
16
+ "tensor_parallel_shards": 1,
17
+ "max_batch_size": 1
18
+ },
19
+ "vocab_size": 32003,
20
+ "context_window_size": 2048,
21
+ "sliding_window_size": -1,
22
+ "prefill_chunk_size": 2048,
23
+ "attention_sink_size": -1,
24
+ "tensor_parallel_shards": 1,
25
+ "max_batch_size": 80,
26
+ "mean_gen_len": 128,
27
+ "max_gen_len": 512,
28
+ "shift_fill_factor": 0.3,
29
+ "temperature": 0.7,
30
+ "repetition_penalty": 1.0,
31
+ "top_p": 0.95,
32
+ "conv_template": "chatml",
33
+ "pad_token_id": 0,
34
+ "bos_token_id": 1,
35
+ "eos_token_id": 2,
36
+ "tokenizer_files": [
37
+ "tokenizer.model",
38
+ "tokenizer.json",
39
+ "added_tokens.json",
40
+ "tokenizer_config.json"
41
+ ],
42
+ "version": "0.1.0"
43
+ }
ndarray-cache.json ADDED
@@ -0,0 +1,1905 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "ParamSize": 135,
4
+ "ParamBytes": 2200121344.0,
5
+ "BitsPerParam": 16.0
6
+ },
7
+ "records": [
8
+ {
9
+ "dataPath": "params_shard_0.bin",
10
+ "format": "raw-shard",
11
+ "nbytes": 131084288,
12
+ "records": [
13
+ {
14
+ "name": "model.embed_tokens.weight",
15
+ "shape": [
16
+ 32003,
17
+ 2048
18
+ ],
19
+ "dtype": "float16",
20
+ "format": "f32-to-bf16",
21
+ "nbytes": 131084288,
22
+ "byteOffset": 0
23
+ }
24
+ ],
25
+ "md5sum": "c20cd1e65e8b9b8bcd6df7258ad692a5"
26
+ },
27
+ {
28
+ "dataPath": "params_shard_1.bin",
29
+ "format": "raw-shard",
30
+ "nbytes": 46137344,
31
+ "records": [
32
+ {
33
+ "name": "model.layers.0.mlp.gate_up_proj.weight",
34
+ "shape": [
35
+ 11264,
36
+ 2048
37
+ ],
38
+ "dtype": "float16",
39
+ "format": "f32-to-bf16",
40
+ "nbytes": 46137344,
41
+ "byteOffset": 0
42
+ }
43
+ ],
44
+ "md5sum": "67117d26e3683d3fb992ed64fb28a3ec"
45
+ },
46
+ {
47
+ "dataPath": "params_shard_2.bin",
48
+ "format": "raw-shard",
49
+ "nbytes": 23068672,
50
+ "records": [
51
+ {
52
+ "name": "model.layers.0.mlp.down_proj.weight",
53
+ "shape": [
54
+ 2048,
55
+ 5632
56
+ ],
57
+ "dtype": "float16",
58
+ "format": "f32-to-bf16",
59
+ "nbytes": 23068672,
60
+ "byteOffset": 0
61
+ }
62
+ ],
63
+ "md5sum": "2b4cf50efe7b11e8d583d2cd7ec3f327"
64
+ },
65
+ {
66
+ "dataPath": "params_shard_3.bin",
67
+ "format": "raw-shard",
68
+ "nbytes": 29368320,
69
+ "records": [
70
+ {
71
+ "name": "model.layers.0.self_attn.qkv_proj.weight",
72
+ "shape": [
73
+ 2560,
74
+ 2048
75
+ ],
76
+ "dtype": "float16",
77
+ "format": "f32-to-bf16",
78
+ "nbytes": 10485760,
79
+ "byteOffset": 0
80
+ },
81
+ {
82
+ "name": "model.layers.0.self_attn.o_proj.weight",
83
+ "shape": [
84
+ 2048,
85
+ 2048
86
+ ],
87
+ "dtype": "float16",
88
+ "format": "f32-to-bf16",
89
+ "nbytes": 8388608,
90
+ "byteOffset": 10485760
91
+ },
92
+ {
93
+ "name": "model.layers.0.input_layernorm.weight",
94
+ "shape": [
95
+ 2048
96
+ ],
97
+ "dtype": "float16",
98
+ "format": "f32-to-bf16",
99
+ "nbytes": 4096,
100
+ "byteOffset": 18874368
101
+ },
102
+ {
103
+ "name": "model.layers.0.post_attention_layernorm.weight",
104
+ "shape": [
105
+ 2048
106
+ ],
107
+ "dtype": "float16",
108
+ "format": "f32-to-bf16",
109
+ "nbytes": 4096,
110
+ "byteOffset": 18878464
111
+ },
112
+ {
113
+ "name": "model.layers.1.self_attn.qkv_proj.weight",
114
+ "shape": [
115
+ 2560,
116
+ 2048
117
+ ],
118
+ "dtype": "float16",
119
+ "format": "f32-to-bf16",
120
+ "nbytes": 10485760,
121
+ "byteOffset": 18882560
122
+ }
123
+ ],
124
+ "md5sum": "60d3e2175a0afeeea347dc304b764f28"
125
+ },
126
+ {
127
+ "dataPath": "params_shard_4.bin",
128
+ "format": "raw-shard",
129
+ "nbytes": 46137344,
130
+ "records": [
131
+ {
132
+ "name": "model.layers.1.mlp.gate_up_proj.weight",
133
+ "shape": [
134
+ 11264,
135
+ 2048
136
+ ],
137
+ "dtype": "float16",
138
+ "format": "f32-to-bf16",
139
+ "nbytes": 46137344,
140
+ "byteOffset": 0
141
+ }
142
+ ],
143
+ "md5sum": "e1ef9544115afa205802364e0809362c"
144
+ },
145
+ {
146
+ "dataPath": "params_shard_5.bin",
147
+ "format": "raw-shard",
148
+ "nbytes": 31465472,
149
+ "records": [
150
+ {
151
+ "name": "model.layers.1.self_attn.o_proj.weight",
152
+ "shape": [
153
+ 2048,
154
+ 2048
155
+ ],
156
+ "dtype": "float16",
157
+ "format": "f32-to-bf16",
158
+ "nbytes": 8388608,
159
+ "byteOffset": 0
160
+ },
161
+ {
162
+ "name": "model.layers.1.mlp.down_proj.weight",
163
+ "shape": [
164
+ 2048,
165
+ 5632
166
+ ],
167
+ "dtype": "float16",
168
+ "format": "f32-to-bf16",
169
+ "nbytes": 23068672,
170
+ "byteOffset": 8388608
171
+ },
172
+ {
173
+ "name": "model.layers.1.input_layernorm.weight",
174
+ "shape": [
175
+ 2048
176
+ ],
177
+ "dtype": "float16",
178
+ "format": "f32-to-bf16",
179
+ "nbytes": 4096,
180
+ "byteOffset": 31457280
181
+ },
182
+ {
183
+ "name": "model.layers.1.post_attention_layernorm.weight",
184
+ "shape": [
185
+ 2048
186
+ ],
187
+ "dtype": "float16",
188
+ "format": "f32-to-bf16",
189
+ "nbytes": 4096,
190
+ "byteOffset": 31461376
191
+ }
192
+ ],
193
+ "md5sum": "365482b49516f8138fc472cf0a025055"
194
+ },
195
+ {
196
+ "dataPath": "params_shard_6.bin",
197
+ "format": "raw-shard",
198
+ "nbytes": 46137344,
199
+ "records": [
200
+ {
201
+ "name": "model.layers.2.mlp.gate_up_proj.weight",
202
+ "shape": [
203
+ 11264,
204
+ 2048
205
+ ],
206
+ "dtype": "float16",
207
+ "format": "f32-to-bf16",
208
+ "nbytes": 46137344,
209
+ "byteOffset": 0
210
+ }
211
+ ],
212
+ "md5sum": "1c833bec4281e38c4b29f06cce87c206"
213
+ },
214
+ {
215
+ "dataPath": "params_shard_7.bin",
216
+ "format": "raw-shard",
217
+ "nbytes": 23068672,
218
+ "records": [
219
+ {
220
+ "name": "model.layers.2.mlp.down_proj.weight",
221
+ "shape": [
222
+ 2048,
223
+ 5632
224
+ ],
225
+ "dtype": "float16",
226
+ "format": "f32-to-bf16",
227
+ "nbytes": 23068672,
228
+ "byteOffset": 0
229
+ }
230
+ ],
231
+ "md5sum": "a869232a80be393b4d1081950705645a"
232
+ },
233
+ {
234
+ "dataPath": "params_shard_8.bin",
235
+ "format": "raw-shard",
236
+ "nbytes": 29368320,
237
+ "records": [
238
+ {
239
+ "name": "model.layers.2.self_attn.qkv_proj.weight",
240
+ "shape": [
241
+ 2560,
242
+ 2048
243
+ ],
244
+ "dtype": "float16",
245
+ "format": "f32-to-bf16",
246
+ "nbytes": 10485760,
247
+ "byteOffset": 0
248
+ },
249
+ {
250
+ "name": "model.layers.2.self_attn.o_proj.weight",
251
+ "shape": [
252
+ 2048,
253
+ 2048
254
+ ],
255
+ "dtype": "float16",
256
+ "format": "f32-to-bf16",
257
+ "nbytes": 8388608,
258
+ "byteOffset": 10485760
259
+ },
260
+ {
261
+ "name": "model.layers.2.input_layernorm.weight",
262
+ "shape": [
263
+ 2048
264
+ ],
265
+ "dtype": "float16",
266
+ "format": "f32-to-bf16",
267
+ "nbytes": 4096,
268
+ "byteOffset": 18874368
269
+ },
270
+ {
271
+ "name": "model.layers.2.post_attention_layernorm.weight",
272
+ "shape": [
273
+ 2048
274
+ ],
275
+ "dtype": "float16",
276
+ "format": "f32-to-bf16",
277
+ "nbytes": 4096,
278
+ "byteOffset": 18878464
279
+ },
280
+ {
281
+ "name": "model.layers.3.self_attn.qkv_proj.weight",
282
+ "shape": [
283
+ 2560,
284
+ 2048
285
+ ],
286
+ "dtype": "float16",
287
+ "format": "f32-to-bf16",
288
+ "nbytes": 10485760,
289
+ "byteOffset": 18882560
290
+ }
291
+ ],
292
+ "md5sum": "a33240d89c7a60afeb67f2aaa85576bb"
293
+ },
294
+ {
295
+ "dataPath": "params_shard_9.bin",
296
+ "format": "raw-shard",
297
+ "nbytes": 46137344,
298
+ "records": [
299
+ {
300
+ "name": "model.layers.3.mlp.gate_up_proj.weight",
301
+ "shape": [
302
+ 11264,
303
+ 2048
304
+ ],
305
+ "dtype": "float16",
306
+ "format": "f32-to-bf16",
307
+ "nbytes": 46137344,
308
+ "byteOffset": 0
309
+ }
310
+ ],
311
+ "md5sum": "eadf1b8d358535035b5b8bd27aa6c718"
312
+ },
313
+ {
314
+ "dataPath": "params_shard_10.bin",
315
+ "format": "raw-shard",
316
+ "nbytes": 31465472,
317
+ "records": [
318
+ {
319
+ "name": "model.layers.3.self_attn.o_proj.weight",
320
+ "shape": [
321
+ 2048,
322
+ 2048
323
+ ],
324
+ "dtype": "float16",
325
+ "format": "f32-to-bf16",
326
+ "nbytes": 8388608,
327
+ "byteOffset": 0
328
+ },
329
+ {
330
+ "name": "model.layers.3.mlp.down_proj.weight",
331
+ "shape": [
332
+ 2048,
333
+ 5632
334
+ ],
335
+ "dtype": "float16",
336
+ "format": "f32-to-bf16",
337
+ "nbytes": 23068672,
338
+ "byteOffset": 8388608
339
+ },
340
+ {
341
+ "name": "model.layers.3.input_layernorm.weight",
342
+ "shape": [
343
+ 2048
344
+ ],
345
+ "dtype": "float16",
346
+ "format": "f32-to-bf16",
347
+ "nbytes": 4096,
348
+ "byteOffset": 31457280
349
+ },
350
+ {
351
+ "name": "model.layers.3.post_attention_layernorm.weight",
352
+ "shape": [
353
+ 2048
354
+ ],
355
+ "dtype": "float16",
356
+ "format": "f32-to-bf16",
357
+ "nbytes": 4096,
358
+ "byteOffset": 31461376
359
+ }
360
+ ],
361
+ "md5sum": "d3170067741ce82bc338d9a567d562a5"
362
+ },
363
+ {
364
+ "dataPath": "params_shard_11.bin",
365
+ "format": "raw-shard",
366
+ "nbytes": 46137344,
367
+ "records": [
368
+ {
369
+ "name": "model.layers.4.mlp.gate_up_proj.weight",
370
+ "shape": [
371
+ 11264,
372
+ 2048
373
+ ],
374
+ "dtype": "float16",
375
+ "format": "f32-to-bf16",
376
+ "nbytes": 46137344,
377
+ "byteOffset": 0
378
+ }
379
+ ],
380
+ "md5sum": "8f7fbadbece4df84afc813140583d14a"
381
+ },
382
+ {
383
+ "dataPath": "params_shard_12.bin",
384
+ "format": "raw-shard",
385
+ "nbytes": 23068672,
386
+ "records": [
387
+ {
388
+ "name": "model.layers.4.mlp.down_proj.weight",
389
+ "shape": [
390
+ 2048,
391
+ 5632
392
+ ],
393
+ "dtype": "float16",
394
+ "format": "f32-to-bf16",
395
+ "nbytes": 23068672,
396
+ "byteOffset": 0
397
+ }
398
+ ],
399
+ "md5sum": "d0e26dab25866fffe72002a9fd8721fb"
400
+ },
401
+ {
402
+ "dataPath": "params_shard_13.bin",
403
+ "format": "raw-shard",
404
+ "nbytes": 29368320,
405
+ "records": [
406
+ {
407
+ "name": "model.layers.4.self_attn.qkv_proj.weight",
408
+ "shape": [
409
+ 2560,
410
+ 2048
411
+ ],
412
+ "dtype": "float16",
413
+ "format": "f32-to-bf16",
414
+ "nbytes": 10485760,
415
+ "byteOffset": 0
416
+ },
417
+ {
418
+ "name": "model.layers.4.self_attn.o_proj.weight",
419
+ "shape": [
420
+ 2048,
421
+ 2048
422
+ ],
423
+ "dtype": "float16",
424
+ "format": "f32-to-bf16",
425
+ "nbytes": 8388608,
426
+ "byteOffset": 10485760
427
+ },
428
+ {
429
+ "name": "model.layers.4.input_layernorm.weight",
430
+ "shape": [
431
+ 2048
432
+ ],
433
+ "dtype": "float16",
434
+ "format": "f32-to-bf16",
435
+ "nbytes": 4096,
436
+ "byteOffset": 18874368
437
+ },
438
+ {
439
+ "name": "model.layers.4.post_attention_layernorm.weight",
440
+ "shape": [
441
+ 2048
442
+ ],
443
+ "dtype": "float16",
444
+ "format": "f32-to-bf16",
445
+ "nbytes": 4096,
446
+ "byteOffset": 18878464
447
+ },
448
+ {
449
+ "name": "model.layers.5.self_attn.qkv_proj.weight",
450
+ "shape": [
451
+ 2560,
452
+ 2048
453
+ ],
454
+ "dtype": "float16",
455
+ "format": "f32-to-bf16",
456
+ "nbytes": 10485760,
457
+ "byteOffset": 18882560
458
+ }
459
+ ],
460
+ "md5sum": "8852fb1dad9f07ee115dcb2f1561f7cf"
461
+ },
462
+ {
463
+ "dataPath": "params_shard_14.bin",
464
+ "format": "raw-shard",
465
+ "nbytes": 46137344,
466
+ "records": [
467
+ {
468
+ "name": "model.layers.5.mlp.gate_up_proj.weight",
469
+ "shape": [
470
+ 11264,
471
+ 2048
472
+ ],
473
+ "dtype": "float16",
474
+ "format": "f32-to-bf16",
475
+ "nbytes": 46137344,
476
+ "byteOffset": 0
477
+ }
478
+ ],
479
+ "md5sum": "a277afa5a9784549dc94d508746f200f"
480
+ },
481
+ {
482
+ "dataPath": "params_shard_15.bin",
483
+ "format": "raw-shard",
484
+ "nbytes": 31465472,
485
+ "records": [
486
+ {
487
+ "name": "model.layers.5.self_attn.o_proj.weight",
488
+ "shape": [
489
+ 2048,
490
+ 2048
491
+ ],
492
+ "dtype": "float16",
493
+ "format": "f32-to-bf16",
494
+ "nbytes": 8388608,
495
+ "byteOffset": 0
496
+ },
497
+ {
498
+ "name": "model.layers.5.mlp.down_proj.weight",
499
+ "shape": [
500
+ 2048,
501
+ 5632
502
+ ],
503
+ "dtype": "float16",
504
+ "format": "f32-to-bf16",
505
+ "nbytes": 23068672,
506
+ "byteOffset": 8388608
507
+ },
508
+ {
509
+ "name": "model.layers.5.input_layernorm.weight",
510
+ "shape": [
511
+ 2048
512
+ ],
513
+ "dtype": "float16",
514
+ "format": "f32-to-bf16",
515
+ "nbytes": 4096,
516
+ "byteOffset": 31457280
517
+ },
518
+ {
519
+ "name": "model.layers.5.post_attention_layernorm.weight",
520
+ "shape": [
521
+ 2048
522
+ ],
523
+ "dtype": "float16",
524
+ "format": "f32-to-bf16",
525
+ "nbytes": 4096,
526
+ "byteOffset": 31461376
527
+ }
528
+ ],
529
+ "md5sum": "bd00ed5dedbdb94ea8a4955138fbe1ef"
530
+ },
531
+ {
532
+ "dataPath": "params_shard_16.bin",
533
+ "format": "raw-shard",
534
+ "nbytes": 46137344,
535
+ "records": [
536
+ {
537
+ "name": "model.layers.6.mlp.gate_up_proj.weight",
538
+ "shape": [
539
+ 11264,
540
+ 2048
541
+ ],
542
+ "dtype": "float16",
543
+ "format": "f32-to-bf16",
544
+ "nbytes": 46137344,
545
+ "byteOffset": 0
546
+ }
547
+ ],
548
+ "md5sum": "722c27ca67ce6a5fab8b68cff677fcf3"
549
+ },
550
+ {
551
+ "dataPath": "params_shard_17.bin",
552
+ "format": "raw-shard",
553
+ "nbytes": 23068672,
554
+ "records": [
555
+ {
556
+ "name": "model.layers.6.mlp.down_proj.weight",
557
+ "shape": [
558
+ 2048,
559
+ 5632
560
+ ],
561
+ "dtype": "float16",
562
+ "format": "f32-to-bf16",
563
+ "nbytes": 23068672,
564
+ "byteOffset": 0
565
+ }
566
+ ],
567
+ "md5sum": "82ae0fddd173d4292ab9bcd66c9aad09"
568
+ },
569
+ {
570
+ "dataPath": "params_shard_18.bin",
571
+ "format": "raw-shard",
572
+ "nbytes": 29368320,
573
+ "records": [
574
+ {
575
+ "name": "model.layers.6.self_attn.qkv_proj.weight",
576
+ "shape": [
577
+ 2560,
578
+ 2048
579
+ ],
580
+ "dtype": "float16",
581
+ "format": "f32-to-bf16",
582
+ "nbytes": 10485760,
583
+ "byteOffset": 0
584
+ },
585
+ {
586
+ "name": "model.layers.6.self_attn.o_proj.weight",
587
+ "shape": [
588
+ 2048,
589
+ 2048
590
+ ],
591
+ "dtype": "float16",
592
+ "format": "f32-to-bf16",
593
+ "nbytes": 8388608,
594
+ "byteOffset": 10485760
595
+ },
596
+ {
597
+ "name": "model.layers.6.input_layernorm.weight",
598
+ "shape": [
599
+ 2048
600
+ ],
601
+ "dtype": "float16",
602
+ "format": "f32-to-bf16",
603
+ "nbytes": 4096,
604
+ "byteOffset": 18874368
605
+ },
606
+ {
607
+ "name": "model.layers.6.post_attention_layernorm.weight",
608
+ "shape": [
609
+ 2048
610
+ ],
611
+ "dtype": "float16",
612
+ "format": "f32-to-bf16",
613
+ "nbytes": 4096,
614
+ "byteOffset": 18878464
615
+ },
616
+ {
617
+ "name": "model.layers.7.self_attn.qkv_proj.weight",
618
+ "shape": [
619
+ 2560,
620
+ 2048
621
+ ],
622
+ "dtype": "float16",
623
+ "format": "f32-to-bf16",
624
+ "nbytes": 10485760,
625
+ "byteOffset": 18882560
626
+ }
627
+ ],
628
+ "md5sum": "a095fafc2739735f56a29e082277429e"
629
+ },
630
+ {
631
+ "dataPath": "params_shard_19.bin",
632
+ "format": "raw-shard",
633
+ "nbytes": 46137344,
634
+ "records": [
635
+ {
636
+ "name": "model.layers.7.mlp.gate_up_proj.weight",
637
+ "shape": [
638
+ 11264,
639
+ 2048
640
+ ],
641
+ "dtype": "float16",
642
+ "format": "f32-to-bf16",
643
+ "nbytes": 46137344,
644
+ "byteOffset": 0
645
+ }
646
+ ],
647
+ "md5sum": "670d905e29bd865b33b8c0f2a3b9f5d8"
648
+ },
649
+ {
650
+ "dataPath": "params_shard_20.bin",
651
+ "format": "raw-shard",
652
+ "nbytes": 31465472,
653
+ "records": [
654
+ {
655
+ "name": "model.layers.7.self_attn.o_proj.weight",
656
+ "shape": [
657
+ 2048,
658
+ 2048
659
+ ],
660
+ "dtype": "float16",
661
+ "format": "f32-to-bf16",
662
+ "nbytes": 8388608,
663
+ "byteOffset": 0
664
+ },
665
+ {
666
+ "name": "model.layers.7.mlp.down_proj.weight",
667
+ "shape": [
668
+ 2048,
669
+ 5632
670
+ ],
671
+ "dtype": "float16",
672
+ "format": "f32-to-bf16",
673
+ "nbytes": 23068672,
674
+ "byteOffset": 8388608
675
+ },
676
+ {
677
+ "name": "model.layers.7.input_layernorm.weight",
678
+ "shape": [
679
+ 2048
680
+ ],
681
+ "dtype": "float16",
682
+ "format": "f32-to-bf16",
683
+ "nbytes": 4096,
684
+ "byteOffset": 31457280
685
+ },
686
+ {
687
+ "name": "model.layers.7.post_attention_layernorm.weight",
688
+ "shape": [
689
+ 2048
690
+ ],
691
+ "dtype": "float16",
692
+ "format": "f32-to-bf16",
693
+ "nbytes": 4096,
694
+ "byteOffset": 31461376
695
+ }
696
+ ],
697
+ "md5sum": "b7dc6a8421fc8110f95b9b5a8f384f6b"
698
+ },
699
+ {
700
+ "dataPath": "params_shard_21.bin",
701
+ "format": "raw-shard",
702
+ "nbytes": 46137344,
703
+ "records": [
704
+ {
705
+ "name": "model.layers.8.mlp.gate_up_proj.weight",
706
+ "shape": [
707
+ 11264,
708
+ 2048
709
+ ],
710
+ "dtype": "float16",
711
+ "format": "f32-to-bf16",
712
+ "nbytes": 46137344,
713
+ "byteOffset": 0
714
+ }
715
+ ],
716
+ "md5sum": "27fbd620e49f15b501fa1f5e80ba09bb"
717
+ },
718
+ {
719
+ "dataPath": "params_shard_22.bin",
720
+ "format": "raw-shard",
721
+ "nbytes": 23068672,
722
+ "records": [
723
+ {
724
+ "name": "model.layers.8.mlp.down_proj.weight",
725
+ "shape": [
726
+ 2048,
727
+ 5632
728
+ ],
729
+ "dtype": "float16",
730
+ "format": "f32-to-bf16",
731
+ "nbytes": 23068672,
732
+ "byteOffset": 0
733
+ }
734
+ ],
735
+ "md5sum": "cf33ca730c2cce84b5a5fbb1e39b9774"
736
+ },
737
+ {
738
+ "dataPath": "params_shard_23.bin",
739
+ "format": "raw-shard",
740
+ "nbytes": 29368320,
741
+ "records": [
742
+ {
743
+ "name": "model.layers.8.self_attn.qkv_proj.weight",
744
+ "shape": [
745
+ 2560,
746
+ 2048
747
+ ],
748
+ "dtype": "float16",
749
+ "format": "f32-to-bf16",
750
+ "nbytes": 10485760,
751
+ "byteOffset": 0
752
+ },
753
+ {
754
+ "name": "model.layers.8.self_attn.o_proj.weight",
755
+ "shape": [
756
+ 2048,
757
+ 2048
758
+ ],
759
+ "dtype": "float16",
760
+ "format": "f32-to-bf16",
761
+ "nbytes": 8388608,
762
+ "byteOffset": 10485760
763
+ },
764
+ {
765
+ "name": "model.layers.8.input_layernorm.weight",
766
+ "shape": [
767
+ 2048
768
+ ],
769
+ "dtype": "float16",
770
+ "format": "f32-to-bf16",
771
+ "nbytes": 4096,
772
+ "byteOffset": 18874368
773
+ },
774
+ {
775
+ "name": "model.layers.8.post_attention_layernorm.weight",
776
+ "shape": [
777
+ 2048
778
+ ],
779
+ "dtype": "float16",
780
+ "format": "f32-to-bf16",
781
+ "nbytes": 4096,
782
+ "byteOffset": 18878464
783
+ },
784
+ {
785
+ "name": "model.layers.9.self_attn.qkv_proj.weight",
786
+ "shape": [
787
+ 2560,
788
+ 2048
789
+ ],
790
+ "dtype": "float16",
791
+ "format": "f32-to-bf16",
792
+ "nbytes": 10485760,
793
+ "byteOffset": 18882560
794
+ }
795
+ ],
796
+ "md5sum": "92d1e9fe434432c5267791029ade35aa"
797
+ },
798
+ {
799
+ "dataPath": "params_shard_24.bin",
800
+ "format": "raw-shard",
801
+ "nbytes": 46137344,
802
+ "records": [
803
+ {
804
+ "name": "model.layers.9.mlp.gate_up_proj.weight",
805
+ "shape": [
806
+ 11264,
807
+ 2048
808
+ ],
809
+ "dtype": "float16",
810
+ "format": "f32-to-bf16",
811
+ "nbytes": 46137344,
812
+ "byteOffset": 0
813
+ }
814
+ ],
815
+ "md5sum": "bd6069b8a4686f97514179aec1dfe399"
816
+ },
817
+ {
818
+ "dataPath": "params_shard_25.bin",
819
+ "format": "raw-shard",
820
+ "nbytes": 31465472,
821
+ "records": [
822
+ {
823
+ "name": "model.layers.9.self_attn.o_proj.weight",
824
+ "shape": [
825
+ 2048,
826
+ 2048
827
+ ],
828
+ "dtype": "float16",
829
+ "format": "f32-to-bf16",
830
+ "nbytes": 8388608,
831
+ "byteOffset": 0
832
+ },
833
+ {
834
+ "name": "model.layers.9.mlp.down_proj.weight",
835
+ "shape": [
836
+ 2048,
837
+ 5632
838
+ ],
839
+ "dtype": "float16",
840
+ "format": "f32-to-bf16",
841
+ "nbytes": 23068672,
842
+ "byteOffset": 8388608
843
+ },
844
+ {
845
+ "name": "model.layers.9.input_layernorm.weight",
846
+ "shape": [
847
+ 2048
848
+ ],
849
+ "dtype": "float16",
850
+ "format": "f32-to-bf16",
851
+ "nbytes": 4096,
852
+ "byteOffset": 31457280
853
+ },
854
+ {
855
+ "name": "model.layers.9.post_attention_layernorm.weight",
856
+ "shape": [
857
+ 2048
858
+ ],
859
+ "dtype": "float16",
860
+ "format": "f32-to-bf16",
861
+ "nbytes": 4096,
862
+ "byteOffset": 31461376
863
+ }
864
+ ],
865
+ "md5sum": "2e50dfd07b18ad6feb80f4f9dce231bb"
866
+ },
867
+ {
868
+ "dataPath": "params_shard_26.bin",
869
+ "format": "raw-shard",
870
+ "nbytes": 46137344,
871
+ "records": [
872
+ {
873
+ "name": "model.layers.10.mlp.gate_up_proj.weight",
874
+ "shape": [
875
+ 11264,
876
+ 2048
877
+ ],
878
+ "dtype": "float16",
879
+ "format": "f32-to-bf16",
880
+ "nbytes": 46137344,
881
+ "byteOffset": 0
882
+ }
883
+ ],
884
+ "md5sum": "2a0423481b60d2ada523fba34a617515"
885
+ },
886
+ {
887
+ "dataPath": "params_shard_27.bin",
888
+ "format": "raw-shard",
889
+ "nbytes": 23068672,
890
+ "records": [
891
+ {
892
+ "name": "model.layers.10.mlp.down_proj.weight",
893
+ "shape": [
894
+ 2048,
895
+ 5632
896
+ ],
897
+ "dtype": "float16",
898
+ "format": "f32-to-bf16",
899
+ "nbytes": 23068672,
900
+ "byteOffset": 0
901
+ }
902
+ ],
903
+ "md5sum": "e1cb04d558fd26722b1ec82fe1512b75"
904
+ },
905
+ {
906
+ "dataPath": "params_shard_28.bin",
907
+ "format": "raw-shard",
908
+ "nbytes": 29368320,
909
+ "records": [
910
+ {
911
+ "name": "model.layers.10.self_attn.qkv_proj.weight",
912
+ "shape": [
913
+ 2560,
914
+ 2048
915
+ ],
916
+ "dtype": "float16",
917
+ "format": "f32-to-bf16",
918
+ "nbytes": 10485760,
919
+ "byteOffset": 0
920
+ },
921
+ {
922
+ "name": "model.layers.10.self_attn.o_proj.weight",
923
+ "shape": [
924
+ 2048,
925
+ 2048
926
+ ],
927
+ "dtype": "float16",
928
+ "format": "f32-to-bf16",
929
+ "nbytes": 8388608,
930
+ "byteOffset": 10485760
931
+ },
932
+ {
933
+ "name": "model.layers.10.input_layernorm.weight",
934
+ "shape": [
935
+ 2048
936
+ ],
937
+ "dtype": "float16",
938
+ "format": "f32-to-bf16",
939
+ "nbytes": 4096,
940
+ "byteOffset": 18874368
941
+ },
942
+ {
943
+ "name": "model.layers.10.post_attention_layernorm.weight",
944
+ "shape": [
945
+ 2048
946
+ ],
947
+ "dtype": "float16",
948
+ "format": "f32-to-bf16",
949
+ "nbytes": 4096,
950
+ "byteOffset": 18878464
951
+ },
952
+ {
953
+ "name": "model.layers.11.self_attn.qkv_proj.weight",
954
+ "shape": [
955
+ 2560,
956
+ 2048
957
+ ],
958
+ "dtype": "float16",
959
+ "format": "f32-to-bf16",
960
+ "nbytes": 10485760,
961
+ "byteOffset": 18882560
962
+ }
963
+ ],
964
+ "md5sum": "c8b49ae0ab414d6e4db03a1b61c85c60"
965
+ },
966
+ {
967
+ "dataPath": "params_shard_29.bin",
968
+ "format": "raw-shard",
969
+ "nbytes": 46137344,
970
+ "records": [
971
+ {
972
+ "name": "model.layers.11.mlp.gate_up_proj.weight",
973
+ "shape": [
974
+ 11264,
975
+ 2048
976
+ ],
977
+ "dtype": "float16",
978
+ "format": "f32-to-bf16",
979
+ "nbytes": 46137344,
980
+ "byteOffset": 0
981
+ }
982
+ ],
983
+ "md5sum": "bcae72462b2002c4dbc81b77b179bc91"
984
+ },
985
+ {
986
+ "dataPath": "params_shard_30.bin",
987
+ "format": "raw-shard",
988
+ "nbytes": 31465472,
989
+ "records": [
990
+ {
991
+ "name": "model.layers.11.self_attn.o_proj.weight",
992
+ "shape": [
993
+ 2048,
994
+ 2048
995
+ ],
996
+ "dtype": "float16",
997
+ "format": "f32-to-bf16",
998
+ "nbytes": 8388608,
999
+ "byteOffset": 0
1000
+ },
1001
+ {
1002
+ "name": "model.layers.11.mlp.down_proj.weight",
1003
+ "shape": [
1004
+ 2048,
1005
+ 5632
1006
+ ],
1007
+ "dtype": "float16",
1008
+ "format": "f32-to-bf16",
1009
+ "nbytes": 23068672,
1010
+ "byteOffset": 8388608
1011
+ },
1012
+ {
1013
+ "name": "model.layers.11.input_layernorm.weight",
1014
+ "shape": [
1015
+ 2048
1016
+ ],
1017
+ "dtype": "float16",
1018
+ "format": "f32-to-bf16",
1019
+ "nbytes": 4096,
1020
+ "byteOffset": 31457280
1021
+ },
1022
+ {
1023
+ "name": "model.layers.11.post_attention_layernorm.weight",
1024
+ "shape": [
1025
+ 2048
1026
+ ],
1027
+ "dtype": "float16",
1028
+ "format": "f32-to-bf16",
1029
+ "nbytes": 4096,
1030
+ "byteOffset": 31461376
1031
+ }
1032
+ ],
1033
+ "md5sum": "3230e6836eccb4fb3ca80088e0149561"
1034
+ },
1035
+ {
1036
+ "dataPath": "params_shard_31.bin",
1037
+ "format": "raw-shard",
1038
+ "nbytes": 46137344,
1039
+ "records": [
1040
+ {
1041
+ "name": "model.layers.12.mlp.gate_up_proj.weight",
1042
+ "shape": [
1043
+ 11264,
1044
+ 2048
1045
+ ],
1046
+ "dtype": "float16",
1047
+ "format": "f32-to-bf16",
1048
+ "nbytes": 46137344,
1049
+ "byteOffset": 0
1050
+ }
1051
+ ],
1052
+ "md5sum": "6928b3404faeb5cb387b31dfc38a8f29"
1053
+ },
1054
+ {
1055
+ "dataPath": "params_shard_32.bin",
1056
+ "format": "raw-shard",
1057
+ "nbytes": 23068672,
1058
+ "records": [
1059
+ {
1060
+ "name": "model.layers.12.mlp.down_proj.weight",
1061
+ "shape": [
1062
+ 2048,
1063
+ 5632
1064
+ ],
1065
+ "dtype": "float16",
1066
+ "format": "f32-to-bf16",
1067
+ "nbytes": 23068672,
1068
+ "byteOffset": 0
1069
+ }
1070
+ ],
1071
+ "md5sum": "b4b49e428a50c53212c6fb9347338aeb"
1072
+ },
1073
+ {
1074
+ "dataPath": "params_shard_33.bin",
1075
+ "format": "raw-shard",
1076
+ "nbytes": 29368320,
1077
+ "records": [
1078
+ {
1079
+ "name": "model.layers.12.self_attn.qkv_proj.weight",
1080
+ "shape": [
1081
+ 2560,
1082
+ 2048
1083
+ ],
1084
+ "dtype": "float16",
1085
+ "format": "f32-to-bf16",
1086
+ "nbytes": 10485760,
1087
+ "byteOffset": 0
1088
+ },
1089
+ {
1090
+ "name": "model.layers.12.self_attn.o_proj.weight",
1091
+ "shape": [
1092
+ 2048,
1093
+ 2048
1094
+ ],
1095
+ "dtype": "float16",
1096
+ "format": "f32-to-bf16",
1097
+ "nbytes": 8388608,
1098
+ "byteOffset": 10485760
1099
+ },
1100
+ {
1101
+ "name": "model.layers.12.input_layernorm.weight",
1102
+ "shape": [
1103
+ 2048
1104
+ ],
1105
+ "dtype": "float16",
1106
+ "format": "f32-to-bf16",
1107
+ "nbytes": 4096,
1108
+ "byteOffset": 18874368
1109
+ },
1110
+ {
1111
+ "name": "model.layers.12.post_attention_layernorm.weight",
1112
+ "shape": [
1113
+ 2048
1114
+ ],
1115
+ "dtype": "float16",
1116
+ "format": "f32-to-bf16",
1117
+ "nbytes": 4096,
1118
+ "byteOffset": 18878464
1119
+ },
1120
+ {
1121
+ "name": "model.layers.13.self_attn.qkv_proj.weight",
1122
+ "shape": [
1123
+ 2560,
1124
+ 2048
1125
+ ],
1126
+ "dtype": "float16",
1127
+ "format": "f32-to-bf16",
1128
+ "nbytes": 10485760,
1129
+ "byteOffset": 18882560
1130
+ }
1131
+ ],
1132
+ "md5sum": "426ca978f11307c8655c9a2d5cafd686"
1133
+ },
1134
+ {
1135
+ "dataPath": "params_shard_34.bin",
1136
+ "format": "raw-shard",
1137
+ "nbytes": 46137344,
1138
+ "records": [
1139
+ {
1140
+ "name": "model.layers.13.mlp.gate_up_proj.weight",
1141
+ "shape": [
1142
+ 11264,
1143
+ 2048
1144
+ ],
1145
+ "dtype": "float16",
1146
+ "format": "f32-to-bf16",
1147
+ "nbytes": 46137344,
1148
+ "byteOffset": 0
1149
+ }
1150
+ ],
1151
+ "md5sum": "0cbc3fe15f3386c29f02ebf34fdb6354"
1152
+ },
1153
+ {
1154
+ "dataPath": "params_shard_35.bin",
1155
+ "format": "raw-shard",
1156
+ "nbytes": 31465472,
1157
+ "records": [
1158
+ {
1159
+ "name": "model.layers.13.self_attn.o_proj.weight",
1160
+ "shape": [
1161
+ 2048,
1162
+ 2048
1163
+ ],
1164
+ "dtype": "float16",
1165
+ "format": "f32-to-bf16",
1166
+ "nbytes": 8388608,
1167
+ "byteOffset": 0
1168
+ },
1169
+ {
1170
+ "name": "model.layers.13.mlp.down_proj.weight",
1171
+ "shape": [
1172
+ 2048,
1173
+ 5632
1174
+ ],
1175
+ "dtype": "float16",
1176
+ "format": "f32-to-bf16",
1177
+ "nbytes": 23068672,
1178
+ "byteOffset": 8388608
1179
+ },
1180
+ {
1181
+ "name": "model.layers.13.input_layernorm.weight",
1182
+ "shape": [
1183
+ 2048
1184
+ ],
1185
+ "dtype": "float16",
1186
+ "format": "f32-to-bf16",
1187
+ "nbytes": 4096,
1188
+ "byteOffset": 31457280
1189
+ },
1190
+ {
1191
+ "name": "model.layers.13.post_attention_layernorm.weight",
1192
+ "shape": [
1193
+ 2048
1194
+ ],
1195
+ "dtype": "float16",
1196
+ "format": "f32-to-bf16",
1197
+ "nbytes": 4096,
1198
+ "byteOffset": 31461376
1199
+ }
1200
+ ],
1201
+ "md5sum": "3c8f312537af7fa69e14aa491ef22f1f"
1202
+ },
1203
+ {
1204
+ "dataPath": "params_shard_36.bin",
1205
+ "format": "raw-shard",
1206
+ "nbytes": 46137344,
1207
+ "records": [
1208
+ {
1209
+ "name": "model.layers.14.mlp.gate_up_proj.weight",
1210
+ "shape": [
1211
+ 11264,
1212
+ 2048
1213
+ ],
1214
+ "dtype": "float16",
1215
+ "format": "f32-to-bf16",
1216
+ "nbytes": 46137344,
1217
+ "byteOffset": 0
1218
+ }
1219
+ ],
1220
+ "md5sum": "deb610c78f144668041f767ea8c75d1c"
1221
+ },
1222
+ {
1223
+ "dataPath": "params_shard_37.bin",
1224
+ "format": "raw-shard",
1225
+ "nbytes": 23068672,
1226
+ "records": [
1227
+ {
1228
+ "name": "model.layers.14.mlp.down_proj.weight",
1229
+ "shape": [
1230
+ 2048,
1231
+ 5632
1232
+ ],
1233
+ "dtype": "float16",
1234
+ "format": "f32-to-bf16",
1235
+ "nbytes": 23068672,
1236
+ "byteOffset": 0
1237
+ }
1238
+ ],
1239
+ "md5sum": "e12c044268cd33825b56fc6da812c951"
1240
+ },
1241
+ {
1242
+ "dataPath": "params_shard_38.bin",
1243
+ "format": "raw-shard",
1244
+ "nbytes": 29368320,
1245
+ "records": [
1246
+ {
1247
+ "name": "model.layers.14.self_attn.qkv_proj.weight",
1248
+ "shape": [
1249
+ 2560,
1250
+ 2048
1251
+ ],
1252
+ "dtype": "float16",
1253
+ "format": "f32-to-bf16",
1254
+ "nbytes": 10485760,
1255
+ "byteOffset": 0
1256
+ },
1257
+ {
1258
+ "name": "model.layers.14.self_attn.o_proj.weight",
1259
+ "shape": [
1260
+ 2048,
1261
+ 2048
1262
+ ],
1263
+ "dtype": "float16",
1264
+ "format": "f32-to-bf16",
1265
+ "nbytes": 8388608,
1266
+ "byteOffset": 10485760
1267
+ },
1268
+ {
1269
+ "name": "model.layers.14.input_layernorm.weight",
1270
+ "shape": [
1271
+ 2048
1272
+ ],
1273
+ "dtype": "float16",
1274
+ "format": "f32-to-bf16",
1275
+ "nbytes": 4096,
1276
+ "byteOffset": 18874368
1277
+ },
1278
+ {
1279
+ "name": "model.layers.14.post_attention_layernorm.weight",
1280
+ "shape": [
1281
+ 2048
1282
+ ],
1283
+ "dtype": "float16",
1284
+ "format": "f32-to-bf16",
1285
+ "nbytes": 4096,
1286
+ "byteOffset": 18878464
1287
+ },
1288
+ {
1289
+ "name": "model.layers.15.self_attn.qkv_proj.weight",
1290
+ "shape": [
1291
+ 2560,
1292
+ 2048
1293
+ ],
1294
+ "dtype": "float16",
1295
+ "format": "f32-to-bf16",
1296
+ "nbytes": 10485760,
1297
+ "byteOffset": 18882560
1298
+ }
1299
+ ],
1300
+ "md5sum": "c278ac2bd7da8a24810792fbdeb699fb"
1301
+ },
1302
+ {
1303
+ "dataPath": "params_shard_39.bin",
1304
+ "format": "raw-shard",
1305
+ "nbytes": 46137344,
1306
+ "records": [
1307
+ {
1308
+ "name": "model.layers.15.mlp.gate_up_proj.weight",
1309
+ "shape": [
1310
+ 11264,
1311
+ 2048
1312
+ ],
1313
+ "dtype": "float16",
1314
+ "format": "f32-to-bf16",
1315
+ "nbytes": 46137344,
1316
+ "byteOffset": 0
1317
+ }
1318
+ ],
1319
+ "md5sum": "bb2c8c0957896ce1d3dd1e15cb6e7528"
1320
+ },
1321
+ {
1322
+ "dataPath": "params_shard_40.bin",
1323
+ "format": "raw-shard",
1324
+ "nbytes": 31465472,
1325
+ "records": [
1326
+ {
1327
+ "name": "model.layers.15.self_attn.o_proj.weight",
1328
+ "shape": [
1329
+ 2048,
1330
+ 2048
1331
+ ],
1332
+ "dtype": "float16",
1333
+ "format": "f32-to-bf16",
1334
+ "nbytes": 8388608,
1335
+ "byteOffset": 0
1336
+ },
1337
+ {
1338
+ "name": "model.layers.15.mlp.down_proj.weight",
1339
+ "shape": [
1340
+ 2048,
1341
+ 5632
1342
+ ],
1343
+ "dtype": "float16",
1344
+ "format": "f32-to-bf16",
1345
+ "nbytes": 23068672,
1346
+ "byteOffset": 8388608
1347
+ },
1348
+ {
1349
+ "name": "model.layers.15.input_layernorm.weight",
1350
+ "shape": [
1351
+ 2048
1352
+ ],
1353
+ "dtype": "float16",
1354
+ "format": "f32-to-bf16",
1355
+ "nbytes": 4096,
1356
+ "byteOffset": 31457280
1357
+ },
1358
+ {
1359
+ "name": "model.layers.15.post_attention_layernorm.weight",
1360
+ "shape": [
1361
+ 2048
1362
+ ],
1363
+ "dtype": "float16",
1364
+ "format": "f32-to-bf16",
1365
+ "nbytes": 4096,
1366
+ "byteOffset": 31461376
1367
+ }
1368
+ ],
1369
+ "md5sum": "2f1facdd4dd8f3ed19cfb95c55f8b638"
1370
+ },
1371
+ {
1372
+ "dataPath": "params_shard_41.bin",
1373
+ "format": "raw-shard",
1374
+ "nbytes": 46137344,
1375
+ "records": [
1376
+ {
1377
+ "name": "model.layers.16.mlp.gate_up_proj.weight",
1378
+ "shape": [
1379
+ 11264,
1380
+ 2048
1381
+ ],
1382
+ "dtype": "float16",
1383
+ "format": "f32-to-bf16",
1384
+ "nbytes": 46137344,
1385
+ "byteOffset": 0
1386
+ }
1387
+ ],
1388
+ "md5sum": "69ff403c5eb2b8d9d0a8295806eb556a"
1389
+ },
1390
+ {
1391
+ "dataPath": "params_shard_42.bin",
1392
+ "format": "raw-shard",
1393
+ "nbytes": 23068672,
1394
+ "records": [
1395
+ {
1396
+ "name": "model.layers.16.mlp.down_proj.weight",
1397
+ "shape": [
1398
+ 2048,
1399
+ 5632
1400
+ ],
1401
+ "dtype": "float16",
1402
+ "format": "f32-to-bf16",
1403
+ "nbytes": 23068672,
1404
+ "byteOffset": 0
1405
+ }
1406
+ ],
1407
+ "md5sum": "4a5757e5079bbe75f783be287cb47bd3"
1408
+ },
1409
+ {
1410
+ "dataPath": "params_shard_43.bin",
1411
+ "format": "raw-shard",
1412
+ "nbytes": 29368320,
1413
+ "records": [
1414
+ {
1415
+ "name": "model.layers.16.self_attn.qkv_proj.weight",
1416
+ "shape": [
1417
+ 2560,
1418
+ 2048
1419
+ ],
1420
+ "dtype": "float16",
1421
+ "format": "f32-to-bf16",
1422
+ "nbytes": 10485760,
1423
+ "byteOffset": 0
1424
+ },
1425
+ {
1426
+ "name": "model.layers.16.self_attn.o_proj.weight",
1427
+ "shape": [
1428
+ 2048,
1429
+ 2048
1430
+ ],
1431
+ "dtype": "float16",
1432
+ "format": "f32-to-bf16",
1433
+ "nbytes": 8388608,
1434
+ "byteOffset": 10485760
1435
+ },
1436
+ {
1437
+ "name": "model.layers.16.input_layernorm.weight",
1438
+ "shape": [
1439
+ 2048
1440
+ ],
1441
+ "dtype": "float16",
1442
+ "format": "f32-to-bf16",
1443
+ "nbytes": 4096,
1444
+ "byteOffset": 18874368
1445
+ },
1446
+ {
1447
+ "name": "model.layers.16.post_attention_layernorm.weight",
1448
+ "shape": [
1449
+ 2048
1450
+ ],
1451
+ "dtype": "float16",
1452
+ "format": "f32-to-bf16",
1453
+ "nbytes": 4096,
1454
+ "byteOffset": 18878464
1455
+ },
1456
+ {
1457
+ "name": "model.layers.17.self_attn.qkv_proj.weight",
1458
+ "shape": [
1459
+ 2560,
1460
+ 2048
1461
+ ],
1462
+ "dtype": "float16",
1463
+ "format": "f32-to-bf16",
1464
+ "nbytes": 10485760,
1465
+ "byteOffset": 18882560
1466
+ }
1467
+ ],
1468
+ "md5sum": "cc24c6b2cdcdf5c26a903d900568b52a"
1469
+ },
1470
+ {
1471
+ "dataPath": "params_shard_44.bin",
1472
+ "format": "raw-shard",
1473
+ "nbytes": 46137344,
1474
+ "records": [
1475
+ {
1476
+ "name": "model.layers.17.mlp.gate_up_proj.weight",
1477
+ "shape": [
1478
+ 11264,
1479
+ 2048
1480
+ ],
1481
+ "dtype": "float16",
1482
+ "format": "f32-to-bf16",
1483
+ "nbytes": 46137344,
1484
+ "byteOffset": 0
1485
+ }
1486
+ ],
1487
+ "md5sum": "9277df931046d3d4d6c45eca5931b8a4"
1488
+ },
1489
+ {
1490
+ "dataPath": "params_shard_45.bin",
1491
+ "format": "raw-shard",
1492
+ "nbytes": 31465472,
1493
+ "records": [
1494
+ {
1495
+ "name": "model.layers.17.self_attn.o_proj.weight",
1496
+ "shape": [
1497
+ 2048,
1498
+ 2048
1499
+ ],
1500
+ "dtype": "float16",
1501
+ "format": "f32-to-bf16",
1502
+ "nbytes": 8388608,
1503
+ "byteOffset": 0
1504
+ },
1505
+ {
1506
+ "name": "model.layers.17.mlp.down_proj.weight",
1507
+ "shape": [
1508
+ 2048,
1509
+ 5632
1510
+ ],
1511
+ "dtype": "float16",
1512
+ "format": "f32-to-bf16",
1513
+ "nbytes": 23068672,
1514
+ "byteOffset": 8388608
1515
+ },
1516
+ {
1517
+ "name": "model.layers.17.input_layernorm.weight",
1518
+ "shape": [
1519
+ 2048
1520
+ ],
1521
+ "dtype": "float16",
1522
+ "format": "f32-to-bf16",
1523
+ "nbytes": 4096,
1524
+ "byteOffset": 31457280
1525
+ },
1526
+ {
1527
+ "name": "model.layers.17.post_attention_layernorm.weight",
1528
+ "shape": [
1529
+ 2048
1530
+ ],
1531
+ "dtype": "float16",
1532
+ "format": "f32-to-bf16",
1533
+ "nbytes": 4096,
1534
+ "byteOffset": 31461376
1535
+ }
1536
+ ],
1537
+ "md5sum": "19c27da9cbe222113814f105b68b6a8d"
1538
+ },
1539
+ {
1540
+ "dataPath": "params_shard_46.bin",
1541
+ "format": "raw-shard",
1542
+ "nbytes": 46137344,
1543
+ "records": [
1544
+ {
1545
+ "name": "model.layers.18.mlp.gate_up_proj.weight",
1546
+ "shape": [
1547
+ 11264,
1548
+ 2048
1549
+ ],
1550
+ "dtype": "float16",
1551
+ "format": "f32-to-bf16",
1552
+ "nbytes": 46137344,
1553
+ "byteOffset": 0
1554
+ }
1555
+ ],
1556
+ "md5sum": "612904acb3f0c114b19ae2b544dad7a6"
1557
+ },
1558
+ {
1559
+ "dataPath": "params_shard_47.bin",
1560
+ "format": "raw-shard",
1561
+ "nbytes": 23068672,
1562
+ "records": [
1563
+ {
1564
+ "name": "model.layers.18.mlp.down_proj.weight",
1565
+ "shape": [
1566
+ 2048,
1567
+ 5632
1568
+ ],
1569
+ "dtype": "float16",
1570
+ "format": "f32-to-bf16",
1571
+ "nbytes": 23068672,
1572
+ "byteOffset": 0
1573
+ }
1574
+ ],
1575
+ "md5sum": "771730159052b4f05ff7a95c6bef476e"
1576
+ },
1577
+ {
1578
+ "dataPath": "params_shard_48.bin",
1579
+ "format": "raw-shard",
1580
+ "nbytes": 29368320,
1581
+ "records": [
1582
+ {
1583
+ "name": "model.layers.18.self_attn.qkv_proj.weight",
1584
+ "shape": [
1585
+ 2560,
1586
+ 2048
1587
+ ],
1588
+ "dtype": "float16",
1589
+ "format": "f32-to-bf16",
1590
+ "nbytes": 10485760,
1591
+ "byteOffset": 0
1592
+ },
1593
+ {
1594
+ "name": "model.layers.18.self_attn.o_proj.weight",
1595
+ "shape": [
1596
+ 2048,
1597
+ 2048
1598
+ ],
1599
+ "dtype": "float16",
1600
+ "format": "f32-to-bf16",
1601
+ "nbytes": 8388608,
1602
+ "byteOffset": 10485760
1603
+ },
1604
+ {
1605
+ "name": "model.layers.18.input_layernorm.weight",
1606
+ "shape": [
1607
+ 2048
1608
+ ],
1609
+ "dtype": "float16",
1610
+ "format": "f32-to-bf16",
1611
+ "nbytes": 4096,
1612
+ "byteOffset": 18874368
1613
+ },
1614
+ {
1615
+ "name": "model.layers.18.post_attention_layernorm.weight",
1616
+ "shape": [
1617
+ 2048
1618
+ ],
1619
+ "dtype": "float16",
1620
+ "format": "f32-to-bf16",
1621
+ "nbytes": 4096,
1622
+ "byteOffset": 18878464
1623
+ },
1624
+ {
1625
+ "name": "model.layers.19.self_attn.qkv_proj.weight",
1626
+ "shape": [
1627
+ 2560,
1628
+ 2048
1629
+ ],
1630
+ "dtype": "float16",
1631
+ "format": "f32-to-bf16",
1632
+ "nbytes": 10485760,
1633
+ "byteOffset": 18882560
1634
+ }
1635
+ ],
1636
+ "md5sum": "321458dc293a90f188ce10b86983fef0"
1637
+ },
1638
+ {
1639
+ "dataPath": "params_shard_49.bin",
1640
+ "format": "raw-shard",
1641
+ "nbytes": 46137344,
1642
+ "records": [
1643
+ {
1644
+ "name": "model.layers.19.mlp.gate_up_proj.weight",
1645
+ "shape": [
1646
+ 11264,
1647
+ 2048
1648
+ ],
1649
+ "dtype": "float16",
1650
+ "format": "f32-to-bf16",
1651
+ "nbytes": 46137344,
1652
+ "byteOffset": 0
1653
+ }
1654
+ ],
1655
+ "md5sum": "2551863673051555d98d5062241c4e45"
1656
+ },
1657
+ {
1658
+ "dataPath": "params_shard_50.bin",
1659
+ "format": "raw-shard",
1660
+ "nbytes": 31465472,
1661
+ "records": [
1662
+ {
1663
+ "name": "model.layers.19.self_attn.o_proj.weight",
1664
+ "shape": [
1665
+ 2048,
1666
+ 2048
1667
+ ],
1668
+ "dtype": "float16",
1669
+ "format": "f32-to-bf16",
1670
+ "nbytes": 8388608,
1671
+ "byteOffset": 0
1672
+ },
1673
+ {
1674
+ "name": "model.layers.19.mlp.down_proj.weight",
1675
+ "shape": [
1676
+ 2048,
1677
+ 5632
1678
+ ],
1679
+ "dtype": "float16",
1680
+ "format": "f32-to-bf16",
1681
+ "nbytes": 23068672,
1682
+ "byteOffset": 8388608
1683
+ },
1684
+ {
1685
+ "name": "model.layers.19.input_layernorm.weight",
1686
+ "shape": [
1687
+ 2048
1688
+ ],
1689
+ "dtype": "float16",
1690
+ "format": "f32-to-bf16",
1691
+ "nbytes": 4096,
1692
+ "byteOffset": 31457280
1693
+ },
1694
+ {
1695
+ "name": "model.layers.19.post_attention_layernorm.weight",
1696
+ "shape": [
1697
+ 2048
1698
+ ],
1699
+ "dtype": "float16",
1700
+ "format": "f32-to-bf16",
1701
+ "nbytes": 4096,
1702
+ "byteOffset": 31461376
1703
+ }
1704
+ ],
1705
+ "md5sum": "ef5a2322b4cf204aaef2611d1ae7d3a6"
1706
+ },
1707
+ {
1708
+ "dataPath": "params_shard_51.bin",
1709
+ "format": "raw-shard",
1710
+ "nbytes": 46137344,
1711
+ "records": [
1712
+ {
1713
+ "name": "model.layers.20.mlp.gate_up_proj.weight",
1714
+ "shape": [
1715
+ 11264,
1716
+ 2048
1717
+ ],
1718
+ "dtype": "float16",
1719
+ "format": "f32-to-bf16",
1720
+ "nbytes": 46137344,
1721
+ "byteOffset": 0
1722
+ }
1723
+ ],
1724
+ "md5sum": "ca686a2ddf2fb1e74a2968a0668966aa"
1725
+ },
1726
+ {
1727
+ "dataPath": "params_shard_52.bin",
1728
+ "format": "raw-shard",
1729
+ "nbytes": 23068672,
1730
+ "records": [
1731
+ {
1732
+ "name": "model.layers.20.mlp.down_proj.weight",
1733
+ "shape": [
1734
+ 2048,
1735
+ 5632
1736
+ ],
1737
+ "dtype": "float16",
1738
+ "format": "f32-to-bf16",
1739
+ "nbytes": 23068672,
1740
+ "byteOffset": 0
1741
+ }
1742
+ ],
1743
+ "md5sum": "66a1fe90dd5a3dcb8310478d5f9f52e3"
1744
+ },
1745
+ {
1746
+ "dataPath": "params_shard_53.bin",
1747
+ "format": "raw-shard",
1748
+ "nbytes": 29368320,
1749
+ "records": [
1750
+ {
1751
+ "name": "model.layers.20.self_attn.qkv_proj.weight",
1752
+ "shape": [
1753
+ 2560,
1754
+ 2048
1755
+ ],
1756
+ "dtype": "float16",
1757
+ "format": "f32-to-bf16",
1758
+ "nbytes": 10485760,
1759
+ "byteOffset": 0
1760
+ },
1761
+ {
1762
+ "name": "model.layers.20.self_attn.o_proj.weight",
1763
+ "shape": [
1764
+ 2048,
1765
+ 2048
1766
+ ],
1767
+ "dtype": "float16",
1768
+ "format": "f32-to-bf16",
1769
+ "nbytes": 8388608,
1770
+ "byteOffset": 10485760
1771
+ },
1772
+ {
1773
+ "name": "model.layers.20.input_layernorm.weight",
1774
+ "shape": [
1775
+ 2048
1776
+ ],
1777
+ "dtype": "float16",
1778
+ "format": "f32-to-bf16",
1779
+ "nbytes": 4096,
1780
+ "byteOffset": 18874368
1781
+ },
1782
+ {
1783
+ "name": "model.layers.20.post_attention_layernorm.weight",
1784
+ "shape": [
1785
+ 2048
1786
+ ],
1787
+ "dtype": "float16",
1788
+ "format": "f32-to-bf16",
1789
+ "nbytes": 4096,
1790
+ "byteOffset": 18878464
1791
+ },
1792
+ {
1793
+ "name": "model.layers.21.self_attn.qkv_proj.weight",
1794
+ "shape": [
1795
+ 2560,
1796
+ 2048
1797
+ ],
1798
+ "dtype": "float16",
1799
+ "format": "f32-to-bf16",
1800
+ "nbytes": 10485760,
1801
+ "byteOffset": 18882560
1802
+ }
1803
+ ],
1804
+ "md5sum": "04d186cafe36c4ff46a123b74adf64ed"
1805
+ },
1806
+ {
1807
+ "dataPath": "params_shard_54.bin",
1808
+ "format": "raw-shard",
1809
+ "nbytes": 46137344,
1810
+ "records": [
1811
+ {
1812
+ "name": "model.layers.21.mlp.gate_up_proj.weight",
1813
+ "shape": [
1814
+ 11264,
1815
+ 2048
1816
+ ],
1817
+ "dtype": "float16",
1818
+ "format": "f32-to-bf16",
1819
+ "nbytes": 46137344,
1820
+ "byteOffset": 0
1821
+ }
1822
+ ],
1823
+ "md5sum": "d70e0c24a80a281c60a590a4c51c4d20"
1824
+ },
1825
+ {
1826
+ "dataPath": "params_shard_55.bin",
1827
+ "format": "raw-shard",
1828
+ "nbytes": 131084288,
1829
+ "records": [
1830
+ {
1831
+ "name": "lm_head.weight",
1832
+ "shape": [
1833
+ 32003,
1834
+ 2048
1835
+ ],
1836
+ "dtype": "float16",
1837
+ "format": "f32-to-bf16",
1838
+ "nbytes": 131084288,
1839
+ "byteOffset": 0
1840
+ }
1841
+ ],
1842
+ "md5sum": "2627d16359c1ba4ca975476b395114de"
1843
+ },
1844
+ {
1845
+ "dataPath": "params_shard_56.bin",
1846
+ "format": "raw-shard",
1847
+ "nbytes": 31469568,
1848
+ "records": [
1849
+ {
1850
+ "name": "model.layers.21.self_attn.o_proj.weight",
1851
+ "shape": [
1852
+ 2048,
1853
+ 2048
1854
+ ],
1855
+ "dtype": "float16",
1856
+ "format": "f32-to-bf16",
1857
+ "nbytes": 8388608,
1858
+ "byteOffset": 0
1859
+ },
1860
+ {
1861
+ "name": "model.layers.21.mlp.down_proj.weight",
1862
+ "shape": [
1863
+ 2048,
1864
+ 5632
1865
+ ],
1866
+ "dtype": "float16",
1867
+ "format": "f32-to-bf16",
1868
+ "nbytes": 23068672,
1869
+ "byteOffset": 8388608
1870
+ },
1871
+ {
1872
+ "name": "model.layers.21.input_layernorm.weight",
1873
+ "shape": [
1874
+ 2048
1875
+ ],
1876
+ "dtype": "float16",
1877
+ "format": "f32-to-bf16",
1878
+ "nbytes": 4096,
1879
+ "byteOffset": 31457280
1880
+ },
1881
+ {
1882
+ "name": "model.layers.21.post_attention_layernorm.weight",
1883
+ "shape": [
1884
+ 2048
1885
+ ],
1886
+ "dtype": "float16",
1887
+ "format": "f32-to-bf16",
1888
+ "nbytes": 4096,
1889
+ "byteOffset": 31461376
1890
+ },
1891
+ {
1892
+ "name": "model.norm.weight",
1893
+ "shape": [
1894
+ 2048
1895
+ ],
1896
+ "dtype": "float16",
1897
+ "format": "f32-to-bf16",
1898
+ "nbytes": 4096,
1899
+ "byteOffset": 31465472
1900
+ }
1901
+ ],
1902
+ "md5sum": "ee8a5b28db683d67207b538efa571f54"
1903
+ }
1904
+ ]
1905
+ }
params_shard_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ceee9c7f08be82de164ef1cd96d12da626fdaca5f5ef6598c6838ceefa523da
3
+ size 131084288
params_shard_1.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9ff64754cbd2716e9ba8db08630e8c03ce47cfc0e7f51ba5994d8bf2191f60f
3
+ size 46137344
params_shard_10.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6eabe0c48ecbac20ec0f31e842ec1fd57ff6ec51cc19f8f6b3d8c0240cdc261
3
+ size 31465472
params_shard_11.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30861a474f52e12fd5dc133bd43b2bd98d32da5a287482cc534b1787bd919d18
3
+ size 46137344
params_shard_12.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9c6a4fb77ceb69a38ac46eff2fecbb8f92a7c409552b2ccfeb1cecb41d63347
3
+ size 23068672
params_shard_13.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:195d4e5e2091b7b1a92aec56fd3be9caf14e5cdda243f167550b77d1bea37fcf
3
+ size 29368320
params_shard_14.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ebd7e841388ebe677100f5a52c4f2953295dbfea39b1a5026109421ccefe4745
3
+ size 46137344
params_shard_15.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:305efac10b7db39714b8fd13af764b2331e2b0b0a8e10b354b8f790e1c6622d1
3
+ size 31465472
params_shard_16.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81a221497eae980213688d61986db6575df76a43c361c60c9f09d2020062b083
3
+ size 46137344
params_shard_17.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fdfa9bc6948f7c4723affa09fdfa2fd52bca128f5809abb2db329212c9fc9d4
3
+ size 23068672
params_shard_18.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb2baf65ce82c57380358effe6a5e0ecebd64602ccb86326756dc82ac5b62325
3
+ size 29368320
params_shard_19.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:706205fa323b2549043412e164739533d69eef95ea3050f0a50b9b4ffbd8aae0
3
+ size 46137344
params_shard_2.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a677782162c003cca5d3b4bbc69d3911b3436328e9054cd4c9ca46cdc20fa157
3
+ size 23068672
params_shard_20.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c23dfbd5afc199bf0646650f0b82c42441424b6450f6a0c408071c9e9196ed5
3
+ size 31465472
params_shard_21.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a61a93418c20ffb9fba2f225c615bdde7d46eef678bf37dad1dfed525961964a
3
+ size 46137344
params_shard_22.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:48e6b55318dbcf41135415d289601b5ab463a4d723b30c290f4e3b11d511f594
3
+ size 23068672
params_shard_23.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba37f1d9cb400e24009b051e7bd1ff545ef89a353be9bdb55cf3722fa41ede41
3
+ size 29368320
params_shard_24.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e52c31269293b51c8402caa6064f5dfdc45e1029799d84aa3184ba040cd3e632
3
+ size 46137344
params_shard_25.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecb48c13442559d185928f85f5773aeaa6860309db8be3b42f26bfca047c85a8
3
+ size 31465472
params_shard_26.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d2f651bb1cb253144f9ad5d078454ad4d31c8c6a91bbca97aa8f2584b198bf7
3
+ size 46137344
params_shard_27.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddbe890e13ea68e18494e101f1aba948b487ec50f22a49f22c9ec5ecb14865f2
3
+ size 23068672
params_shard_28.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb5992e1be334eea0eed0ac5f9ff788270a85bd796c6ee43ca88b26fb24762eb
3
+ size 29368320
params_shard_29.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86567db741bfc9b6405f75d85646d57ef14cdef2063d6cb76fefba7fb9db504f
3
+ size 46137344
params_shard_3.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dfd8278533983ba20c9b9a899f135f125fc4a9c8a3bfe7e6d79e6f9322c22b16
3
+ size 29368320
params_shard_30.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1bdfc2cd426d01163362a9a857f79947556dce7bafefc8fa4d981cd6ca518dc6
3
+ size 31465472
params_shard_31.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc2e4a91312b34f26f899d476ac0a92b9f2a7cddaf71dacaa2e81c72ec4ce8d6
3
+ size 46137344
params_shard_32.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6ebd1de2c227f2d4bb2b5da82259f99d7429cc5e1d027e0ad7218f073bd7560
3
+ size 23068672
params_shard_33.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc67d875effcf11eeae42e830b1f7e13305eeef6322df8fbdd9b9792d76df30c
3
+ size 29368320
params_shard_34.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b490ee1a2915317b4ea47da8885475d3b655216ec2d164f81821630721b91946
3
+ size 46137344
params_shard_35.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:554108a0a029ee246e8f49863903b5a471851613b81fcdfdabf8f773ecde751b
3
+ size 31465472
params_shard_36.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:496fa88be3d7471bfafa50486f8be5c3438a6bc7c14ea3d17415e3fc9c8a7e84
3
+ size 46137344
params_shard_37.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:046ee56fea11d5a658063fdfefedc147f9aba46a09531d2b49c3a61b52c9eeb1
3
+ size 23068672
params_shard_38.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:058b8bb29653f02b53feadcfae0396fb79758894d1816d676e8560e1819f0f5c
3
+ size 29368320
params_shard_39.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:361aa60f6aa4c078c77e5acb4e2d5d505603dfa947c5a7b8d7ad1173812750b7
3
+ size 46137344
params_shard_4.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2731e2352041e4db4d466c38d56e7b18db91ce4f867abf877a0f845428723f1b
3
+ size 46137344
params_shard_40.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ba53b1729fda14c9e3a920e2bd8202590058c1dc21979b32f4bdad447099a49
3
+ size 31465472
params_shard_41.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2e4f42d9c74adaf0c099a6f233d4bac4efdb666d52d391caca7702fe6215960
3
+ size 46137344
params_shard_42.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97e317e02344b47de817a95f71812819ea074e51230cb2d8a36c4bc89ecb15d5
3
+ size 23068672
params_shard_43.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6326e675a985b33705f748a3e722e32ae8721a17e2c828b2157b199ef648378
3
+ size 29368320
params_shard_44.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09b6aabdb0c1715c0bce63fa110c29c5295a43f5feef944d4728cc906937e308
3
+ size 46137344
params_shard_45.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b698303e81ebb234253d60e9e32f7fb021d6dafb334f099c97e35fc2542b6ce5
3
+ size 31465472
params_shard_46.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9fdbb5e4922c986ceb00a02a95c56399331cf4d098c2fe0bbf816ad10dd5edee
3
+ size 46137344
params_shard_47.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c813970c363a1e2b5218e0555663b3b7deb0d2a4aab8c2e95013b434045517f
3
+ size 23068672
params_shard_48.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9086a6c5699424c8986c3da785e15f5bea12e91a9c913079c76d2a17afa75d85
3
+ size 29368320
params_shard_49.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73f148a16a02a91f0a23d550a2da54a3ecdd2c5eca97fe9f03640de81428eeb4
3
+ size 46137344
params_shard_5.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3437251c0d49f8a3544bf7118acdcc692a34b847b8b774d9ef763d3801e9ce6f
3
+ size 31465472