Safetensors
gemma2

Additional proofreading

#3
by kiliangoto - opened
.gitattributes CHANGED
@@ -33,4 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
- tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
README.md CHANGED
@@ -8,14 +8,16 @@ language:
8
  - su
9
  license: gemma
10
  ---
11
- # Gemma2 9B CPT Sahabat-AI v1 Instruct
12
 
13
- **Sahabat-AI** (Indonesian language for “close friends”) is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for Indonesian language and its various dialects. Sahabat-AI ecosystem is co-initiated by Indonesian tech and telecommunication companies: GoTo Group and Indosat Ooredoo Hutchison.
14
 
15
- Gemma2 9B CPT Sahabat-AI v1 Instruct is an Indonesian-focused model which has been fine-tuned with around **448,000 Indonesian instruction-completion pairs** alongside an Indonesian-dialect pool consisting of **96,000 instruction-completion pairs in Javanese** and **98,000 instruction-completion pairs in Sundanese**. Additionally, we added a pool of **129,000 instruction-completion pairs in English**.
 
 
16
 
17
- - **Co-initiated by:** PT GoTo Gojek Tokopedia Tbk, Indosat Ooredoo Hutchison
18
  - **Developed by:** PT GoTo Gojek Tokopedia Tbk, AI Singapore
 
19
  - **Model type:** Decoder
20
  - **Languages:** English, Indonesian, Javanese, Sundanese
21
  - **License:** [Gemma Community License](https://ai.google.dev/gemma/terms)
@@ -23,12 +25,12 @@ Gemma2 9B CPT Sahabat-AI v1 Instruct is an Indonesian-focused model which has be
23
  ## Model Details
24
 
25
  ### Model Description
26
- We performed instruction tuning in Indonesian, Javanese, Sundanese as well as English on our [continued pre-trained Gemma2 9B CPT Sahabat-AI v1](https://huggingface.co/GoToCompany/gemma2-9b-cpt-sahabatai-v1-base), a decoder model using the Gemma2 architecture, to create Gemma2 9B CPT Sahabat-AI v1 Instruct.
27
 
28
  For tokenisation, the model employs the default tokenizer used in Gemma-2-9B. The model has a context length of 8192.
29
 
30
  ### Benchmark Performance
31
- We evaluated Gemma2 9B CPT Sahabat-AI V1 Instruct on both general language capabilities and instruction-following capabilities.
32
 
33
  #### General Language Capabilities
34
  For the evaluation of general language capabilities, we employed the
@@ -37,149 +39,22 @@ For the evaluation of general language capabilities, we employed the
37
  - We also added support for Javanese and Sundanese for the BHASA tasks whenever applicable
38
  - [IndoMMLU](https://arxiv.org/pdf/2310.04928)
39
  - These tasks include examination questions on Humanities, Indonesian language, Local languages and cultures, Social science and STEM across primary, middle, and high school levels.
40
- - and the common English tasks from the [HuggingFace LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
41
- - These tasks consist of [IFEval, BBH, Math Lvl 5, GPQA, MuSR, and MMLU-PRO.](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about)
42
- - **Caveat**: Our results differ from the HuggingFace LLM Leaderboard because we have used [VLLM](https://docs.vllm.ai/en/latest/) as our inference platform. VLLM caps the context size at **4096 tokens** while HuggingFace was set to **8192 tokens**.
43
 
44
  Note: SEA HELM is implemented using prompts to elicit answers in a strict format. For all tasks, the model is expected to provide an answer tag from which the answer is automatically extracted. For tasks where options are provided, the answer should comprise one of the pre-defined options. The scores for each task is normalised to account for baseline performance due to random chance.
45
 
46
  The evaluation was done **zero-shot** with native prompts on a sample of 100-1000 instances for each dataset.
47
 
48
 
49
- #### Instruction-following Capabilities
50
- Since Gemma2 9B CPT Sahabat-AI v1 Instruct is an instruction-following model, we also evaluated it on instruction-following capabilities with the [IFEval](https://arxiv.org/abs/2311.07911) dataset.
51
-
52
- As this dataset was in English, the linguists and native speakers in the team worked together to filter, localize and translate the dataset into the respective target languages to ensure that the examples remained reasonable, meaningful and natural.
53
-
54
- **IFEval**
55
-
56
- IFEval evaluates a model's ability to adhere to constraints provided in the prompt, for example beginning a response with a specific word/phrase or answering with a certain number of sections. Additionally, accuracy is normalized by the proportion of responses in the correct language (if the model performs the task correctly but responds in the wrong language, it is judged to have failed the task).
57
-
58
- *Note*: IFEval was only used on Bahasa Indonesia. We are currently working on adding it for Javanese and Sundanese for our upcoming releases.
59
-
60
- #### Results
61
-
62
- #### Indonesian Results
63
- #### SEA HELM (also known as BHASA)
64
- <table style="border-collapse: collapse; width: 100%; font-size: 10px">
65
- <tr>
66
- <th style="border: 2px solid black; padding: 8px; font-weight: bold;">Language / Model Name [Instruct]</th>
67
- <th style="border: 1px solid gray; padding: 8px;">Qwen2-7B</th>
68
- <th style="border: 1px solid gray; padding: 8px;">Qwen2.5-7B</th>
69
- <th style="border: 1px solid gray; padding: 8px;">Llama-3-8B</th>
70
- <th style="border: 1px solid gray; padding: 8px;">Llama-3.1-8B</th>
71
- <th style="border: 1px solid gray; padding: 8px;">sea-lionv2.1-8B</th>
72
- <th style="border: 1px solid gray; padding: 8px;">gemma-2-9B</th>
73
- <th style="border: 1px solid gray; padding: 8px;">sahabatai-v1-8B</th>
74
- <th style="border: 2px solid black; padding: 8px;">sahabatai-v1-9B</th>
75
- </tr>
76
- <tr>
77
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Overall (Bahasa Indonesia + Javanese + Sundanese)</td>
78
- <td style="border: 1px solid gray; padding: 8px;">36.963</td>
79
- <td style="border: 1px solid gray; padding: 8px;">42.988</td>
80
- <td style="border: 1px solid gray; padding: 8px;">37.805</td>
81
- <td style="border: 1px solid gray; padding: 8px;">45.866</td>
82
- <td style="border: 1px solid gray; padding: 8px;">46.880</td>
83
- <td style="border: 1px solid gray; padding: 8px;">56.359</td>
84
- <td style="border: 1px solid gray; padding: 8px;">53.725</td>
85
- <td style="border: 2px solid black; padding: 8px; background-color: lightgreen;">61.169</td>
86
- </tr>
87
- <tr>
88
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Bahasa Indonesia</td>
89
- <td style="border: 1px solid gray; padding: 8px;">46.760</td>
90
- <td style="border: 1px solid gray; padding: 8px;">60.372</td>
91
- <td style="border: 1px solid gray; padding: 8px;">42.022</td>
92
- <td style="border: 1px solid gray; padding: 8px;">51.944</td>
93
- <td style="border: 1px solid gray; padding: 8px;">54.579</td>
94
- <td style="border: 1px solid gray; padding: 8px;">63.394</td>
95
- <td style="border: 1px solid gray; padding: 8px;">57.221</td>
96
- <td style="border: 2px solid black; padding: 8px; background-color: lightgreen;">64.154</td>
97
- </tr>
98
- <tr>
99
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Javanese</td>
100
- <td style="border: 1px solid gray; padding: 8px;">33.956</td>
101
- <td style="border: 1px solid gray; padding: 8px;">40.625</td>
102
- <td style="border: 1px solid gray; padding: 8px;">41.739</td>
103
- <td style="border: 1px solid gray; padding: 8px;">47.587</td>
104
- <td style="border: 1px solid gray; padding: 8px;">48.012</td>
105
- <td style="border: 1px solid gray; padding: 8px;">56.468</td>
106
- <td style="border: 1px solid gray; padding: 8px;">56.460</td>
107
- <td style="border: 2px solid black; padding: 8px; background-color: lightgreen;">64.439</td>
108
- </tr>
109
- <tr>
110
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Sundanese</td>
111
- <td style="border: 1px solid gray; padding: 8px;">30.173</td>
112
- <td style="border: 1px solid gray; padding: 8px;">27.969</td>
113
- <td style="border: 1px solid gray; padding: 8px;">29.654</td>
114
- <td style="border: 1px solid gray; padding: 8px;">38.068</td>
115
- <td style="border: 1px solid gray; padding: 8px;">38.050</td>
116
- <td style="border: 1px solid gray; padding: 8px;">49.216</td>
117
- <td style="border: 1px solid gray; padding: 8px;">47.495</td>
118
- <td style="border: 2px solid black; padding: 8px; background-color: lightgreen;">54.913</td>
119
- </tr>
120
- </table>
121
-
122
- #### IndoMMLU
123
- <table style="border-collapse: collapse; width: 100%; font-size: 10px">
124
- <tr>
125
- <th style="border: 2px solid black; padding: 8px; font-weight: bold;">Model Name [Instruct]</th>
126
- <th style="border: 1px solid gray; padding: 8px;">Qwen2-7B</th>
127
- <th style="border: 1px solid gray; padding: 8px;">Qwen2.5-7B</th>
128
- <th style="border: 1px solid gray; padding: 8px;">Meta-Llama-3-8B</th>
129
- <th style="border: 1px solid gray; padding: 8px;">Llama-3.1-8B</th>
130
- <th style="border: 1px solid gray; padding: 8px;">sea-lionv2.1-8B</th>
131
- <th style="border: 1px solid gray; padding: 8px;">gemma-2-9B</th>
132
- <th style="border: 1px solid gray; padding: 8px;">sahabatai-v1-8B</th>
133
- <th style="border: 2px solid black; padding: 8px;">sahabatai-v1-9B</th>
134
- </tr>
135
- <tr>
136
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Overall Results</td>
137
- <td style="border: 1px solid gray; padding: 8px;">53.0%</td>
138
- <td style="border: 1px solid gray; padding: 8px;">56.0%</td>
139
- <td style="border: 1px solid gray; padding: 8px;">51.9%</td>
140
- <td style="border: 1px solid gray; padding: 8px;">53.8%</td>
141
- <td style="border: 1px solid gray; padding: 8px;">54.4%</td>
142
- <td style="border: 1px solid gray; padding: 8px;">61.4%</td>
143
- <td style="border: 1px solid gray; padding: 8px;">55.6%</td>
144
- <td style="border: 2px solid black; padding: 8px; background-color: lightgreen;">62.6%</td>
145
- </tr>
146
- </table>
147
-
148
-
149
- #### English Results
150
- <table style="border-collapse: collapse; width: 100%; font-size: 10px">
151
- <tr>
152
- <th style="border: 2px solid black; padding: 8px;">Model Name [Instruct]</th>
153
- <th style="border: 1px solid gray; padding: 8px;">Qwen2-7B</th>
154
- <th style="border: 1px solid gray; padding: 8px;">Qwen2.5-7B</th>
155
- <th style="border: 1px solid gray; padding: 8px;">Llama-3-8B</th>
156
- <th style="border: 1px solid gray; padding: 8px;">Llama-3.1-8B</th>
157
- <th style="border: 1px solid gray; padding: 8px;">sea-lionv2.1-8B</th>
158
- <th style="border: 1px solid gray; padding: 8px;">gemma-2-9B</th>
159
- <th style="border: 1px solid gray; padding: 8px;">sahabatai-v1-8B</th>
160
- <th style="border: 2px solid black; padding: 8px;">sahabatai-v1-9B</th>
161
- </tr>
162
- <tr>
163
- <td style="border: 2px solid black; padding: 8px; font-weight: bold;">Average</td>
164
- <td style="border: 1px solid gray; padding: 8px;">24.48</td>
165
- <td style="border: 1px solid gray; padding: 8px;">27.75</td>
166
- <td style="border: 1px solid gray; padding: 8px;">23.91</td>
167
- <td style="border: 1px solid gray; padding: 8px;">27.98</td>
168
- <td style="border: 1px solid gray; padding: 8px;">24.52</td>
169
- <td style="border: 1px solid gray; padding: 8px;">26.44</td>
170
- <td style="border: 1px solid gray; padding: 8px;">24.43</td>
171
- <td style="border: 1px solid black; padding: 8px; background-color: lightgreen;">33.67</td>
172
- </tr>
173
- </table>
174
-
175
- Gemma2 9B CPT Sahabat-AI v1 Instruct can be run using the 🤗 Transformers library
176
  ```python
177
- # Please use transformers==4.45.0
178
 
179
- import torch
180
  import transformers
 
181
 
182
- model_id = "GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct"
183
 
184
  pipeline = transformers.pipeline(
185
  "text-generation",
@@ -187,34 +62,13 @@ pipeline = transformers.pipeline(
187
  model_kwargs={"torch_dtype": torch.bfloat16},
188
  device_map="auto",
189
  )
190
-
191
- terminators = [
192
- pipeline.tokenizer.eos_token_id,
193
- pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
194
- ]
195
-
196
- # Javanese
197
- messages = [
198
- {"role": "user", "content": "Sopo wae sing ana ing Punakawan?"}
199
- ]
200
-
201
- outputs = pipeline(
202
- messages,
203
- max_new_tokens=256,
204
- eos_token_id=terminators,
205
- )
206
- print(outputs[0]["generated_text"][-1])
207
-
208
-
209
- # Sundanese
210
  messages = [
211
- {"role": "user", "content": "Kumaha caritana si Kabayan?"},
212
  ]
213
 
214
  outputs = pipeline(
215
  messages,
216
  max_new_tokens=256,
217
- eos_token_id=terminators,
218
  )
219
  print(outputs[0]["generated_text"][-1])
220
  ```
@@ -225,39 +79,23 @@ It is important for users to be aware that our model exhibits certain limitation
225
  ## Limitations
226
  ### Safety
227
 
228
- Current Sahabat-AI models, including this commercially permissive release, have not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claim, damages, or other liability arising from the use of the released weights and codes.
229
 
230
  ## Technical Specifications
231
  ### Fine-Tuning Details
232
- Gemma2 9B CPT Sahabat-AI v1 Instruct was built using a combination of a full parameter fine-tune, on-policy alignment, and model merges of the best performing checkpoints. The training process for fine-tuning was approximately 4 hours, with alignment taking 2 hours, both on 8x H100-80GB GPUs.
233
 
234
  ## Data
235
- Gemma2 9B CPT Sahabat-AI v1 Instruct was trained on a wide range of synthetic instructions, alongside publicly available instructions hand-curated by the team with the assistance of native speakers. In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
236
-
237
- ## Call for Collaboration
238
-
239
- Sahabat-AI (Indonesian language for “close friends”) a **local open source Large Language Model (LLM) ecosystem in Indonesian language**, co-initiated by Indonesian tech and telecommunication companies: GoTo Group and Indosat Ooredoo Hutchison.
240
- Sahabat-AI ecosystem aims to empower Indonesians who want to develop AI-based services and applications using Bahasa Indonesia and its various local dialects.
241
 
242
- We are supported by research centers and global tech experts such as AI Singapore and Tech Mahendra to train the model to gain general language understanding.
 
243
 
244
- We also collaborate with key top Indonesia universities such as University of Indonesia, Gadjah Mada University, Bogor Institute of Agriculture, Bandung Institute of Technology, including top Indonesia media groups, such as Kompas Gramedia Group and Republika to train and enrich the model in Bahasa Indonesia, ensuring optimum provision of local context and cultural relevance.
245
-
246
- We would like to invite **researchers, developers, and language enthusiasts** to actively contribute to the enhancement and expansion of Sahabat-AI.
247
- Your collaborations can involve:
248
- - Identifying and reporting technical issues
249
- - Sharing pre-training, instruction, and preference data
250
- - Improving documentation usability
251
- - Proposing and implementing new model evaluation tasks and metrics
252
-
253
- Join us in shaping the future of Sahabat-AI by sharing your expertise and insights to make these models more accessible, accurate, and versatile.
254
-
255
- You can contribute your ideas through [this form.](https://docs.google.com/forms/d/1_us969eQtEooYOn4XkvGkdP5VHOyCbO6L_sd9kTMnaA/edit)
256
-
257
- ## The Development Team (in ascending alphabetical order)
258
 
259
  ### AI Singapore
260
  Chan Adwin<br>
 
261
  Cheng Nicholas<br>
262
  Choa Esther<br>
263
  Huang Yuli<br>
@@ -288,7 +126,6 @@ Yong Xianbin<br>
288
 
289
  ### PT GoTo Gojek Tokopedia Tbk
290
  Anissa Dininta<br>
291
- Chau Shiau Ching<br>
292
  Choiri Hendra Hadhil<br>
293
  Goel Priyank<br>
294
  Saini Ajay Kumar<br>
@@ -296,18 +133,19 @@ Shalev Ofir<br>
296
  Tan Daryl<br>
297
  Tep Kilian Rithi<br>
298
  Tiwari Anupam<br>
299
- Widjojo Daniel<br>
300
 
 
301
  ## Acknowledgements
302
 
303
  [AI Singapore](​​https://aisingapore.org/) is a national programme supported by the National Research Foundation, Singapore and hosted by the National University of Singapore.
304
 
305
- Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Research Foundation or the National University of Singapore.
306
 
307
 
308
  ## Contact
309
 
310
- For more info, please contact us using this [Sahabat-AI Inquiry Form.](https://docs.google.com/forms/d/1_us969eQtEooYOn4XkvGkdP5VHOyCbO6L_sd9kTMnaA/edit)
311
 
312
  ## Disclaimer
313
 
 
8
  - su
9
  license: gemma
10
  ---
11
+ # Gemma2 9B CPT Sahabat AI v1 Instruct
12
 
13
+ Sahabat AI is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for Indonesian languages.
14
 
15
+ Llama3 8B CPT Sahabat AI v1 Instruct is an Indonesian-focused model which has been fine-tuned with around **448,000 Indonesian instruction-completion pairs** alongside an Indonesian-dialect pool consisting of **96,000 instruction-completion pairs in Javanese** and **98,000 instruction-completion pairs in Sundanese**. Additionally, we also included **129,000 instruction-completion pairs in English**.
16
+
17
+ Sahabat is Indonesian for "Close Friends."
18
 
 
19
  - **Developed by:** PT GoTo Gojek Tokopedia Tbk, AI Singapore
20
+ - **Funded by:** PT GoTo Gojek Tokopedia Tbk, AI Singapore
21
  - **Model type:** Decoder
22
  - **Languages:** English, Indonesian, Javanese, Sundanese
23
  - **License:** [Gemma Community License](https://ai.google.dev/gemma/terms)
 
25
  ## Model Details
26
 
27
  ### Model Description
28
+ We performed instruction tuning in Indonesian, Javanese, Sundanese as well as English on our [continued pre-trained Gemma2 9B CPT Sahabat AI v1](https://huggingface.co/GoToCompany/gemma2-9b-cpt-sahabatai-v1-base), a decoder model using the Gemma2 architecture, to create Gemma2 9B CPT Sahabat AI v1 Instruct.
29
 
30
  For tokenisation, the model employs the default tokenizer used in Gemma-2-9B. The model has a context length of 8192.
31
 
32
  ### Benchmark Performance
33
+ We evaluated Gemma2 9B CPT Sahabat AI v1 Instruct on general language capabilities.
34
 
35
  #### General Language Capabilities
36
  For the evaluation of general language capabilities, we employed the
 
39
  - We also added support for Javanese and Sundanese for the BHASA tasks whenever applicable
40
  - [IndoMMLU](https://arxiv.org/pdf/2310.04928)
41
  - These tasks include examination questions on Humanities, Indonesian language, Local languages and cultures, Social science and STEM across primary, middle, and high school levels.
42
+ - and the well known [English MMLU](https://arxiv.org/pdf/2009.03300)
 
 
43
 
44
  Note: SEA HELM is implemented using prompts to elicit answers in a strict format. For all tasks, the model is expected to provide an answer tag from which the answer is automatically extracted. For tasks where options are provided, the answer should comprise one of the pre-defined options. The scores for each task is normalised to account for baseline performance due to random chance.
45
 
46
  The evaluation was done **zero-shot** with native prompts on a sample of 100-1000 instances for each dataset.
47
 
48
 
49
+ ### Usage
50
+ Gemma2 9B CPT Sahabat AI v1 Instruct can be run using the 🤗 Transformers library
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ```python
52
+ # Please use transformers==4.45.2
53
 
 
54
  import transformers
55
+ import torch
56
 
57
+ model_id = # PLACEHOLDER
58
 
59
  pipeline = transformers.pipeline(
60
  "text-generation",
 
62
  model_kwargs={"torch_dtype": torch.bfloat16},
63
  device_map="auto",
64
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  messages = [
66
+ {"role": "user", "content": "Apa sentimen dari kalimat berikut ini?\nKalimat: Buku ini sangat membosankan.\nJawaban: "},
67
  ]
68
 
69
  outputs = pipeline(
70
  messages,
71
  max_new_tokens=256,
 
72
  )
73
  print(outputs[0]["generated_text"][-1])
74
  ```
 
79
  ## Limitations
80
  ### Safety
81
 
82
+ Current Sahabat AI models, including this commercially permissive release, have not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claim, damages, or other liability arising from the use of the released weights and codes.
83
 
84
  ## Technical Specifications
85
  ### Fine-Tuning Details
86
+ Gemma2 9B CPT Sahabat AI v1 Instruct was built using a combination of a full parameter fine-tune, on-policy alignment, and model merges of the best performing checkpoints. The training process for fine-tuning was approximately 4 hours, with alignment taking 2 hours, both on 8x H100-80GB GPUs.
87
 
88
  ## Data
89
+ Gemma2 9B CPT Sahabat AI v1 Instruct was trained on a wide range of synthetic instructions, alongside publicly available instructions hand-curated by the team with the assistance of native speakers. In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
 
 
 
 
 
90
 
91
+ ## Call for Contributions
92
+ We encourage researchers, developers, and language enthusiasts to actively contribute to the enhancement and expansion of Sahabat. Contributions can involve identifying and reporting bugs, sharing pre-training, instruction, and preference data, improving documentation usability, proposing and implementing new model evaluation tasks and metrics, or training versions of the model in additional Indonesian languages. Join us in shaping the future of Sahabat by sharing your expertise and insights to make these models more accessible, accurate, and versatile.
93
 
94
+ ## The Team (by ascending alphabetical order)
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ### AI Singapore
97
  Chan Adwin<br>
98
+ Chau Shiau Ching<br>
99
  Cheng Nicholas<br>
100
  Choa Esther<br>
101
  Huang Yuli<br>
 
126
 
127
  ### PT GoTo Gojek Tokopedia Tbk
128
  Anissa Dininta<br>
 
129
  Choiri Hendra Hadhil<br>
130
  Goel Priyank<br>
131
  Saini Ajay Kumar<br>
 
133
  Tan Daryl<br>
134
  Tep Kilian Rithi<br>
135
  Tiwari Anupam<br>
136
+ Widjojo Daniel<be>
137
 
138
+ <!--
139
  ## Acknowledgements
140
 
141
  [AI Singapore](​​https://aisingapore.org/) is a national programme supported by the National Research Foundation, Singapore and hosted by the National University of Singapore.
142
 
143
+ Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Research Foundation or the National University of Singapore. -->
144
 
145
 
146
  ## Contact
147
 
148
+ For more info, please contact us using this [Sahabat Inquiry Form.](https://docs.google.com/forms/d/1_us969eQtEooYOn4XkvGkdP5VHOyCbO6L_sd9kTMnaA/edit)
149
 
150
  ## Disclaimer
151
 
config.json DELETED
@@ -1,34 +0,0 @@
1
- {
2
- "_name_or_path": "/shared/gojek_sft_example/models/sahabat9ftbigmerge_gemma_merge_it_della_linear",
3
- "architectures": [
4
- "Gemma2ForCausalLM"
5
- ],
6
- "attention_bias": false,
7
- "attention_dropout": 0.0,
8
- "attn_logit_softcapping": 50.0,
9
- "bos_token_id": 2,
10
- "cache_implementation": "hybrid",
11
- "eos_token_id": 1,
12
- "final_logit_softcapping": 30.0,
13
- "head_dim": 256,
14
- "hidden_act": "gelu_pytorch_tanh",
15
- "hidden_activation": "gelu_pytorch_tanh",
16
- "hidden_size": 3584,
17
- "initializer_range": 0.02,
18
- "intermediate_size": 14336,
19
- "max_position_embeddings": 8192,
20
- "model_type": "gemma2",
21
- "num_attention_heads": 16,
22
- "num_hidden_layers": 42,
23
- "num_key_value_heads": 8,
24
- "pad_token_id": 0,
25
- "query_pre_attn_scalar": 256,
26
- "rms_norm_eps": 1e-06,
27
- "rope_theta": 10000.0,
28
- "sliding_window": 4096,
29
- "sliding_window_size": 4096,
30
- "torch_dtype": "bfloat16",
31
- "transformers_version": "4.45.2",
32
- "use_cache": true,
33
- "vocab_size": 256000
34
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
generation_config.json DELETED
@@ -1,9 +0,0 @@
1
- {
2
- "_from_model_config": true,
3
- "bos_token_id": 2,
4
- "cache_implementation": "hybrid",
5
- "eos_token_id": 1,
6
- "pad_token_id": 0,
7
- "transformers_version": "4.45.2",
8
- "use_cache": false
9
- }
 
 
 
 
 
 
 
 
 
 
model-00001-of-00004.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:f049e12b64490cd381d6266c22fd518947aec2d0b07288968c7e2448d86639af
3
- size 4903351912
 
 
 
 
model-00002-of-00004.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:bb64a550c196dd94523d7bdb821e920a76f1821e3668fa351e5d782034e6022f
3
- size 4947570872
 
 
 
 
model-00003-of-00004.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:7cd7b4b84de41f02d4cdfad19c4e6dfb0c0b3c3a30d4838d362f32c5e8bcae35
3
- size 4962221464
 
 
 
 
model-00004-of-00004.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1fb28a1ee59339014752cf330932a474d794d2201ec0a14b19b5046b8c3c997a
3
- size 3670322200
 
 
 
 
model.safetensors.index.json DELETED
@@ -1,471 +0,0 @@
1
- {
2
- "metadata": {
3
- "total_size": 18483411968
4
- },
5
- "weight_map": {
6
- "model.embed_tokens.weight": "model-00001-of-00004.safetensors",
7
- "model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
8
- "model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
9
- "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
10
- "model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
11
- "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
12
- "model.layers.0.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
13
- "model.layers.0.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
14
- "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
15
- "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
16
- "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
17
- "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
18
- "model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
19
- "model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
20
- "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
21
- "model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
22
- "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
23
- "model.layers.1.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
24
- "model.layers.1.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
25
- "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
26
- "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
27
- "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
28
- "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
29
- "model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
30
- "model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
31
- "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
32
- "model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
33
- "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
34
- "model.layers.10.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
35
- "model.layers.10.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
36
- "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
37
- "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
38
- "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
39
- "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
40
- "model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
41
- "model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
42
- "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
43
- "model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
44
- "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
45
- "model.layers.11.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
46
- "model.layers.11.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
47
- "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
48
- "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
49
- "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
50
- "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
51
- "model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
52
- "model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
53
- "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
54
- "model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
55
- "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
56
- "model.layers.12.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
57
- "model.layers.12.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
58
- "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
59
- "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
60
- "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
61
- "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
62
- "model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
63
- "model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
64
- "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
65
- "model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
66
- "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
67
- "model.layers.13.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
68
- "model.layers.13.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
69
- "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
70
- "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
71
- "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
72
- "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
73
- "model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
74
- "model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
75
- "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
76
- "model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
77
- "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
78
- "model.layers.14.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
79
- "model.layers.14.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
80
- "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
81
- "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
82
- "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
83
- "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
84
- "model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
85
- "model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
86
- "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
87
- "model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
88
- "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
89
- "model.layers.15.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
90
- "model.layers.15.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
91
- "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
92
- "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
93
- "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
94
- "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
95
- "model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
96
- "model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
97
- "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
98
- "model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
99
- "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
100
- "model.layers.16.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
101
- "model.layers.16.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
102
- "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
103
- "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
104
- "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
105
- "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
106
- "model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
107
- "model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
108
- "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
109
- "model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
110
- "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
111
- "model.layers.17.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
112
- "model.layers.17.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
113
- "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
114
- "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
115
- "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
116
- "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
117
- "model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
118
- "model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
119
- "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
120
- "model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
121
- "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
122
- "model.layers.18.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
123
- "model.layers.18.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
124
- "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
125
- "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
126
- "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
127
- "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
128
- "model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
129
- "model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
130
- "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
131
- "model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
132
- "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
133
- "model.layers.19.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
134
- "model.layers.19.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
135
- "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
136
- "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
137
- "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
138
- "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
139
- "model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
140
- "model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
141
- "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
142
- "model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
143
- "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
144
- "model.layers.2.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
145
- "model.layers.2.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
146
- "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
147
- "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
148
- "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
149
- "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
150
- "model.layers.20.input_layernorm.weight": "model-00003-of-00004.safetensors",
151
- "model.layers.20.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
152
- "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
153
- "model.layers.20.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
154
- "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
155
- "model.layers.20.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
156
- "model.layers.20.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
157
- "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
158
- "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
159
- "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
160
- "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
161
- "model.layers.21.input_layernorm.weight": "model-00003-of-00004.safetensors",
162
- "model.layers.21.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
163
- "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
164
- "model.layers.21.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
165
- "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
166
- "model.layers.21.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
167
- "model.layers.21.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
168
- "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
169
- "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
170
- "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
171
- "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
172
- "model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
173
- "model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
174
- "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
175
- "model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
176
- "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
177
- "model.layers.22.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
178
- "model.layers.22.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
179
- "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
180
- "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
181
- "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
182
- "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
183
- "model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
184
- "model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
185
- "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
186
- "model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
187
- "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
188
- "model.layers.23.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
189
- "model.layers.23.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
190
- "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
191
- "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
192
- "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
193
- "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
194
- "model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
195
- "model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
196
- "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
197
- "model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
198
- "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
199
- "model.layers.24.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
200
- "model.layers.24.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
201
- "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
202
- "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
203
- "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
204
- "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
205
- "model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
206
- "model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
207
- "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
208
- "model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
209
- "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
210
- "model.layers.25.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
211
- "model.layers.25.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
212
- "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
213
- "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
214
- "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
215
- "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
216
- "model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
217
- "model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
218
- "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
219
- "model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
220
- "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
221
- "model.layers.26.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
222
- "model.layers.26.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
223
- "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
224
- "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
225
- "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
226
- "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
227
- "model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
228
- "model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
229
- "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
230
- "model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
231
- "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
232
- "model.layers.27.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
233
- "model.layers.27.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
234
- "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
235
- "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
236
- "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
237
- "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
238
- "model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
239
- "model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
240
- "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
241
- "model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
242
- "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
243
- "model.layers.28.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
244
- "model.layers.28.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
245
- "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
246
- "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
247
- "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
248
- "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
249
- "model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
250
- "model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
251
- "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
252
- "model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
253
- "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
254
- "model.layers.29.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
255
- "model.layers.29.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
256
- "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
257
- "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
258
- "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
259
- "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
260
- "model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
261
- "model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
262
- "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
263
- "model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
264
- "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
265
- "model.layers.3.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
266
- "model.layers.3.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
267
- "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
268
- "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
269
- "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
270
- "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
271
- "model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
272
- "model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
273
- "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
274
- "model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
275
- "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
276
- "model.layers.30.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
277
- "model.layers.30.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
278
- "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
279
- "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
280
- "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
281
- "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
282
- "model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
283
- "model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
284
- "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
285
- "model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
286
- "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
287
- "model.layers.31.post_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
288
- "model.layers.31.pre_feedforward_layernorm.weight": "model-00003-of-00004.safetensors",
289
- "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
290
- "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
291
- "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
292
- "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
293
- "model.layers.32.input_layernorm.weight": "model-00004-of-00004.safetensors",
294
- "model.layers.32.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
295
- "model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
296
- "model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
297
- "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
298
- "model.layers.32.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
299
- "model.layers.32.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
300
- "model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
301
- "model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
302
- "model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
303
- "model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
304
- "model.layers.33.input_layernorm.weight": "model-00004-of-00004.safetensors",
305
- "model.layers.33.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
306
- "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
307
- "model.layers.33.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
308
- "model.layers.33.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
309
- "model.layers.33.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
310
- "model.layers.33.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
311
- "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
312
- "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
313
- "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
314
- "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
315
- "model.layers.34.input_layernorm.weight": "model-00004-of-00004.safetensors",
316
- "model.layers.34.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
317
- "model.layers.34.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
318
- "model.layers.34.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
319
- "model.layers.34.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
320
- "model.layers.34.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
321
- "model.layers.34.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
322
- "model.layers.34.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
323
- "model.layers.34.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
324
- "model.layers.34.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
325
- "model.layers.34.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
326
- "model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
327
- "model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
328
- "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
329
- "model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
330
- "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
331
- "model.layers.35.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
332
- "model.layers.35.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
333
- "model.layers.35.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
334
- "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
335
- "model.layers.35.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
336
- "model.layers.35.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
337
- "model.layers.36.input_layernorm.weight": "model-00004-of-00004.safetensors",
338
- "model.layers.36.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
339
- "model.layers.36.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
340
- "model.layers.36.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
341
- "model.layers.36.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
342
- "model.layers.36.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
343
- "model.layers.36.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
344
- "model.layers.36.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
345
- "model.layers.36.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
346
- "model.layers.36.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
347
- "model.layers.36.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
348
- "model.layers.37.input_layernorm.weight": "model-00004-of-00004.safetensors",
349
- "model.layers.37.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
350
- "model.layers.37.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
351
- "model.layers.37.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
352
- "model.layers.37.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
353
- "model.layers.37.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
354
- "model.layers.37.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
355
- "model.layers.37.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
356
- "model.layers.37.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
357
- "model.layers.37.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
358
- "model.layers.37.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
359
- "model.layers.38.input_layernorm.weight": "model-00004-of-00004.safetensors",
360
- "model.layers.38.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
361
- "model.layers.38.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
362
- "model.layers.38.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
363
- "model.layers.38.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
364
- "model.layers.38.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
365
- "model.layers.38.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
366
- "model.layers.38.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
367
- "model.layers.38.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
368
- "model.layers.38.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
369
- "model.layers.38.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
370
- "model.layers.39.input_layernorm.weight": "model-00004-of-00004.safetensors",
371
- "model.layers.39.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
372
- "model.layers.39.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
373
- "model.layers.39.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
374
- "model.layers.39.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
375
- "model.layers.39.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
376
- "model.layers.39.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
377
- "model.layers.39.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
378
- "model.layers.39.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
379
- "model.layers.39.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
380
- "model.layers.39.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
381
- "model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
382
- "model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
383
- "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
384
- "model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
385
- "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
386
- "model.layers.4.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
387
- "model.layers.4.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
388
- "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
389
- "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
390
- "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
391
- "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
392
- "model.layers.40.input_layernorm.weight": "model-00004-of-00004.safetensors",
393
- "model.layers.40.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
394
- "model.layers.40.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
395
- "model.layers.40.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
396
- "model.layers.40.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
397
- "model.layers.40.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
398
- "model.layers.40.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
399
- "model.layers.40.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
400
- "model.layers.40.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
401
- "model.layers.40.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
402
- "model.layers.40.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
403
- "model.layers.41.input_layernorm.weight": "model-00004-of-00004.safetensors",
404
- "model.layers.41.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
405
- "model.layers.41.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
406
- "model.layers.41.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
407
- "model.layers.41.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
408
- "model.layers.41.post_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
409
- "model.layers.41.pre_feedforward_layernorm.weight": "model-00004-of-00004.safetensors",
410
- "model.layers.41.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
411
- "model.layers.41.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
412
- "model.layers.41.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
413
- "model.layers.41.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
414
- "model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
415
- "model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
416
- "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
417
- "model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
418
- "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
419
- "model.layers.5.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
420
- "model.layers.5.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
421
- "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
422
- "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
423
- "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
424
- "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
425
- "model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
426
- "model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
427
- "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
428
- "model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
429
- "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
430
- "model.layers.6.post_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
431
- "model.layers.6.pre_feedforward_layernorm.weight": "model-00001-of-00004.safetensors",
432
- "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
433
- "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
434
- "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
435
- "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
436
- "model.layers.7.input_layernorm.weight": "model-00002-of-00004.safetensors",
437
- "model.layers.7.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
438
- "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
439
- "model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
440
- "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
441
- "model.layers.7.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
442
- "model.layers.7.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
443
- "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
444
- "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
445
- "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
446
- "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
447
- "model.layers.8.input_layernorm.weight": "model-00002-of-00004.safetensors",
448
- "model.layers.8.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
449
- "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
450
- "model.layers.8.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
451
- "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
452
- "model.layers.8.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
453
- "model.layers.8.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
454
- "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
455
- "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
456
- "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
457
- "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
458
- "model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
459
- "model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
460
- "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
461
- "model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
462
- "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
463
- "model.layers.9.post_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
464
- "model.layers.9.pre_feedforward_layernorm.weight": "model-00002-of-00004.safetensors",
465
- "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
466
- "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
467
- "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
468
- "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
469
- "model.norm.weight": "model-00004-of-00004.safetensors"
470
- }
471
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
special_tokens_map.json DELETED
@@ -1,34 +0,0 @@
1
- {
2
- "additional_special_tokens": [
3
- "<start_of_turn>",
4
- "<end_of_turn>"
5
- ],
6
- "bos_token": {
7
- "content": "<bos>",
8
- "lstrip": false,
9
- "normalized": false,
10
- "rstrip": false,
11
- "single_word": false
12
- },
13
- "eos_token": {
14
- "content": "<eos>",
15
- "lstrip": false,
16
- "normalized": false,
17
- "rstrip": false,
18
- "single_word": false
19
- },
20
- "pad_token": {
21
- "content": "<pad>",
22
- "lstrip": false,
23
- "normalized": false,
24
- "rstrip": false,
25
- "single_word": false
26
- },
27
- "unk_token": {
28
- "content": "<unk>",
29
- "lstrip": false,
30
- "normalized": false,
31
- "rstrip": false,
32
- "single_word": false
33
- }
34
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:5f7eee611703c5ce5d1eee32d9cdcfe465647b8aff0c1dfb3bed7ad7dbb05060
3
- size 34362873
 
 
 
 
tokenizer.model DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:61a7b147390c64585d6c3543dd6fc636906c9af3865a5548f27f31aee1d4c8e2
3
- size 4241003
 
 
 
 
tokenizer_config.json DELETED
@@ -1,2013 +0,0 @@
1
- {
2
- "add_bos_token": true,
3
- "add_eos_token": false,
4
- "added_tokens_decoder": {
5
- "0": {
6
- "content": "<pad>",
7
- "lstrip": false,
8
- "normalized": false,
9
- "rstrip": false,
10
- "single_word": false,
11
- "special": true
12
- },
13
- "1": {
14
- "content": "<eos>",
15
- "lstrip": false,
16
- "normalized": false,
17
- "rstrip": false,
18
- "single_word": false,
19
- "special": true
20
- },
21
- "2": {
22
- "content": "<bos>",
23
- "lstrip": false,
24
- "normalized": false,
25
- "rstrip": false,
26
- "single_word": false,
27
- "special": true
28
- },
29
- "3": {
30
- "content": "<unk>",
31
- "lstrip": false,
32
- "normalized": false,
33
- "rstrip": false,
34
- "single_word": false,
35
- "special": true
36
- },
37
- "4": {
38
- "content": "<mask>",
39
- "lstrip": false,
40
- "normalized": false,
41
- "rstrip": false,
42
- "single_word": false,
43
- "special": false
44
- },
45
- "5": {
46
- "content": "<2mass>",
47
- "lstrip": false,
48
- "normalized": false,
49
- "rstrip": false,
50
- "single_word": false,
51
- "special": false
52
- },
53
- "6": {
54
- "content": "[@BOS@]",
55
- "lstrip": false,
56
- "normalized": false,
57
- "rstrip": false,
58
- "single_word": false,
59
- "special": false
60
- },
61
- "7": {
62
- "content": "<unused0>",
63
- "lstrip": false,
64
- "normalized": false,
65
- "rstrip": false,
66
- "single_word": false,
67
- "special": false
68
- },
69
- "8": {
70
- "content": "<unused1>",
71
- "lstrip": false,
72
- "normalized": false,
73
- "rstrip": false,
74
- "single_word": false,
75
- "special": false
76
- },
77
- "9": {
78
- "content": "<unused2>",
79
- "lstrip": false,
80
- "normalized": false,
81
- "rstrip": false,
82
- "single_word": false,
83
- "special": false
84
- },
85
- "10": {
86
- "content": "<unused3>",
87
- "lstrip": false,
88
- "normalized": false,
89
- "rstrip": false,
90
- "single_word": false,
91
- "special": false
92
- },
93
- "11": {
94
- "content": "<unused4>",
95
- "lstrip": false,
96
- "normalized": false,
97
- "rstrip": false,
98
- "single_word": false,
99
- "special": false
100
- },
101
- "12": {
102
- "content": "<unused5>",
103
- "lstrip": false,
104
- "normalized": false,
105
- "rstrip": false,
106
- "single_word": false,
107
- "special": false
108
- },
109
- "13": {
110
- "content": "<unused6>",
111
- "lstrip": false,
112
- "normalized": false,
113
- "rstrip": false,
114
- "single_word": false,
115
- "special": false
116
- },
117
- "14": {
118
- "content": "<unused7>",
119
- "lstrip": false,
120
- "normalized": false,
121
- "rstrip": false,
122
- "single_word": false,
123
- "special": false
124
- },
125
- "15": {
126
- "content": "<unused8>",
127
- "lstrip": false,
128
- "normalized": false,
129
- "rstrip": false,
130
- "single_word": false,
131
- "special": false
132
- },
133
- "16": {
134
- "content": "<unused9>",
135
- "lstrip": false,
136
- "normalized": false,
137
- "rstrip": false,
138
- "single_word": false,
139
- "special": false
140
- },
141
- "17": {
142
- "content": "<unused10>",
143
- "lstrip": false,
144
- "normalized": false,
145
- "rstrip": false,
146
- "single_word": false,
147
- "special": false
148
- },
149
- "18": {
150
- "content": "<unused11>",
151
- "lstrip": false,
152
- "normalized": false,
153
- "rstrip": false,
154
- "single_word": false,
155
- "special": false
156
- },
157
- "19": {
158
- "content": "<unused12>",
159
- "lstrip": false,
160
- "normalized": false,
161
- "rstrip": false,
162
- "single_word": false,
163
- "special": false
164
- },
165
- "20": {
166
- "content": "<unused13>",
167
- "lstrip": false,
168
- "normalized": false,
169
- "rstrip": false,
170
- "single_word": false,
171
- "special": false
172
- },
173
- "21": {
174
- "content": "<unused14>",
175
- "lstrip": false,
176
- "normalized": false,
177
- "rstrip": false,
178
- "single_word": false,
179
- "special": false
180
- },
181
- "22": {
182
- "content": "<unused15>",
183
- "lstrip": false,
184
- "normalized": false,
185
- "rstrip": false,
186
- "single_word": false,
187
- "special": false
188
- },
189
- "23": {
190
- "content": "<unused16>",
191
- "lstrip": false,
192
- "normalized": false,
193
- "rstrip": false,
194
- "single_word": false,
195
- "special": false
196
- },
197
- "24": {
198
- "content": "<unused17>",
199
- "lstrip": false,
200
- "normalized": false,
201
- "rstrip": false,
202
- "single_word": false,
203
- "special": false
204
- },
205
- "25": {
206
- "content": "<unused18>",
207
- "lstrip": false,
208
- "normalized": false,
209
- "rstrip": false,
210
- "single_word": false,
211
- "special": false
212
- },
213
- "26": {
214
- "content": "<unused19>",
215
- "lstrip": false,
216
- "normalized": false,
217
- "rstrip": false,
218
- "single_word": false,
219
- "special": false
220
- },
221
- "27": {
222
- "content": "<unused20>",
223
- "lstrip": false,
224
- "normalized": false,
225
- "rstrip": false,
226
- "single_word": false,
227
- "special": false
228
- },
229
- "28": {
230
- "content": "<unused21>",
231
- "lstrip": false,
232
- "normalized": false,
233
- "rstrip": false,
234
- "single_word": false,
235
- "special": false
236
- },
237
- "29": {
238
- "content": "<unused22>",
239
- "lstrip": false,
240
- "normalized": false,
241
- "rstrip": false,
242
- "single_word": false,
243
- "special": false
244
- },
245
- "30": {
246
- "content": "<unused23>",
247
- "lstrip": false,
248
- "normalized": false,
249
- "rstrip": false,
250
- "single_word": false,
251
- "special": false
252
- },
253
- "31": {
254
- "content": "<unused24>",
255
- "lstrip": false,
256
- "normalized": false,
257
- "rstrip": false,
258
- "single_word": false,
259
- "special": false
260
- },
261
- "32": {
262
- "content": "<unused25>",
263
- "lstrip": false,
264
- "normalized": false,
265
- "rstrip": false,
266
- "single_word": false,
267
- "special": false
268
- },
269
- "33": {
270
- "content": "<unused26>",
271
- "lstrip": false,
272
- "normalized": false,
273
- "rstrip": false,
274
- "single_word": false,
275
- "special": false
276
- },
277
- "34": {
278
- "content": "<unused27>",
279
- "lstrip": false,
280
- "normalized": false,
281
- "rstrip": false,
282
- "single_word": false,
283
- "special": false
284
- },
285
- "35": {
286
- "content": "<unused28>",
287
- "lstrip": false,
288
- "normalized": false,
289
- "rstrip": false,
290
- "single_word": false,
291
- "special": false
292
- },
293
- "36": {
294
- "content": "<unused29>",
295
- "lstrip": false,
296
- "normalized": false,
297
- "rstrip": false,
298
- "single_word": false,
299
- "special": false
300
- },
301
- "37": {
302
- "content": "<unused30>",
303
- "lstrip": false,
304
- "normalized": false,
305
- "rstrip": false,
306
- "single_word": false,
307
- "special": false
308
- },
309
- "38": {
310
- "content": "<unused31>",
311
- "lstrip": false,
312
- "normalized": false,
313
- "rstrip": false,
314
- "single_word": false,
315
- "special": false
316
- },
317
- "39": {
318
- "content": "<unused32>",
319
- "lstrip": false,
320
- "normalized": false,
321
- "rstrip": false,
322
- "single_word": false,
323
- "special": false
324
- },
325
- "40": {
326
- "content": "<unused33>",
327
- "lstrip": false,
328
- "normalized": false,
329
- "rstrip": false,
330
- "single_word": false,
331
- "special": false
332
- },
333
- "41": {
334
- "content": "<unused34>",
335
- "lstrip": false,
336
- "normalized": false,
337
- "rstrip": false,
338
- "single_word": false,
339
- "special": false
340
- },
341
- "42": {
342
- "content": "<unused35>",
343
- "lstrip": false,
344
- "normalized": false,
345
- "rstrip": false,
346
- "single_word": false,
347
- "special": false
348
- },
349
- "43": {
350
- "content": "<unused36>",
351
- "lstrip": false,
352
- "normalized": false,
353
- "rstrip": false,
354
- "single_word": false,
355
- "special": false
356
- },
357
- "44": {
358
- "content": "<unused37>",
359
- "lstrip": false,
360
- "normalized": false,
361
- "rstrip": false,
362
- "single_word": false,
363
- "special": false
364
- },
365
- "45": {
366
- "content": "<unused38>",
367
- "lstrip": false,
368
- "normalized": false,
369
- "rstrip": false,
370
- "single_word": false,
371
- "special": false
372
- },
373
- "46": {
374
- "content": "<unused39>",
375
- "lstrip": false,
376
- "normalized": false,
377
- "rstrip": false,
378
- "single_word": false,
379
- "special": false
380
- },
381
- "47": {
382
- "content": "<unused40>",
383
- "lstrip": false,
384
- "normalized": false,
385
- "rstrip": false,
386
- "single_word": false,
387
- "special": false
388
- },
389
- "48": {
390
- "content": "<unused41>",
391
- "lstrip": false,
392
- "normalized": false,
393
- "rstrip": false,
394
- "single_word": false,
395
- "special": false
396
- },
397
- "49": {
398
- "content": "<unused42>",
399
- "lstrip": false,
400
- "normalized": false,
401
- "rstrip": false,
402
- "single_word": false,
403
- "special": false
404
- },
405
- "50": {
406
- "content": "<unused43>",
407
- "lstrip": false,
408
- "normalized": false,
409
- "rstrip": false,
410
- "single_word": false,
411
- "special": false
412
- },
413
- "51": {
414
- "content": "<unused44>",
415
- "lstrip": false,
416
- "normalized": false,
417
- "rstrip": false,
418
- "single_word": false,
419
- "special": false
420
- },
421
- "52": {
422
- "content": "<unused45>",
423
- "lstrip": false,
424
- "normalized": false,
425
- "rstrip": false,
426
- "single_word": false,
427
- "special": false
428
- },
429
- "53": {
430
- "content": "<unused46>",
431
- "lstrip": false,
432
- "normalized": false,
433
- "rstrip": false,
434
- "single_word": false,
435
- "special": false
436
- },
437
- "54": {
438
- "content": "<unused47>",
439
- "lstrip": false,
440
- "normalized": false,
441
- "rstrip": false,
442
- "single_word": false,
443
- "special": false
444
- },
445
- "55": {
446
- "content": "<unused48>",
447
- "lstrip": false,
448
- "normalized": false,
449
- "rstrip": false,
450
- "single_word": false,
451
- "special": false
452
- },
453
- "56": {
454
- "content": "<unused49>",
455
- "lstrip": false,
456
- "normalized": false,
457
- "rstrip": false,
458
- "single_word": false,
459
- "special": false
460
- },
461
- "57": {
462
- "content": "<unused50>",
463
- "lstrip": false,
464
- "normalized": false,
465
- "rstrip": false,
466
- "single_word": false,
467
- "special": false
468
- },
469
- "58": {
470
- "content": "<unused51>",
471
- "lstrip": false,
472
- "normalized": false,
473
- "rstrip": false,
474
- "single_word": false,
475
- "special": false
476
- },
477
- "59": {
478
- "content": "<unused52>",
479
- "lstrip": false,
480
- "normalized": false,
481
- "rstrip": false,
482
- "single_word": false,
483
- "special": false
484
- },
485
- "60": {
486
- "content": "<unused53>",
487
- "lstrip": false,
488
- "normalized": false,
489
- "rstrip": false,
490
- "single_word": false,
491
- "special": false
492
- },
493
- "61": {
494
- "content": "<unused54>",
495
- "lstrip": false,
496
- "normalized": false,
497
- "rstrip": false,
498
- "single_word": false,
499
- "special": false
500
- },
501
- "62": {
502
- "content": "<unused55>",
503
- "lstrip": false,
504
- "normalized": false,
505
- "rstrip": false,
506
- "single_word": false,
507
- "special": false
508
- },
509
- "63": {
510
- "content": "<unused56>",
511
- "lstrip": false,
512
- "normalized": false,
513
- "rstrip": false,
514
- "single_word": false,
515
- "special": false
516
- },
517
- "64": {
518
- "content": "<unused57>",
519
- "lstrip": false,
520
- "normalized": false,
521
- "rstrip": false,
522
- "single_word": false,
523
- "special": false
524
- },
525
- "65": {
526
- "content": "<unused58>",
527
- "lstrip": false,
528
- "normalized": false,
529
- "rstrip": false,
530
- "single_word": false,
531
- "special": false
532
- },
533
- "66": {
534
- "content": "<unused59>",
535
- "lstrip": false,
536
- "normalized": false,
537
- "rstrip": false,
538
- "single_word": false,
539
- "special": false
540
- },
541
- "67": {
542
- "content": "<unused60>",
543
- "lstrip": false,
544
- "normalized": false,
545
- "rstrip": false,
546
- "single_word": false,
547
- "special": false
548
- },
549
- "68": {
550
- "content": "<unused61>",
551
- "lstrip": false,
552
- "normalized": false,
553
- "rstrip": false,
554
- "single_word": false,
555
- "special": false
556
- },
557
- "69": {
558
- "content": "<unused62>",
559
- "lstrip": false,
560
- "normalized": false,
561
- "rstrip": false,
562
- "single_word": false,
563
- "special": false
564
- },
565
- "70": {
566
- "content": "<unused63>",
567
- "lstrip": false,
568
- "normalized": false,
569
- "rstrip": false,
570
- "single_word": false,
571
- "special": false
572
- },
573
- "71": {
574
- "content": "<unused64>",
575
- "lstrip": false,
576
- "normalized": false,
577
- "rstrip": false,
578
- "single_word": false,
579
- "special": false
580
- },
581
- "72": {
582
- "content": "<unused65>",
583
- "lstrip": false,
584
- "normalized": false,
585
- "rstrip": false,
586
- "single_word": false,
587
- "special": false
588
- },
589
- "73": {
590
- "content": "<unused66>",
591
- "lstrip": false,
592
- "normalized": false,
593
- "rstrip": false,
594
- "single_word": false,
595
- "special": false
596
- },
597
- "74": {
598
- "content": "<unused67>",
599
- "lstrip": false,
600
- "normalized": false,
601
- "rstrip": false,
602
- "single_word": false,
603
- "special": false
604
- },
605
- "75": {
606
- "content": "<unused68>",
607
- "lstrip": false,
608
- "normalized": false,
609
- "rstrip": false,
610
- "single_word": false,
611
- "special": false
612
- },
613
- "76": {
614
- "content": "<unused69>",
615
- "lstrip": false,
616
- "normalized": false,
617
- "rstrip": false,
618
- "single_word": false,
619
- "special": false
620
- },
621
- "77": {
622
- "content": "<unused70>",
623
- "lstrip": false,
624
- "normalized": false,
625
- "rstrip": false,
626
- "single_word": false,
627
- "special": false
628
- },
629
- "78": {
630
- "content": "<unused71>",
631
- "lstrip": false,
632
- "normalized": false,
633
- "rstrip": false,
634
- "single_word": false,
635
- "special": false
636
- },
637
- "79": {
638
- "content": "<unused72>",
639
- "lstrip": false,
640
- "normalized": false,
641
- "rstrip": false,
642
- "single_word": false,
643
- "special": false
644
- },
645
- "80": {
646
- "content": "<unused73>",
647
- "lstrip": false,
648
- "normalized": false,
649
- "rstrip": false,
650
- "single_word": false,
651
- "special": false
652
- },
653
- "81": {
654
- "content": "<unused74>",
655
- "lstrip": false,
656
- "normalized": false,
657
- "rstrip": false,
658
- "single_word": false,
659
- "special": false
660
- },
661
- "82": {
662
- "content": "<unused75>",
663
- "lstrip": false,
664
- "normalized": false,
665
- "rstrip": false,
666
- "single_word": false,
667
- "special": false
668
- },
669
- "83": {
670
- "content": "<unused76>",
671
- "lstrip": false,
672
- "normalized": false,
673
- "rstrip": false,
674
- "single_word": false,
675
- "special": false
676
- },
677
- "84": {
678
- "content": "<unused77>",
679
- "lstrip": false,
680
- "normalized": false,
681
- "rstrip": false,
682
- "single_word": false,
683
- "special": false
684
- },
685
- "85": {
686
- "content": "<unused78>",
687
- "lstrip": false,
688
- "normalized": false,
689
- "rstrip": false,
690
- "single_word": false,
691
- "special": false
692
- },
693
- "86": {
694
- "content": "<unused79>",
695
- "lstrip": false,
696
- "normalized": false,
697
- "rstrip": false,
698
- "single_word": false,
699
- "special": false
700
- },
701
- "87": {
702
- "content": "<unused80>",
703
- "lstrip": false,
704
- "normalized": false,
705
- "rstrip": false,
706
- "single_word": false,
707
- "special": false
708
- },
709
- "88": {
710
- "content": "<unused81>",
711
- "lstrip": false,
712
- "normalized": false,
713
- "rstrip": false,
714
- "single_word": false,
715
- "special": false
716
- },
717
- "89": {
718
- "content": "<unused82>",
719
- "lstrip": false,
720
- "normalized": false,
721
- "rstrip": false,
722
- "single_word": false,
723
- "special": false
724
- },
725
- "90": {
726
- "content": "<unused83>",
727
- "lstrip": false,
728
- "normalized": false,
729
- "rstrip": false,
730
- "single_word": false,
731
- "special": false
732
- },
733
- "91": {
734
- "content": "<unused84>",
735
- "lstrip": false,
736
- "normalized": false,
737
- "rstrip": false,
738
- "single_word": false,
739
- "special": false
740
- },
741
- "92": {
742
- "content": "<unused85>",
743
- "lstrip": false,
744
- "normalized": false,
745
- "rstrip": false,
746
- "single_word": false,
747
- "special": false
748
- },
749
- "93": {
750
- "content": "<unused86>",
751
- "lstrip": false,
752
- "normalized": false,
753
- "rstrip": false,
754
- "single_word": false,
755
- "special": false
756
- },
757
- "94": {
758
- "content": "<unused87>",
759
- "lstrip": false,
760
- "normalized": false,
761
- "rstrip": false,
762
- "single_word": false,
763
- "special": false
764
- },
765
- "95": {
766
- "content": "<unused88>",
767
- "lstrip": false,
768
- "normalized": false,
769
- "rstrip": false,
770
- "single_word": false,
771
- "special": false
772
- },
773
- "96": {
774
- "content": "<unused89>",
775
- "lstrip": false,
776
- "normalized": false,
777
- "rstrip": false,
778
- "single_word": false,
779
- "special": false
780
- },
781
- "97": {
782
- "content": "<unused90>",
783
- "lstrip": false,
784
- "normalized": false,
785
- "rstrip": false,
786
- "single_word": false,
787
- "special": false
788
- },
789
- "98": {
790
- "content": "<unused91>",
791
- "lstrip": false,
792
- "normalized": false,
793
- "rstrip": false,
794
- "single_word": false,
795
- "special": false
796
- },
797
- "99": {
798
- "content": "<unused92>",
799
- "lstrip": false,
800
- "normalized": false,
801
- "rstrip": false,
802
- "single_word": false,
803
- "special": false
804
- },
805
- "100": {
806
- "content": "<unused93>",
807
- "lstrip": false,
808
- "normalized": false,
809
- "rstrip": false,
810
- "single_word": false,
811
- "special": false
812
- },
813
- "101": {
814
- "content": "<unused94>",
815
- "lstrip": false,
816
- "normalized": false,
817
- "rstrip": false,
818
- "single_word": false,
819
- "special": false
820
- },
821
- "102": {
822
- "content": "<unused95>",
823
- "lstrip": false,
824
- "normalized": false,
825
- "rstrip": false,
826
- "single_word": false,
827
- "special": false
828
- },
829
- "103": {
830
- "content": "<unused96>",
831
- "lstrip": false,
832
- "normalized": false,
833
- "rstrip": false,
834
- "single_word": false,
835
- "special": false
836
- },
837
- "104": {
838
- "content": "<unused97>",
839
- "lstrip": false,
840
- "normalized": false,
841
- "rstrip": false,
842
- "single_word": false,
843
- "special": false
844
- },
845
- "105": {
846
- "content": "<unused98>",
847
- "lstrip": false,
848
- "normalized": false,
849
- "rstrip": false,
850
- "single_word": false,
851
- "special": false
852
- },
853
- "106": {
854
- "content": "<start_of_turn>",
855
- "lstrip": false,
856
- "normalized": false,
857
- "rstrip": false,
858
- "single_word": false,
859
- "special": true
860
- },
861
- "107": {
862
- "content": "<end_of_turn>",
863
- "lstrip": false,
864
- "normalized": false,
865
- "rstrip": false,
866
- "single_word": false,
867
- "special": true
868
- },
869
- "108": {
870
- "content": "\n",
871
- "lstrip": false,
872
- "normalized": false,
873
- "rstrip": false,
874
- "single_word": false,
875
- "special": false
876
- },
877
- "109": {
878
- "content": "\n\n",
879
- "lstrip": false,
880
- "normalized": false,
881
- "rstrip": false,
882
- "single_word": false,
883
- "special": false
884
- },
885
- "110": {
886
- "content": "\n\n\n",
887
- "lstrip": false,
888
- "normalized": false,
889
- "rstrip": false,
890
- "single_word": false,
891
- "special": false
892
- },
893
- "111": {
894
- "content": "\n\n\n\n",
895
- "lstrip": false,
896
- "normalized": false,
897
- "rstrip": false,
898
- "single_word": false,
899
- "special": false
900
- },
901
- "112": {
902
- "content": "\n\n\n\n\n",
903
- "lstrip": false,
904
- "normalized": false,
905
- "rstrip": false,
906
- "single_word": false,
907
- "special": false
908
- },
909
- "113": {
910
- "content": "\n\n\n\n\n\n",
911
- "lstrip": false,
912
- "normalized": false,
913
- "rstrip": false,
914
- "single_word": false,
915
- "special": false
916
- },
917
- "114": {
918
- "content": "\n\n\n\n\n\n\n",
919
- "lstrip": false,
920
- "normalized": false,
921
- "rstrip": false,
922
- "single_word": false,
923
- "special": false
924
- },
925
- "115": {
926
- "content": "\n\n\n\n\n\n\n\n",
927
- "lstrip": false,
928
- "normalized": false,
929
- "rstrip": false,
930
- "single_word": false,
931
- "special": false
932
- },
933
- "116": {
934
- "content": "\n\n\n\n\n\n\n\n\n",
935
- "lstrip": false,
936
- "normalized": false,
937
- "rstrip": false,
938
- "single_word": false,
939
- "special": false
940
- },
941
- "117": {
942
- "content": "\n\n\n\n\n\n\n\n\n\n",
943
- "lstrip": false,
944
- "normalized": false,
945
- "rstrip": false,
946
- "single_word": false,
947
- "special": false
948
- },
949
- "118": {
950
- "content": "\n\n\n\n\n\n\n\n\n\n\n",
951
- "lstrip": false,
952
- "normalized": false,
953
- "rstrip": false,
954
- "single_word": false,
955
- "special": false
956
- },
957
- "119": {
958
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n",
959
- "lstrip": false,
960
- "normalized": false,
961
- "rstrip": false,
962
- "single_word": false,
963
- "special": false
964
- },
965
- "120": {
966
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n",
967
- "lstrip": false,
968
- "normalized": false,
969
- "rstrip": false,
970
- "single_word": false,
971
- "special": false
972
- },
973
- "121": {
974
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
975
- "lstrip": false,
976
- "normalized": false,
977
- "rstrip": false,
978
- "single_word": false,
979
- "special": false
980
- },
981
- "122": {
982
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
983
- "lstrip": false,
984
- "normalized": false,
985
- "rstrip": false,
986
- "single_word": false,
987
- "special": false
988
- },
989
- "123": {
990
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
991
- "lstrip": false,
992
- "normalized": false,
993
- "rstrip": false,
994
- "single_word": false,
995
- "special": false
996
- },
997
- "124": {
998
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
999
- "lstrip": false,
1000
- "normalized": false,
1001
- "rstrip": false,
1002
- "single_word": false,
1003
- "special": false
1004
- },
1005
- "125": {
1006
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1007
- "lstrip": false,
1008
- "normalized": false,
1009
- "rstrip": false,
1010
- "single_word": false,
1011
- "special": false
1012
- },
1013
- "126": {
1014
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1015
- "lstrip": false,
1016
- "normalized": false,
1017
- "rstrip": false,
1018
- "single_word": false,
1019
- "special": false
1020
- },
1021
- "127": {
1022
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1023
- "lstrip": false,
1024
- "normalized": false,
1025
- "rstrip": false,
1026
- "single_word": false,
1027
- "special": false
1028
- },
1029
- "128": {
1030
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1031
- "lstrip": false,
1032
- "normalized": false,
1033
- "rstrip": false,
1034
- "single_word": false,
1035
- "special": false
1036
- },
1037
- "129": {
1038
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1039
- "lstrip": false,
1040
- "normalized": false,
1041
- "rstrip": false,
1042
- "single_word": false,
1043
- "special": false
1044
- },
1045
- "130": {
1046
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1047
- "lstrip": false,
1048
- "normalized": false,
1049
- "rstrip": false,
1050
- "single_word": false,
1051
- "special": false
1052
- },
1053
- "131": {
1054
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1055
- "lstrip": false,
1056
- "normalized": false,
1057
- "rstrip": false,
1058
- "single_word": false,
1059
- "special": false
1060
- },
1061
- "132": {
1062
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1063
- "lstrip": false,
1064
- "normalized": false,
1065
- "rstrip": false,
1066
- "single_word": false,
1067
- "special": false
1068
- },
1069
- "133": {
1070
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1071
- "lstrip": false,
1072
- "normalized": false,
1073
- "rstrip": false,
1074
- "single_word": false,
1075
- "special": false
1076
- },
1077
- "134": {
1078
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1079
- "lstrip": false,
1080
- "normalized": false,
1081
- "rstrip": false,
1082
- "single_word": false,
1083
- "special": false
1084
- },
1085
- "135": {
1086
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1087
- "lstrip": false,
1088
- "normalized": false,
1089
- "rstrip": false,
1090
- "single_word": false,
1091
- "special": false
1092
- },
1093
- "136": {
1094
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1095
- "lstrip": false,
1096
- "normalized": false,
1097
- "rstrip": false,
1098
- "single_word": false,
1099
- "special": false
1100
- },
1101
- "137": {
1102
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1103
- "lstrip": false,
1104
- "normalized": false,
1105
- "rstrip": false,
1106
- "single_word": false,
1107
- "special": false
1108
- },
1109
- "138": {
1110
- "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
1111
- "lstrip": false,
1112
- "normalized": false,
1113
- "rstrip": false,
1114
- "single_word": false,
1115
- "special": false
1116
- },
1117
- "139": {
1118
- "content": "▁▁",
1119
- "lstrip": false,
1120
- "normalized": false,
1121
- "rstrip": false,
1122
- "single_word": false,
1123
- "special": false
1124
- },
1125
- "140": {
1126
- "content": "▁▁▁",
1127
- "lstrip": false,
1128
- "normalized": false,
1129
- "rstrip": false,
1130
- "single_word": false,
1131
- "special": false
1132
- },
1133
- "141": {
1134
- "content": "▁▁▁▁",
1135
- "lstrip": false,
1136
- "normalized": false,
1137
- "rstrip": false,
1138
- "single_word": false,
1139
- "special": false
1140
- },
1141
- "142": {
1142
- "content": "▁▁▁▁▁",
1143
- "lstrip": false,
1144
- "normalized": false,
1145
- "rstrip": false,
1146
- "single_word": false,
1147
- "special": false
1148
- },
1149
- "143": {
1150
- "content": "▁▁▁▁▁▁",
1151
- "lstrip": false,
1152
- "normalized": false,
1153
- "rstrip": false,
1154
- "single_word": false,
1155
- "special": false
1156
- },
1157
- "144": {
1158
- "content": "▁▁▁▁▁▁▁",
1159
- "lstrip": false,
1160
- "normalized": false,
1161
- "rstrip": false,
1162
- "single_word": false,
1163
- "special": false
1164
- },
1165
- "145": {
1166
- "content": "▁▁▁▁▁▁▁▁",
1167
- "lstrip": false,
1168
- "normalized": false,
1169
- "rstrip": false,
1170
- "single_word": false,
1171
- "special": false
1172
- },
1173
- "146": {
1174
- "content": "▁▁▁▁▁▁▁▁▁",
1175
- "lstrip": false,
1176
- "normalized": false,
1177
- "rstrip": false,
1178
- "single_word": false,
1179
- "special": false
1180
- },
1181
- "147": {
1182
- "content": "▁▁▁▁▁▁▁▁▁▁",
1183
- "lstrip": false,
1184
- "normalized": false,
1185
- "rstrip": false,
1186
- "single_word": false,
1187
- "special": false
1188
- },
1189
- "148": {
1190
- "content": "▁▁▁▁▁▁▁▁▁▁▁",
1191
- "lstrip": false,
1192
- "normalized": false,
1193
- "rstrip": false,
1194
- "single_word": false,
1195
- "special": false
1196
- },
1197
- "149": {
1198
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁",
1199
- "lstrip": false,
1200
- "normalized": false,
1201
- "rstrip": false,
1202
- "single_word": false,
1203
- "special": false
1204
- },
1205
- "150": {
1206
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁",
1207
- "lstrip": false,
1208
- "normalized": false,
1209
- "rstrip": false,
1210
- "single_word": false,
1211
- "special": false
1212
- },
1213
- "151": {
1214
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1215
- "lstrip": false,
1216
- "normalized": false,
1217
- "rstrip": false,
1218
- "single_word": false,
1219
- "special": false
1220
- },
1221
- "152": {
1222
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1223
- "lstrip": false,
1224
- "normalized": false,
1225
- "rstrip": false,
1226
- "single_word": false,
1227
- "special": false
1228
- },
1229
- "153": {
1230
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1231
- "lstrip": false,
1232
- "normalized": false,
1233
- "rstrip": false,
1234
- "single_word": false,
1235
- "special": false
1236
- },
1237
- "154": {
1238
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1239
- "lstrip": false,
1240
- "normalized": false,
1241
- "rstrip": false,
1242
- "single_word": false,
1243
- "special": false
1244
- },
1245
- "155": {
1246
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1247
- "lstrip": false,
1248
- "normalized": false,
1249
- "rstrip": false,
1250
- "single_word": false,
1251
- "special": false
1252
- },
1253
- "156": {
1254
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1255
- "lstrip": false,
1256
- "normalized": false,
1257
- "rstrip": false,
1258
- "single_word": false,
1259
- "special": false
1260
- },
1261
- "157": {
1262
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1263
- "lstrip": false,
1264
- "normalized": false,
1265
- "rstrip": false,
1266
- "single_word": false,
1267
- "special": false
1268
- },
1269
- "158": {
1270
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1271
- "lstrip": false,
1272
- "normalized": false,
1273
- "rstrip": false,
1274
- "single_word": false,
1275
- "special": false
1276
- },
1277
- "159": {
1278
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1279
- "lstrip": false,
1280
- "normalized": false,
1281
- "rstrip": false,
1282
- "single_word": false,
1283
- "special": false
1284
- },
1285
- "160": {
1286
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1287
- "lstrip": false,
1288
- "normalized": false,
1289
- "rstrip": false,
1290
- "single_word": false,
1291
- "special": false
1292
- },
1293
- "161": {
1294
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1295
- "lstrip": false,
1296
- "normalized": false,
1297
- "rstrip": false,
1298
- "single_word": false,
1299
- "special": false
1300
- },
1301
- "162": {
1302
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1303
- "lstrip": false,
1304
- "normalized": false,
1305
- "rstrip": false,
1306
- "single_word": false,
1307
- "special": false
1308
- },
1309
- "163": {
1310
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1311
- "lstrip": false,
1312
- "normalized": false,
1313
- "rstrip": false,
1314
- "single_word": false,
1315
- "special": false
1316
- },
1317
- "164": {
1318
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1319
- "lstrip": false,
1320
- "normalized": false,
1321
- "rstrip": false,
1322
- "single_word": false,
1323
- "special": false
1324
- },
1325
- "165": {
1326
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1327
- "lstrip": false,
1328
- "normalized": false,
1329
- "rstrip": false,
1330
- "single_word": false,
1331
- "special": false
1332
- },
1333
- "166": {
1334
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1335
- "lstrip": false,
1336
- "normalized": false,
1337
- "rstrip": false,
1338
- "single_word": false,
1339
- "special": false
1340
- },
1341
- "167": {
1342
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1343
- "lstrip": false,
1344
- "normalized": false,
1345
- "rstrip": false,
1346
- "single_word": false,
1347
- "special": false
1348
- },
1349
- "168": {
1350
- "content": "▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁",
1351
- "lstrip": false,
1352
- "normalized": false,
1353
- "rstrip": false,
1354
- "single_word": false,
1355
- "special": false
1356
- },
1357
- "169": {
1358
- "content": "<table>",
1359
- "lstrip": false,
1360
- "normalized": false,
1361
- "rstrip": false,
1362
- "single_word": false,
1363
- "special": false
1364
- },
1365
- "170": {
1366
- "content": "<caption>",
1367
- "lstrip": false,
1368
- "normalized": false,
1369
- "rstrip": false,
1370
- "single_word": false,
1371
- "special": false
1372
- },
1373
- "171": {
1374
- "content": "<thead>",
1375
- "lstrip": false,
1376
- "normalized": false,
1377
- "rstrip": false,
1378
- "single_word": false,
1379
- "special": false
1380
- },
1381
- "172": {
1382
- "content": "<tbody>",
1383
- "lstrip": false,
1384
- "normalized": false,
1385
- "rstrip": false,
1386
- "single_word": false,
1387
- "special": false
1388
- },
1389
- "173": {
1390
- "content": "<tfoot>",
1391
- "lstrip": false,
1392
- "normalized": false,
1393
- "rstrip": false,
1394
- "single_word": false,
1395
- "special": false
1396
- },
1397
- "174": {
1398
- "content": "<tr>",
1399
- "lstrip": false,
1400
- "normalized": false,
1401
- "rstrip": false,
1402
- "single_word": false,
1403
- "special": false
1404
- },
1405
- "175": {
1406
- "content": "<th>",
1407
- "lstrip": false,
1408
- "normalized": false,
1409
- "rstrip": false,
1410
- "single_word": false,
1411
- "special": false
1412
- },
1413
- "176": {
1414
- "content": "<td>",
1415
- "lstrip": false,
1416
- "normalized": false,
1417
- "rstrip": false,
1418
- "single_word": false,
1419
- "special": false
1420
- },
1421
- "177": {
1422
- "content": "</table>",
1423
- "lstrip": false,
1424
- "normalized": false,
1425
- "rstrip": false,
1426
- "single_word": false,
1427
- "special": false
1428
- },
1429
- "178": {
1430
- "content": "</caption>",
1431
- "lstrip": false,
1432
- "normalized": false,
1433
- "rstrip": false,
1434
- "single_word": false,
1435
- "special": false
1436
- },
1437
- "179": {
1438
- "content": "</thead>",
1439
- "lstrip": false,
1440
- "normalized": false,
1441
- "rstrip": false,
1442
- "single_word": false,
1443
- "special": false
1444
- },
1445
- "180": {
1446
- "content": "</tbody>",
1447
- "lstrip": false,
1448
- "normalized": false,
1449
- "rstrip": false,
1450
- "single_word": false,
1451
- "special": false
1452
- },
1453
- "181": {
1454
- "content": "</tfoot>",
1455
- "lstrip": false,
1456
- "normalized": false,
1457
- "rstrip": false,
1458
- "single_word": false,
1459
- "special": false
1460
- },
1461
- "182": {
1462
- "content": "</tr>",
1463
- "lstrip": false,
1464
- "normalized": false,
1465
- "rstrip": false,
1466
- "single_word": false,
1467
- "special": false
1468
- },
1469
- "183": {
1470
- "content": "</th>",
1471
- "lstrip": false,
1472
- "normalized": false,
1473
- "rstrip": false,
1474
- "single_word": false,
1475
- "special": false
1476
- },
1477
- "184": {
1478
- "content": "</td>",
1479
- "lstrip": false,
1480
- "normalized": false,
1481
- "rstrip": false,
1482
- "single_word": false,
1483
- "special": false
1484
- },
1485
- "185": {
1486
- "content": "<h1>",
1487
- "lstrip": false,
1488
- "normalized": false,
1489
- "rstrip": false,
1490
- "single_word": false,
1491
- "special": false
1492
- },
1493
- "186": {
1494
- "content": "<h2>",
1495
- "lstrip": false,
1496
- "normalized": false,
1497
- "rstrip": false,
1498
- "single_word": false,
1499
- "special": false
1500
- },
1501
- "187": {
1502
- "content": "<h3>",
1503
- "lstrip": false,
1504
- "normalized": false,
1505
- "rstrip": false,
1506
- "single_word": false,
1507
- "special": false
1508
- },
1509
- "188": {
1510
- "content": "<h4>",
1511
- "lstrip": false,
1512
- "normalized": false,
1513
- "rstrip": false,
1514
- "single_word": false,
1515
- "special": false
1516
- },
1517
- "189": {
1518
- "content": "<h5>",
1519
- "lstrip": false,
1520
- "normalized": false,
1521
- "rstrip": false,
1522
- "single_word": false,
1523
- "special": false
1524
- },
1525
- "190": {
1526
- "content": "<h6>",
1527
- "lstrip": false,
1528
- "normalized": false,
1529
- "rstrip": false,
1530
- "single_word": false,
1531
- "special": false
1532
- },
1533
- "191": {
1534
- "content": "<blockquote>",
1535
- "lstrip": false,
1536
- "normalized": false,
1537
- "rstrip": false,
1538
- "single_word": false,
1539
- "special": false
1540
- },
1541
- "192": {
1542
- "content": "</h1>",
1543
- "lstrip": false,
1544
- "normalized": false,
1545
- "rstrip": false,
1546
- "single_word": false,
1547
- "special": false
1548
- },
1549
- "193": {
1550
- "content": "</h2>",
1551
- "lstrip": false,
1552
- "normalized": false,
1553
- "rstrip": false,
1554
- "single_word": false,
1555
- "special": false
1556
- },
1557
- "194": {
1558
- "content": "</h3>",
1559
- "lstrip": false,
1560
- "normalized": false,
1561
- "rstrip": false,
1562
- "single_word": false,
1563
- "special": false
1564
- },
1565
- "195": {
1566
- "content": "</h4>",
1567
- "lstrip": false,
1568
- "normalized": false,
1569
- "rstrip": false,
1570
- "single_word": false,
1571
- "special": false
1572
- },
1573
- "196": {
1574
- "content": "</h5>",
1575
- "lstrip": false,
1576
- "normalized": false,
1577
- "rstrip": false,
1578
- "single_word": false,
1579
- "special": false
1580
- },
1581
- "197": {
1582
- "content": "</h6>",
1583
- "lstrip": false,
1584
- "normalized": false,
1585
- "rstrip": false,
1586
- "single_word": false,
1587
- "special": false
1588
- },
1589
- "198": {
1590
- "content": "</blockquote>",
1591
- "lstrip": false,
1592
- "normalized": false,
1593
- "rstrip": false,
1594
- "single_word": false,
1595
- "special": false
1596
- },
1597
- "199": {
1598
- "content": "<strong>",
1599
- "lstrip": false,
1600
- "normalized": false,
1601
- "rstrip": false,
1602
- "single_word": false,
1603
- "special": false
1604
- },
1605
- "200": {
1606
- "content": "<em>",
1607
- "lstrip": false,
1608
- "normalized": false,
1609
- "rstrip": false,
1610
- "single_word": false,
1611
- "special": false
1612
- },
1613
- "201": {
1614
- "content": "<b>",
1615
- "lstrip": false,
1616
- "normalized": false,
1617
- "rstrip": false,
1618
- "single_word": false,
1619
- "special": false
1620
- },
1621
- "202": {
1622
- "content": "<i>",
1623
- "lstrip": false,
1624
- "normalized": false,
1625
- "rstrip": false,
1626
- "single_word": false,
1627
- "special": false
1628
- },
1629
- "203": {
1630
- "content": "<u>",
1631
- "lstrip": false,
1632
- "normalized": false,
1633
- "rstrip": false,
1634
- "single_word": false,
1635
- "special": false
1636
- },
1637
- "204": {
1638
- "content": "<s>",
1639
- "lstrip": false,
1640
- "normalized": false,
1641
- "rstrip": false,
1642
- "single_word": false,
1643
- "special": false
1644
- },
1645
- "205": {
1646
- "content": "<sub>",
1647
- "lstrip": false,
1648
- "normalized": false,
1649
- "rstrip": false,
1650
- "single_word": false,
1651
- "special": false
1652
- },
1653
- "206": {
1654
- "content": "<sup>",
1655
- "lstrip": false,
1656
- "normalized": false,
1657
- "rstrip": false,
1658
- "single_word": false,
1659
- "special": false
1660
- },
1661
- "207": {
1662
- "content": "<code>",
1663
- "lstrip": false,
1664
- "normalized": false,
1665
- "rstrip": false,
1666
- "single_word": false,
1667
- "special": false
1668
- },
1669
- "208": {
1670
- "content": "</strong>",
1671
- "lstrip": false,
1672
- "normalized": false,
1673
- "rstrip": false,
1674
- "single_word": false,
1675
- "special": false
1676
- },
1677
- "209": {
1678
- "content": "</em>",
1679
- "lstrip": false,
1680
- "normalized": false,
1681
- "rstrip": false,
1682
- "single_word": false,
1683
- "special": false
1684
- },
1685
- "210": {
1686
- "content": "</b>",
1687
- "lstrip": false,
1688
- "normalized": false,
1689
- "rstrip": false,
1690
- "single_word": false,
1691
- "special": false
1692
- },
1693
- "211": {
1694
- "content": "</i>",
1695
- "lstrip": false,
1696
- "normalized": false,
1697
- "rstrip": false,
1698
- "single_word": false,
1699
- "special": false
1700
- },
1701
- "212": {
1702
- "content": "</u>",
1703
- "lstrip": false,
1704
- "normalized": false,
1705
- "rstrip": false,
1706
- "single_word": false,
1707
- "special": false
1708
- },
1709
- "213": {
1710
- "content": "</s>",
1711
- "lstrip": false,
1712
- "normalized": false,
1713
- "rstrip": false,
1714
- "single_word": false,
1715
- "special": false
1716
- },
1717
- "214": {
1718
- "content": "</sub>",
1719
- "lstrip": false,
1720
- "normalized": false,
1721
- "rstrip": false,
1722
- "single_word": false,
1723
- "special": false
1724
- },
1725
- "215": {
1726
- "content": "</sup>",
1727
- "lstrip": false,
1728
- "normalized": false,
1729
- "rstrip": false,
1730
- "single_word": false,
1731
- "special": false
1732
- },
1733
- "216": {
1734
- "content": "</code>",
1735
- "lstrip": false,
1736
- "normalized": false,
1737
- "rstrip": false,
1738
- "single_word": false,
1739
- "special": false
1740
- },
1741
- "255968": {
1742
- "content": "[toxicity=0]",
1743
- "lstrip": false,
1744
- "normalized": false,
1745
- "rstrip": false,
1746
- "single_word": false,
1747
- "special": false
1748
- },
1749
- "255969": {
1750
- "content": "\t\t",
1751
- "lstrip": false,
1752
- "normalized": false,
1753
- "rstrip": false,
1754
- "single_word": false,
1755
- "special": false
1756
- },
1757
- "255970": {
1758
- "content": "\t\t\t",
1759
- "lstrip": false,
1760
- "normalized": false,
1761
- "rstrip": false,
1762
- "single_word": false,
1763
- "special": false
1764
- },
1765
- "255971": {
1766
- "content": "\t\t\t\t",
1767
- "lstrip": false,
1768
- "normalized": false,
1769
- "rstrip": false,
1770
- "single_word": false,
1771
- "special": false
1772
- },
1773
- "255972": {
1774
- "content": "\t\t\t\t\t",
1775
- "lstrip": false,
1776
- "normalized": false,
1777
- "rstrip": false,
1778
- "single_word": false,
1779
- "special": false
1780
- },
1781
- "255973": {
1782
- "content": "\t\t\t\t\t\t",
1783
- "lstrip": false,
1784
- "normalized": false,
1785
- "rstrip": false,
1786
- "single_word": false,
1787
- "special": false
1788
- },
1789
- "255974": {
1790
- "content": "\t\t\t\t\t\t\t",
1791
- "lstrip": false,
1792
- "normalized": false,
1793
- "rstrip": false,
1794
- "single_word": false,
1795
- "special": false
1796
- },
1797
- "255975": {
1798
- "content": "\t\t\t\t\t\t\t\t",
1799
- "lstrip": false,
1800
- "normalized": false,
1801
- "rstrip": false,
1802
- "single_word": false,
1803
- "special": false
1804
- },
1805
- "255976": {
1806
- "content": "\t\t\t\t\t\t\t\t\t",
1807
- "lstrip": false,
1808
- "normalized": false,
1809
- "rstrip": false,
1810
- "single_word": false,
1811
- "special": false
1812
- },
1813
- "255977": {
1814
- "content": "\t\t\t\t\t\t\t\t\t\t",
1815
- "lstrip": false,
1816
- "normalized": false,
1817
- "rstrip": false,
1818
- "single_word": false,
1819
- "special": false
1820
- },
1821
- "255978": {
1822
- "content": "\t\t\t\t\t\t\t\t\t\t\t",
1823
- "lstrip": false,
1824
- "normalized": false,
1825
- "rstrip": false,
1826
- "single_word": false,
1827
- "special": false
1828
- },
1829
- "255979": {
1830
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t",
1831
- "lstrip": false,
1832
- "normalized": false,
1833
- "rstrip": false,
1834
- "single_word": false,
1835
- "special": false
1836
- },
1837
- "255980": {
1838
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t",
1839
- "lstrip": false,
1840
- "normalized": false,
1841
- "rstrip": false,
1842
- "single_word": false,
1843
- "special": false
1844
- },
1845
- "255981": {
1846
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1847
- "lstrip": false,
1848
- "normalized": false,
1849
- "rstrip": false,
1850
- "single_word": false,
1851
- "special": false
1852
- },
1853
- "255982": {
1854
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1855
- "lstrip": false,
1856
- "normalized": false,
1857
- "rstrip": false,
1858
- "single_word": false,
1859
- "special": false
1860
- },
1861
- "255983": {
1862
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1863
- "lstrip": false,
1864
- "normalized": false,
1865
- "rstrip": false,
1866
- "single_word": false,
1867
- "special": false
1868
- },
1869
- "255984": {
1870
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1871
- "lstrip": false,
1872
- "normalized": false,
1873
- "rstrip": false,
1874
- "single_word": false,
1875
- "special": false
1876
- },
1877
- "255985": {
1878
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1879
- "lstrip": false,
1880
- "normalized": false,
1881
- "rstrip": false,
1882
- "single_word": false,
1883
- "special": false
1884
- },
1885
- "255986": {
1886
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1887
- "lstrip": false,
1888
- "normalized": false,
1889
- "rstrip": false,
1890
- "single_word": false,
1891
- "special": false
1892
- },
1893
- "255987": {
1894
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1895
- "lstrip": false,
1896
- "normalized": false,
1897
- "rstrip": false,
1898
- "single_word": false,
1899
- "special": false
1900
- },
1901
- "255988": {
1902
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1903
- "lstrip": false,
1904
- "normalized": false,
1905
- "rstrip": false,
1906
- "single_word": false,
1907
- "special": false
1908
- },
1909
- "255989": {
1910
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1911
- "lstrip": false,
1912
- "normalized": false,
1913
- "rstrip": false,
1914
- "single_word": false,
1915
- "special": false
1916
- },
1917
- "255990": {
1918
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1919
- "lstrip": false,
1920
- "normalized": false,
1921
- "rstrip": false,
1922
- "single_word": false,
1923
- "special": false
1924
- },
1925
- "255991": {
1926
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1927
- "lstrip": false,
1928
- "normalized": false,
1929
- "rstrip": false,
1930
- "single_word": false,
1931
- "special": false
1932
- },
1933
- "255992": {
1934
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1935
- "lstrip": false,
1936
- "normalized": false,
1937
- "rstrip": false,
1938
- "single_word": false,
1939
- "special": false
1940
- },
1941
- "255993": {
1942
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1943
- "lstrip": false,
1944
- "normalized": false,
1945
- "rstrip": false,
1946
- "single_word": false,
1947
- "special": false
1948
- },
1949
- "255994": {
1950
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1951
- "lstrip": false,
1952
- "normalized": false,
1953
- "rstrip": false,
1954
- "single_word": false,
1955
- "special": false
1956
- },
1957
- "255995": {
1958
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1959
- "lstrip": false,
1960
- "normalized": false,
1961
- "rstrip": false,
1962
- "single_word": false,
1963
- "special": false
1964
- },
1965
- "255996": {
1966
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1967
- "lstrip": false,
1968
- "normalized": false,
1969
- "rstrip": false,
1970
- "single_word": false,
1971
- "special": false
1972
- },
1973
- "255997": {
1974
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1975
- "lstrip": false,
1976
- "normalized": false,
1977
- "rstrip": false,
1978
- "single_word": false,
1979
- "special": false
1980
- },
1981
- "255998": {
1982
- "content": "\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t",
1983
- "lstrip": false,
1984
- "normalized": false,
1985
- "rstrip": false,
1986
- "single_word": false,
1987
- "special": false
1988
- },
1989
- "255999": {
1990
- "content": "<unused99>",
1991
- "lstrip": false,
1992
- "normalized": false,
1993
- "rstrip": false,
1994
- "single_word": false,
1995
- "special": false
1996
- }
1997
- },
1998
- "additional_special_tokens": [
1999
- "<start_of_turn>",
2000
- "<end_of_turn>"
2001
- ],
2002
- "bos_token": "<bos>",
2003
- "chat_template": "{{ bos_token }}{% if messages[0]['role'] == 'system' %}{% set system_message = messages[0]['content'] | trim + '\n\n' %}{% set messages = messages[1:] %}{% else %}{% set system_message = '' %}{% endif %}{% for message in messages %}{% if loop.index0 == 0 %}{% set content = system_message + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% else %}{% set role = message['role'] %}{% endif %}{{ '<start_of_turn>' + role + '\n' + content | trim + '<end_of_turn>\n' }}{% endfor %}{% if add_generation_prompt %}{{'<start_of_turn>model\n'}}{% endif %}",
2004
- "clean_up_tokenization_spaces": false,
2005
- "eos_token": "<eos>",
2006
- "model_max_length": 2048,
2007
- "pad_token": "<pad>",
2008
- "sp_model_kwargs": {},
2009
- "spaces_between_special_tokens": false,
2010
- "tokenizer_class": "GemmaTokenizer",
2011
- "unk_token": "<unk>",
2012
- "use_default_system_prompt": false
2013
- }