RichardErkhov commited on
Commit
ba736ee
·
verified ·
1 Parent(s): 0843a76

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +139 -0
README.md ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Poro-34B-chat - GGUF
11
+ - Model creator: https://huggingface.co/LumiOpen/
12
+ - Original model: https://huggingface.co/LumiOpen/Poro-34B-chat/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [Poro-34B-chat.Q2_K.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q2_K.gguf) | Q2_K | 12.49GB |
18
+ | [Poro-34B-chat.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.IQ3_XS.gguf) | IQ3_XS | 14.05GB |
19
+ | [Poro-34B-chat.IQ3_S.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.IQ3_S.gguf) | IQ3_S | 14.42GB |
20
+ | [Poro-34B-chat.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q3_K_S.gguf) | Q3_K_S | 14.42GB |
21
+ | [Poro-34B-chat.IQ3_M.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.IQ3_M.gguf) | IQ3_M | 15.94GB |
22
+ | [Poro-34B-chat.Q3_K.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q3_K.gguf) | Q3_K | 17.23GB |
23
+ | [Poro-34B-chat.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q3_K_M.gguf) | Q3_K_M | 17.23GB |
24
+ | [Poro-34B-chat.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q3_K_L.gguf) | Q3_K_L | 18.78GB |
25
+ | [Poro-34B-chat.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.IQ4_XS.gguf) | IQ4_XS | 17.83GB |
26
+ | [Poro-34B-chat.Q4_0.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q4_0.gguf) | Q4_0 | 18.65GB |
27
+ | [Poro-34B-chat.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.IQ4_NL.gguf) | IQ4_NL | 18.79GB |
28
+ | [Poro-34B-chat.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q4_K_S.gguf) | Q4_K_S | 18.79GB |
29
+ | [Poro-34B-chat.Q4_K.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q4_K.gguf) | Q4_K | 20.9GB |
30
+ | [Poro-34B-chat.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q4_K_M.gguf) | Q4_K_M | 20.9GB |
31
+ | [Poro-34B-chat.Q4_1.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q4_1.gguf) | Q4_1 | 20.64GB |
32
+ | [Poro-34B-chat.Q5_0.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q5_0.gguf) | Q5_0 | 22.63GB |
33
+ | [Poro-34B-chat.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q5_K_S.gguf) | Q5_K_S | 22.63GB |
34
+ | [Poro-34B-chat.Q5_K.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q5_K.gguf) | Q5_K | 24.32GB |
35
+ | [Poro-34B-chat.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q5_K_M.gguf) | Q5_K_M | 24.32GB |
36
+ | [Poro-34B-chat.Q5_1.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q5_1.gguf) | Q5_1 | 24.62GB |
37
+ | [Poro-34B-chat.Q6_K.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q6_K.gguf) | Q6_K | 26.86GB |
38
+ | [Poro-34B-chat.Q8_0.gguf](https://huggingface.co/RichardErkhov/LumiOpen_-_Poro-34B-chat-gguf/blob/main/Poro-34B-chat.Q8_0.gguf) | Q8_0 | 34.78GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ license: apache-2.0
46
+ datasets:
47
+ - LumiOpen/instruction-collection-fin
48
+ language:
49
+ - fi
50
+ - en
51
+ ---
52
+ <div align="center">
53
+ <img src="./poro-logo.png" width="200px">
54
+ </div>
55
+
56
+ # Poro 34B Chat
57
+
58
+ Poro 34b chat is a chat-tuned version of [Poro
59
+ 34B](https://huggingface.co/LumiOpen/Poro-34B) trained to follow instructions
60
+ in both Finnish and English. Quantized versions are available on [Poro
61
+ 34B-chat-GGUF](https://huggingface.co/LumiOpen/Poro-34B-chat-GGUF).
62
+
63
+ Because of the limited amount of instruction tuning available for Finnish, documents from the English datasets were machine-translated by the Poro 34B base model into Finnish, then used to train this chat version. We selected only datasets that are available for commercial use and only contain synthetic data if it was gathered in ToS-compliant fashion.
64
+
65
+ More information about the data selection and translation process for our Finnish dataset are available on the [LumiOpen/instruction-collection-fin](https://huggingface.co/datasets/LumiOpen/instruction-collection-fin) page.
66
+
67
+ Poro was created in a collaboration between [SiloGen](https://www.silo.ai/silogen) from [Silo AI](https://www.silo.ai/), the [TurkuNLP group](https://turkunlp.org/) of the University of Turku, and [High Performance Language Technologies](https://hplt-project.org/) (HPLT). Training was conducted on the [LUMI supercomputer](https://www.lumi-supercomputer.eu/), using compute resources generously provided by [CSC](https://csc.fi/) - IT Center for Science, Finland.
68
+
69
+ This project is part of an ongoing effort to create open source large language models for non-English and especially low resource languages like Finnish. Through the combination of English and Finnish training data we get a model that outperforms previous Finnish only models, while also being fluent in English and code, and capable of basic translation between English and Finnish.
70
+
71
+
72
+ ## Fine Tuning
73
+
74
+ Poro-34b-Chat is an SFT finetune of Poro-34b on a collection of Finnish and
75
+ English instruction datasets. The collection is made up of roughly of 40%
76
+ English, 40% Finnish, and 20% cross-lingual entries.
77
+
78
+ We finetuned the base model for 3 epochs with a learning rate of 2e-05, warmup
79
+ ratio of 0.1, and a global batch size of 48. For full-parameter finetuning, we used 3 nodes (8 GPUs per node). We used the [Alignment Handbook](https://github.com/huggingface/alignment-handbook/)
80
+ code for finetuning.
81
+
82
+ ## Datasets
83
+
84
+ #### Finnish and Cross-lingual
85
+ - [LumiOpen/instruction-collection-fin](https://huggingface.co/datasets/LumiOpen/instruction-collection-fin)
86
+
87
+ #### English
88
+ - [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
89
+ - [Curated OASST2](https://huggingface.co/datasets/sablo/oasst2_curated)
90
+ - [Argilla/10k_prompts_ranked_mistral_large_responses](https://huggingface.co/datasets/argilla/10k_prompts_ranked_mistral_large_responses)
91
+
92
+
93
+ ## Chat template
94
+
95
+ We use the ChatML chat template. For example:
96
+
97
+ ```
98
+ <|im_start|>system
99
+ You can add an optional system prompt here.<|im_end|>
100
+ <|im_start|>user
101
+ Miten rakennan tietokoneen?<|im_end|>
102
+ <|im_start|>assistant
103
+ ```
104
+
105
+ ## Evaluations
106
+
107
+ We relied on the popular MTBench benchmark to evaluate multi-turn performance.
108
+
109
+ Since MTBench is an English only benchmark, we also release this fork of [MTBench Finnish](https://github.com/LumiOpen/FastChat/tree/main/fastchat/llm_judge) with multilingual support and machine translated Finnish prompts. Our scores for both benchmarks follow.
110
+
111
+ Note: Updated on 18 June 2024
112
+
113
+ | Eval | Overall | Coding | Extraction | Humanities | Math | Reasoning | Roleplay | STEM | Writing |
114
+ | :---- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | ----: |
115
+ | MTBench English | 6.13 | 4.25 | 6.65 | 9.60 | 2.30 | 4.30 | 7.05 | 7.55 | 7.35 |
116
+ | MTBench Finnish | 6.06 | 3.70 | 6.37 | 9.25 | 1.20 | 4.35 | 7.35 | 7.80 | 8.50 |
117
+
118
+
119
+ ## License
120
+
121
+ Poro 34B chat is released under the Apache 2.0 license.
122
+
123
+
124
+ ## Citation
125
+
126
+ ```
127
+ @misc{luukkonen2024poro,
128
+ title={Poro 34B and the Blessing of Multilinguality},
129
+ author={Risto Luukkonen and Jonathan Burdge and Elaine Zosa and Aarne
130
+ Talman and Ville Komulainen and Väinö Hatanpää and Peter Sarlin and Sampo
131
+ Pyysalo},
132
+ year={2024},
133
+ eprint={2404.01856},
134
+ archivePrefix={arXiv},
135
+ primaryClass={cs.CL}
136
+ }
137
+ ```
138
+
139
+