kuotient
/

mamba-ko-2.8b

@@ -12,8 +12,10 @@ tags:
 ![Mamba-ko-2.8B](./Seagull-mamba.png)
 **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
-If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
 For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - [email protected]
 ## TODO
 - Complete training with korean_textbooks - 6B tokens down, 2B to go.
 - More training with publicly available Korean corpora
@@ -31,7 +33,7 @@ Jisoo Kim(kuotient)
 ### KoBEST
 | Model | boolq | copa | hellaswag | sentinag |
 | --- | --- | --- | --- | --- |
-| kuotient/mamba-ko-2.8b | 0.5825 | 0.6166 | 0.4051 | 0.3383 |
 | state_spaces/mamba-2.8b-slimpj | 0.3343 | 0.4867 | 0.3452 | 0.3547 |
 | kuotient/mamba-ko-2.8b-old (2B trained only) | 0.4236 | 0.5896 | 0.4012 | 0.4348 |
 | kuotient/mamba-ko-2.8b-old-instruct | 0.4041 | 0.6505 | 0.4906 | 0.3348 |
@@ -39,7 +41,7 @@ Jisoo Kim(kuotient)
 | maywell/TinyWand-SFT | 0.3455 | 0.6142 | 0.3944 | N/A |
 | microsoft/phi-2 | 0.3343 | 0.4792 | 0.3235 | N/A |
 | TinyLlama/TinyLlama-1.1B | 0.3343 | 0.4784 | 0.3396 | N/A |
 ### Thanks
 한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
 ## Usage

 ![Mamba-ko-2.8B](./Seagull-mamba.png)
 **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
+> If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
 For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - [email protected]
+I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
 ## TODO
 - Complete training with korean_textbooks - 6B tokens down, 2B to go.
 - More training with publicly available Korean corpora
 ### KoBEST
 | Model | boolq | copa | hellaswag | sentinag |
 | --- | --- | --- | --- | --- |
+| kuotient/mamba-ko-2.8b* | 0.5825 | 0.6166 | 0.4051 | 0.3383 |
 | state_spaces/mamba-2.8b-slimpj | 0.3343 | 0.4867 | 0.3452 | 0.3547 |
 | kuotient/mamba-ko-2.8b-old (2B trained only) | 0.4236 | 0.5896 | 0.4012 | 0.4348 |
 | kuotient/mamba-ko-2.8b-old-instruct | 0.4041 | 0.6505 | 0.4906 | 0.3348 |
 | maywell/TinyWand-SFT | 0.3455 | 0.6142 | 0.3944 | N/A |
 | microsoft/phi-2 | 0.3343 | 0.4792 | 0.3235 | N/A |
 | TinyLlama/TinyLlama-1.1B | 0.3343 | 0.4784 | 0.3396 | N/A |
+*>6B tokens trained. Further up to 8B tokens.
 ### Thanks
 한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
 ## Usage