kuotient
/

mamba-ko-2.8b

@@ -13,7 +13,7 @@ tags:
 **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
 > If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
-For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - jisoo.kim@allganize.ai
 I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
 ## TODO
@@ -46,23 +46,28 @@ Jisoo Kim(kuotient)
 한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
 ## Usage
 ```sh
-pip install torch==2.1.0 transformers==4.35.0 causal_conv1d>=1.1.0 mamba-ssm==1.1.1
 ```
 ```py
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
 from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
 device = "cuda" if torch.cuda.is_available() else "cpu"
 model_name = "kuotient/mamba-2.8b-ko"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 tokenizer.pad_token = tokenizer.eos_token
 model = MambaLMHeadModel.from_pretrained(
         model_name, device=device, dtype=torch.float16)
 prompt = "아이들한테 제공할 영양가 있는 음식 5가지의 예시는 다음과 같다."
 tokens = tokenizer(prompt, return_tensors='pt')
 input_ids = tokens.input_ids.to(device)
 streamer = TextStreamer(tokenizer)
 out = model.generate(
     input_ids=input_ids,
     streamer=streamer,

 **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
 > If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
+For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - kuotient.dev@gmail.com
 I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
 ## TODO
 한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
 ## Usage
 ```sh
+pip install causal_conv1d>=1.1.0 mamba-ssm==1.1.1
 ```
 ```py
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
 from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
 device = "cuda" if torch.cuda.is_available() else "cpu"
 model_name = "kuotient/mamba-2.8b-ko"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 tokenizer.pad_token = tokenizer.eos_token
 model = MambaLMHeadModel.from_pretrained(
         model_name, device=device, dtype=torch.float16)
 prompt = "아이들한테 제공할 영양가 있는 음식 5가지의 예시는 다음과 같다."
 tokens = tokenizer(prompt, return_tensors='pt')
 input_ids = tokens.input_ids.to(device)
 streamer = TextStreamer(tokenizer)
 out = model.generate(
     input_ids=input_ids,
     streamer=streamer,