Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ tags:
|
|
13 |
**Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
|
14 |
|
15 |
> If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
|
16 |
-
For a coffee chat or if you have any questions, please do not hesitate to contact me as well! -
|
17 |
|
18 |
I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
|
19 |
## TODO
|
@@ -46,23 +46,28 @@ Jisoo Kim(kuotient)
|
|
46 |
ํ๊ตญ์ด LLM ์ปค๋ฎค๋ํฐ์ ๋ง์ ๊ธฐ์ฌ์ ๋๊ธฐ๋ถ์ฌ๋ฅผ ํด์ฃผ๊ณ ๊ณ์ [maywell](https://huggingface.co/maywell)๋ ๊ฐ์ฌ๋๋ฆฝ๋๋ค.
|
47 |
## Usage
|
48 |
```sh
|
49 |
-
pip install
|
50 |
```
|
51 |
```py
|
52 |
import torch
|
53 |
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
|
54 |
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
|
|
|
55 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
|
|
56 |
model_name = "kuotient/mamba-2.8b-ko"
|
57 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
58 |
tokenizer.pad_token = tokenizer.eos_token
|
|
|
59 |
model = MambaLMHeadModel.from_pretrained(
|
60 |
model_name, device=device, dtype=torch.float16)
|
61 |
|
62 |
prompt = "์์ด๋คํํ
์ ๊ณตํ ์์๊ฐ ์๋ ์์ 5๊ฐ์ง์ ์์๋ ๋ค์๊ณผ ๊ฐ๋ค."
|
|
|
63 |
tokens = tokenizer(prompt, return_tensors='pt')
|
64 |
input_ids = tokens.input_ids.to(device)
|
65 |
streamer = TextStreamer(tokenizer)
|
|
|
66 |
out = model.generate(
|
67 |
input_ids=input_ids,
|
68 |
streamer=streamer,
|
|
|
13 |
**Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
|
14 |
|
15 |
> If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
|
16 |
+
For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - kuotient.dev@gmail.com
|
17 |
|
18 |
I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
|
19 |
## TODO
|
|
|
46 |
ํ๊ตญ์ด LLM ์ปค๋ฎค๋ํฐ์ ๋ง์ ๊ธฐ์ฌ์ ๋๊ธฐ๋ถ์ฌ๋ฅผ ํด์ฃผ๊ณ ๊ณ์ [maywell](https://huggingface.co/maywell)๋ ๊ฐ์ฌ๋๋ฆฝ๋๋ค.
|
47 |
## Usage
|
48 |
```sh
|
49 |
+
pip install causal_conv1d>=1.1.0 mamba-ssm==1.1.1
|
50 |
```
|
51 |
```py
|
52 |
import torch
|
53 |
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
|
54 |
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
|
55 |
+
|
56 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
57 |
+
|
58 |
model_name = "kuotient/mamba-2.8b-ko"
|
59 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
60 |
tokenizer.pad_token = tokenizer.eos_token
|
61 |
+
|
62 |
model = MambaLMHeadModel.from_pretrained(
|
63 |
model_name, device=device, dtype=torch.float16)
|
64 |
|
65 |
prompt = "์์ด๋คํํ
์ ๊ณตํ ์์๊ฐ ์๋ ์์ 5๊ฐ์ง์ ์์๋ ๋ค์๊ณผ ๊ฐ๋ค."
|
66 |
+
|
67 |
tokens = tokenizer(prompt, return_tensors='pt')
|
68 |
input_ids = tokens.input_ids.to(device)
|
69 |
streamer = TextStreamer(tokenizer)
|
70 |
+
|
71 |
out = model.generate(
|
72 |
input_ids=input_ids,
|
73 |
streamer=streamer,
|