File size: 2,593 Bytes
5823c1f
9e73c30
 
3a3b606
9e73c30
6212e04
9e73c30
 
7c01146
5823c1f
7c01146
 
9e73c30
bc17a70
 
 
9e73c30
 
 
e5b281c
 
4ba27ba
9e73c30
 
 
 
 
 
6212e04
9e73c30
4fcc3f7
14f27fe
6212e04
 
66bd4e7
9e73c30
2aba96a
 
9e73c30
 
4fcc3f7
 
ea346c8
4fcc3f7
 
 
d0a4220
4fcc3f7
d0a4220
4fcc3f7
 
ae5b2e3
4fcc3f7
 
9e73c30
 
 
 
 
 
2b30f1a
4fcc3f7
9e73c30
 
 
 
 
4fcc3f7
9e73c30
 
4fcc3f7
9e73c30
4fcc3f7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
language:
- en
- ko
datasets:
- kyujinpy/KOpen-platypus
library_name: transformers
pipeline_tag: text-generation
license: cc-by-nc-sa-4.0
---
**(주)미디어그룹사람과숲과 (주)마커의 LLM 연구 컨소시엄에서 개발된 모델입니다**  
**The license is `cc-by-nc-sa-4.0`.**  

# **Ko-Platypus2-13B**  
![KO-Platypus2-13B](./KO_platypus.png)


## Model Details

**More detail repo(Github): [KO-Platypus](https://github.com/Marker-Inc-Korea/KO-Platypus)**

**Model Developers** Kyujin Han (kyujinpy)

**Input** Models input text only.

**Output** Models generate text only.

**Model Architecture** 
KO-Platypus2-13B is an auto-regressive language model based on the LLaMA2 transformer architecture.

**Base Model** [hyunseoki/ko-en-llama2-13b](https://huggingface.co/hyunseoki/ko-en-llama2-13b)  

**Training Dataset**
I use [KOpen-platypus](https://huggingface.co/datasets/kyujinpy/KOpen-platypus).  
It is high-quality korean translation dataset about [open-platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).  

I use A100 GPU 40GB and COLAB, when trianing.  

# **Model Benchmark**

## KO-LLM leaderboard
- Follow up as [Open KO-LLM LeaderBoard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).  
  
![img](./leaderboard.png)
| Model | Average |Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 |
| --- | --- | --- | --- | --- | --- | --- |
| KO-Platypus2-13B(ours) | 47.90 | 44.20 | 54.31 | 42.47 | 44.41 | 54.11 | 
| [hyunseoki/ko-en-llama2-13b](https://huggingface.co/hyunseoki/ko-en-llama2-13b) | 46.68 | 42.15 | 54.23 | 38.90 | 40.74 | 57.39 |
| [MarkrAI/kyujin-CoTy-platypus-ko-12.8b](https://huggingface.co/MarkrAI/kyujin-CoTy-platypus-ko-12.8b) | 46.44 | 34.98 | 49.11 | 25.68 | 37.59 | 84.86 | 
| [momo/polyglot-ko-12.8b-Chat-QLoRA-Merge](https://huggingface.co/momo/polyglot-ko-12.8b-Chat-QLoRA-Merge) | 45.71 | 35.49 | 49.93 | 25.97 | 39.43 | 77.70 |   
| [KoT-platypus2-7B](https://huggingface.co/kyujinpy/KoT-platypus2-7B) | 45.62 | 38.05 | 49.63 | 34.68 | 37.69 | 68.08 |  
> Compare with Top 4 SOTA models. (update: 10/06)

---  
# Implementation Code
```python
### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "kyujinpy/KO-Platypus2-13B"
CoT-llama = AutoModelForCausalLM.from_pretrained(
        repo,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map='auto'
)
CoT-llama_tokenizer = AutoTokenizer.from_pretrained(repo)
```

> Readme format: [kyujinpy/KoT-platypus2-7B](https://huggingface.co/kyujinpy/KoT-platypus2-7B)

---