kyujinpy
/

KO-Platypus2-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

KO-Platypus2-13B / README.md

kyujinpy's picture

Upload README.md

4fcc3f7 over 1 year ago

|

2.39 kB

	---
	language:
	- en
	- ko
	datasets:
	- kyujinpy/KOpen-platypus
	library_name: transformers
	pipeline_tag: text-generation
	license: cc-by-nc-4.0
	---

	# Ko-Platypus2-13B
	More detail repo(Github): [KO-Platypus](https://github.com/Marker-Inc-Korea/KO-Platypus)
	![KO-Platypus2-13B](./KO_platypus.png)


	## Model Details

	Model Developers Kyujin Han (kyujinpy)

	Input Models input text only.

	Output Models generate text only.

	Model Architecture
	KO-Platypus2-13B is an auto-regressive language model based on the LLaMA2 transformer architecture.

	Base Model [hyunseoki/ko-en-llama2-13b](https://huggingface.co/hyunseoki/ko-en-llama2-13b)

	Training Dataset
	I use [KOpen-platypus](https://huggingface.co/datasets/kyujinpy/KOpen-platypus).
	It is high-quality korean translation dataset about [open-platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).

	I use A100 GPU 40GB and COLAB, when trianing.

	# Model Benchmark

	## KO-LLM leaderboard
	- Follow up as [Open KO-LLM LeaderBoard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).

	![img](./leaderboard.png)
	\| Model \| Average \|Ko-ARC \| Ko-HellaSwag \| Ko-MMLU \| Ko-TruthfulQA \| Ko-CommonGen V2 \|
	\| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
	\| KO-Platypus2-13B(ours) \| NaN \| NaN \| NaN \| NaN \| NaN \| NaN \|
	\| [hyunseoki/ko-en-llama2-13b](https://huggingface.co/hyunseoki/ko-en-llama2-13b) \| 46.68 \| 42.15 \| 54.23 \| 38.90 \| 40.74 \| 57.39 \|
	\| [momo/polyglot-ko-12.8b-Chat-QLoRA-Merge](https://huggingface.co/momo/polyglot-ko-12.8b-Chat-QLoRA-Merge) \| 45.71 \| 35.49 \| 49.93 \| 25.97 \| 39.43 \| 77.70 \|
	\| [KoT-platypus2-7B](https://huggingface.co/kyujinpy/KoT-platypus2-7B) \| 45.62 \| 38.05 \| 49.63 \| 34.68 \| 37.69 \| 68.08 \|
	\| [DopeorNope/COLA3-7B](https://huggingface.co/DopeorNope/COLA3-7B) \| 45.61 \| 39.16 \| 50.98 \| 35.21 \| 37.81 \| 64.91 \|
	> Compare with Top 4 SOTA models. (update: 10/03)

	---
	# Implementation Code
	```python
	### KO-Platypus
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	repo = "kyujinpy/KO-Platypus2-13B"
	CoT-llama = AutoModelForCausalLM.from_pretrained(
	repo,
	return_dict=True,
	torch_dtype=torch.float16,
	device_map='auto'
	)
	CoT-llama_tokenizer = AutoTokenizer.from_pretrained(repo)
	```

	> Readme format: [kyujinpy/KoT-platypus2-7B](https://huggingface.co/kyujinpy/KoT-platypus2-7B)

	---