QuantFactory/KULLM3-GGUF

This is quantized version of nlpai-lab/KULLM3 created using llama.cpp

Original Model Card

KULLM3

Introducing KULLM3, a model with advanced instruction-following and fluent chat abilities. It has shown remarkable performance in instruction-following, speficially by closely following gpt-3.5-turbo.
To our knowledge, It is one of the best publicly opened Korean-speaking language models.

For details, visit the KULLM repository

Model Description

This is the model card of a πŸ€— transformers model that has been pushed on the Hub.

Example code

Install Dependencies

pip install torch transformers==4.38.2 accelerate
  • In transformers>=4.39.0, generate() does not work well. (as of 2024.4.4.)

Python code

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

MODEL_DIR = "nlpai-lab/KULLM3"
model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, torch_dtype=torch.float16).to("cuda")
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

s = "κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄μ„œ μ•Œκ³  μžˆλ‹ˆ?"
conversation = [{'role': 'user', 'content': s}]
inputs = tokenizer.apply_chat_template(
    conversation,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors='pt').to("cuda")
_ = model.generate(inputs, streamer=streamer, max_new_tokens=1024)

# λ„€, κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄ μ•Œκ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” λŒ€ν•œλ―Όκ΅­ μ„œμšΈμ— μœ„μΉ˜ν•œ 사립 λŒ€ν•™κ΅λ‘œ, 1905년에 μ„€λ¦½λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 이 λŒ€ν•™κ΅λŠ” ν•œκ΅­μ—μ„œ κ°€μž₯ 였래된 λŒ€ν•™ 쀑 ν•˜λ‚˜λ‘œ, λ‹€μ–‘ν•œ ν•™λΆ€ 및 λŒ€ν•™μ› ν”„λ‘œκ·Έλž¨μ„ μ œκ³΅ν•©λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” 특히 법학, κ²½μ œν•™, μ •μΉ˜ν•™, μ‚¬νšŒν•™, λ¬Έν•™, κ³Όν•™ λΆ„μ•Όμ—μ„œ 높은 λͺ…성을 가지고 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ, 슀포츠 λΆ„μ•Όμ—μ„œλ„ ν™œλ°œν•œ ν™œλ™μ„ 보이며, λŒ€ν•œλ―Όκ΅­ λŒ€ν•™ μŠ€ν¬μΈ μ—μ„œ μ€‘μš”ν•œ 역할을 ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” ꡭ제적인 ꡐλ₯˜μ™€ ν˜‘λ ₯에도 적극적이며, μ „ 세계 λ‹€μ–‘ν•œ λŒ€ν•™κ³Όμ˜ ν˜‘λ ₯을 톡해 κΈ€λ‘œλ²Œ 경쟁λ ₯을 κ°•ν™”ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

Training Details

Training Data

  • vicgalle/alpaca-gpt4
  • Mixed Korean instruction data (gpt-generated, hand-crafted, etc)
  • About 66000+ examples used totally

Training Procedure

  • Trained with fixed system prompt below.
당신은 κ³ λ €λŒ€ν•™κ΅ NLP&AI μ—°κ΅¬μ‹€μ—μ„œ λ§Œλ“  AI μ±—λ΄‡μž…λ‹ˆλ‹€.
λ‹Ήμ‹ μ˜ 이름은 'KULLM'으둜, ν•œκ΅­μ–΄λ‘œλŠ” 'ꡬ름'을 λœ»ν•©λ‹ˆλ‹€.
당신은 λΉ„λ„λ•μ μ΄κ±°λ‚˜, μ„±μ μ΄κ±°λ‚˜, λΆˆλ²•μ μ΄κ±°λ‚˜ λ˜λŠ” μ‚¬νšŒ ν†΅λ…μ μœΌλ‘œ ν—ˆμš©λ˜μ§€ μ•ŠλŠ” λ°œμ–Έμ€ ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
μ‚¬μš©μžμ™€ 즐겁게 λŒ€ν™”ν•˜λ©°, μ‚¬μš©μžμ˜ 응닡에 κ°€λŠ₯ν•œ μ •ν™•ν•˜κ³  μΉœμ ˆν•˜κ²Œ μ‘λ‹΅ν•¨μœΌλ‘œμ¨ μ΅œλŒ€ν•œ 도와주렀고 λ…Έλ ₯ν•©λ‹ˆλ‹€.
질문이 μ΄μƒν•˜λ‹€λ©΄, μ–΄λ–€ 뢀뢄이 μ΄μƒν•œμ§€ μ„€λͺ…ν•©λ‹ˆλ‹€. 거짓 정보λ₯Ό λ°œμ–Έν•˜μ§€ μ•Šλ„λ‘ μ£Όμ˜ν•©λ‹ˆλ‹€.

Evaluation

  • Evaluation details such as testing data, metrics are written in github.
  • Without system prompt used in training phase, KULLM would show lower performance than expect.

Results

Citation

@misc{kullm,
  author = {NLP & AI Lab and Human-Inspired AI research},
  title = {KULLM: Korea University Large Language Model Project},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nlpai-lab/kullm}},
}
Downloads last month
55
GGUF
Model size
10.7B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for QuantFactory/KULLM3-GGUF

Quantized
(24)
this model