Model Card: vi-gemma-2b-RAG

(English below)

Tiแบฟng Viแป‡t (Vietnamese)

Mรด tแบฃ mรด hรฌnh:

vi-gemma-2b-RAG lร  mแป™t mรด hรฌnh ngรดn ngแปฏ lแป›n ฤ‘ฦฐแปฃc tinh chแป‰nh tแปซ mรด hรฌnh cฦก sแปŸ google/gemma-1.1-2b-it sแปญ dแปฅng kแปน thuแบญt LoRA. Mรด hรฌnh ฤ‘ฦฐแปฃc huแบฅn luyแป‡n trรชn tแบญp dแปฏ liแป‡u tiแบฟng Viแป‡t vแป›i mแปฅc tiรชu cแบฃi thiแป‡n khแบฃ nฤƒng xแปญ lรฝ ngรดn ngแปฏ tiแบฟng Viแป‡t vร  nรขng cao hiแป‡u suแบฅt cho cรกc tรกc vแปฅ truy xuแบฅt thรดng tin mแปŸ (Retrieval Augmented Generation - RAG).

Mแปฅc ฤ‘รญch sแปญ dแปฅng:

Mรด hรฌnh vi-gemma-2b-RAG phรน hแปฃp cho cรกc tรกc vแปฅ sau:

  • Trแบฃ lแปi cรขu hแปi dแปฑa trรชn ngแปฏ cแบฃnh tiแบฟng Viแป‡t.
  • Tรณm tแบฏt vฤƒn bแบฃn tiแบฟng Viแป‡t.
  • Dแป‹ch mรกy tiแบฟng Viแป‡t.
  • Vร  cรกc tรกc vแปฅ tแบกo vฤƒn bแบฃn tiแบฟng Viแป‡t khรกc.

Giแป›i hแบกn:

Mแบทc dรน ฤ‘รฃ ฤ‘ฦฐแปฃc tinh chแป‰nh cho tiแบฟng Viแป‡t, vi-gemma-2b-RAG vแบซn cรณ thแปƒ gแบทp phแบฃi mแป™t sแป‘ hแบกn chแบฟ:

  • Cรณ thแปƒ tแบกo ra thรดng tin sai lแป‡ch hoแบทc khรดng chรญnh xรกc.
  • Cรณ thแปƒ thแปƒ hiแป‡n thร nh kiแบฟn โ€‹โ€‹hoแบทc quan ฤ‘iแปƒm khรดng phรน hแปฃp.
  • Hiแป‡u suแบฅt cรณ thแปƒ bแป‹ แบฃnh hฦฐแปŸng bแปŸi chแบฅt lฦฐแปฃng cแปงa dแปฏ liแป‡u ฤ‘แบงu vร o.

Cรกch sแปญ dแปฅng:

Dฦฐแป›i ฤ‘รขy chรบng tรดi chia sแบป mแป™t sแป‘ ฤ‘oแบกn mรฃ vแป cรกch bแบฏt ฤ‘แบงu nhanh chรณng ฤ‘แปƒ sแปญ dแปฅng mรด hรฌnh. Trฦฐแป›c tiรชn, hรฃy ฤ‘แบฃm bแบฃo ฤ‘รฃ cร i ฤ‘แบทt pip install -U transformers, sau ฤ‘รณ sao chรฉp ฤ‘oแบกn mรฃ tแปซ phแบงn cรณ liรชn quan ฤ‘แบฟn usecase cแปงa bแบกn.

Chรบng tรดi khuyแบฟn nghแป‹ sแปญ dแปฅng torch.bfloat16 lร m mแบทc ฤ‘แป‹nh.

# pip install transformers torch accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# KhแปŸi tแบกo tokenizer vร  model tแปซ checkpoint ฤ‘รฃ lฦฐu
tokenizer = AutoTokenizer.from_pretrained("himmeow/vi-gemma-2b-RAG")
model = AutoModelForCausalLM.from_pretrained(
    "himmeow/vi-gemma-2b-RAG",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Sแปญ dแปฅng GPU nแบฟu cรณ
if torch.cuda.is_available():
    model.to("cuda")

# ฤแป‹nh dแบกng prompt cho model
prompt = """
### Instruction and Input:
Dแปฑa vร o ngแปฏ cแบฃnh/tร i liแป‡u sau:
{}
Hรฃy trแบฃ lแปi cรขu hแปi: {}

### Response:
{}
"""

# Chuแบฉn bแป‹ dแปฏ liแป‡u ฤ‘แบงu vร o
input_data = """
Short Tandem Repeats (STRs) lร  cรกc trรฌnh tแปฑ DNA lแบทp lแบกi ngแบฏn (2- 6 nucleotides) xuแบฅt hiแป‡n phแป• biแบฟn trong hแป‡ gen cแปงa con ngฦฐแปi. Cรกc trรฌnh tแปฑ nร y cรณ tรญnh ฤ‘a hรฌnh rแบฅt cao trong tแปฑ nhiรชn, ฤ‘iแปu nร y khiแบฟn cรกc STRs trแปŸ thร nh nhแปฏng markers di truyแปn rแบฅt quan trแปng trong nghiรชn cแปฉu bแบฃn ฤ‘แป“ gen ngฦฐแปi vร  chuแบฉn ฤ‘oรกn bแป‡nh lรฝ di truyแปn cลฉng nhฦฐ xรกc ฤ‘แป‹nh danh tรญnh trong lฤฉnh vแปฑc phรกp y.
Cรกc STRs trแปŸ nรชn phแป• biแบฟn tแบกi cรกc phรฒng xรฉt nghiแป‡m phรกp y bแปŸi vรฌ viแป‡c nhรขn bแบฃn vร  phรขn tรญch STRs chแป‰ cแบงn lฦฐแปฃng DNA rแบฅt thแบฅp ngay cแบฃ khi แปŸ dแบกng bแป‹ phรขn hแปงy viแป‡c ฤ‘inh danh vแบซn cรณ thแปƒ ฤ‘ฦฐแปฃc thแปฑc hiแป‡n thร nh cรดng. Hฦกn nแปฏa viแป‡c phรกt hiแป‡n vร  ฤ‘รกnh giรก sแปฑ nhiแป…m DNA mแบซu trong cรกc mแบซu vแบญt cรณ thแปƒ ฤ‘ฦฐแปฃc giแบฃi quyแบฟt nhanh vแป›i kแบฟt quแบฃ phรขn tรญch STRs. แปž Hoa Kแปณ hiแป‡n nay, tแปซ bแป™ 13 markers nay ฤ‘รฃ tฤƒng lรชn 20 markers chรญnh ฤ‘ang ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ tแบกo ra mแป™t cฦก sแปŸ dแปฏ liแป‡u DNA trรชn toร n ฤ‘แบฅt nฦฐแป›c ฤ‘ฦฐแปฃc gแปi lร  The FBI Combined DNA Index System (Expaned CODIS).
CODIS vร  cรกc cฦก sแปญ dแปฏ liแป‡u DNA tฦฐฦกng tแปฑ ฤ‘ang ฤ‘ฦฐแปฃc sแปญ dแปฅng thแปฑc sแปฑ thร nh cรดng trong viแป‡c liรชn kแบฟt cรกc hแป“ sฦก DNA tแปซ cรกc tแป™i phแบกm vร  cรกc bแบฑng chแปฉng hiแป‡n trฦฐแปng vแปฅ รกn. Kแบฟt quแบฃ ฤ‘แป‹nh danh STRs cลฉng ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ hแป— trแปฃ hร ng trฤƒm nghรฌn trฦฐแปng hแปฃp xรฉt nghiแป‡m huyแบฟt thแป‘ng cha con mแป—i nฤƒm'
"""
query = "Hรฃy cho tรดi biแบฟt mแป™t sแป‘ tรญnh chแบฅt cแปงa STRs ฤ‘ฦฐแปฃc dรนng ฤ‘แปƒ lร m gรฌ?"

# ฤแป‹nh dแบกng input text
input_text = prompt.format(input_data, query," ")

# Mรฃ hรณa input text thร nh input ids
input_ids = tokenizer(input_text, return_tensors="pt")

# Sแปญ dแปฅng GPU cho input ids nแบฟu cรณ
if torch.cuda.is_available():
    input_ids = input_ids.to("cuda") 

# Tแบกo vฤƒn bแบฃn bแบฑng model
outputs = model.generate(
    **input_ids,
    max_new_tokens=500,
    no_repeat_ngram_size=5,  # Ngฤƒn chแบทn lแบทp lแบกi cรกc cแปฅm tแปซ 5 gram
    # do_sample=True,   # Kรญch hoแบกt chแบฟ ฤ‘แป™ tแบกo vฤƒn bแบฃn dแปฑa trรชn lแบฅy mแบซu. Trong chแบฟ ฤ‘แป™ nร y, model sแบฝ chแปn ngแบซu nhiรชn token tiแบฟp theo dแปฑa trรชn xรกc suแบฅt ฤ‘ฦฐแปฃc tรญnh tแปซ phรขn phแป‘i xรกc suแบฅt cแปงa cรกc token.
    # temperature=0.7,  # Giแบฃm temperature ฤ‘แปƒ kiแปƒm soรกt tรญnh ngแบซu nhiรชn
    # early_stopping=True,  # Dแปซng tแบกo vฤƒn bแบฃn khi tรฌm thแบฅy kแบฟt thรบc phรน hแปฃp
)
# Giแบฃi mรฃ vร  in kแบฟt quแบฃ
print(tokenizer.decode(outputs[0]))

'''
<bos>
### Instruction and Input:
Dแปฑa vร o ngแปฏ cแบฃnh/tร i liแป‡u sau:

Short Tandem Repeats (STRs) lร  cรกc trรฌnh tแปฑ DNA lแบทp lแบกi ngแบฏn (2- 6 nucleotides) xuแบฅt hiแป‡n phแป• biแบฟn trong hแป‡ gen cแปงa con ngฦฐแปi. Cรกc trรฌnh tแปฑ nร y cรณ tรญnh ฤ‘a hรฌnh rแบฅt cao trong tแปฑ nhiรชn, ฤ‘iแปu nร y khiแบฟn cรกc STRs trแปŸ thร nh nhแปฏng markers di truyแปn rแบฅt quan trแปng trong nghiรชn cแปฉu bแบฃn ฤ‘แป“ gen ngฦฐแปi vร  chuแบฉn ฤ‘oรกn bแป‡nh lรฝ di truyแปn cลฉng nhฦฐ xรกc ฤ‘แป‹nh danh tรญnh trong lฤฉnh vแปฑc phรกp y.
Cรกc STRs trแปŸ nรชn phแป• biแบฟn tแบกi cรกc phรฒng xรฉt nghiแป‡m phรกp y bแปŸi vรฌ viแป‡c nhรขn bแบฃn vร  phรขn tรญch STRs chแป‰ cแบงn lฦฐแปฃng DNA rแบฅt thแบฅp ngay cแบฃ khi แปŸ dแบกng bแป‹ phรขn hแปงy viแป‡c ฤ‘inh danh vแบซn cรณ thแปƒ ฤ‘ฦฐแปฃc thแปฑc hiแป‡n thร nh cรดng. Hฦกn nแปฏa viแป‡c phรกt hiแป‡n vร  ฤ‘รกnh giรก sแปฑ nhiแป…m DNA mแบซu trong cรกc mแบซu vแบญt cรณ thแปƒ ฤ‘ฦฐแปฃc giแบฃi quyแบฟt nhanh vแป›i kแบฟt quแบฃ phรขn tรญch STRs. แปž Hoa Kแปณ hiแป‡n nay, tแปซ bแป™ 13 markers nay ฤ‘รฃ tฤƒng lรชn 20 markers chรญnh ฤ‘ang ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ tแบกo ra mแป™t cฦก sแปŸ dแปฏ liแป‡u DNA trรชn toร n ฤ‘แบฅt nฦฐแป›c ฤ‘ฦฐแปฃc gแปi lร  The FBI Combined DNA Index System (Expaned CODIS).
CODIS vร  cรกc cฦก sแปญ dแปฏ liแป‡u DNA tฦฐฦกng tแปฑ ฤ‘ang ฤ‘ฦฐแปฃc sแปญ dแปฅng thแปฑc sแปฑ thร nh cรดng trong viแป‡c liรชn kแบฟt cรกc hแป“ sฦก DNA tแปซ cรกc tแป™i phแบกm vร  cรกc bแบฑng chแปฉng hiแป‡n trฦฐแปng vแปฅ รกn. Kแบฟt quแบฃ ฤ‘แป‹nh danh STRs cลฉng ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ hแป— trแปฃ hร ng trฤƒm nghรฌn trฦฐแปng hแปฃp xรฉt nghiแป‡m huyแบฟt thแป‘ng cha con mแป—i nฤƒm'

Hรฃy trแบฃ lแปi cรขu hแปi: Hรฃy cho tรดi biแบฟt mแป™t sแป‘ tรญnh chแบฅt cแปงa STRs ฤ‘ฦฐแปฃc dรนng ฤ‘แปƒ lร m gรฌ?

### Response:
 
STRs ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ xรกc ฤ‘แป‹nh danh tรญnh, chuแบฉn ฤ‘oรกn bแป‡nh lรฝ vร  xรกc ฤ‘แป‹nh bแป‡nh lรฝ di truyแปn.
<eos>
'''

Huแบฅn luyแป‡n:

  • Mรด hรฌnh cฦก sแปŸ: google/gemma-1.1-2b-it
  • Tแบญp dแปฏ liแป‡u: lamhieu/mabrycodes_dialogue_vi
  • Phฦฐฦกng phรกp tinh chแป‰nh: LoRA, PEFT vแป›i Unsloth

Model Card: vi-gemma-2b-RAG

English

Model Description:

vi-gemma-2b-RAG is a large language model fine-tuned from the base model google/gemma-1.1-2b-it using LoRA. The model is trained on a Vietnamese dataset to improve its Vietnamese language processing capabilities and enhance its performance for Retrieval Augmented Generation (RAG) tasks.

Intended Use:

The vi-gemma-2b-RAG model is suitable for tasks such as:

  • Vietnamese question answering.
  • Vietnamese text summarization.
  • Vietnamese machine translation.
  • And other Vietnamese text generation tasks.

Limitations:

While fine-tuned for Vietnamese, vi-gemma-2b-RAG may still have some limitations:

  • It may generate incorrect or misleading information.
  • It may exhibit biases or inappropriate opinions.
  • Its performance may be affected by the quality of the input data.

How to Use:

Usage

Below we share some code snippets on how to get quickly started with running the model. First make sure to pip install -U transformers, then copy the snippet from the section that is relevant for your usecase.

We recommend torch.bfloat16 as the default dtype.

# pip install transformers torch accelerate

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Initialize the tokenizer and model from the saved checkpoint
tokenizer = AutoTokenizer.from_pretrained("himmeow/vi-gemma-2b-RAG")
model = AutoModelForCausalLM.from_pretrained(
    "himmeow/vi-gemma-2b-RAG",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Use GPU if available
if torch.cuda.is_available():
    model.to("cuda")

# Define the prompt format for the model
prompt = """
### Instruction and Input:
Based on the following context/document:
{}
Please answer the question: {}

### Response:
{}
"""

# Prepare the input data
input_data = """
Short Tandem Repeats (STRs) are short (2-6 nucleotides) repeating DNA sequences that are widespread in the human genome. These sequences are highly polymorphic in nature, which makes STRs very important genetic markers in human gene mapping and diagnosis of hereditary diseases as well as identification in the field of forensics.
STRs have become popular in forensic laboratories because the replication and analysis of STRs requires very small amounts of DNA, even in decomposed form, identification can still be performed successfully. Furthermore, the detection and assessment of sample DNA contamination in specimens can be quickly resolved with STR analysis results. In the United States today, the set of 13 markers has now been increased to 20 main markers being used to create a nationwide DNA database called The FBI Combined DNA Index System (Expaned CODIS).
CODIS and similar DNA databases are being used very successfully in linking DNA records from criminals and crime scene evidence. STR identification results are also used to support hundreds of thousands of paternity test cases each year.'
"""
query = "Tell me what are some properties of STRs used for?"

# Format the input text
input_text = prompt.format(input_data, query," ")

# Encode the input text into input ids
input_ids = tokenizer(input_text, return_tensors="pt")

# Use GPU for input ids if available
if torch.cuda.is_available():
    input_ids = input_ids.to("cuda") 

# Generate text using the model
outputs = model.generate(
    **input_ids,
    max_new_tokens=500, # Limit the number of tokens generated
    no_repeat_ngram_size=5,  # Prevent repetition of 5-gram phrases
    # do_sample=True,
    # temperature=0.7,  # Adjust the randomness of the generated text
    # early_stopping=True,  # Stop generating text when a suitable ending is found
)
# Decode and print the results
print(tokenizer.decode(outputs[0]))


Training:

  • Base Model: google/gemma-1.1-2b-it
  • Dataset: lamhieu/mabrycodes_dialogue_vi
  • Fine-tuning Method: LoRA, PEFT and Unsloth

Using example repository: https://github.com/Martincrux/Vietnamese-RAG-system-building-with-vi-gemma-2b-RAG-and-halong_embedding

Uploaded model

This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
224
Safetensors
Model size
2.51B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ricepaper/vi-gemma-2b-RAG

Finetuned
(52)
this model
Finetunes
7 models
Quantizations
3 models

Spaces using ricepaper/vi-gemma-2b-RAG 5

Collection including ricepaper/vi-gemma-2b-RAG