metadata

license: mit
tags:
  - llama
  - text-generation
  - instruction-following
  - llama-2
  - lora
  - peft
  - trl
  - sft

Llama-2-7b-chat-finetune

This model is a fine-tuned version of NousResearch/Llama-2-7b-chat-hf using the mlabonne/guanaco-llama2-1k dataset. It has been fine-tuned using LoRA (Low-Rank Adaptation) with the PEFT library and the SFTTrainer from TRL.

Model Description

This model is intended for text generation and instruction following tasks. It has been fine-tuned on a dataset of 1,000 instruction-following examples.

Intended Uses & Limitations

This model can be used for a variety of text generation tasks, including:

Generating creative text formats, like poems, code, scripts, musical pieces, email, letters, etc.
Answering your questions in an informative way, even if they are open ended, challenging, or strange.
Following your instructions and completing your requests thoughtfully.

Limitations:

The model may generate biased or harmful content.
The model may not be able to follow all instructions perfectly.
The model may not be able to generate text that is factually accurate.

Training and Fine-tuning

This model was fine-tuned using the following parameters:

LoRA attention dimension (lora_r): 64
Alpha parameter for LoRA scaling (lora_alpha): 16
Dropout probability for LoRA layers (lora_dropout): 0.1
4-bit precision base model loading (use_4bit): True
Number of training epochs (num_train_epochs): 1
Batch size per GPU for training (per_device_train_batch_size): 4
Learning rate (learning_rate): 2e-4

How to Use

You can use this model with the following code:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name = "chaitanya42/Llama-2-7b-chat-finetune"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "What is a large language model?"
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
result = pipe(f"[INST] {prompt} [/INST]")
print(result[0]['generated_text'])