YuvrajSingh9886's picture
Update README.md
a1f00af verified
|
raw
history blame
5.55 kB
---
library_name: transformers
tags: [Text Generation, Question-Answering]
inference: false
---
# Uploaded model
- **Developed by:** YuvrajSingh9886
- **License:** apache-2.0
- **Finetuned from model :** unsloth/phi-3-mini-4k-instruct-bnb-4bit
<!-- Provide a quick summary of what the model is/does. -->
It's a fine-tuned version of Phi-2 model by Microsoft on [Alpaca-Cleaned-52k](yahma/alpaca-cleaned).
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The above model, with applicable changes to the generation_config file, passed to model.generate() function can lead to the generation of better results which could then be used for Health Counseling Chatbot dev.
## Bias, Risks, and Limitations
The model was developed as a proof-of-concept type hobby project and is not intended to be used without careful consideration of its implications.
[More Information Needed]
## How to Get Started with the Model
Use the code below to get started with the model.
### Load in the model using the BitsandBytes library
```python
pip install bitsandbytes
```
#### Load model from Hugging Face Hub with model name and bitsandbytes configuration
```python
def load_model_tokenizer(model_name: str, bnb_config: BitsAndBytesConfig) -> Tuple[AutoModelForCausalLM, AutoTokenizer]:
"""
Load the model and tokenizer from the HuggingFace model hub using quantization.
Args:
model_name (str): The name of the model.
bnb_config (BitsAndBytesConfig): The quantization configuration of BitsAndBytes.
Returns:
Tuple[AutoModelForCausalLM, AutoTokenizer]: The model and tokenizer.
"""
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config = bnb_config,
# device_map = "auto",
torch_dtype="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token = True, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
return model, tokenizer
bnb_config = BitsAndBytesConfig(
load_in_4bit = load_in_4bit,
bnb_4bit_use_double_quant = bnb_4bit_use_double_quant,
bnb_4bit_quant_type = bnb_4bit_quant_type,
bnb_4bit_compute_dtype = bnb_4bit_compute_dtype,
)
model, tokenizer = load_model_tokenizer(model_name, bnb_config)
```
```python
new_model = "YuvrajSingh9886/medicinal-QnA-phi2-custom"
prompt = "I have been feeling more and more down for over a month. I have started having trouble sleeping due to panic attacks, but they are almost never triggered by something that I know of."
tokens = tokenizer(f"### Question: {prompt}", return_tensors='pt').to('cuda')
tokenizer.pad_token = tokenizer.eos_token
outputs = model.generate(**tokens, max_new_tokens=1024, num_beams=5,
no_repeat_ngram_size=2,
early_stopping=True
)
print(tokenizer.batch_decode(outputs,skip_special_tokens=True)[0])
```
## Training Details
### Training Data
#### Hardware
Epcohs: 10
Hardware: (1) RTX 4090 (24GB VRAM) 48GB 8vCPU (RAM)
Hard Disk: 40GB
[More Information Needed]
### Training Procedure
QLoRA was used for quantization purposes.
Phi-2 model from Huggingface with BitsandBytes support
#### Preprocessing [optional]
```python
def format_phi2(row):
question = row['Context']
answer = row['Response']
# text = f"[INST] {question} [/INST] {answer}".replace('\xa0', ' ')
text = f"### Question: {question}\n ### Answer: {answer}"
return text
```
#### Training Hyperparameters
LoRA config-
```bash
# LoRA attention dimension (int)
lora_r = 64
# Alpha parameter for LoRA scaling (int)
lora_alpha = 16
# Dropout probability for LoRA layers (float)
lora_dropout = 0.05
# Bias (string)
bias = "none"
# Task type (string)
task_type = "CAUSAL_LM"
# Random seed (int)
seed = 33
```
Phi-2 config-
```bash
# Batch size per GPU for training (int)
per_device_train_batch_size = 6
# Number of update steps to accumulate the gradients for (int)
gradient_accumulation_steps = 2
# Initial learning rate (AdamW optimizer) (float)
learning_rate = 2e-4
# Optimizer to use (string)
optim = "paged_adamw_8bit"
# Number of training epochs (int)
num_train_epochs = 4
# Linear warmup steps from 0 to learning_rate (int)
warmup_steps = 10
# Enable fp16/bf16 training (set bf16 to True with an A100) (bool)
fp16 = True
# Log every X updates steps (int)
logging_steps = 100
#L2 regularization(prevents overfitting)
weight_decay=0.0
#Checkpoint saves
save_strategy="epoch"
```
BnB config
```bash
# Activate 4-bit precision base model loading (bool)
load_in_4bit = True
# Activate nested quantization for 4-bit base models (double quantization) (bool)
bnb_4bit_use_double_quant = True
# Quantization type (fp4 or nf4) (string)
bnb_4bit_quant_type = "nf4"
# Compute data type for 4-bit base models
bnb_4bit_compute_dtype = torch.bfloat16
```
### Results
Training loss: 2.229
Validation loss: 2.223
## More Information [optional]
[Phi-2](https://huggingface.co/microsoft/phi-2)
## Model Card Authors [optional]
[YuvrajSingh9886](https://huggingface.co/YuvrajSingh9886)
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)