File size: 4,296 Bytes
dbc8f90 1bf92d5 dbc8f90 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
base_model: unsloth/phi-4-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
---
# Uploaded model
- **Developed by:** Omarrran
- **License:** apache-2.0
- **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit
# Fine-tuned Phi-4 Model Documentation
## 📌 Introduction
This documentation provides an in-depth overview of the **fine-tuned Phi-4 conversational AI model**, detailing its **training methodology, parameters, dataset, model deployment, and usage instructions**.
## 🔹 Model Overview
**Phi-4** is a transformer-based language model optimized for **natural language understanding and text generation**. We have fine-tuned it using **LoRA (Low-Rank Adaptation)** with the **Unsloth framework**, making it lightweight and efficient while preserving the base model's capabilities.
## 🔹 Training Details
### **🛠 Fine-tuning Methodology**
We employed **LoRA (Low-Rank Adaptation)** for fine-tuning, which significantly reduces the number of trainable parameters while retaining the model’s expressive power.
### **📑 Dataset Used**
- **Dataset Name**: `mlabonne/FineTome-100k`
- **Dataset Size**: 100,000 examples
- **Data Format**: Conversational AI dataset with structured prompts and responses.
- **Preprocessing**: The dataset was standardized using `unsloth.chat_templates.standardize_sharegpt()`
### **🔢 Training Parameters**
| Parameter | Value |
|----------------------|-------|
| LoRA Rank (`r`) | 16 |
| LoRA Alpha | 16 |
| LoRA Dropout | 0 |
| Target Modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Max Sequence Length | 2048 |
| Load in 4-bit | True |
| Gradient Checkpointing | `unsloth` |
| Fine-tuning Duration | **10 epochs** |
| Optimizer Used | AdamW |
| Learning Rate | 2e-5 |
## 🔹 How to Load the Model
To load the fine-tuned model, use the **Unsloth framework**:
```python
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from peft import PeftModel
model_name = "unsloth/Phi-4"
max_seq_length = 2048
load_in_4bit = True
# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_name,
max_seq_length=max_seq_length,
load_in_4bit=load_in_4bit
)
# Apply LoRA adapter
model = FastLanguageModel.get_peft_model(
model,
r=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"],
lora_alpha=16,
lora_dropout=0,
bias="none",
use_gradient_checkpointing="unsloth"
)
```
## 🔹 Deploying the Model
### **🚀 Using Google Colab**
1. Install dependencies:
```bash
pip install gradio transformers torch unsloth peft
```
2. Load the model using the script above.
3. Run inference using the chatbot interface.
### **🚀 Deploy on Hugging Face Spaces**
1. Save the script as `app.py`.
2. Create a `requirements.txt` file with:
```
gradio
transformers
torch
unsloth
peft
```
3. Upload the files to a new **Hugging Face Space**.
4. Select **Python environment** and click **Deploy**.
## 🔹 Using the Model
### **🗨 Chatbot Interface (Gradio UI)**
To interact with the fine-tuned model using **Gradio**, use:
```python
import gradio as gr
def chat_with_model(user_input):
inputs = tokenizer(user_input, return_tensors="pt")
output = model.generate(**inputs, max_length=200)
response = tokenizer.decode(output[0], skip_special_tokens=True)
return response
demo = gr.Interface(
fn=chat_with_model,
inputs=gr.Textbox(label="Your Message"),
outputs=gr.Textbox(label="Chatbot's Response"),
title="LoRA-Enhanced Phi-4 Chatbot"
)
demo.launch()
```
## 📌 Conclusion
This **fine-tuned Phi-4 model** delivers **optimized conversational AI capabilities** using **LoRA fine-tuning and Unsloth’s 4-bit quantization**. The model is **lightweight, memory-efficient**, and suitable for chatbot applications in both **research and production environments**.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|