File size: 4,296 Bytes
dbc8f90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1bf92d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dbc8f90
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
base_model: unsloth/phi-4-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
---

# Uploaded  model

- **Developed by:** Omarrran
- **License:** apache-2.0
- **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit

# Fine-tuned Phi-4 Model Documentation

## 📌 Introduction
This documentation provides an in-depth overview of the **fine-tuned Phi-4 conversational AI model**, detailing its **training methodology, parameters, dataset, model deployment, and usage instructions**.

## 🔹 Model Overview
**Phi-4** is a transformer-based language model optimized for **natural language understanding and text generation**. We have fine-tuned it using **LoRA (Low-Rank Adaptation)** with the **Unsloth framework**, making it lightweight and efficient while preserving the base model's capabilities.

## 🔹 Training Details
### **🛠 Fine-tuning Methodology**
We employed **LoRA (Low-Rank Adaptation)** for fine-tuning, which significantly reduces the number of trainable parameters while retaining the model’s expressive power.

### **📑 Dataset Used**
- **Dataset Name**: `mlabonne/FineTome-100k`
- **Dataset Size**: 100,000 examples
- **Data Format**: Conversational AI dataset with structured prompts and responses.
- **Preprocessing**: The dataset was standardized using `unsloth.chat_templates.standardize_sharegpt()`

### **🔢 Training Parameters**
| Parameter             | Value |
|----------------------|-------|
| LoRA Rank (`r`)     | 16    |
| LoRA Alpha          | 16    |
| LoRA Dropout        | 0     |
| Target Modules      | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Max Sequence Length | 2048  |
| Load in 4-bit       | True  |
| Gradient Checkpointing | `unsloth` |
| Fine-tuning Duration | **10 epochs** |
| Optimizer Used      | AdamW |
| Learning Rate       | 2e-5  |

## 🔹 How to Load the Model
To load the fine-tuned model, use the **Unsloth framework**:

```python
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from peft import PeftModel

model_name = "unsloth/Phi-4"
max_seq_length = 2048
load_in_4bit = True

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    load_in_4bit=load_in_4bit
)

# Apply LoRA adapter
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth"
)
```

## 🔹 Deploying the Model
### **🚀 Using Google Colab**
1. Install dependencies:
    ```bash
    pip install gradio transformers torch unsloth peft
    ```
2. Load the model using the script above.
3. Run inference using the chatbot interface.

### **🚀 Deploy on Hugging Face Spaces**
1. Save the script as `app.py`.
2. Create a `requirements.txt` file with:
    ```
    gradio
    transformers
    torch
    unsloth
    peft
    ```
3. Upload the files to a new **Hugging Face Space**.
4. Select **Python environment** and click **Deploy**.

## 🔹 Using the Model
### **🗨 Chatbot Interface (Gradio UI)**
To interact with the fine-tuned model using **Gradio**, use:

```python
import gradio as gr

def chat_with_model(user_input):
    inputs = tokenizer(user_input, return_tensors="pt")
    output = model.generate(**inputs, max_length=200)
    response = tokenizer.decode(output[0], skip_special_tokens=True)
    return response

demo = gr.Interface(
    fn=chat_with_model,
    inputs=gr.Textbox(label="Your Message"),
    outputs=gr.Textbox(label="Chatbot's Response"),
    title="LoRA-Enhanced Phi-4 Chatbot"
)

demo.launch()
```

## 📌 Conclusion
This **fine-tuned Phi-4 model** delivers **optimized conversational AI capabilities** using **LoRA fine-tuning and Unsloth’s 4-bit quantization**. The model is **lightweight, memory-efficient**, and suitable for chatbot applications in both **research and production environments**.



[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)