--- base_model: unsloth/phi-4-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - llama - trl license: apache-2.0 language: - en --- # Uploaded model - **Developed by:** Omarrran - **License:** apache-2.0 - **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit # Fine-tuned Phi-4 Model Documentation ## πŸ“Œ Introduction This documentation provides an in-depth overview of the **fine-tuned Phi-4 conversational AI model**, detailing its **training methodology, parameters, dataset, model deployment, and usage instructions**. ## πŸ”Ή Model Overview **Phi-4** is a transformer-based language model optimized for **natural language understanding and text generation**. We have fine-tuned it using **LoRA (Low-Rank Adaptation)** with the **Unsloth framework**, making it lightweight and efficient while preserving the base model's capabilities. ## πŸ”Ή Training Details ### **πŸ›  Fine-tuning Methodology** We employed **LoRA (Low-Rank Adaptation)** for fine-tuning, which significantly reduces the number of trainable parameters while retaining the model’s expressive power. ### **πŸ“‘ Dataset Used** - **Dataset Name**: `mlabonne/FineTome-100k` - **Dataset Size**: 100,000 examples - **Data Format**: Conversational AI dataset with structured prompts and responses. - **Preprocessing**: The dataset was standardized using `unsloth.chat_templates.standardize_sharegpt()` ### **πŸ”’ Training Parameters** | Parameter | Value | |----------------------|-------| | LoRA Rank (`r`) | 16 | | LoRA Alpha | 16 | | LoRA Dropout | 0 | | Target Modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` | | Max Sequence Length | 2048 | | Load in 4-bit | True | | Gradient Checkpointing | `unsloth` | | Fine-tuning Duration | **10 epochs** | | Optimizer Used | AdamW | | Learning Rate | 2e-5 | ## πŸ”Ή How to Load the Model To load the fine-tuned model, use the **Unsloth framework**: ```python from unsloth import FastLanguageModel from unsloth.chat_templates import get_chat_template from peft import PeftModel model_name = "unsloth/Phi-4" max_seq_length = 2048 load_in_4bit = True # Load model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( model_name=model_name, max_seq_length=max_seq_length, load_in_4bit=load_in_4bit ) # Apply LoRA adapter model = FastLanguageModel.get_peft_model( model, r=16, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha=16, lora_dropout=0, bias="none", use_gradient_checkpointing="unsloth" ) ``` ## πŸ”Ή Deploying the Model ### **πŸš€ Using Google Colab** 1. Install dependencies: ```bash pip install gradio transformers torch unsloth peft ``` 2. Load the model using the script above. 3. Run inference using the chatbot interface. ### **πŸš€ Deploy on Hugging Face Spaces** 1. Save the script as `app.py`. 2. Create a `requirements.txt` file with: ``` gradio transformers torch unsloth peft ``` 3. Upload the files to a new **Hugging Face Space**. 4. Select **Python environment** and click **Deploy**. ## πŸ”Ή Using the Model ### **πŸ—¨ Chatbot Interface (Gradio UI)** To interact with the fine-tuned model using **Gradio**, use: ```python import gradio as gr def chat_with_model(user_input): inputs = tokenizer(user_input, return_tensors="pt") output = model.generate(**inputs, max_length=200) response = tokenizer.decode(output[0], skip_special_tokens=True) return response demo = gr.Interface( fn=chat_with_model, inputs=gr.Textbox(label="Your Message"), outputs=gr.Textbox(label="Chatbot's Response"), title="LoRA-Enhanced Phi-4 Chatbot" ) demo.launch() ``` ## πŸ“Œ Conclusion This **fine-tuned Phi-4 model** delivers **optimized conversational AI capabilities** using **LoRA fine-tuning and Unsloth’s 4-bit quantization**. The model is **lightweight, memory-efficient**, and suitable for chatbot applications in both **research and production environments**. [](https://github.com/unslothai/unsloth)