--- datasets: - vicgalle/alpaca-gpt4 base_model: - mistralai/Mistral-7B-v0.3 --- ## Model Fine-Tuning This model was fine-tuned using supervised fine-tuning (SFT) with the following key configuration details: ### Pre-trained Model - **Model**: `mistralai/Mistral-7B-v0.3` - **Tokenizer**: Corresponding tokenizer for the Mistral-7B model. ### Dataset - **Dataset**: `vicgalle/alpaca-gpt4` - **Subset Size**: First 1,000 examples. ### Training Parameters - **Epochs**: 1 - **Batch Size**: 8 (with gradient accumulation of 2) - **Learning Rate**: 2e-4 - **Optimizer**: `paged_adamw_8bit` - **Weight Decay**: 0.001 - **Max Grad Norm**: 0.3 - **Warm-up Ratio**: 0.3 - **Gradient Accumulation**: 2 steps - **FP16/BF16**: Disabled - **Max Steps**: -1 (training will stop when dataset is exhausted) - **Scheduler**: Linear learning rate scheduler - **Monitoring**: Weights & Biases (`wandb`) ### PEFT Configuration The model uses **LoRA (Low-Rank Adaptation)** for efficient fine-tuning with the following configuration: - **lora_alpha**: 8 - **lora_dropout**: 0.1 - **r**: 16 - **Bias**: "none" - **Task Type**: `CAUSAL_LM` - **Target Modules**: `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj"]` ### Quantization Configuration The model is quantized to **4-bit** using `BitsAndBytes` to reduce memory usage for training and inference: - **Load in 4-bit**: Yes - **Quantization Type**: `nf4` - **Compute Data Type**: `float16` ### Training Environment - **Platform**: Google Colab (GPU enabled) - **Monitoring**: Weights & Biases (W&B) ### Model Saving and Upload - **Fine-tuned Model**: Saved to Hugging Face repository: `nicksnlp/shrimp` - **Tokenizer**: Pushed to Hugging Face repository. ### Example Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # Load the fine-tuned model model = PeftModel.from_pretrained("nicksnlp/shrimp") tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.3") # Example inference prompt = "What is Newton's 3rd Law and its formula?" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Code references: - [Aarne Talman, aarnetalman: supervised_finetuning.ipynb](https://github.com/Helsinki-NLP/LLM-course-2024/tree/main/week-4/supervised_finetuning.ipynb) - [Brett Young, byyoung3: Fine-Tuning-Mistral-7B-on-Python-Code-With-A-Single-GPU](https://wandb.ai/byyoung3/ml-news/reports/Fine-Tuning-Mistral-7B-on-Python-Code-With-A-Single-GPU---Vmlldzo1NTg0NzY5)