---
datasets:
- vicgalle/alpaca-gpt4
base_model:
- mistralai/Mistral-7B-v0.3
---
## Model Fine-Tuning

This model was fine-tuned using supervised fine-tuning (SFT) with the following key configuration details:

### Pre-trained Model
- **Model**: `mistralai/Mistral-7B-v0.3`
- **Tokenizer**: Corresponding tokenizer for the Mistral-7B model.

### Dataset
- **Dataset**: `vicgalle/alpaca-gpt4`
- **Subset Size**: First 1,000 examples.

### Training Parameters
- **Epochs**: 1
- **Batch Size**: 8 (with gradient accumulation of 2)
- **Learning Rate**: 2e-4
- **Optimizer**: `paged_adamw_8bit`
- **Weight Decay**: 0.001
- **Max Grad Norm**: 0.3
- **Warm-up Ratio**: 0.3
- **Gradient Accumulation**: 2 steps
- **FP16/BF16**: Disabled
- **Max Steps**: -1 (training will stop when dataset is exhausted)
- **Scheduler**: Linear learning rate scheduler
- **Monitoring**: Weights & Biases (`wandb`)

### PEFT Configuration
The model uses **LoRA (Low-Rank Adaptation)** for efficient fine-tuning with the following configuration:
- **lora_alpha**: 8
- **lora_dropout**: 0.1
- **r**: 16
- **Bias**: "none"
- **Task Type**: `CAUSAL_LM`
- **Target Modules**: `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj"]`

### Quantization Configuration
The model is quantized to **4-bit** using `BitsAndBytes` to reduce memory usage for training and inference:
- **Load in 4-bit**: Yes
- **Quantization Type**: `nf4`
- **Compute Data Type**: `float16`

### Training Environment
- **Platform**: Google Colab (GPU enabled)
- **Monitoring**: Weights & Biases (W&B)

### Model Saving and Upload
- **Fine-tuned Model**: Saved to Hugging Face repository: `nicksnlp/shrimp`
- **Tokenizer**: Pushed to Hugging Face repository.

### Example Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load the fine-tuned model
model = PeftModel.from_pretrained("nicksnlp/shrimp")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.3")

# Example inference
prompt = "What is Newton's 3rd Law and its formula?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Code references:
- [Aarne Talman, aarnetalman: supervised_finetuning.ipynb](https://github.com/Helsinki-NLP/LLM-course-2024/tree/main/week-4/supervised_finetuning.ipynb)
- [Brett Young, byyoung3: Fine-Tuning-Mistral-7B-on-Python-Code-With-A-Single-GPU](https://wandb.ai/byyoung3/ml-news/reports/Fine-Tuning-Mistral-7B-on-Python-Code-With-A-Single-GPU---Vmlldzo1NTg0NzY5)