File size: 5,548 Bytes
a1f00af
1b392a9
a1f00af
 
 
1b392a9
 
a1f00af
1b392a9
 
 
 
 
 
a1f00af
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1b392a9
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233

---
library_name: transformers
tags: [Text Generation, Question-Answering]
inference: false
---


# Uploaded  model

- **Developed by:** YuvrajSingh9886
- **License:** apache-2.0
- **Finetuned from model :** unsloth/phi-3-mini-4k-instruct-bnb-4bit



<!-- Provide a quick summary of what the model is/does. -->

It's a fine-tuned version of Phi-2 model by Microsoft on [Alpaca-Cleaned-52k](yahma/alpaca-cleaned).


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The above model, with applicable changes to the generation_config file, passed to model.generate() function can lead to the generation of better results which could then be used for Health Counseling Chatbot dev.



## Bias, Risks, and Limitations

The model was developed as a proof-of-concept type hobby project and is not intended to be used without careful consideration of its implications.

[More Information Needed]


## How to Get Started with the Model

Use the code below to get started with the model.

### Load in the model using the BitsandBytes library

```python
pip install bitsandbytes
```

#### Load model from Hugging Face Hub with model name and bitsandbytes configuration

```python

def load_model_tokenizer(model_name: str, bnb_config: BitsAndBytesConfig) -> Tuple[AutoModelForCausalLM, AutoTokenizer]:
    """
    Load the model and tokenizer from the HuggingFace model hub using quantization.

    Args:
        model_name (str): The name of the model.
        bnb_config (BitsAndBytesConfig): The quantization configuration of BitsAndBytes.

    Returns:
        Tuple[AutoModelForCausalLM, AutoTokenizer]: The model and tokenizer.
    """


    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config = bnb_config,
        # device_map = "auto",
        torch_dtype="auto",
        trust_remote_code=True
    )

    tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token = True, trust_remote_code=True)

    tokenizer.pad_token = tokenizer.eos_token

    return model, tokenizer


bnb_config = BitsAndBytesConfig(
        load_in_4bit = load_in_4bit,
        bnb_4bit_use_double_quant = bnb_4bit_use_double_quant,
        bnb_4bit_quant_type = bnb_4bit_quant_type,
        bnb_4bit_compute_dtype = bnb_4bit_compute_dtype,
    )

model, tokenizer = load_model_tokenizer(model_name, bnb_config)

```

```python

new_model = "YuvrajSingh9886/medicinal-QnA-phi2-custom"

prompt = "I have been feeling more and more down for over a month. I have started having trouble sleeping due to panic attacks, but they are almost never triggered by something that I know of."

tokens = tokenizer(f"### Question: {prompt}", return_tensors='pt').to('cuda')
tokenizer.pad_token = tokenizer.eos_token
outputs = model.generate(**tokens, max_new_tokens=1024, num_beams=5,
    no_repeat_ngram_size=2,
    early_stopping=True
                         )
print(tokenizer.batch_decode(outputs,skip_special_tokens=True)[0])

```

## Training Details

### Training Data


#### Hardware

Epcohs: 10
Hardware: (1) RTX 4090 (24GB VRAM) 48GB 8vCPU (RAM)
          Hard Disk: 40GB


[More Information Needed]

### Training Procedure

QLoRA was used for quantization purposes.

Phi-2 model from Huggingface with BitsandBytes support


#### Preprocessing [optional]

```python

def format_phi2(row):
    question = row['Context']
    answer = row['Response']

#     text = f"[INST] {question} [/INST] {answer}".replace('\xa0', ' ')
    text = f"### Question: {question}\n ### Answer: {answer}"

    return text
```

#### Training Hyperparameters


LoRA config-
```bash
# LoRA attention dimension (int)
lora_r = 64

# Alpha parameter for LoRA scaling (int)
lora_alpha = 16

# Dropout probability for LoRA layers (float)
lora_dropout = 0.05

# Bias (string)
bias = "none"

# Task type (string)
task_type = "CAUSAL_LM"

# Random seed (int)
seed = 33
```

Phi-2 config-

```bash
# Batch size per GPU for training (int)
per_device_train_batch_size = 6

# Number of update steps to accumulate the gradients for (int)
gradient_accumulation_steps = 2

# Initial learning rate (AdamW optimizer) (float)
learning_rate = 2e-4

# Optimizer to use (string)
optim = "paged_adamw_8bit"

# Number of training epochs (int)
num_train_epochs = 4

# Linear warmup steps from 0 to learning_rate (int)
warmup_steps = 10

# Enable fp16/bf16 training (set bf16 to True with an A100) (bool)
fp16 = True

# Log every X updates steps (int)
logging_steps = 100

#L2 regularization(prevents overfitting)
weight_decay=0.0

#Checkpoint saves
save_strategy="epoch"
```

BnB config

```bash
# Activate 4-bit precision base model loading (bool)
load_in_4bit = True

# Activate nested quantization for 4-bit base models (double quantization) (bool)
bnb_4bit_use_double_quant = True

# Quantization type (fp4 or nf4) (string)
bnb_4bit_quant_type = "nf4"

# Compute data type for 4-bit base models
bnb_4bit_compute_dtype = torch.bfloat16

```

### Results

Training loss: 2.229
Validation loss: 2.223


## More Information [optional]

[Phi-2](https://huggingface.co/microsoft/phi-2)

## Model Card Authors [optional]

[YuvrajSingh9886](https://huggingface.co/YuvrajSingh9886)


This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)