Update README.md

a1f00af verified 7 months ago

5.55 kB


	---
	library_name: transformers
	tags: [Text Generation, Question-Answering]
	inference: false
	---


	# Uploaded model

	- Developed by: YuvrajSingh9886
	- License: apache-2.0
	- Finetuned from model : unsloth/phi-3-mini-4k-instruct-bnb-4bit



	<!-- Provide a quick summary of what the model is/does. -->

	It's a fine-tuned version of Phi-2 model by Microsoft on [Alpaca-Cleaned-52k](yahma/alpaca-cleaned).


	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
	The above model, with applicable changes to the generation_config file, passed to model.generate() function can lead to the generation of better results which could then be used for Health Counseling Chatbot dev.



	## Bias, Risks, and Limitations

	The model was developed as a proof-of-concept type hobby project and is not intended to be used without careful consideration of its implications.

	[More Information Needed]


	## How to Get Started with the Model

	Use the code below to get started with the model.

	### Load in the model using the BitsandBytes library

	```python
	pip install bitsandbytes
	```

	#### Load model from Hugging Face Hub with model name and bitsandbytes configuration

	```python

	def load_model_tokenizer(model_name: str, bnb_config: BitsAndBytesConfig) -> Tuple[AutoModelForCausalLM, AutoTokenizer]:
	"""
	Load the model and tokenizer from the HuggingFace model hub using quantization.

	Args:
	model_name (str): The name of the model.
	bnb_config (BitsAndBytesConfig): The quantization configuration of BitsAndBytes.

	Returns:
	Tuple[AutoModelForCausalLM, AutoTokenizer]: The model and tokenizer.
	"""


	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	quantization_config = bnb_config,
	# device_map = "auto",
	torch_dtype="auto",
	trust_remote_code=True
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token = True, trust_remote_code=True)

	tokenizer.pad_token = tokenizer.eos_token

	return model, tokenizer


	bnb_config = BitsAndBytesConfig(
	load_in_4bit = load_in_4bit,
	bnb_4bit_use_double_quant = bnb_4bit_use_double_quant,
	bnb_4bit_quant_type = bnb_4bit_quant_type,
	bnb_4bit_compute_dtype = bnb_4bit_compute_dtype,
	)

	model, tokenizer = load_model_tokenizer(model_name, bnb_config)

	```

	```python

	new_model = "YuvrajSingh9886/medicinal-QnA-phi2-custom"

	prompt = "I have been feeling more and more down for over a month. I have started having trouble sleeping due to panic attacks, but they are almost never triggered by something that I know of."

	tokens = tokenizer(f"### Question: {prompt}", return_tensors='pt').to('cuda')
	tokenizer.pad_token = tokenizer.eos_token
	outputs = model.generate(**tokens, max_new_tokens=1024, num_beams=5,
	no_repeat_ngram_size=2,
	early_stopping=True
	)
	print(tokenizer.batch_decode(outputs,skip_special_tokens=True)[0])

	```

	## Training Details

	### Training Data


	#### Hardware

	Epcohs: 10
	Hardware: (1) RTX 4090 (24GB VRAM) 48GB 8vCPU (RAM)
	Hard Disk: 40GB


	[More Information Needed]

	### Training Procedure

	QLoRA was used for quantization purposes.

	Phi-2 model from Huggingface with BitsandBytes support


	#### Preprocessing [optional]

	```python

	def format_phi2(row):
	question = row['Context']
	answer = row['Response']

	# text = f"[INST] {question} [/INST] {answer}".replace('\xa0', ' ')
	text = f"### Question: {question}\n ### Answer: {answer}"

	return text
	```

	#### Training Hyperparameters


	LoRA config-
	```bash
	# LoRA attention dimension (int)
	lora_r = 64

	# Alpha parameter for LoRA scaling (int)
	lora_alpha = 16

	# Dropout probability for LoRA layers (float)
	lora_dropout = 0.05

	# Bias (string)
	bias = "none"

	# Task type (string)
	task_type = "CAUSAL_LM"

	# Random seed (int)
	seed = 33
	```

	Phi-2 config-

	```bash
	# Batch size per GPU for training (int)
	per_device_train_batch_size = 6

	# Number of update steps to accumulate the gradients for (int)
	gradient_accumulation_steps = 2

	# Initial learning rate (AdamW optimizer) (float)
	learning_rate = 2e-4

	# Optimizer to use (string)
	optim = "paged_adamw_8bit"

	# Number of training epochs (int)
	num_train_epochs = 4

	# Linear warmup steps from 0 to learning_rate (int)
	warmup_steps = 10

	# Enable fp16/bf16 training (set bf16 to True with an A100) (bool)
	fp16 = True

	# Log every X updates steps (int)
	logging_steps = 100

	#L2 regularization(prevents overfitting)
	weight_decay=0.0

	#Checkpoint saves
	save_strategy="epoch"
	```

	BnB config

	```bash
	# Activate 4-bit precision base model loading (bool)
	load_in_4bit = True

	# Activate nested quantization for 4-bit base models (double quantization) (bool)
	bnb_4bit_use_double_quant = True

	# Quantization type (fp4 or nf4) (string)
	bnb_4bit_quant_type = "nf4"

	# Compute data type for 4-bit base models
	bnb_4bit_compute_dtype = torch.bfloat16

	```

	### Results

	Training loss: 2.229
	Validation loss: 2.223


	## More Information [optional]

	[Phi-2](https://huggingface.co/microsoft/phi-2)

	## Model Card Authors [optional]

	[YuvrajSingh9886](https://huggingface.co/YuvrajSingh9886)


	This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)