Vijayendra
/

Phi4-MedQA

Question Answering

Model card Files Files and versions Community

Phi4-MedQA / README.md

Vijayendra's picture

Update README.md

74446e5 verified 14 days ago

|

2.59 kB

	---
	base_model: unsloth/phi-4-unsloth-bnb-4bit
	library_name: peft
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->



	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## How to Use

	# Install required libraries
	!pip install unsloth peft bitsandbytes accelerate transformers

	# Import necessary modules
	from transformers import AutoTokenizer
	from unsloth import FastLanguageModel

	# Define the MedQA prompt
	medqa_prompt = """You are a medical QA system. Answer the following medical question clearly and in detail with complete sentences.

	### Question:
	{}

	### Answer:
	"""

	# Load the model and tokenizer using unsloth
	model_name = "Vijayendra/Phi4-MedQA" # Replace with your Hugging Face model name
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name=model_name,
	max_seq_length=2048,
	dtype=None, # Use default precision
	load_in_4bit=True, # Enable 4-bit quantization
	device_map="auto" # Automatically map model to available devices
	)

	# Enable faster inference
	FastLanguageModel.for_inference(model)

	# Prepare the medical question
	medical_question = "What are the common symptoms of diabetes?" # Replace with your medical question
	inputs = tokenizer(
	[medqa_prompt.format(medical_question)],
	return_tensors="pt",
	padding=True,
	truncation=True,
	max_length=1024
	).to("cuda") # Ensure inputs are on the GPU

	# Generate the output
	outputs = model.generate(
	**inputs,
	max_new_tokens=512, # Allow for detailed responses
	use_cache=True # Speeds up generation
	)

	# Decode and clean the response
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Extract and print the generated answer
	answer_text = response.split("### Answer:")[1].strip() if "### Answer:" in response else response.strip()

	print(f"Question: {medical_question}")
	print(f"Answer: {answer_text}")



	[More Information Needed]
	### Framework versions

	- PEFT 0.14.0