base_model: unsloth/phi-4-unsloth-bnb-4bit
library_name: peft
Model Card for Model ID
Model Details
Model Description
- Developed by: [More Information Needed]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
How to Use
Install required libraries
!pip install unsloth peft bitsandbytes accelerate transformers
Import necessary modules
from transformers import AutoTokenizer from unsloth import FastLanguageModel
Define the MedQA prompt
medqa_prompt = """You are a medical QA system. Answer the following medical question clearly and in detail with complete sentences.
Question:
{}
Answer:
"""
Load the model and tokenizer using unsloth
model_name = "Vijayendra/Phi4-MedQA" # Replace with your Hugging Face model name model, tokenizer = FastLanguageModel.from_pretrained( model_name=model_name, max_seq_length=2048, dtype=None, # Use default precision load_in_4bit=True, # Enable 4-bit quantization device_map="auto" # Automatically map model to available devices )
Enable faster inference
FastLanguageModel.for_inference(model)
Prepare the medical question
medical_question = "What are the common symptoms of diabetes?" # Replace with your medical question inputs = tokenizer( [medqa_prompt.format(medical_question)], return_tensors="pt", padding=True, truncation=True, max_length=1024 ).to("cuda") # Ensure inputs are on the GPU
Generate the output
outputs = model.generate( **inputs, max_new_tokens=512, # Allow for detailed responses use_cache=True # Speeds up generation )
Decode and clean the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Extract and print the generated answer
answer_text = response.split("### Answer:")[1].strip() if "### Answer:" in response else response.strip()
print(f"Question: {medical_question}") print(f"Answer: {answer_text}")
[More Information Needed]
Framework versions
- PEFT 0.14.0