MedQA — Qwen3-1.7B LoRA Fine-tuned on MedMCQA
Clinical question-answering AI fine-tuned on MedMCQA. Built on AMD Instinct MI300X via ROCm — no CUDA required.
Model Details
- Base Model : Qwen/Qwen3-1.7B
- Fine-tuning : LoRA (r=4, target: q_proj + v_proj)
- Dataset : openlifescienceai/medmcqa (193k clinical MCQs)
- Hardware : AMD Instinct MI300X (192GB HBM3)
- Precision : bfloat16 (ROCm native)
- Trainable : ~2.2M of 1.7B parameters (0.15%)
What It Does
Given a clinical multiple-choice question with 4 options, the model selects the correct answer and explains its reasoning.
Example input:
### Question:
First-line treatment for hypertensive emergency?
### Options:
A) Oral amlodipine
B) IV labetalol or IV nitroprusside
C) Sublingual nifedipine
D) IM hydralazine
### Answer:
Example output:
B) IV labetalol or IV nitroprusside
Explanation:
Hypertensive emergencies require immediate IV therapy.
Labetalol is a combined alpha and beta blocker that rapidly
reduces blood pressure safely. Nitroprusside is a vasodilator
used when faster or more precise control is needed.
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
BASE_MODEL = "Qwen/Qwen3-1.7B"
ADAPTER_REPO = "HK2184/medqa-qwen3-lora"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
base = AutoModelForCausalLM.from_pretrained(
BASE_MODEL,
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, ADAPTER_REPO)
model = model.merge_and_unload()
model.eval()
prompt = """### Question:
First-line treatment for hypertensive emergency?
### Options:
A) Oral amlodipine
B) IV labetalol or IV nitroprusside
C) Sublingual nifedipine
D) IM hydralazine
### Answer:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.3,
pad_token_id=tokenizer.eos_token_id,
)
new = out[0][inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new, skip_special_tokens=True))
Training Details
- Framework : PyTorch + HuggingFace Transformers + PEFT + TRL
- LoRA rank : r=4, alpha=16
- Batch size : 4
- Learning rate: 1e-4
- Epochs : 1
- Max length : 128 tokens
- Samples : 500 from MedMCQA train split
- Training time: ~5 minutes on AMD MI300X
AMD ROCm Notes
Trained entirely on AMD hardware using ROCm 7.2. Key insight: bfloat16 is stable on MI300X — fp16 caused gradient norm explosion (nan) during LoRA training.
Environment variables used: ROCR_VISIBLE_DEVICES=0 HIP_VISIBLE_DEVICES=0 HSA_OVERRIDE_GFX_VERSION=9.4.2
Live Demo
Try it without any setup: https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/MedQA-Medical-AI-on-AMD-ROCm
Repository
Full training code, eval script, and Gradio app: https://github.com/HK2184/MedQA-Medical-AI-on-AMD-ROCm
Dataset
MedMCQA — 193,000 medical multiple choice questions from Indian medical entrance exams (AIIMS, USMLE-style). https://huggingface.co/datasets/openlifescienceai/medmcqa
Authors
Harikrishna Sivanand Iyer and Srijan Sivaram A Built for the AMD Hackathon on lablab.ai
License
MIT — free to use, modify, and build on.
- Downloads last month
- -