MedQA — Qwen3-1.7B LoRA Fine-tuned on MedMCQA

Clinical question-answering AI fine-tuned on MedMCQA. Built on AMD Instinct MI300X via ROCm — no CUDA required.

Model Details

  • Base Model : Qwen/Qwen3-1.7B
  • Fine-tuning : LoRA (r=4, target: q_proj + v_proj)
  • Dataset : openlifescienceai/medmcqa (193k clinical MCQs)
  • Hardware : AMD Instinct MI300X (192GB HBM3)
  • Precision : bfloat16 (ROCm native)
  • Trainable : ~2.2M of 1.7B parameters (0.15%)

What It Does

Given a clinical multiple-choice question with 4 options, the model selects the correct answer and explains its reasoning.

Example input:

### Question:
First-line treatment for hypertensive emergency?

### Options:
A) Oral amlodipine
B) IV labetalol or IV nitroprusside
C) Sublingual nifedipine
D) IM hydralazine

### Answer:

Example output:

B) IV labetalol or IV nitroprusside

Explanation:
Hypertensive emergencies require immediate IV therapy.
Labetalol is a combined alpha and beta blocker that rapidly
reduces blood pressure safely. Nitroprusside is a vasodilator
used when faster or more precise control is needed.

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

BASE_MODEL   = "Qwen/Qwen3-1.7B"
ADAPTER_REPO = "HK2184/medqa-qwen3-lora"

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
tokenizer.pad_token    = tokenizer.eos_token
tokenizer.padding_side = "left"

base = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base, ADAPTER_REPO)
model = model.merge_and_unload()
model.eval()

prompt = """### Question:
First-line treatment for hypertensive emergency?

### Options:
A) Oral amlodipine
B) IV labetalol or IV nitroprusside
C) Sublingual nifedipine
D) IM hydralazine

### Answer:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.3,
        pad_token_id=tokenizer.eos_token_id,
    )
new = out[0][inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new, skip_special_tokens=True))

Training Details

  • Framework : PyTorch + HuggingFace Transformers + PEFT + TRL
  • LoRA rank : r=4, alpha=16
  • Batch size : 4
  • Learning rate: 1e-4
  • Epochs : 1
  • Max length : 128 tokens
  • Samples : 500 from MedMCQA train split
  • Training time: ~5 minutes on AMD MI300X

AMD ROCm Notes

Trained entirely on AMD hardware using ROCm 7.2. Key insight: bfloat16 is stable on MI300X — fp16 caused gradient norm explosion (nan) during LoRA training.

Environment variables used: ROCR_VISIBLE_DEVICES=0 HIP_VISIBLE_DEVICES=0 HSA_OVERRIDE_GFX_VERSION=9.4.2

Live Demo

Try it without any setup: https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/MedQA-Medical-AI-on-AMD-ROCm

Repository

Full training code, eval script, and Gradio app: https://github.com/HK2184/MedQA-Medical-AI-on-AMD-ROCm

Dataset

MedMCQA — 193,000 medical multiple choice questions from Indian medical entrance exams (AIIMS, USMLE-style). https://huggingface.co/datasets/openlifescienceai/medmcqa

Authors

Harikrishna Sivanand Iyer and Srijan Sivaram A Built for the AMD Hackathon on lablab.ai

License

MIT — free to use, modify, and build on.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lablab-ai-amd-developer-hackathon/medqa-qwen3-lora

Finetuned
Qwen/Qwen3-1.7B
Adapter
(487)
this model

Dataset used to train lablab-ai-amd-developer-hackathon/medqa-qwen3-lora