Instructions to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="vadimbelsky/qwen3.5-medical-ft-stage3-dpo") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("vadimbelsky/qwen3.5-medical-ft-stage3-dpo") model = AutoModelForImageTextToText.from_pretrained("vadimbelsky/qwen3.5-medical-ft-stage3-dpo") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vadimbelsky/qwen3.5-medical-ft-stage3-dpo" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vadimbelsky/qwen3.5-medical-ft-stage3-dpo", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/vadimbelsky/qwen3.5-medical-ft-stage3-dpo
- SGLang
How to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "vadimbelsky/qwen3.5-medical-ft-stage3-dpo" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vadimbelsky/qwen3.5-medical-ft-stage3-dpo", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "vadimbelsky/qwen3.5-medical-ft-stage3-dpo" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vadimbelsky/qwen3.5-medical-ft-stage3-dpo", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Unsloth Studio new
How to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vadimbelsky/qwen3.5-medical-ft-stage3-dpo to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vadimbelsky/qwen3.5-medical-ft-stage3-dpo to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for vadimbelsky/qwen3.5-medical-ft-stage3-dpo to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="vadimbelsky/qwen3.5-medical-ft-stage3-dpo", max_seq_length=2048, ) - Docker Model Runner
How to use vadimbelsky/qwen3.5-medical-ft-stage3-dpo with Docker Model Runner:
docker model run hf.co/vadimbelsky/qwen3.5-medical-ft-stage3-dpo
Qwen3.5-9B Medical Triage — Stage 3 DPO (v4)
Emergency department triage model fine-tuned on Qwen3.5-9B via a 3-stage pipeline: Stage 1 (general medical SFT) → Stage 2 (ED intake SOAP → ESI decision SFT) → Stage 3 (DPO alignment to reduce over-triage, this model).
Quantized to Q4_K_M GGUF for on-device inference.
Model Description
Given an ED SOAP intake note, the model outputs a structured triage decision:
- ESI level (1–5) with justification
- Key clinical findings
- Time-to-provider target
- Immediate interventions required
ESI Scale: 1 = Immediate life threat · 2 = Emergent high-risk · 3 = Urgent stable · 4 = Less urgent · 5 = Non-urgent
Training Pipeline
| Stage | Method | Objective |
|---|---|---|
| 1 | SFT (LoRA r=16) | General medical knowledge (PubMed, clinical guidelines) |
| 2 | SFT (LoRA r=16) | SOAP note → structured ESI triage decision |
| 3 | DPO (LoRA r=8) | Reduce over-triage · preserve ESI 1/2 high-risk recall |
Stage 3 DPO Details
- Base: Stage 2 LoRA checkpoint (
vadimbelsky/qwen3.5-medical-ft-stage2) - Dataset:
dpo_dataset_v4.jsonl— 5,413 raw pairs → 7,789 weighted pairs - Loss: Combined
apo_down × 0.3 + sft × 1.0(MPO-style) - Beta: 0.5 · LR: 5e-5 · Epochs: 0.1 (47 steps)
- Batch: 2 × 8 gradient accumulation = effective 16
- ESI label prepending: All chosen/rejected completions prefixed with explicit ESI label (e.g.
ESI 2 — Emergent (high risk)\n\n...) to anchor preference signal at token position 0
Dataset Sources (v4)
| Source | Description | Raw pairs | Weight | Weighted |
|---|---|---|---|---|
| A | Anti-overtriage synthetic (ESI 3→1/2 rejected) | 2,388 | 1× | 2,388 |
| B | Anti-overtriage synthetic (ESI 4/5→1/2 rejected) | 1,500 | 1× | 1,500 |
| C | Edge cases (synthetic boundary scenarios) | 39 | 1× | 39 |
| D | ESI 1/2 anchor pairs (high-risk recall preservation) | 890 | 3× | 2,670 |
| E-over | ESI 3 bidirectional — anti-overtriage | 297 | 2× | 594 |
| E-under | ESI 3 bidirectional — anti-undertriage | 299 | 2× | 598 |
| Total | 5,413 | 7,789 |
Evaluation Results
Evaluated on MIMIC-IV-Ext Triage Instruction Corpus (MIETIC) — 36 human-expert validated RETAIN cases.
v4 vs Previous Stages
| Metric | Stage 2 (SFT) | v1 DPO | v2 DPO | v3 DPO | v4 DPO | Target |
|---|---|---|---|---|---|---|
| Accuracy | ~68% | 55.6% | 50.0% | 27.8% | 75.0% | >82% |
| Over-triage rate | ~22% | 22.2% | 30.6% | 0% | 13.9% | <10% |
| Under-triage rate | ~8% | 36.1% | 41.7% | 72.2% | 11.1% | <6% |
| High-risk recall (ESI 1+2) | ~84% | 76% | 64% | 40% | 92% | 100% |
| ESI 3 accuracy | ~45% | ~40% | ~30% | ~0% | 60% | >65% |
v4 Detailed Results (MIETIC, n=36)
Samples evaluated : 36
ESI level parsed : 36 / 36
Correct : 27
Accuracy : 75.0%
Under-triage rate : 11.1% (4 cases)
Over-triage rate : 13.9% (5 cases)
High-risk recall : 92.0% (ESI 1+2, n=25)
Per-ESI Accuracy:
| ESI Level | N | Correct | Accuracy |
|---|---|---|---|
| ESI 1 | 14 | 12 | 85.7% |
| ESI 2 | 11 | 9 | 81.8% |
| ESI 3 | 5 | 3 | 60.0% |
| ESI 4 | 4 | 2 | 50.0% |
| ESI 5 | 2 | 1 | 50.0% |
Confusion Matrix (rows = ground truth, cols = predicted):
GT \ Pred ESI 1 ESI 2 ESI 3 ESI 4 ESI 5
ESI 1 12 2 0 0 0
ESI 2 0 9 2 0 0
ESI 3 0 2 3 0 0
ESI 4 0 0 2 2 0
ESI 5 0 0 0 1 1
All remaining errors are ±1 ESI boundary confusions — no catastrophic mis-triage.
Key Lessons from DPO Iteration
- v1–v3 failure: IPO/sigmoid loss collapsed when dataset direction was 100% anti-overtriage → catastrophic under-triage regression (40% high-risk recall at worst)
- v4 fix: (1) ESI label prepended at token position 0 for unambiguous preference signal; (2)
apo_down + sftcombined loss preserves ESI 1/2 recall via SFT component; (3) Sources D (ESI 1/2 anchors ×3) + E (ESI 3 bidirectional ×2) balance dataset direction
Usage
# Requires llama.cpp server running with the Q4_K_M GGUF
# llama-server --model qwen3.5-medical-ft-stage3-dpo-q4km.gguf --port 8080 -c 4096
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="none")
SYSTEM_PROMPT = (
"You are an expert emergency medicine triage nurse. "
"Given a SOAP intake note, provide a structured triage decision including "
"ESI level with justification, key clinical findings, time-to-provider target, "
"and any immediate interventions required."
)
response = client.chat.completions.create(
model="local",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "<SOAP intake note here>"},
],
temperature=0.1,
max_tokens=512,
)
print(response.choices[0].message.content)
Limitations & Safety
⚠️ This model is for research purposes only. It must NOT be used for clinical decision-making without licensed clinician oversight.
- Evaluated on 36 MIETIC validation cases — not a clinical trial
- 11.1% under-triage rate means critical patients may be down-triaged
- 92% high-risk recall means ~8% of ESI 1/2 patients may be missed
- Model has not been validated on real ED populations
- Fine-tuned on synthetic + MIMIC-IV derived data only
Training Infrastructure
- Hardware: NVIDIA GB10 (121 GB VRAM), 1 GPU
- Framework: Unsloth 2026.3.4 + TRL DPOTrainer + Transformers 5.2.0
- Training time: ~2 hours (47 steps)
- Quantization: GGUF Q4_K_M via llama.cpp
Fine-tuned with Unsloth 🦥
- Downloads last month
- 87
docker model run hf.co/vadimbelsky/qwen3.5-medical-ft-stage3-dpo