Instructions to use vadimbelsky/bert-v47-medical-triage with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vadimbelsky/bert-v47-medical-triage with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="vadimbelsky/bert-v47-medical-triage")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("vadimbelsky/bert-v47-medical-triage", dtype="auto") - Notebooks
- Google Colab
- Kaggle
BERT v47 β Medical Triage Decision Support (19-head)
A 19-head BERT model for emergency department triage decision support. Predicts ESI (Emergency Severity Index) levels 1-5 from free-text triage narratives + supplemental heads for symptoms, resources, vitals, flags, and clinical context.
Architecture: BiomedBERT encoder (109M params) + 19 task heads, trained with focal loss, label smoothing, ordinal-distance penalty, layer-wise LR decay, and effective-number-of-samples class weighting.
Intended use: clinical decision support for triage nurses β produces ESI prediction with confidence, detected symptoms, suggested resources, and uncertainty signals. Not a standalone diagnostic system.
Eval results (epoch 3, 4 clean holdouts)
| Dataset | n | Exact | Adjacent | ESI 1 recall | ESI 5 recall |
|---|---|---|---|---|---|
| MIETIC clean (narrative) | 200 | 85.0% | 94.5% | 56.7% | 92.5% |
| MIMIC-IV-ED holdout | 7,917 | 62.9% | 97.9% | 65.3% | 25.0% |
| Lukina v3 (curated narrative) | 201 | 58.2% | 86.1% | 80.0% | 25.0% |
| MC-MED Stanford clean | 1,000 | 57.2% | 96.0% | 18.0% | 6.0% |
| ER-REASON (unseen variants, 200) | 200 | 50.5% | 93.5% | n/a | n/a |
Validation metrics at best checkpoint (composite=0.791):
- esi_exact: 0.827
- esi_adjacent: 0.993
- symptom_f1_micro: 0.706
- symptom_p_micro: 0.641, symptom_r_micro: 0.786
- flag_f1_macro: 0.994
- ner_entity_f1: 1.000
19 head outputs
PRIMARY (1 head) β the answer
esi_head 5 softmax over ESI levels 1-5
CRITICAL DISPLAY (3) β trust drivers for nurse UI
symptom_head 176 multi-label concepts (chest_pain, sepsis_signs, etc.)
resource_head 12 multi-label resource types
resource_count_head 3 bucket 0 / 1 / 2+
PERCEPTION (8) β engine-input + context
flag_head 3 severe_pain_distress, on_anticoag, altered_ms
vitals_head 6 HR, BP_sys, BP_dia, SpO2, RR, Temp_C
ner_head 21 BIO spans
medrec_head 2 on_anticoag, on_antiplatelet
pain_head 1 0-10 regression
age_head 1 years regression
arrival_head 5 ambulance/walk-in/EMS/etc
gender_head 3 M/F/U
SAFETY (2) β rare-positive critical
airway_head 1 pos_weight=2500
resus_head 1 pos_weight=200
AUXILIARY (5) β regularization + outcome signals
gestalt_head 5 outcome tier (OUT-1..5)
disposition_head 9
syndrome_head 15
history_visits_head 3
history_admits_head 3
last_dx_head 30
Quick start
import torch
from transformers import AutoTokenizer
# 1. Get the architecture code
# Either clone the source repo or copy train_bert_v47.py from this repo
from train_bert_v47 import V47MultiHeadBERT
ENCODER = "microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext"
tokenizer = AutoTokenizer.from_pretrained("vadimbelsky/bert-v47-medical-triage", subfolder="tokenizer")
model = V47MultiHeadBERT(ENCODER)
state = torch.load("model.pt", map_location="cpu", weights_only=False)
model.load_state_dict(state, strict=False)
model.eval()
text = """52-year-old female arrived by ambulance with chest pain.
Vital signs: HR 86, BP 134/78, RR 16, SpO2 99%, T 36.4Β°C. Pain 7/10."""
enc = tokenizer(text, return_tensors="pt", truncation=True, max_length=512, padding="max_length")
with torch.no_grad():
out = model(enc["input_ids"], enc["attention_mask"])
esi_pred = int(out["esi_logits"].argmax(-1)) + 1 # 1..5
print(f"ESI: {esi_pred}")
Training
Encoder: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext
Total params: 109.7M
Loss: focal CE Ξ³=2 + ordinal-distance penalty (esi)
focal BCE Ξ³=2 + pos_weight (airway/resus)
BCE multi-label (symptom/resource heads)
Class weights: effective-number-of-samples Ξ²=0.999 (Cui et al. 2019)
Optimization: AdamW + cosine schedule + layer-wise LR decay (0.9/layer)
Precision: bf16 mixed
Checkpoint: 0.7 Γ esi_exact + 0.3 Γ symptom_f1_micro (composite)
Best checkpoint composite: 0.791 (epoch 3 of 6, early stopped at epoch 3)
Limitations
- Compact CC dialect heavy in training corpus (MIMIC-IV-ED dominates at 290K/354K records) β over-fits to short telegraphic CC + vitals format
- Lukina-style structured narrative: 0 representation in train; eval shows 58% exact on this dialect (vs 85% on MIETIC where MIETIC-style examples ARE in train)
- MC-MED Stanford: 57% exact, ESI 1 recall 18% β Stanford triage dialect under-represented
- ESI 5 underperformance: ESI 5 (non-urgent) recall 25% on MIMIC; class is rare (~5K records) and easily mis-routed to ESI 3-4
- Long-note format (full ED notes): max_length=512 truncates ER-REASON discharge summaries / H&P notes severely; ER-REASON unseen variants score 50.5% exact (but 93.5% adjacent)
Citation
@misc{belsky2026berttriage,
title = {BERT v47: Multi-head Decision Support for Emergency Triage},
author = {Belski, Vadzim},
year = {2026},
url = {https://huggingface.co/vadimbelsky/bert-v47-medical-triage}
}
Disclaimer
This model is research software for clinical decision support, not a standalone diagnostic system. ESI predictions are advisory only and must be reviewed by a licensed clinician. The model has known limitations on rare ESI classes (1 and 5) and out-of-distribution narrative formats. Do not deploy in production triage workflows without thorough validation on your local patient population, IRB approval, and physician oversight.