--- language: - en license: apache-2.0 datasets: - ptb-xl tags: - ecg - cardiology - signal-processing - medical - unet - clip - lead-generation - pytorch - 1d-unet - film-conditioning metrics: - rmse pipeline_tag: other library_name: pytorch --- # ๐Ÿซ€ ECG Lead Generator โ€” 7 โ†’ 12 Leads A **CLIP-conditioned 1D U-Net** that reconstructs 5 missing precordial ECG leads (V2โ€“V6) from 7 available leads (I, II, III, aVR, aVL, aVF, V1), enabling full 12-lead ECG synthesis from reduced-lead recordings. --- ## Clinical Motivation Standard 12-lead ECGs require 10 body-surface electrodes. In wearables, ambulatory monitoring, and emergency pre-hospital settings, only limb leads + V1 may be feasible to acquire. This model reconstructs the missing precordial leads with clinical fidelity using a visual-language prior from CLIP. --- ## Architecture ``` Input [B, 7, 2500] โ”€โ”€โ–บ 1D U-Net (CLIP-FiLM conditioned) โ”€โ”€โ–บ Output [B, 5, 2500] โ–ฒ CLIP-ViT-L/14 (frozen) ECG red-grid image โ†’ 1024-d embedding FiLM-injected at every encoder/decoder scale ``` | Component | Detail | |---|---| | Backbone | 1D U-Net, 4 encoder + 4 decoder scales | | Conditioning | CLIP-ViT-L/14 โ†’ 1024-d pooler output | | Conditioning mechanism | FiLM (Feature-wise Linear Modulation) at every scale | | Parameters | **13,703,507** (LeadGenerator only; CLIP is frozen) | | Base channels | 64 | | Sequence length | 2500 samples (5 s @ 500 Hz) | | Loss | Huber loss | | Optimiser | AdamW + CosineAnnealingLR | ### Why CLIP conditioning? Each ECG is rendered as a **red-grid clinical image** (standard paper layout) before being passed through CLIP-ViT-L. The resulting 1024-d embedding captures morphological patterns visually and is injected into the U-Net via FiLM โ€” allowing the generator to produce lead-consistent waveforms conditioned on the global ECG appearance. --- ## Performance Evaluated on a held-out 10% split of PTB-XL (500 Hz, 200 records). | Lead | RMSE โ†“ | DTW (normalised) โ†“ | |------|--------|--------------------| | V2 | 0.41751 | 0.10016 | | V3 | 0.52274 | 0.10500 | | V4 | 0.45217 | 0.09813 | | V5 | 0.35278 | 0.07953 | | V6 | 0.37252 | 0.09069 | | **Mean** | **0.42355** | **0.09470** | > Evaluated on 200 held-out PTB-XL records (500 Hz). V5 achieves the best reconstruction > quality (RMSE 0.353), consistent with its anatomical proximity to V4 and V6 which are > both present in the training conditioning signal. --- ## How to Use ### Install dependencies ```bash pip install torch huggingface_hub safetensors transformers ``` ### Load the model ```python import torch from huggingface_hub import hf_hub_download from safetensors.torch import load_file import json # Load config cfg_path = hf_hub_download("rishsoraganvi/ecg-lead-generator", "config.json") with open(cfg_path) as f: cfg = json.load(f) # Paste or import LeadGenerator from model.py in this repo from model import LeadGenerator model = LeadGenerator( ni=cfg["n_in"], # 7 no=cfg["n_out"], # 5 ch=cfg["base_ch"], # 64 cd=cfg["clip_dim"], # 1024 ) weights_path = hf_hub_download("rishsoraganvi/ecg-lead-generator", "model.safetensors") model.load_state_dict(load_file(weights_path)) model.eval() ``` ### Run inference ```python import torch import numpy as np from transformers import CLIPProcessor, CLIPModel # 1. Load CLIP (frozen) clip_model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14") clip_enc = clip_model.vision_model.eval() clip_proc = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14") # 2. Prepare 7-lead ECG input: numpy array [7, 2500], 500 Hz, z-normalised ecg_7lead = np.random.randn(7, 2500).astype(np.float32) # replace with real data # 3. Render ECG as red-grid image โ†’ CLIP embedding # (use render_redgrid() from the notebook or your preprocessing pipeline) from PIL import Image img = render_redgrid(ecg_7lead) # PIL RGB image inp = clip_proc(images=[img], return_tensors="pt") with torch.no_grad(): clip_emb = clip_enc(**inp).pooler_output # [1, 1024] # 4. Generate missing leads x = torch.FloatTensor(ecg_7lead).unsqueeze(0) # [1, 7, 2500] with torch.no_grad(): pred = model(x, clip_emb) # [1, 5, 2500] # pred contains V2, V3, V4, V5, V6 ``` --- ## Training Details | Setting | Value | |---|---| | Dataset | PTB-XL (PhysioNet, v1.0.3) | | Sampling rate | 500 Hz | | Training samples | 2,000 records | | Train / Val / Test | 80 / 10 / 10 | | Preprocessing | Butterworth bandpass 0.5โ€“40 Hz + z-score normalisation | | Epochs | 60 | | Batch size | 32 | | Learning rate | 1e-3 | | Weight decay | 1e-4 | | Gradient clipping | 1.0 | | Hardware | Vast.ai A100 GPU | | CLIP model | `openai/clip-vit-large-patch14` (frozen) | --- ## Repository Structure ``` ecg-lead-generator/ โ”œโ”€โ”€ model.safetensors # Model weights (safetensors format) โ”œโ”€โ”€ config.json # Model configuration โ”œโ”€โ”€ model.py # LeadGenerator architecture โ””โ”€โ”€ README.md # This file ``` --- ## Limitations - Trained on 2,000 PTB-XL records โ€” a larger training set is recommended for production use - Validated on 500 Hz recordings only - Not validated on all pathological ECG subtypes present in clinical practice - **Not a medical device** โ€” intended for research and educational purposes only --- ## Citation If you use this model in your work, please cite PTB-XL: ```bibtex @article{wagner2020ptb, title={PTB-XL, a large publicly available electrocardiography dataset}, author={Wagner, Patrick and Strodthoff, Nils and Bousseljot, Ralf-Dieter and others}, journal={Scientific Data}, volume={7}, number={1}, pages={154}, year={2020}, publisher={Nature Publishing Group} } ``` --- ## Author **Rishabh Soraganvi** โ€” [GitHub](https://github.com/rishsoraganvi) ยท [Hugging Face](https://huggingface.co/rishsoraganvi)