Lossless Mechanistic Compression and Surgical Correction of Medical Imaging Models

Artifacts for the paper by Yeonseong Cynn (River Lab, May 2026).

Summary

A compressed CheXNet (DenseNet121) at 51.43% parameter reduction (6,966,034 β†’ 3,383,248) with mean AUROC preserved within sampling noise on n=1045 NIH ChestX-ray14 test images (Ξ” +0.0004, per-pathology max |Ξ”| = 0.0033). Output identity to numerical precision (max |Ξ” logit| < 5Γ—10⁻⁢).

The compressed model exposes classifier channels at a granularity that makes mechanistic interventions practical:

  • Surgical correction: 5-channel classifier weight zeroing softly reduces a target false-positive probability with bounded side effects.
  • Mutual exclusivity insight: 89 of 100 polarized classifier channels are not architectural conflicts but bipolar discriminative axes exploiting label mutual exclusivity (Jaccard < 0.1).
  • Cost-aware operations: threshold calibration and minimal retraining routed by a decision system per pathology.
  • Clinical report auto-generation: combining channel-level evidence, Grad-CAM region mapping, and mutual-exclusivity exclusion.

Files

Weights

File Size Description
compressed_model.pt 14.2 MB Compressed CheXNet backbone + classifier (3.38M params)
classifier_finetuned.pt 75 KB Optional fine-tuned classifier head (18K params)

Code

File Description
inference.py Minimal CLI inference (load + forward)
requirements.txt Pip dependencies

Metrics (JSON)

  • metrics/baseline_vs_compressed.json β€” Per-pathology AUROC (baseline vs compressed, n=1045)
  • metrics/eval_nih_weights.json β€” All 5 torchxrayvision DenseNet121 checkpoints on NIH test
  • metrics/analyze_binary_axis.json β€” Pathology independence + Jaccard matrix
  • metrics/q_conflict_legitimacy.json β€” Polarized channel legitimacy classification
  • metrics/surgery_channel_ablation.json β€” Surgical correction K-sweep
  • metrics/apply_threshold_calibration.json β€” Per-class Youden threshold + F1/Recall
  • metrics/minimal_retrain.json, minimal_retrain_v2.json β€” Classifier-head fine-tune costs

Figures (paper)

  • figures/fig1_compression.{pdf,png} β€” Headline numbers
  • figures/fig2_sparse_layers.{pdf,png} β€” Per-block sparsity
  • figures/fig3_mutual_exclusivity.{pdf,png} β€” Jaccard ↔ channel-usage mirror
  • figures/fig4_legitimacy.{pdf,png} β€” Polarized-channel legitimacy
  • figures/fig5_surgery.{pdf,png} β€” Surgical-correction K-sweep
  • figures/fig6_treatment.{pdf,png} β€” Per-pathology treatment recommendations
  • figures/fig7_minimal_retrain.{pdf,png} β€” Fine-tuning + threshold calibration
  • figures/fig8_clinical_report.{pdf,png} β€” Clinical report sample

Setup

pip install -r requirements.txt

Usage

Loading and inference

python inference.py path/to/xray.png
python inference.py path/to/xray.png --classifier classifier_finetuned.pt

Loading the model in your own code

import torch
import torch.nn as nn
import torchxrayvision as xrv

model = xrv.models.DenseNet(weights="densenet121-res224-all").eval()
ckpt = torch.load("compressed_model.pt", weights_only=False)
for block_idx in [1, 2, 3, 4]:
    block = getattr(model.features, f"denseblock{block_idx}")
    block_alive = ckpt["alive_per_block"][block_idx]
    for dl_key, n_alive in block_alive.items():
        i = int(dl_key[2:])
        L = getattr(block, f"denselayer{i}")
        in_ch = L.conv1.in_channels
        L.conv1 = nn.Conv2d(in_ch, n_alive, 1, bias=True).eval()
        L.norm2 = nn.BatchNorm2d(n_alive, eps=L.norm2.eps).eval()
        L.conv2 = nn.Conv2d(n_alive, 32, 3, padding=1, bias=False).eval()
model.load_state_dict(ckpt["state_dict"])
for block_idx in [1, 2, 3, 4]:
    block = getattr(model.features, f"denseblock{block_idx}")
    for i in range(1, {1:6, 2:12, 3:24, 4:16}[block_idx] + 1):
        getattr(block, f"denselayer{i}").norm2 = nn.Identity()

# Optional fine-tuned classifier
cls_ft = nn.Linear(1024, 18)
cls_ft.load_state_dict(torch.load("classifier_finetuned.pt", weights_only=True))
model.classifier = cls_ft
model.eval()

Verification

NIH ChestX-ray14 official test split (1045 images)

Configuration Parameters Mean AUROC Latency (ms/image)
Baseline (densenet121-res224-all) 6,966,034 0.7781 15.17
Compressed 3,383,248 (-51.43%) 0.7785 (+0.0004) 14.73 (-2.9%)

Per-pathology max |Ξ” AUROC| = 0.0033 (Emphysema +); all within sampling noise.

Choice of baseline checkpoint

We compared all 5 torchxrayvision DenseNet121 checkpoints on the same NIH test subset. The multi-source all is the strongest:

Checkpoint Mean AUROC
densenet121-res224-all 0.7781
densenet121-res224-nih 0.7524
densenet121-res224-chex 0.7425
densenet121-res224-mimic_ch 0.7178
densenet121-res224-mimic_nb 0.7049

Higher published NIH-only DenseNet121 numbers (e.g., 0.84) come from corpus-specific hyperparameter and augmentation tuning not part of the open torchxrayvision release.

Threshold calibration (Youden-J)

The default decision threshold 0.5 is overly conservative for this multi-label model. Per-class Youden-J on a held-out validation set shifts the cohort-average operating point:

Setting Mean F1 Mean Recall
Default threshold 0.5 0.127 0.111
Youden-J calibrated 0.20 0.78

Caveat: this trades precision for recall sharply. Best-performing classes (Cardiomegaly: precision 1.0 β†’ 0.11, F1 0.57 β†’ 0.20; Mass: F1 0.25 β†’ 0.07) are degraded. F1 average is dominated by previously zero-recall classes (Infiltration, Atelectasis). For deployment, F1-optimal thresholds or explicit clinical precision floors are preferable.

Surgical correction (representative Cardiomegaly false positive)

K (channels zeroed) Target prob TP loss Other 13 pathology AUROC Ξ”
0 (baseline) 0.89 β€” β€”
5 0.76 0 exactly 0
10 0.67 4 exactly 0
20 0.54 7 exactly 0

At K=5 the decision (threshold 0.5) is not flipped; the correction is a soft probability reduction, not a hard decision change. K=20 crosses the boundary but loses 7 true positives. Treat surgical correction as a confidence-shaping tool, not a binary error eraser.

The exact-zero AUROC isolation guarantee on the other 13 pathologies holds by construction (only one classifier row is modified).

Method Disclosure

Compression method specifics are proprietary; the foundational procedure is covered by Korean patent applications. The released artifacts (weights, inference code, downstream analysis scripts) are sufficient for reproduction of the reported results.

Base Model

torchxrayvision densenet121-res224-all (DenseNet121 trained on NIH ChestX-ray14, CheXpert, MIMIC-CXR, PadChest).

Citation

Zenodo preprint: 10.5281/zenodo.20131680

@misc{cynn2026chexnet,
  title={Lossless Mechanistic Compression and Surgical Correction of Medical Imaging Models},
  author={Cynn, Yeonseong},
  year={2026},
  publisher={Zenodo},
  doi={10.5281/zenodo.20131680},
  url={https://doi.org/10.5281/zenodo.20131680}
}

License

MIT for the released code; the underlying compression method is proprietary (see Method Disclosure above).

Contact

For questions or commercial inquiries: whitepep@gmail.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support