Update README.md
Browse files
README.md
CHANGED
|
@@ -13,8 +13,9 @@ tags:
|
|
| 13 |
- medgemma
|
| 14 |
- medsiglip
|
| 15 |
datasets:
|
| 16 |
-
-
|
| 17 |
-
-
|
|
|
|
| 18 |
base_model:
|
| 19 |
- google/medgemma-1.5-4b-it
|
| 20 |
- google/medsiglip-448
|
|
@@ -38,8 +39,8 @@ BrainGemma3D is a **multimodal vision-language model** that generates clinically
|
|
| 38 |
|
| 39 |
## π― Key Features
|
| 40 |
|
| 41 |
-
- **π¬ Native 3D Processing**: Inflated 2D medical vision encoder ([MedSigLIP](https://huggingface.co/google/medsiglip-
|
| 42 |
-
- **π Clinical Accuracy**: 95.1% F1 score on pathology entity recognition (BraTS
|
| 43 |
- **π§ Spatial Awareness**: 68.9% laterality F1 (correct left/right hemisphere localization)
|
| 44 |
- **π Interpretable**: LIME-based 3D attribution maps show which brain regions drive predictions
|
| 45 |
- **π Efficient**: Processes full 3D volumes with 32 compressed visual tokens
|
|
@@ -52,7 +53,7 @@ BrainGemma3D is a **multimodal vision-language model** that generates clinically
|
|
| 52 |
BrainGemma3D combines:
|
| 53 |
|
| 54 |
1. **3D Vision Encoder**: MedSigLIP inflated to 3D via center-frame initialization (Conv2D β Conv3D)
|
| 55 |
-
*Base model: [google/medsiglip-
|
| 56 |
|
| 57 |
2. **Token Compressor**: 2-layer Perceiver that reduces 3D patches to 32 visual tokens
|
| 58 |
|
|
@@ -193,8 +194,8 @@ BrainGemma3D is trained in **three progressive stages** to prevent catastrophic
|
|
| 193 |
- **Epochs**: 100
|
| 194 |
|
| 195 |
**Dataset**:
|
| 196 |
-
- 369 BraTS
|
| 197 |
-
- 99 healthy control scans with synthetic reports
|
| 198 |
- Stratified group-based splits (70% train / 10% val / 20% test) to prevent patient leakage
|
| 199 |
|
| 200 |
---
|
|
@@ -256,7 +257,7 @@ weights, wvol = run_interpretability(
|
|
| 256 |
|
| 257 |
<div align="left">
|
| 258 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/662a12d70951c58269b066fb/UkQwmZRwkn-rlNlFBNVkH.png" alt="LIME Interpretability" width="80%">
|
| 259 |
-
<p><i>Figure
|
| 260 |
</div>
|
| 261 |
|
| 262 |
---
|
|
@@ -302,7 +303,7 @@ weights, wvol = run_interpretability(
|
|
| 302 |
|
| 303 |
## π₯ Clinical Validation Notes
|
| 304 |
|
| 305 |
-
BrainGemma3D achieved **95.1% pathology F1** on the BraTS
|
| 306 |
|
| 307 |
1. **Dataset Homogeneity**: BraTS contains predominantly glioblastomas β performance on other tumor types (meningiomas, metastases) is unknown
|
| 308 |
2. **Report Quality**: Ground truth reports are from a single institution β may not generalize to other radiology practices
|
|
@@ -324,8 +325,7 @@ This project was developed by:
|
|
| 324 |
|
| 325 |
### Built With
|
| 326 |
- [Google MedGemma](https://huggingface.co/google/medgemma-1.5-4b-it) β Medical domain language model
|
| 327 |
-
- [Google MedSigLIP](https://huggingface.co/google/medsiglip-
|
| 328 |
-
- [BraTS 2021](https://www.med.upenn.edu/cbica/brats2021/) β Brain tumor segmentation dataset
|
| 329 |
- [Hugging Face Transformers](https://huggingface.co/docs/transformers) β Model framework
|
| 330 |
|
| 331 |
---
|
|
@@ -333,4 +333,4 @@ This project was developed by:
|
|
| 333 |
<div align="center">
|
| 334 |
<p><i>Built with β€οΈ for the <a href="https://www.kaggle.com/competitions/med-gemma-impact-challenge/overview">MedGemma Impact Challenge</a> π</i></p>
|
| 335 |
<p><i>Advancing Medical AI with Google's Health AI Developer Foundations</i></p>
|
| 336 |
-
</div>
|
|
|
|
| 13 |
- medgemma
|
| 14 |
- medsiglip
|
| 15 |
datasets:
|
| 16 |
+
- BraTS2020
|
| 17 |
+
- TextBraTS2021
|
| 18 |
+
- MPI-Leipzig_Mind-Brain-Body
|
| 19 |
base_model:
|
| 20 |
- google/medgemma-1.5-4b-it
|
| 21 |
- google/medsiglip-448
|
|
|
|
| 39 |
|
| 40 |
## π― Key Features
|
| 41 |
|
| 42 |
+
- **π¬ Native 3D Processing**: Inflated 2D medical vision encoder ([MedSigLIP](https://huggingface.co/google/medsiglip-448)) to 3D for volumetric understanding
|
| 43 |
+
- **π Clinical Accuracy**: 95.1% F1 score on pathology entity recognition (on BraTS dataset)
|
| 44 |
- **π§ Spatial Awareness**: 68.9% laterality F1 (correct left/right hemisphere localization)
|
| 45 |
- **π Interpretable**: LIME-based 3D attribution maps show which brain regions drive predictions
|
| 46 |
- **π Efficient**: Processes full 3D volumes with 32 compressed visual tokens
|
|
|
|
| 53 |
BrainGemma3D combines:
|
| 54 |
|
| 55 |
1. **3D Vision Encoder**: MedSigLIP inflated to 3D via center-frame initialization (Conv2D β Conv3D)
|
| 56 |
+
*Base model: [google/medsiglip-448](https://huggingface.co/google/medsiglip-448)*
|
| 57 |
|
| 58 |
2. **Token Compressor**: 2-layer Perceiver that reduces 3D patches to 32 visual tokens
|
| 59 |
|
|
|
|
| 194 |
- **Epochs**: 100
|
| 195 |
|
| 196 |
**Dataset**:
|
| 197 |
+
- 369 [BraTS 2020](https://www.kaggle.com/datasets/awsaf49/brats20-dataset-training-validation) brain tumor MRI cases with radiologist-written reports from [TextBraTS 2021](https://github.com/Jupitern52/TextBraTS)
|
| 198 |
+
- 99 healthy control scans with synthetic reports from [MPI-Leipzig Mind-Brain-Body](https://openneuro.org/datasets/ds000221/versions/00002)
|
| 199 |
- Stratified group-based splits (70% train / 10% val / 20% test) to prevent patient leakage
|
| 200 |
|
| 201 |
---
|
|
|
|
| 257 |
|
| 258 |
<div align="left">
|
| 259 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/662a12d70951c58269b066fb/UkQwmZRwkn-rlNlFBNVkH.png" alt="LIME Interpretability" width="80%">
|
| 260 |
+
<p><i>Figure 1: LIME attribution maps for a BraTS sample. Red regions show supervoxels that positively contribute to pathology predictions. The model correctly focuses on tumor-affected areas in the left parietal and frontal lobes.</i></p>
|
| 261 |
</div>
|
| 262 |
|
| 263 |
---
|
|
|
|
| 303 |
|
| 304 |
## π₯ Clinical Validation Notes
|
| 305 |
|
| 306 |
+
BrainGemma3D achieved **95.1% pathology F1** on the BraTS, but this does NOT imply clinical readiness. Key considerations:
|
| 307 |
|
| 308 |
1. **Dataset Homogeneity**: BraTS contains predominantly glioblastomas β performance on other tumor types (meningiomas, metastases) is unknown
|
| 309 |
2. **Report Quality**: Ground truth reports are from a single institution β may not generalize to other radiology practices
|
|
|
|
| 325 |
|
| 326 |
### Built With
|
| 327 |
- [Google MedGemma](https://huggingface.co/google/medgemma-1.5-4b-it) β Medical domain language model
|
| 328 |
+
- [Google MedSigLIP](https://huggingface.co/google/medsiglip-448) β Medical vision encoder
|
|
|
|
| 329 |
- [Hugging Face Transformers](https://huggingface.co/docs/transformers) β Model framework
|
| 330 |
|
| 331 |
---
|
|
|
|
| 333 |
<div align="center">
|
| 334 |
<p><i>Built with β€οΈ for the <a href="https://www.kaggle.com/competitions/med-gemma-impact-challenge/overview">MedGemma Impact Challenge</a> π</i></p>
|
| 335 |
<p><i>Advancing Medical AI with Google's Health AI Developer Foundations</i></p>
|
| 336 |
+
</div>
|