--- license: apache-2.0 library_name: pytorch tags: - earth-observation - remote-sensing - semantic-segmentation - glacial-lakes - GLOF - foundation-model - terramind - multimodal - sentinel-2 - sentinel-1 - copernicus-dem - high-mountain-asia - tien-shan - climate-risk datasets: - abzal-glw/cryosentinel-glof-v3 language: - en metrics: - iou - f1 pipeline_tag: image-segmentation base_model: ibm-esa-geospatial/TerraMind-1.0-large model-index: - name: cryosentinel-terramind-v3 results: - task: type: image-segmentation name: Glacial-lake semantic segmentation dataset: type: abzal-glw/cryosentinel-glof-v3 name: CryoSentinel multimodal chips v3 (Almaty corridor) split: validation metrics: - type: iou value: 0.9557 name: IoU @ thr 0.5, TTA - type: iou value: 0.9596 name: IoU @ best thr 0.70, TTA - task: type: image-segmentation name: Glacial-lake semantic segmentation dataset: type: abzal-glw/cryosentinel-glof-v3 name: CryoSentinel multimodal chips v3 (Almaty corridor) split: test metrics: - type: iou value: 0.8918 name: IoU @ thr 0.5, TTA (raw) - type: iou value: 0.9082 name: IoU @ thr 0.5, TTA (label-corrected, n=658) --- # CryoSentinel — TerraMind-v3 Almaty Corridor Soup A foundation-model semantic-segmentation checkpoint for glacial lakes from joint Sentinel-1 SAR, Sentinel-2 optical, and Copernicus DEM imagery in High Mountain Asia. Released under Apache 2.0 as the supervision component of the [GLOFcast](https://github.com/abzalabdrash/glofcast) operational early-warning system. This model card documents the **Stage 4b v2 production soup** — a uniform weight-space average of five SWA snapshots from the Almaty-corridor finetune. It is the recommended checkpoint for inference. ## Key results | Metric | Value | |---|---:| | Validation IoU @ thr = 0.5, TTA on (n = 666) | **0.9557** | | Validation IoU @ best thr (0.70), TTA on | 0.9596 | | Held-out test IoU @ thr = 0.5, TTA on (n = 665) | 0.8918 | | Held-out test IoU, label-corrected (n = 658) | **0.9082** | Comparison to literature — same train/val protocol as Adhikari & Regmi (2025), arXiv:2512.24117: | Reference | Modality | Val IoU | |---|---|---:| | Adhikari & Regmi (2025) | S1 only | 0.9130 | | **CryoSentinel (this model)** | **S1 + S2 + DEM** | **0.9557** | ## Model details - **Architecture**: TerraMind 1.0 Large encoder (1.1 B parameters; Jakubik et al., 2025) + UperNet decoder (channel sequence 256 → 128 → 64 → 32; Xiao et al., 2018) with a `LearnedInterpolateToPyramidal` neck. - **Inputs**: 224 × 224 chips at 10 m/pixel: - 12 Sentinel-2 L2A bands (B01–B12 minus B10), uint16 reflectance - 2 Sentinel-1 GRD polarisations (VV, VH), float32 dB - 1 Copernicus DEM 30 m band, int16 m elevation - **Output**: per-pixel water sigmoid logit at the input resolution. - **Training**: 30 epochs Stage 4a HMA pretrain → 30 epochs Stage 4b Almaty-corridor finetune. - **Soup**: uniform mean of five SWA snapshots (epochs 12, 14, 17, 22, 30) per Wortsman et al. (2022). EMA decay 0.999. Flip-only TTA at validation and test time. - **Loss**: BCE (pos_weight = 100) + flat Dice + per-image Lovász softmax + asymmetric Tversky (α = 0.25, β = 0.75) + Focal (γ = 2) + Boundary, with OHEM hard-negative mining. - **Train compute**: ~ 12 hours on a Modal H100 80 GB, ~ $50 in cloud credits. Robust checkpoint resume via `_sanitize_resume_ckpt`. ## Files ``` terramind_v3_finetune_almaty_v2/ checkpoints/ soup.ckpt ← recommended for inference (this card) step001605-iou0.952.ckpt ← single-best fallback last.ckpt ← resume-from-here diagnostics/ per_chip_val.parquet per_chip_test.parquet per_region_breakdown.json threshold_sweep.json logs/ metrics.csv config.yaml ``` ## Intended uses **Direct uses.** Operational glacial-lake monitoring in High Mountain Asia. Multi-year change detection by running per-year inference and differencing masks. Research on multi-modal foundation-model fine-tuning, label-noise robustness, and snapshot averaging in segmentation. **Downstream uses.** Component of GLOF early-warning systems (CryoSentinel produces the lake-area inputs; the breach forecast lives in [GLOFcast](https://github.com/abzalabdrash/glofcast)). Per-lake hazard scoring pipelines that consume per-pixel water masks. Comparative benchmarking against new foundation models on the public test split. **Out of scope.** General water mapping (rivers, irrigation reservoirs, coastal water). The model is intentionally trained to suppress non-glacial water signatures; see `docs/LIMITATIONS.md` § 4. Areas outside HMA — no out-of-domain validation has been performed. ## Limitations The full list is in `docs/LIMITATIONS.md`; the headline items: - **Not a forecaster.** Segmentation only; no breach probability or time-to-failure output. - **HMA only.** Twelve sub-regions trained, no Andes / Alps / Caucasus / Patagonia validation. - **Sub-hectare lakes are unreliable.** Minimum trustworthy area ~ 0.5 ha (≈ 50 px at 10 m/px). - **Single-snapshot.** No temporal modelling. - **No uncertainty quantification.** Single sigmoid logit per pixel. - **Single-seed.** All numbers from one run with `seed = 42`. Multi-seed variance (3 seeds) shipping in v1.1, June 2026; budget allocated. - **Not affiliated.** CryoSentinel is a research-stage prototype, not affiliated with «Казселезащита», UNESCO GLOFCA, or any other public agency. See `docs/LIMITATIONS.md` § 9. ## Bias, risks, and ethical considerations The model is trained on a Kumar & Vijay (2026) inventory that integrates Landsat-8, Sentinel-1, and Sentinel-2 imagery with manual quality control. Documented bias modes: - **Label undersizing** on a small fraction (~ 1 %) of chips. We have audited seven instances in the test split where the Kumar polygon undersizes a real lake; the model correctly identifies the larger water body in all seven (see `docs/LABEL_NOISE_AUDIT.md`). This bias is documented but not corrected at training time, because the inverse direction — chips where Kumar oversizes a small puddle — would require an audit we have not performed. - **Cloud climatology bias.** The training imagery is selected from late-summer composites with cloud cover < 30 %. Performance on persistently cloudy regions (eastern Himalaya monsoon, Andes Pacific slope) will degrade; we have not quantified by how much. - **Operational misuse.** The biggest risk we are aware of is treating CryoSentinel as a forecaster rather than a segmenter. We document this prominently in the README and `LIMITATIONS.md`. If you find the model deployed in a way that conflates it with a breach forecaster, please flag it. ## How to use ### Hugging Face Hub ```python from huggingface_hub import hf_hub_download import torch ckpt_path = hf_hub_download( repo_id="abzal-glw/cryosentinel-terramind-v3", filename="terramind_v3_finetune_almaty_v2/checkpoints/soup.ckpt", ) state = torch.load(ckpt_path, map_location="cpu", weights_only=False) # Pass to a CryoSentinel Lightning module configured with # configs/terramind_v3_finetune_almaty_v2.yaml ``` ### One-line evaluation ```bash git clone https://github.com/abzalabdrash/Cryosentinel.git cd Cryosentinel && pip install -e . bash scripts/reproduce_benchmarks.sh ``` Reproduces the benchmark table to four decimal places. ~ 25 min, ~ $3 on Modal H100. See `docs/REPRODUCING.md` for local-GPU notes. ### Inference on released chips ```bash modal run scripts/predict_modal.py::predict \ --hf-data-repo abzal-glw/cryosentinel-glof-v3 \ --hf-ckpt-repo abzal-glw/cryosentinel-terramind-v3 \ --ckpt-uri hf://abzal-glw/cryosentinel-terramind-v3/terramind_v3_finetune_almaty_v2/checkpoints/soup.ckpt \ --config configs/terramind_v3_finetune_almaty_v2.yaml \ --region ile_alatau --year 2023 --n-chips 30 \ --gpu L4 ``` Input chip layout: 12 S2 L2A bands → 2 S1 GRD bands (VV, VH dB) → 1 Copernicus DEM 30 m band, all at 10 m/pixel in a 224 × 224 GeoTIFF. ## Training data `abzal-glw/cryosentinel-glof-v3`: 42,237 chips across 12 HMA sub-regions × 4 years (2017, 2021, 2022, 2023) at 224 × 224 / 10 m / pixel. 30.4 GiB on disk. Built from Kumar & Vijay (2026) labels + ESA Copernicus Sentinel-1/2 + Copernicus DEM 30 m. Publicly downloadable on Hugging Face under ODC-By 1.0; see `docs/DATA.md` for download instructions and full preprocessing details. Stage 4b finetune split (after spatial-block-split with 0.15° blocks and 0.02° / 2.2 km buffer): | Split | Chips | Share | |---|---:|---:| | Train | 4,283 | 76 % | | Val | 666 | 12 % | | Test | 665 | 12 % | The split is hash-based, year-invariant, and rules out spatial leakage by construction. See `docs/METHOD.md` § 4. ## Evaluation protocol - Production checkpoint: `soup.ckpt` (uniform mean of five SWA snapshots). - TTA: flip-only (horizontal + vertical, four passes averaged in logit space). - EMA: decay 0.999, applied at validation and test time. - Threshold: 0.5 (default headline) and 0.70 (best-on-val). - Metric: global IoU computed as (sum of intersection) / (sum of union) across all chips in the split. Per-chip mean IoU also reported as a robustness check. ## Citation ```bibtex @software{abdrash2026cryosentinel, title = {CryoSentinel: A Foundation-Model Glacial Lake Segmenter for High Mountain Asia}, author = {Abdrash, Abzal}, year = {2026}, url = {https://github.com/abzalabdrash/cryosentinel}, version = {v1.0.0}, license = {Apache-2.0}, doi = {10.5281/zenodo.20239229} } @dataset{abdrash2026cryosentinelglofv3, title = {CryoSentinel-GLOF v3: Multimodal Glacial Lake Chips for High Mountain Asia}, author = {Abdrash, Abzal}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/abzal-glw/cryosentinel-glof-v3}, doi = {10.57967/hf/8823} } ``` If you use CryoSentinel in operational early-warning, please also cite the underlying TerraMind backbone: ```bibtex @article{jakubik2025terramind, title = {TerraMind: Large-Scale Generative Multimodality for Earth Observation}, author = {Jakubik, Johannes and others}, journal = {arXiv:2504.11171}, year = {2025} } ``` And the Kumar & Vijay glacial-lake inventory: ```bibtex @dataset{kumar2026hma, title = {Inventory of Glacial Lakes in High Mountain Asia for the Years 2016 and 2022}, author = {Kumar, Ravindra and Vijay, Saurabh}, year = {2026}, publisher = {PANGAEA}, doi = {10.1594/PANGAEA.983845} } ``` ## Acknowledgments - **TerraMind** by IBM Research, the European Space Agency Φ-lab, and the FAST-EO project (Apache 2.0). Released April 2025. - **TerraTorch** by IBM Research (Apache 2.0). The training framework that made the multi-modal patch-embedder dispatch and LLRD profile manageable. - **Modal Labs** for the H100 80 GB compute. - **Hugging Face** for hosting the model checkpoints and the training dataset. - The **«Казселезащита»** institutional GLOF early-warning service in Almaty, Kazakhstan, and the **UNESCO GLOFCA** programme — the institutional context this work is positioned to complement, not replace. CryoSentinel is not affiliated with either organisation; this acknowledgment recognises their decades of operational work on the same problem. ## Contact Open an issue at https://github.com/abzalabdrash/Cryosentinel/issues for bug reports, reproducibility checks, or out-of-domain validation results.