--- library_name: transformers tags: - resnet - SAR - RADAR - EO - backbone - ocean - wind - sentinel-1 license: apache-2.0 pipeline_tag: image-feature-extraction --- # Model Card for OceanSAR-1 ## Model Details ### Model Description OceanSAR-1 is the first foundation model in the OceanSAR family, specifically designed for Synthetic Aperture Radar (SAR) imagery analysis, with a focus on ocean observation. The model is trained using a novel dynamic dataset pruning strategy that enhances training efficiency and feature quality. - **Developed by:** Thomas Kerdreux, Alexandre Tuel @ [Galeio](http://galeio.fr) - **Deployed by:** Antoine Audras @ [Galeio](http://galeio.fr) - **Model type:** Vision Foundation Model (ResNet50/ViT variants) - **License:** Apache License 2.0 - **Training data:** Sentinel-1 Wave Mode (WV) SAR images (2015-2024) - **Training regime:** DINO self-supervised learning with dynamic dataset pruning ## Uses ### Direct Use The model is intended to be used as a feature extractor for SAR image analysis, particularly for ocean observation tasks. It can be used for: - Feature extraction from SAR images - Transfer learning for downstream tasks ### Downstream Use The model has been validated on three downstream tasks: 1. **TenGeoP Classification**: Classification of 10 geophysical phenomena in SAR images 2. **Significant Wave Height Estimation**: Regression task for ocean wave height prediction 3. **Wind Speed Prediction**: Regression task for surface wind speed estimation ## How to Use ```python import torch from transformers import AutoModel # Load model and processor model = AutoModel.from_pretrained("galeio-research/OceanSAR-1") # Prepare your SAR image (should be single-channel VV polarization) # Here using random data as example dummy_image = torch.randn(1, 1, 256, 256) # (C, H, W) # Extract features with torch.no_grad(): outputs = model(dummy_image) features = outputs.pooler_output # Shape: (1, 2048) for ResNet50 ``` ## Training Details ### Training Data - **Dataset:** Sentinel-1 Wave Mode (WV) SAR images - **Time period:** 2015-2024 - **Size:** ~12 million images - **Preprocessing:** - Spatial downsampling to 50m resolution - Dynamic dataset pruning for diversity and balancedness - Excluded validation images from training set ### Dynamic Dataset Pruning The model uses a novel dynamic dataset pruning strategy that: - Maximizes dataset diversity and balancedness - Reduces computational costs - Improves model performance on downstream tasks - Works without requiring a pre-existing feature extractor ## Evaluation ### Results The model achieves state-of-the-art performance on three downstream tasks (linear probing): 1. **TenGeoP Classification**: - ResNet50: 75.5% accuracy - ViT-S/16: 78.6% accuracy - ViT-S/8: 82.1% accuracy - ViT-B/8: 83.6% accuracy 2. **Significant Wave Height Estimation**: - RMSE: 0.63-0.72m (depending on architecture) 3. **Wind Speed Prediction**: - RMSE: 1.37-1.43 m/s (depending on architecture) For commercial deployments or to access optimized model variants for specific operational needs, feel free to reach out to discuss licensing and support options. ## Technical Specifications ### Hardware Requirements - GPU with at least 8GB VRAM recommended ### Dependencies - PyTorch >= 1.8.0 - Transformers >= 4.30.0 - torchvision >= 0.9.0 ### Input Specifications - Input size: 256x256 pixels - Single channel (VV polarization) - Normalized pixel values - SAR images from Sentinel-1 Wave Mode ## Citation **BibTeX:** ```bibtex @article{kerdreux2025efficientselfsupervisedlearningearth, title={Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation}, author={Kerdreux, Thomas and Tuel, Alexandre and Febvre, Quentin and Mouche, Alexis and Chapron, Bertrand}, journal={arXiv preprint arXiv:2504.06962}, year={2025}, eprint={2504.06962}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2504.06962}, } ``` ## Acknowledgements This work was granted access to the HPC resources of IDRIS and TGCC under the allocation 2025-[A0171015666] made by GENCI.