--- license: apache-2.0 library_name: diffusers pipeline_tag: feature-extraction base_model: ibm-esa-geospatial/TerraMind-1.0-Tokenizer-S1RTC language: - en tags: - terramind - remote-sensing - earth-observation - tokenizer - sentinel-1 - vq - fsq --- # TerraMind-1.0-Tokenizer-S1RTC This repository provides the S1RTC tokenizer checkpoint from TerraMind 1.0. ## Model details - Modality: S1RTC - Input channels: `2` - Image size: `256x256` - Tokenizer backbone: `vit_b_enc` - Quantization: `FSQ` (`codebook_size=8-8-8-6-5`, `latent_dim=5`) ## Quick use (diffusers-style) The tokenizer uses native diffusers patterns: `ModelMixin`, `ConfigMixin`, `from_pretrained`, and `from_config`. ```python from huggingface_hub import snapshot_download import torch import sys # Download model repository model_dir = snapshot_download("BiliSakura/TerraMind-1.0-Tokenizer-S1RTC") # Expose local module, then load (diffusers-style) sys.path.insert(0, model_dir) from terramind_tokenizer import TerraMindTokenizer # Load from path or Hub ID tokenizer = TerraMindTokenizer.from_pretrained( model_dir, # or "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC" torch_dtype=torch.float32, device="cpu", ) # S1RTC input: [B, 2, 256, 256] x = torch.randn(1, 2, 256, 256) tokens = tokenizer.tokenize(x) print(tokens.shape) # [1, 16, 16] # Encode returns (quant, code_loss, tokens) quant, code_loss, tokens = tokenizer.encode(x) ``` ## Load via AutoModel or TerraMindTokenizer (trust_remote_code) You can load via diffusers `AutoModel` or the specific `TerraMindTokenizer` class with `trust_remote_code=True`: ```python from diffusers import AutoModel import torch # Option 1: AutoModel (auto-detects from config) tokenizer = AutoModel.from_pretrained( "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC", trust_remote_code=True, torch_dtype=torch.float32, device="cpu", ) # Option 2: TerraMindTokenizer (explicit class) from terramind_tokenizer import TerraMindTokenizer tokenizer = TerraMindTokenizer.from_pretrained( "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC", torch_dtype=torch.float32, device="cpu", ) # Same API: tokenize(), encode() x = torch.randn(1, 2, 256, 256) tokens = tokenizer.tokenize(x) ``` > **Security:** `trust_remote_code=True` runs code from the Hub. Only use with repos you trust. For production, pin a specific revision: `revision="abc123def456"` (commit hash after your changes). ## Notes - Uses diffusers `ModelMixin` and `ConfigMixin` for standard `from_pretrained` / `from_config` / `save_pretrained`. - Supports both `model.safetensors` and `diffusion_pytorch_model.safetensors` weights. - Tokenizer-focused API: `tokenize()`, `encode()`, and `forward()`. - Please follow TerraMind and TerraTorch licenses/usage terms from the upstream project. ## Credits - Original TerraMind project: https://github.com/IBM/terramind - Original TerraMind model code (TerraTorch): https://github.com/terrastackai/terratorch/tree/main/terratorch/models/backbones/terramind - This repository adapts tokenizer checkpoints for convenient Hugging Face usage. ## Citation If you use TerraMind in your research, please cite: ```bibtex @article{jakubik2025terramind, title={TerraMind: Large-Scale Generative Multimodality for Earth Observation}, author={Jakubik, Johannes and Yang, Felix and Blumenstiel, Benedikt and Scheurer, Erik and Sedona, Rocco and Maurogiovanni, Stefano and Bosmans, Jente and Dionelis, Nikolaos and Marsocci, Valerio and Kopp, Niklas and others}, journal={IEEE/CVF International Conference on Computer Vision (ICCV)}, year={2025} } ```