---
license: apache-2.0
library_name: diffusers
pipeline_tag: feature-extraction
base_model: ibm-esa-geospatial/TerraMind-1.0-Tokenizer-S1RTC
language:
  - en
tags:
  - terramind
  - remote-sensing
  - earth-observation
  - tokenizer
  - sentinel-1
  - vq
  - fsq
---

# TerraMind-1.0-Tokenizer-S1RTC

This repository provides the S1RTC tokenizer checkpoint from TerraMind 1.0.

## Model details

- Modality: S1RTC
- Input channels: `2`
- Image size: `256x256`
- Tokenizer backbone: `vit_b_enc`
- Quantization: `FSQ` (`codebook_size=8-8-8-6-5`, `latent_dim=5`)

## Quick use (diffusers-style)

The tokenizer uses native diffusers patterns: `ModelMixin`, `ConfigMixin`, `from_pretrained`, and `from_config`.

```python
from huggingface_hub import snapshot_download
import torch
import sys

# Download model repository
model_dir = snapshot_download("BiliSakura/TerraMind-1.0-Tokenizer-S1RTC")

# Expose local module, then load (diffusers-style)
sys.path.insert(0, model_dir)
from terramind_tokenizer import TerraMindTokenizer

# Load from path or Hub ID
tokenizer = TerraMindTokenizer.from_pretrained(
    model_dir,  # or "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC"
    torch_dtype=torch.float32,
    device="cpu",
)

# S1RTC input: [B, 2, 256, 256]
x = torch.randn(1, 2, 256, 256)
tokens = tokenizer.tokenize(x)
print(tokens.shape)  # [1, 16, 16]

# Encode returns (quant, code_loss, tokens)
quant, code_loss, tokens = tokenizer.encode(x)
```

## Load via AutoModel or TerraMindTokenizer (trust_remote_code)

You can load via diffusers `AutoModel` or the specific `TerraMindTokenizer` class with `trust_remote_code=True`:

```python
from diffusers import AutoModel
import torch

# Option 1: AutoModel (auto-detects from config)
tokenizer = AutoModel.from_pretrained(
    "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC",
    trust_remote_code=True,
    torch_dtype=torch.float32,
    device="cpu",
)

# Option 2: TerraMindTokenizer (explicit class)
from terramind_tokenizer import TerraMindTokenizer
tokenizer = TerraMindTokenizer.from_pretrained(
    "BiliSakura/TerraMind-1.0-Tokenizer-S1RTC",
    torch_dtype=torch.float32,
    device="cpu",
)

# Same API: tokenize(), encode()
x = torch.randn(1, 2, 256, 256)
tokens = tokenizer.tokenize(x)
```

> **Security:** `trust_remote_code=True` runs code from the Hub. Only use with repos you trust. For production, pin a specific revision: `revision="abc123def456"` (commit hash after your changes).

## Notes

- Uses diffusers `ModelMixin` and `ConfigMixin` for standard `from_pretrained` / `from_config` / `save_pretrained`.
- Supports both `model.safetensors` and `diffusion_pytorch_model.safetensors` weights.
- Tokenizer-focused API: `tokenize()`, `encode()`, and `forward()`.
- Please follow TerraMind and TerraTorch licenses/usage terms from the upstream project.

## Credits

- Original TerraMind project: https://github.com/IBM/terramind
- Original TerraMind model code (TerraTorch): https://github.com/terrastackai/terratorch/tree/main/terratorch/models/backbones/terramind
- This repository adapts tokenizer checkpoints for convenient Hugging Face usage.

## Citation

If you use TerraMind in your research, please cite:

```bibtex
@article{jakubik2025terramind,
  title={TerraMind: Large-Scale Generative Multimodality for Earth Observation},
  author={Jakubik, Johannes and Yang, Felix and Blumenstiel, Benedikt and Scheurer, Erik and Sedona, Rocco and Maurogiovanni, Stefano and Bosmans, Jente and Dionelis, Nikolaos and Marsocci, Valerio and Kopp, Niklas and others},
  journal={IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}
```