---
language: en
tags:
  - image-similarity
  - re-identification
  - efficientnetv2
  - miewid
  - latonia
library_name: pytorch
---

# miewid-msv3-latonia-1232

Finetuned checkpoint for Hula painted frog (_Latonia nigriventer_) photo-ID.  
Lineage: EfficientNetV2 -> miewid-msv3 -> miewid-msv3-latonia-1232.

## Model details
- Base model: EfficientNetV2 (via miewid-msv3)
- Task: Individual photo-identification via deep local-feature matching
- Domain: Hula painted frog, ventral pattern images
- Training data: 1,232 photos (Latonia dataset; more information in the manuscript)
- Classifier head: ArcFace used during training only; for re-ID use embeddings + cosine similarity (do not use the classifier head).

## Preprocessing
Images are preprocessed as described in the preprint:
- Rotate images so the head is oriented upwards.
- Detect a bounding box using MegaDetector and crop to the bbox.
- Apply the transforms shown in the Usage example below.

## Training
See the preprint for full training details and evaluation protocol.

## Usage
This repository provides a single PyTorch checkpoint:
- `miewid-msv3-latonia-1232.pt`

Example (embed images and compute cosine similarity):

```python
import torch
from PIL import Image
from torchvision import transforms

from transformers import AutoModel


class ZoomCenterCrop:
    def __init__(self, zoom=1.0):
        self.zoom = zoom

    def __call__(self, img):
        w, h = img.size
        m = int(min(h, w) / self.zoom)
        left = (w - m) // 2
        top = (h - m) // 2
        return img.crop((left, top, left + m, top + m))


preprocess = transforms.Compose([
    ZoomCenterCrop(zoom=2.0),
    transforms.Resize((440, 440)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
])

model = AutoModel.from_pretrained("conservationxlabs/miewid-msv3", trust_remote_code=True)
ckpt = torch.load("miewid-msv3-latonia-1232.pt", map_location="cpu")
model.load_state_dict(ckpt["model"], strict=True)
model.eval()

def embed(path):
    img = Image.open(path).convert("RGB")
    x = preprocess(img).unsqueeze(0)
    with torch.no_grad():
        emb = model(x)
    return emb / emb.norm(dim=1, keepdim=True)

e1 = embed("img1.jpg")
e2 = embed("img2.jpg")
cosine_sim = (e1 @ e2.T).item()
print(cosine_sim)
```

## Intended use
- Research on automated photo-ID of Hula painted frogs
- Evaluation and reproduction of results in the associated preprint

## Limitations
- Trained on a specific dataset (1,232 images) and may not generalize to other species or imaging setups.
- Performance depends on image quality and pose/lighting conditions.

## Citation
If you use this model, please cite:

Yesharim, M., Bina Perl, R. G., Roll, U., Gafny, S., Geffen, E., Ram, Y.  
"Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching."  
arXiv:2601.08798 (2026). https://arxiv.org/abs/2601.08798