--- language: en tags: - image-similarity - re-identification - efficientnetv2 - miewid - latonia library_name: pytorch --- # miewid-msv3-latonia-1232 Finetuned checkpoint for Hula painted frog (_Latonia nigriventer_) photo-ID. Lineage: EfficientNetV2 -> miewid-msv3 -> miewid-msv3-latonia-1232. ## Model details - Base model: EfficientNetV2 (via miewid-msv3) - Task: Individual photo-identification via deep local-feature matching - Domain: Hula painted frog, ventral pattern images - Training data: 1,232 photos (Latonia dataset; more information in the manuscript) - Classifier head: ArcFace used during training only; for re-ID use embeddings + cosine similarity (do not use the classifier head). ## Preprocessing Images are preprocessed as described in the preprint: - Rotate images so the head is oriented upwards. - Detect a bounding box using MegaDetector and crop to the bbox. - Apply the transforms shown in the Usage example below. ## Training See the preprint for full training details and evaluation protocol. ## Usage This repository provides a single PyTorch checkpoint: - `miewid-msv3-latonia-1232.pt` Example (embed images and compute cosine similarity): ```python import torch from PIL import Image from torchvision import transforms from transformers import AutoModel class ZoomCenterCrop: def __init__(self, zoom=1.0): self.zoom = zoom def __call__(self, img): w, h = img.size m = int(min(h, w) / self.zoom) left = (w - m) // 2 top = (h - m) // 2 return img.crop((left, top, left + m, top + m)) preprocess = transforms.Compose([ ZoomCenterCrop(zoom=2.0), transforms.Resize((440, 440)), transforms.ToTensor(), transforms.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) model = AutoModel.from_pretrained("conservationxlabs/miewid-msv3", trust_remote_code=True) ckpt = torch.load("miewid-msv3-latonia-1232.pt", map_location="cpu") model.load_state_dict(ckpt["model"], strict=True) model.eval() def embed(path): img = Image.open(path).convert("RGB") x = preprocess(img).unsqueeze(0) with torch.no_grad(): emb = model(x) return emb / emb.norm(dim=1, keepdim=True) e1 = embed("img1.jpg") e2 = embed("img2.jpg") cosine_sim = (e1 @ e2.T).item() print(cosine_sim) ``` ## Intended use - Research on automated photo-ID of Hula painted frogs - Evaluation and reproduction of results in the associated preprint ## Limitations - Trained on a specific dataset (1,232 images) and may not generalize to other species or imaging setups. - Performance depends on image quality and pose/lighting conditions. ## Citation If you use this model, please cite: Yesharim, M., Bina Perl, R. G., Roll, U., Gafny, S., Geffen, E., Ram, Y. "Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching." arXiv:2601.08798 (2026). https://arxiv.org/abs/2601.08798