Urock-AI
/

Eddy-vl_embedding_1.9B_v1

@@ -16,6 +16,10 @@ base_model:
 - Qwen/Qwen3-VL-Embedding-2B
 ---
 # Eddy-VL Embedding 1.9B
 [Urock-AI](https://huggingface.co/Urock-AI) · [urock.kr](https://urock.kr/) · License: Apache 2.0
@@ -60,53 +64,6 @@ Rather than train a new model from scratch, we started from one of the best open
 ---
-## Installation
-```bash
-git clone https://huggingface.co/Urock-AI/Eddy-vl_embedding_1.9B_v1
-cd Eddy-vl_embedding_1.9B_v1
-pip install "transformers>=5.0" safetensors torch pillow torchvision
-pip install decord   # video input (or `av`)
-```
-Inference code ships with the repo (not a pip package):
-| File | Role |
-|------|------|
-| `vl_embedding_v1.py` | `VLEmbedder` — load weights, encode text / image / video |
-| `processing_vl.py` | `VLProcessor` — multimodal tokenization & preprocessing |
-| `vl_utils/` | Image & video loading / resizing (bundled) |
-## How to use it
-Clone the repo first, then run from inside the repo folder:
-```python
-import torch
-from PIL import Image
-from vl_embedding_v1 import VLEmbedder
-instruction = "Represent this input for retrieval."
-# load from local repo checkout (weights in ./model.safetensors)
-embedder = VLEmbedder(".", torch_dtype=torch.bfloat16, default_instruction=instruction)
-# text / image / video → 2048-d vectors (L2-normalized)
-text_vec = embedder.process([{"text": "a photo of a cat"}])[0]
-image_vec = embedder.process([{"image": Image.open("photo.jpg")}])[0]
-video_vec = embedder.process([{"video": "clip.mp4"}])[0]
-# cosine similarity = dot product
-score = (text_vec @ image_vec.T).item()
-```
-You can also pass the Hub repo id (`"Urock-AI/Eddy-vl_embedding_1.9B_v1"`) to download weights automatically, but you still need the cloned Python files on `PYTHONPATH`.
-> `trust_remote_code=True` is used internally for `processing_vl.py` (`VLProcessor`).
----
 ## How well it does
 Eddy-VL is validated on **[MMEB-V2](https://huggingface.co/datasets/TIGER-Lab/MMEB-V2)** across image, video, and document retrieval. Selected per-task results:
@@ -165,6 +122,53 @@ The public leaderboard and our in-house pipeline differ in setup, so we compare
 ---
 ## Good to know before you rely on it
 - **It finds, it doesn't decide.** Eddy-VL surfaces candidates for a human to review; it shouldn't be the sole basis for any high-stakes decision.

 - Qwen/Qwen3-VL-Embedding-2B
 ---
+<p align="center">
+  <img src="logo.png" alt="Eddy-VL" width="480">
+</p>
 # Eddy-VL Embedding 1.9B
 [Urock-AI](https://huggingface.co/Urock-AI) · [urock.kr](https://urock.kr/) · License: Apache 2.0
 ---
 ## How well it does
 Eddy-VL is validated on **[MMEB-V2](https://huggingface.co/datasets/TIGER-Lab/MMEB-V2)** across image, video, and document retrieval. Selected per-task results:
 ---
+## Installation
+```bash
+git clone https://huggingface.co/Urock-AI/Eddy-vl_embedding_1.9B_v1
+cd Eddy-vl_embedding_1.9B_v1
+pip install "transformers>=5.0" safetensors torch pillow torchvision
+pip install decord   # video input (or `av`)
+```
+Inference code ships with the repo (not a pip package):
+| File | Role |
+|------|------|
+| `vl_embedding_v1.py` | `VLEmbedder` — load weights, encode text / image / video |
+| `processing_vl.py` | `VLProcessor` — multimodal tokenization & preprocessing |
+| `vl_utils/` | Image & video loading / resizing (bundled) |
+## How to use it
+Clone the repo first, then run from inside the repo folder:
+```python
+import torch
+from PIL import Image
+from vl_embedding_v1 import VLEmbedder
+instruction = "Represent this input for retrieval."
+# load from local repo checkout (weights in ./model.safetensors)
+embedder = VLEmbedder(".", torch_dtype=torch.bfloat16, default_instruction=instruction)
+# text / image / video → 2048-d vectors (L2-normalized)
+text_vec = embedder.process([{"text": "a photo of a cat"}])[0]
+image_vec = embedder.process([{"image": Image.open("photo.jpg")}])[0]
+video_vec = embedder.process([{"video": "clip.mp4"}])[0]
+# cosine similarity = dot product
+score = (text_vec @ image_vec.T).item()
+```
+You can also pass the Hub repo id (`"Urock-AI/Eddy-vl_embedding_1.9B_v1"`) to download weights automatically, but you still need the cloned Python files on `PYTHONPATH`.
+> `trust_remote_code=True` is used internally for `processing_vl.py` (`VLProcessor`).
+---
 ## Good to know before you rely on it
 - **It finds, it doesn't decide.** Eddy-VL surfaces candidates for a human to review; it shouldn't be the sole basis for any high-stakes decision.