Urock-chy commited on
Commit
627ae98
·
verified ·
1 Parent(s): d5bb69f

Simplify README usage section

Browse files
Files changed (1) hide show
  1. README.md +9 -27
README.md CHANGED
@@ -83,38 +83,20 @@ from PIL import Image
83
  from vl_embedding_v1 import VLEmbedder
84
 
85
  model_id = "Urock-AI/Eddy-vl_embedding_1.9B_v1"
 
86
 
87
- embedder = VLEmbedder(
88
- model_id,
89
- torch_dtype=torch.bfloat16,
90
- default_instruction="Represent this input for retrieval.",
91
- )
92
 
93
- # weights default to ``{model_id}/model.safetensors``
94
- # embedder = VLEmbedder(model_id, weights_path="/path/to/model.safetensors", ...)
95
- ```
96
-
97
- > `trust_remote_code=True` is required for `processing_vl.py` (`VLProcessor`) in this repo. Model weights load from `model.safetensors`.
98
 
99
- Embeddings are L2-normalized and compared by cosine similarity (dot product). A simple "find the closest image to this text" looks like:
100
-
101
- ```python
102
- INSTRUCTION = "Represent this input for retrieval."
103
-
104
- query = embedder.process(
105
- [{"text": "a tan toilet and sink in a small room", "instruction": INSTRUCTION}],
106
- )[0]
107
- candidates = [
108
- embedder.process([{"image": Image.open(p), "instruction": INSTRUCTION}])[0]
109
- for p in ["a.jpg", "b.jpg", "c.jpg"]
110
- ]
111
-
112
- scores = [(query @ c.T).item() for c in candidates]
113
- ranking = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)
114
- print(ranking, scores)
115
  ```
116
 
117
- Need cheaper storage or faster search? Truncate the 2048-d vector to a shorter prefix before normalizing — you trade a little accuracy for a lot of speed.
118
 
119
  ---
120
 
 
83
  from vl_embedding_v1 import VLEmbedder
84
 
85
  model_id = "Urock-AI/Eddy-vl_embedding_1.9B_v1"
86
+ instruction = "Represent this input for retrieval."
87
 
88
+ embedder = VLEmbedder(model_id, torch_dtype=torch.bfloat16, default_instruction=instruction)
 
 
 
 
89
 
90
+ # text / image / video → 2048-d vectors (L2-normalized)
91
+ text_vec = embedder.process([{"text": "a photo of a cat"}])[0]
92
+ image_vec = embedder.process([{"image": Image.open("photo.jpg")}])[0]
93
+ video_vec = embedder.process([{"video": "clip.mp4"}])[0]
 
94
 
95
+ # cosine similarity = dot product
96
+ score = (text_vec @ image_vec.T).item()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  ```
98
 
99
+ > `trust_remote_code=True` is required for `processing_vl.py` (`VLProcessor`) in this repo.
100
 
101
  ---
102