Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
tags:
|
| 6 |
+
- glove
|
| 7 |
+
- lora
|
| 8 |
+
- distillation
|
| 9 |
+
- bpe
|
| 10 |
+
- cl100k_base
|
| 11 |
+
base_model: jsanzolac/bpe_glove_512
|
| 12 |
+
datasets:
|
| 13 |
+
- jsanzolac/qwen3_emb_512
|
| 14 |
+
- jsanzolac/qwen3_emb_512_packed
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
# bpe_glove_512_lora_v1
|
| 18 |
+
|
| 19 |
+
LoRA drifts on top of frozen `jsanzolac/bpe_glove_512` BPE-GloVe-512 embeddings, distilled
|
| 20 |
+
from `Qwen/Qwen3-Embedding-8B` (MRL-truncated to 512 dims).
|
| 21 |
+
|
| 22 |
+
**Variant 1 loss:** `位_c路InfoNCE + 位_D路||蟻_T - 蟻_S||_F虏` with `位_c=1.0`, `位_D=0.1`.
|
| 23 |
+
|
| 24 |
+
Each `rank_<r>/` folder contains:
|
| 25 |
+
- `checkpoint_final.pt`
|
| 26 |
+
- `config.json`
|
| 27 |
+
- `vectors_drifted.txt`
|
| 28 |
+
- `train_log.jsonl`
|
| 29 |
+
|
| 30 |
+
Ranks shipped: [512, 256, 128, 64, 32, 16, 8, 4, 2]
|