sentence-transformers/msmarco-msmarco-distilbert-base-v3
Viewer • Updated • 88.9M • 751 • 5
Stage 2 supervised fine-tuning of jsanzolac/drifting-glove-distilled-r300 on MS MARCO triplet-50 with
InfoNCE-only (no MSE), 1 epoch, batch=256, K=6 mined hard negatives per anchor, τ=0.02.
Trainable: A.weight, B.weight only. E (frozen 300-d cl100k GloVe from jsanzolac/drifting-glove-distilled-r300) is excluded from the saved state dict.
Base model
jsanzolac/drifting-glove-distilled-r300