BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Paper β’ 2402.03216 β’ Published β’ 10
How to use CuongCao/oe-bge-m3-LoRA-ft-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("CuongCao/oe-bge-m3-LoRA-ft-v2")
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use CuongCao/oe-bge-m3-LoRA-ft-v2 with PEFT:
Task type is invalid.
This is a LoRA-finetuned version of BAAI/bge-m3 trained on an internal Order Express (OE) support knowledge base to improve retrieval accuracy for customer support queries.
| Item | Detail |
|---|---|
| Base model | BAAI/bge-m3 |
| Method | PEFT LoRA + CachedMultipleNegativesRankingLoss |
| LoRA rank (r) | 32 |
| LoRA alpha | 64 |
| Target modules | query, value |
| Training epochs | 3 |
| Batch size | 8 (gradient accumulation Γ 4) |
| Learning rate | 3e-5 |
| Warmup ratio | 0.1 |
| Max sequence length | 512 |
| Train queries | 3,116 (85% of golden test set) |
| Eval queries | ~550 (15% of golden test set) |
| Corpus | Full KB: 6,756 documents |
| Train/eval split | 85/15 stratified by difficulty (Easy / Medium / Hard) |
import os
from huggingface_hub import snapshot_download
from langchain_huggingface import HuggingFaceEmbeddings
model_kwargs = {'device': 'cuda' if torch.cuda.is_available() else 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
print("Downloading merged model files...")
local_model_dir = snapshot_download(
repo_id="CuongCao/oe-bge-m3-LoRA-ft-v2",
allow_patterns=["merged/*", "merged/**/*"]
)
merged_path = os.path.join(local_model_dir, "merged")
print("Loading model into LangChain wrapper...")
model = HuggingFaceEmbeddings(
model_name=merged_path,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs
)
query = "How do I reset my password?"
documents = ["Password reset procedure", "Account settings guide", ...]
query_emb = model.encode(query)
doc_embs = model.encode(documents)
# Cosine similarity
scores = query_emb @ doc_embs.T
CuongCao/oe-bge-m3-LoRA-ft-v2/
βββ lora-adapters/ # LoRA adapter weights only (~12 MB)
β βββ adapter_config.json
β βββ adapter_model.safetensors
βββ merged/ # Full merged model (~2.2 GB, ready for inference)
βββ config.json
βββ model.safetensors
βββ ...
If you use this model, please cite the original BGE-M3 paper:
@article{bge-m3,
title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation},
author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
journal={arXiv preprint arXiv:2402.03216},
year={2024}
}
Base model
BAAI/bge-m3
from sentence_transformers import SentenceTransformer model = SentenceTransformer("CuongCao/oe-bge-m3-LoRA-ft-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4]