BGE-M3 LoRA β€” Finetuned on OE Support Knowledge Base

This is a LoRA-finetuned version of BAAI/bge-m3 trained on an internal Order Express (OE) support knowledge base to improve retrieval accuracy for customer support queries.

Training Summary

Item Detail
Base model BAAI/bge-m3
Method PEFT LoRA + CachedMultipleNegativesRankingLoss
LoRA rank (r) 32
LoRA alpha 64
Target modules query, value
Training epochs 3
Batch size 8 (gradient accumulation Γ— 4)
Learning rate 3e-5
Warmup ratio 0.1
Max sequence length 512
Train queries 3,116 (85% of golden test set)
Eval queries ~550 (15% of golden test set)
Corpus Full KB: 6,756 documents
Train/eval split 85/15 stratified by difficulty (Easy / Medium / Hard)

Training Process

image

Usage

With sentence-transformers (merged model β€” recommended for inference)

import os
from huggingface_hub import snapshot_download
from langchain_huggingface import HuggingFaceEmbeddings

model_kwargs = {'device': 'cuda' if torch.cuda.is_available() else 'cpu'}
encode_kwargs = {'normalize_embeddings': True}

print("Downloading merged model files...")
local_model_dir = snapshot_download(
    repo_id="CuongCao/oe-bge-m3-LoRA-ft-v2",
    allow_patterns=["merged/*", "merged/**/*"]
)
merged_path = os.path.join(local_model_dir, "merged")

print("Loading model into LangChain wrapper...")
model = HuggingFaceEmbeddings(
    model_name=merged_path,       
    model_kwargs=model_kwargs,     
    encode_kwargs=encode_kwargs    
)

query = "How do I reset my password?"
documents = ["Password reset procedure", "Account settings guide", ...]

query_emb = model.encode(query)
doc_embs = model.encode(documents)

# Cosine similarity
scores = query_emb @ doc_embs.T

Repo Structure

CuongCao/oe-bge-m3-LoRA-ft-v2/
β”œβ”€β”€ lora-adapters/      # LoRA adapter weights only (~12 MB)
β”‚   β”œβ”€β”€ adapter_config.json
β”‚   └── adapter_model.safetensors
└── merged/             # Full merged model (~2.2 GB, ready for inference)
    β”œβ”€β”€ config.json
    β”œβ”€β”€ model.safetensors
    └── ...

Citation

If you use this model, please cite the original BGE-M3 paper:

@article{bge-m3,
  title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation},
  author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
  journal={arXiv preprint arXiv:2402.03216},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for CuongCao/oe-bge-m3-LoRA-ft-v2

Base model

BAAI/bge-m3
Adapter
(33)
this model

Paper for CuongCao/oe-bge-m3-LoRA-ft-v2