Instructions to use ronaldmannak/LFM2.5-ColBERT-350M-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use ronaldmannak/LFM2.5-ColBERT-350M-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir LFM2.5-ColBERT-350M-bf16 ronaldmannak/LFM2.5-ColBERT-350M-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
LFM2.5-ColBERT-350M โ MLX (bf16)
MLX build of LiquidAI/LFM2.5-ColBERT-350M, a multilingual late-interaction retriever (128-dim vector per token, scored with MaxSim), for local inference on Apple Silicon with MLX.
All weights, architecture, and behavior are LiquidAI's. This repository changes the file format (PyTorch/safetensors โ MLX) and kept at the original bf16 precision โ it is not quantized. Quantized variants (8-bit / 4-bit) are available as sibling repos; see the table below. See the original model card for training details and intended use.
Conversion details
- Converted with
mlx; weights unchanged apart from tensor layout (bf16 โ MLX bf16). Includes the 1024โ128Denseprojection head (dense.weight). - The architecture is
Lfm2BidirectionalModel(a bidirectional LFM2 encoder), whichmlx-lm/mlx-embeddingsdo not support out of the box, so a small self-contained MLX implementation is included aslfm2_bidirectional.py. - Verified against the original (PyTorch, float32, identical token ids): worst-case cosine of the per-token projected vectors โ 1.0 across short prompts and a 130-token passage.
Evaluation
Retrieval quality of this checkpoint (and its sibling precisions), measured as NDCG@10 / Recall@10 on judged pools. Retention = metric รท bf16 metric, averaged per-dataset.
Setup. English = the four NanoBEIR sets (full small corpora, ~2โ5k passages, 50 queries each). Multilingual = MIRACL dev (the real queries and relevance judgments) for Spanish, German, Japanese, Arabic, each scored over a reduced pool of ~6k passages (judged positives + hard-mined negatives + sampled distractors, from mteb/MIRACLRetrievalHardNegatives), 100 queries each. Reduced pools make absolute scores easier than full-corpus MIRACL and not leaderboard-comparable โ but every precision searches the identical pool, so the retention numbers (the point of this table) are sound. ColBERT uses brute-force MaxSim with no query augmentation, so its absolute scores sit a touch below a full PLAID setup.
Summary (mean over 8 datasets)
| precision | NDCG@10 | NDCG retention | Recall@10 | Recall retention | size |
|---|---|---|---|---|---|
| bf16 โ | 0.740 | 100.0% | 0.780 | 100.0% | 707 MB |
| 8-bit | 0.741 | 100.0% | 0.779 | 99.4% | 376 MB |
| 4-bit | 0.731 | 98.7% | 0.780 | 99.7% | 199 MB |
| mxfp4 | 0.730 | 98.5% | 0.773 | 98.8% | โ |
NDCG@10 by dataset
| dataset | bf16 โ | 8-bit | 4-bit | mxfp4 |
|---|---|---|---|---|
| NanoNQ ยท en | 0.757 | 0.751 | 0.716 | 0.742 |
| NanoFiQA2018 ยท en | 0.528 | 0.512 | 0.524 | 0.520 |
| NanoSciFact ยท en | 0.693 | 0.712 | 0.702 | 0.682 |
| NanoNFCorpus ยท en | 0.345 | 0.342 | 0.335 | 0.334 |
| MIRACL ยท es | 0.900 | 0.901 | 0.899 | 0.900 |
| MIRACL ยท de | 0.823 | 0.837 | 0.826 | 0.811 |
| MIRACL ยท ja | 0.934 | 0.933 | 0.923 | 0.926 |
| MIRACL ยท ar | 0.938 | 0.941 | 0.924 | 0.926 |
License & attribution
Redistributed under the LFM Open License v1.0 (LICENSE) โ the same license as the original model. Per Section 4, this notice records that the files were modified (format conversion to MLX). The original work is by Liquid AI; this repository is an independent conversion, not affiliated with or endorsed by Liquid AI. The license includes a commercial-use threshold (Section 5) โ review it for your use case.
Base model: LiquidAI/LFM2.5-ColBERT-350M
- Downloads last month
- 29
Quantized
Model tree for ronaldmannak/LFM2.5-ColBERT-350M-bf16
Base model
LiquidAI/LFM2.5-350M-Base