Instructions to use hyperspaceai/matrix-v6-reranker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use hyperspaceai/matrix-v6-reranker with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("hyperspaceai/matrix-v6-reranker", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Matrix v6 Reranker (matrix-2 v6)
Stage-2 cross-encoder reranker of the Hyperspace Matrix v6 capability-routing
cascade (retrieve → rerank). It scores a (query, capability) pair jointly and
re-orders a retriever's top-k candidates for precision.
- Base:
Qwen/Qwen3-0.6B(backbone trained, not frozen) + aLinear(1024 → 1)scoring head on the last-token hidden state. - Params: 596M · fp: bf16 backbone / fp32 head · max_len: 256.
- Trained on: the 446,897-capability corpus (387K skills.sh
SKILL.mdacross 7,029 repos + the Hyperspace tool/agent catalog), 1.16M mined(query, capability)pairs with top-k hard negatives and cluster-graded labels (190,085 intent clusters).
Lineage — note on size
Matrix-2 v5 was a frozen Qwen2.5-1.5B decoder + 7 small heads. v6 deliberately went smaller and purpose-built, not bigger: the routing budget (4 GB / P2P node) is spent on a good embedding (a 0.6B retriever) and a 0.6B reranker, not a larger backbone — model size has steep diminishing returns on the representation task that is the bottleneck (see MATRIX_V6_ARCHITECTURE.md §3). So this v6 reranker (0.6B) is smaller than the v5 model it supersedes, and out-scores a 4B baseline (below).
Eval (5,000 held-out test queries, retrieve top-50 → rerank)
| metric | retriever only | + this reranker | lift |
|---|---|---|---|
| cluster@1 | 0.517 | 0.696 | +0.179 |
| ndcg@10 (cluster) | 0.489 | 0.600 | +0.111 |
| ret@5 | 0.472 | 0.601 | +0.129 |
Three-way (1,000 queries): this 0.6B reranker (ours_0.6B, cluster@1 0.696) beats the 4B zeroentropy/zerank-2 baseline (0.599) and the retriever-only floor (0.519).
Files
| file | what |
|---|---|
model.safetensors |
the reranker weights (backbone.* + head) |
final.pt |
original training checkpoint ({model_state_dict, config}), for provenance |
modeling_matrix_reranker.py |
exact architecture + load + score() |
config.json |
arch / pooling / input-format metadata |
tokenizer* , vocab.json |
Qwen3-0.6B tokenizer |
Usage
from modeling_matrix_reranker import MatrixV6Reranker, format_capability
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("hyperspaceai/matrix-v6-reranker")
m = MatrixV6Reranker.from_checkpoint("model.safetensors").to("cuda")
q = "convert a pdf to markdown"
caps = [format_capability("pdf-to-markdown", "Convert PDF documents into clean Markdown."),
format_capability("weather-now", "Get the current weather for a city.")]
print(m.score(tok, q, caps, device="cuda")) # higher = more relevant
The retriever + capability index it reranks lives at
hyperspaceai/matrix-v6
(base Qwen/Qwen3-Embedding-0.6B, 446,897 capabilities). Canonical serving code:
agentic-os-prod serve_v6.py.
- Downloads last month
- 30