Matrix v6 Reranker (matrix-2 v6)

Stage-2 cross-encoder reranker of the Hyperspace Matrix v6 capability-routing cascade (retrieve → rerank). It scores a (query, capability) pair jointly and re-orders a retriever's top-k candidates for precision.

  • Base: Qwen/Qwen3-0.6B (backbone trained, not frozen) + a Linear(1024 → 1) scoring head on the last-token hidden state.
  • Params: 596M · fp: bf16 backbone / fp32 head · max_len: 256.
  • Trained on: the 446,897-capability corpus (387K skills.sh SKILL.md across 7,029 repos + the Hyperspace tool/agent catalog), 1.16M mined (query, capability) pairs with top-k hard negatives and cluster-graded labels (190,085 intent clusters).

Lineage — note on size

Matrix-2 v5 was a frozen Qwen2.5-1.5B decoder + 7 small heads. v6 deliberately went smaller and purpose-built, not bigger: the routing budget (4 GB / P2P node) is spent on a good embedding (a 0.6B retriever) and a 0.6B reranker, not a larger backbone — model size has steep diminishing returns on the representation task that is the bottleneck (see MATRIX_V6_ARCHITECTURE.md §3). So this v6 reranker (0.6B) is smaller than the v5 model it supersedes, and out-scores a 4B baseline (below).

Eval (5,000 held-out test queries, retrieve top-50 → rerank)

metric retriever only + this reranker lift
cluster@1 0.517 0.696 +0.179
ndcg@10 (cluster) 0.489 0.600 +0.111
ret@5 0.472 0.601 +0.129

Three-way (1,000 queries): this 0.6B reranker (ours_0.6B, cluster@1 0.696) beats the 4B zeroentropy/zerank-2 baseline (0.599) and the retriever-only floor (0.519).

Files

file what
model.safetensors the reranker weights (backbone.* + head)
final.pt original training checkpoint ({model_state_dict, config}), for provenance
modeling_matrix_reranker.py exact architecture + load + score()
config.json arch / pooling / input-format metadata
tokenizer* , vocab.json Qwen3-0.6B tokenizer

Usage

from modeling_matrix_reranker import MatrixV6Reranker, format_capability
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("hyperspaceai/matrix-v6-reranker")
m = MatrixV6Reranker.from_checkpoint("model.safetensors").to("cuda")
q = "convert a pdf to markdown"
caps = [format_capability("pdf-to-markdown", "Convert PDF documents into clean Markdown."),
        format_capability("weather-now", "Get the current weather for a city.")]
print(m.score(tok, q, caps, device="cuda"))   # higher = more relevant

The retriever + capability index it reranks lives at hyperspaceai/matrix-v6 (base Qwen/Qwen3-Embedding-0.6B, 446,897 capabilities). Canonical serving code: agentic-os-prod serve_v6.py.

Downloads last month
30
Safetensors
Model size
0.6B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hyperspaceai/matrix-v6-reranker

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1004)
this model