Matrix v6 Reranker (matrix-2 v6)

Stage-2 cross-encoder reranker of the Hyperspace Matrix v6 capability-routing cascade (retrieve → rerank). It scores a (query, capability) pair jointly and re-orders a retriever's top-k candidates for precision.

Base: Qwen/Qwen3-0.6B (backbone trained, not frozen) + a Linear(1024 → 1) scoring head on the last-token hidden state.
Params: 596M · fp: bf16 backbone / fp32 head · max_len: 256.
Trained on: the 446,897-capability corpus (387K skills.sh SKILL.md across 7,029 repos + the Hyperspace tool/agent catalog), 1.16M mined (query, capability) pairs with top-k hard negatives and cluster-graded labels (190,085 intent clusters).

Lineage — note on size

Matrix-2 v5 was a frozen Qwen2.5-1.5B decoder + ~~7 small heads. v6 deliberately went smaller and purpose-built, not bigger: the routing budget (~~4 GB / P2P node) is spent on a good embedding (a 0.6B retriever) and a 0.6B reranker, not a larger backbone — model size has steep diminishing returns on the representation task that is the bottleneck (see MATRIX_V6_ARCHITECTURE.md §3). So this v6 reranker (0.6B) is smaller than the v5 model it supersedes, and out-scores a 4B baseline (below).

Eval (5,000 held-out test queries, retrieve top-50 → rerank)

metric	retriever only	+ this reranker	lift
cluster@1	0.517	0.696	+0.179
ndcg@10 (cluster)	0.489	0.600	+0.111
ret@5	0.472	0.601	+0.129

Three-way (1,000 queries): this 0.6B reranker (ours_0.6B, cluster@1 0.696) beats the 4B zeroentropy/zerank-2 baseline (0.599) and the retriever-only floor (0.519).

Files

file	what
`model.safetensors`	the reranker weights (backbone.* + head)
`final.pt`	original training checkpoint (`{model_state_dict, config}`), for provenance
`modeling_matrix_reranker.py`	exact architecture + load + `score()`
`config.json`	arch / pooling / input-format metadata
`tokenizer*` , `vocab.json`	Qwen3-0.6B tokenizer

Usage

from modeling_matrix_reranker import MatrixV6Reranker, format_capability
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("hyperspaceai/matrix-v6-reranker")
m = MatrixV6Reranker.from_checkpoint("model.safetensors").to("cuda")
q = "convert a pdf to markdown"
caps = [format_capability("pdf-to-markdown", "Convert PDF documents into clean Markdown."),
        format_capability("weather-now", "Get the current weather for a city.")]
print(m.score(tok, q, caps, device="cuda"))   # higher = more relevant

The retriever + capability index it reranks lives at hyperspaceai/matrix-v6 (base Qwen/Qwen3-Embedding-0.6B, 446,897 capabilities). Canonical serving code: agentic-os-prod serve_v6.py.

Downloads last month: 30

Safetensors

Model size

0.6B params

Tensor type

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hyperspaceai/matrix-v6-reranker

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(1004)

this model