---
language:
- en
license: cc-by-nc-4.0
tags:
- mteb
- sentence-transformers
- embedding
- text-embedding
- ogma
- axiotic
- matryoshka
- small-model
model-index:
- name: ogma-large
  results:
  - task:
      type: Classification
    dataset:
      type: mteb/AmazonCounterfactualClassification
      name: MTEB AmazonCounterfactualClassification
      config: default
      split: test
      revision: 1f7e6a9d6fa6e64c53d146e428565640410c0df1
    metrics:
    - type: accuracy
      value: 72.85
  - task:
      type: Classification
    dataset:
      type: mteb/AmazonPolarityClassification
      name: MTEB AmazonPolarityClassification
      config: default
      split: test
      revision: e2d317d38cd51312af73b3d32a06d1a08b442046
    metrics:
    - type: accuracy
      value: 83.51
  - task:
      type: Classification
    dataset:
      type: mteb/AmazonReviewsClassification
      name: MTEB AmazonReviewsClassification
      config: default
      split: test
      revision: 6b5d328eaae8ef408dd7d775040245cf86f92e9d
    metrics:
    - type: accuracy
      value: 39.85
  - task:
      type: Clustering
    dataset:
      type: mteb/BiorxivClusteringP2P
      name: MTEB BiorxivClusteringP2P
      config: default
      split: test
      revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
    metrics:
    - type: v_measure
      value: 34.84
  - task:
      type: Clustering
    dataset:
      type: mteb/BiorxivClusteringS2S
      name: MTEB BiorxivClusteringS2S
      config: default
      split: test
      revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
    metrics:
    - type: v_measure
      value: 27.02
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackAndroidRetrieval
      name: MTEB CQADupstackAndroidRetrieval
      config: default
      split: test
      revision: 9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3
    metrics:
    - type: ndcg_at_10
      value: 38.98
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackEnglishRetrieval
      name: MTEB CQADupstackEnglishRetrieval
      config: default
      split: test
      revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
    metrics:
    - type: ndcg_at_10
      value: 39.78
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackGamingRetrieval
      name: MTEB CQADupstackGamingRetrieval
      config: default
      split: test
      revision: 4885aa143210c98657558c04aaf3dc47cfb54340
    metrics:
    - type: ndcg_at_10
      value: 48.24
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackGisRetrieval
      name: MTEB CQADupstackGisRetrieval
      config: default
      split: test
      revision: 5003b3064772da1887988e05400cf3806fe491f2
    metrics:
    - type: ndcg_at_10
      value: 33.09
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackMathematicaRetrieval
      name: MTEB CQADupstackMathematicaRetrieval
      config: default
      split: test
      revision: 90fceea13679c63fe563ded68f3b6f06e50061de
    metrics:
    - type: ndcg_at_10
      value: 25.36
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackPhysicsRetrieval
      name: MTEB CQADupstackPhysicsRetrieval
      config: default
      split: test
      revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
    metrics:
    - type: ndcg_at_10
      value: 38.02
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackProgrammersRetrieval
      name: MTEB CQADupstackProgrammersRetrieval
      config: default
      split: test
      revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
    metrics:
    - type: ndcg_at_10
      value: 36.42
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackRetrieval
      name: MTEB CQADupstackRetrieval
      config: default
      split: test
      revision: '1'
    metrics:
    - type: ndcg_at_10
      value: 33.61
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackStatsRetrieval
      name: MTEB CQADupstackStatsRetrieval
      config: default
      split: test
      revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
    metrics:
    - type: ndcg_at_10
      value: 28.07
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackTexRetrieval
      name: MTEB CQADupstackTexRetrieval
      config: default
      split: test
      revision: 46989137a86843e03a6195de44b09deda022eec7
    metrics:
    - type: ndcg_at_10
      value: 23.29
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackUnixRetrieval
      name: MTEB CQADupstackUnixRetrieval
      config: default
      split: test
      revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
    metrics:
    - type: ndcg_at_10
      value: 32.78
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackWebmastersRetrieval
      name: MTEB CQADupstackWebmastersRetrieval
      config: default
      split: test
      revision: 160c094312a0e1facb97e55eeddb698c0abe3571
    metrics:
    - type: ndcg_at_10
      value: 32.9
  - task:
      type: Retrieval
    dataset:
      type: mteb/CQADupstackWordpressRetrieval
      name: MTEB CQADupstackWordpressRetrieval
      config: default
      split: test
      revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
    metrics:
    - type: ndcg_at_10
      value: 26.42
  - task:
      type: Retrieval
    dataset:
      type: mteb/ClimateFEVER
      name: MTEB ClimateFEVER
      config: default
      split: test
      revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
    metrics:
    - type: ndcg_at_10
      value: 24.91
  - task:
      type: Retrieval
    dataset:
      type: mteb/DBPedia
      name: MTEB DBPedia
      config: default
      split: test
      revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
    metrics:
    - type: ndcg_at_10
      value: 37.55
  - task:
      type: Classification
    dataset:
      type: mteb/EmotionClassification
      name: MTEB EmotionClassification
      config: default
      split: test
      revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
    metrics:
    - type: accuracy
      value: 48.29
  - task:
      type: Retrieval
    dataset:
      type: mteb/FEVER
      name: MTEB FEVER
      config: default
      split: test
      revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
    metrics:
    - type: ndcg_at_10
      value: 59.78
  - task:
      type: Retrieval
    dataset:
      type: mteb/HotpotQA
      name: MTEB HotpotQA
      config: default
      split: test
      revision: ab518f4d6fcca38d87c25209f94beba119d02014
    metrics:
    - type: ndcg_at_10
      value: 55.46
  - task:
      type: Retrieval
    dataset:
      type: mteb/MSMARCO
      name: MTEB MSMARCO
      config: default
      split: test
      revision: c5a29a104738b98a9e76336939199e264163d4a0
    metrics:
    - type: ndcg_at_10
      value: 0
  - task:
      type: Classification
    dataset:
      type: mteb/MTOPIntentClassification
      name: MTEB MTOPIntentClassification
      config: default
      split: test
      revision: 2992d820f31312593c49a4890430aadadb0f0039
    metrics:
    - type: accuracy
      value: 64.35
  - task:
      type: Clustering
    dataset:
      type: mteb/MedrxivClusteringP2P
      name: MTEB MedrxivClusteringP2P
      config: default
      split: test
      revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
    metrics:
    - type: v_measure
      value: 32.32
  - task:
      type: Clustering
    dataset:
      type: mteb/MedrxivClusteringS2S
      name: MTEB MedrxivClusteringS2S
      config: default
      split: test
      revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
    metrics:
    - type: v_measure
      value: 29.07
  - task:
      type: Reranking
    dataset:
      type: mteb/MindSmallReranking
      name: MTEB MindSmallReranking
      config: default
      split: test
      revision: 227478e3235572039f4f7661840e059f31ef6eb1
    metrics:
    - type: map
      value: 30.61
  - task:
      type: Retrieval
    dataset:
      type: mteb/NFCorpus
      name: MTEB NFCorpus
      config: default
      split: test
      revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
    metrics:
    - type: ndcg_at_10
      value: 31.98
  - task:
      type: Retrieval
    dataset:
      type: mteb/NQ
      name: MTEB NQ
      config: default
      split: test
      revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
    metrics:
    - type: ndcg_at_10
      value: 54.65
  - task:
      type: Retrieval
    dataset:
      type: mteb/QuoraRetrieval
      name: MTEB QuoraRetrieval
      config: default
      split: test
      revision: e4e08e0b7dbe3c8700f0daef558ff32256715259
    metrics:
    - type: ndcg_at_10
      value: 61.89
  - task:
      type: Clustering
    dataset:
      type: mteb/RedditClustering
      name: MTEB RedditClustering
      config: default
      split: test
      revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
    metrics:
    - type: v_measure
      value: 44.56
  - task:
      type: Clustering
    dataset:
      type: mteb/RedditClusteringP2P
      name: MTEB RedditClusteringP2P
      config: default
      split: test
      revision: 385e3cb46b4cfa89021f56c4380204149d0efe33
    metrics:
    - type: v_measure
      value: 54.14
  - task:
      type: Retrieval
    dataset:
      type: mteb/SCIDOCS
      name: MTEB SCIDOCS
      config: default
      split: test
      revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88
    metrics:
    - type: ndcg_at_10
      value: 17.07
  - task:
      type: STS
    dataset:
      type: mteb/SICK-R
      name: MTEB SICK-R
      config: default
      split: test
      revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
    metrics:
    - type: cosine_spearman
      value: 82.07
  - task:
      type: STS
    dataset:
      type: mteb/STS12
      name: MTEB STS12
      config: default
      split: test
      revision: a0d554a64d88156834ff5ae9920b964011b16384
    metrics:
    - type: cosine_spearman
      value: 78.29
  - task:
      type: STS
    dataset:
      type: mteb/STS13
      name: MTEB STS13
      config: default
      split: test
      revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
    metrics:
    - type: cosine_spearman
      value: 85.41
  - task:
      type: STS
    dataset:
      type: mteb/STS14
      name: MTEB STS14
      config: default
      split: test
      revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
    metrics:
    - type: cosine_spearman
      value: 82.62
  - task:
      type: STS
    dataset:
      type: mteb/STS15
      name: MTEB STS15
      config: default
      split: test
      revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
    metrics:
    - type: cosine_spearman
      value: 86.73
  - task:
      type: STS
    dataset:
      type: mteb/STS16
      name: MTEB STS16
      config: default
      split: test
      revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
    metrics:
    - type: cosine_spearman
      value: 83.84
  - task:
      type: STS
    dataset:
      type: mteb/STSBenchmark
      name: MTEB STSBenchmark
      config: default
      split: test
      revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
    metrics:
    - type: cosine_spearman
      value: 87.32
  - task:
      type: Reranking
    dataset:
      type: mteb/SciDocsRR
      name: MTEB SciDocsRR
      config: default
      split: test
      revision: 39b8377811871075eed9de3b8a7e21aaa6acb3d8
    metrics:
    - type: map
      value: 75.52
  - task:
      type: Retrieval
    dataset:
      type: mteb/SciFact
      name: MTEB SciFact
      config: default
      split: test
      revision: d56462d0e63a25450459c4f213e49ffdb866f7f9
    metrics:
    - type: ndcg_at_10
      value: 63.03
  - task:
      type: PairClassification
    dataset:
      type: mteb/SprintDuplicateQuestions
      name: MTEB SprintDuplicateQuestions
      config: default
      split: test
      revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
    metrics:
    - type: cosine_ap
      value: 94.59
  - task:
      type: Clustering
    dataset:
      type: mteb/StackExchangeClustering
      name: MTEB StackExchangeClustering
      config: default
      split: test
      revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
    metrics:
    - type: v_measure
      value: 51.77
  - task:
      type: Clustering
    dataset:
      type: mteb/StackExchangeClusteringP2P
      name: MTEB StackExchangeClusteringP2P
      config: default
      split: test
      revision: 815ca46b2622cec33ccafc3735d572c266efdb44
    metrics:
    - type: v_measure
      value: 34.23
  - task:
      type: Reranking
    dataset:
      type: mteb/StackOverflowDupQuestions
      name: MTEB StackOverflowDupQuestions
      config: default
      split: test
      revision: 5debda000fe8e27ebb5c123d38081f92e1847a59
    metrics:
    - type: map
      value: 45.15
  - task:
      type: Summarization
    dataset:
      type: mteb/SummEval
      name: MTEB SummEval
      config: default
      split: test
      revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
    metrics:
    - type: cosine_spearman
      value: 30.93
  - task:
      type: Retrieval
    dataset:
      type: mteb/TRECCOVID
      name: MTEB TRECCOVID
      config: default
      split: test
      revision: bb9466bac8153a0349341eb1b22e06409e78ef4e
    metrics:
    - type: ndcg_at_10
      value: 72.99
  - task:
      type: Retrieval
    dataset:
      type: mteb/Touche2020
      name: MTEB Touche2020
      config: default
      split: test
      revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
    metrics:
    - type: ndcg_at_10
      value: 28.12
  - task:
      type: Classification
    dataset:
      type: mteb/ToxicConversationsClassification
      name: MTEB ToxicConversationsClassification
      config: default
      split: test
      revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de
    metrics:
    - type: accuracy
      value: 65.79
  - task:
      type: Classification
    dataset:
      type: mteb/TweetSentimentExtractionClassification
      name: MTEB TweetSentimentExtractionClassification
      config: default
      split: test
      revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
    metrics:
    - type: accuracy
      value: 62.34
  - task:
      type: Clustering
    dataset:
      type: mteb/TwentyNewsgroupsClustering
      name: MTEB TwentyNewsgroupsClustering
      config: default
      split: test
      revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
    metrics:
    - type: v_measure
      value: 41.53
  - task:
      type: PairClassification
    dataset:
      type: mteb/TwitterSemEval2015
      name: MTEB TwitterSemEval2015
      config: default
      split: test
      revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
    metrics:
    - type: cosine_ap
      value: 71.88
  - task:
      type: PairClassification
    dataset:
      type: mteb/TwitterURLCorpus
      name: MTEB TwitterURLCorpus
      config: default
      split: test
      revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
    metrics:
    - type: cosine_ap
      value: 85.53
---

# ogma-large

**32.37M parameter text embedding model** by [Axiotic AI](https://axiotic.ai), achieving **57.38 average** on MTEB English (66/66 tasks).

9-layer transformer, 512 hidden dim, mean pooling — strongest overall model.

## Highlights

- **57.38 MTEB average** on the standard 66-task MTEB English benchmark
- **Matryoshka embeddings** — dimensions [32, 64, 128, 256] for flexible storage/compute tradeoffs
- **Symmetric routing** — task tokens `[QRY]`, `[DOC]`, `[SYM]`; **recommended: `[QRY]`/`[QRY]`** (highest MTEB), with `[SYM]` everywhere as the next-best alternative. `[DOC]` is exposed for downstream fine-tuning and is **not recommended at inference**.
- **1024 token context** — handles longer passages than typical small models
- **HuggingFace Hub** — load directly, no local package installation needed

## Quick Start

```python
import torch
from huggingface_hub import snapshot_download
import sys, yaml

# Download model from HuggingFace
model_path = snapshot_download("axiotic/ogma-large")
sys.path.insert(0, model_path)

from ogma_model import OgmaModel
from config import OgmaConfig, TaskToken
from tokenizer import OgmaTokenizer

# Load model
with open(f"{model_path}/config.yaml") as f:
    cfg = yaml.safe_load(f)
config = OgmaConfig.from_dict(cfg)
model = OgmaModel(config)
state = torch.load(f"{model_path}/model.pt", map_location="cpu", weights_only=True)
model.load_state_dict(state)
model.eval()

# Load tokenizer
tokenizer = OgmaTokenizer(f"{model_path}/tokenizer.json")

# Encode text
sentences = ["The quick brown fox", "A fast auburn canine"]
enc = tokenizer.batch_encode(sentences, max_length=1024)
ids = torch.tensor(enc["input_ids"])
mask = torch.tensor(enc["attention_mask"])

with torch.no_grad():
    embs = model.encode(ids, mask, task=TaskToken.SYM)

# Cosine similarity
sim = torch.nn.functional.cosine_similarity(embs[0], embs[1], dim=0)
print(f"Similarity: {sim.item():.4f}")
print(f"Shape: {embs.shape}")  # (2, 256)
```

## Retrieval (Symmetric Routing)

Ogma is trained for **symmetric routing** — encode queries and documents with the **same** task token. **The recommended route is `[QRY]`/`[QRY]`** (both sides use `TaskToken.QRY`); this benchmarked highest on MTEB. `[SYM]` everywhere is the next-best symmetric alternative — try it on your data if you want to compare. **`[DOC]` is not recommended at inference** — it is exposed for downstream fine-tuning, not as an asymmetric query/document route.

```python
queries = ["What is machine learning?"]
documents = ["ML is a subset of AI...", "The weather is sunny today"]

q_enc = tokenizer.batch_encode(queries, max_length=1024)
d_enc = tokenizer.batch_encode(documents, max_length=1024)

with torch.no_grad():
    # Symmetric: both queries and documents use TaskToken.QRY (not a typo).
    # Swap TaskToken.QRY → TaskToken.SYM on both sides to try the SYM route instead.
    q_embs = model.encode(torch.tensor(q_enc["input_ids"]),
                           torch.tensor(q_enc["attention_mask"]), task=TaskToken.QRY)
    d_embs = model.encode(torch.tensor(d_enc["input_ids"]),
                           torch.tensor(d_enc["attention_mask"]), task=TaskToken.QRY)

scores = q_embs @ d_embs.T
print(f"Relevance scores: {scores}")
```

## Matryoshka Dimensionality Reduction

```python
full = model.encode(ids, mask, task=TaskToken.SYM)       # (256d)
small = torch.nn.functional.normalize(full[:, :32], p=2, dim=-1)  # (32d)
```

## Architecture

| Component | Details |
|-----------|---------|
| Parameters | 32.37M |
| Layers | 9 |
| Hidden dim | 512 |
| Output dim | 256 |
| Heads | 8 |
| Max seq len | 1024 |
| Matryoshka | [32, 64, 128, 256] |
| Pooling | Mean |
| Positional | RoPE |
| FFN | SwiGLU |
| Tokenizer | SentencePiece Unigram (30K) |

## MTEB Results (66/66 tasks)

| Category | ogma-large |
|----------|------------|
| Classification | 68.4 |
| Clustering | 41.6 |
| PairClassification | 84.0 |
| Reranking | 53.1 |
| Retrieval | 43.7 |
| STS | 83.7 |
| Summarization | 30.9 |
| **Overall** | **57.38** |

Benchmarked with MTEB v2.10.7 on the standard 66-task English benchmark using category averaging (same methodology as the MTEB leaderboard).

## Ogma Model Family

| Model | Params | MTEB-66 | Best For |
|-------|--------|---------|----------|
| [ogma-large](https://huggingface.co/axiotic/ogma-large) | 32.37M | 57.38 | Maximum quality |
| [ogma-base](https://huggingface.co/axiotic/ogma-base) | 13.32M | 56.54 | General purpose |
| [ogma-small](https://huggingface.co/axiotic/ogma-small) | 8.60M | 55.79 | Best sub-10M |
| [ogma-mini](https://huggingface.co/axiotic/ogma-mini) | 3.51M | 51.42 | Edge deployment |
| [ogma-micro](https://huggingface.co/axiotic/ogma-micro) | 2.32M | 49.77 | Extreme edge |

## License

This model is licensed under [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/). Commercial use requires a separate license from Axiotic AI.

CC-BY-NC-4.0