--- license: apache-2.0 library_name: transformers base_model: answerdotai/ModernBERT-large tags: - modernbert - episodic-ingestion-compiler - structured-extraction - ingestion-model-spec-v1 - multi-head-classifier language: - en --- # episodic-ingestion-compiler ModernBERT-large multi-head (2000 steps) Fine-tuned from `answerdotai/ModernBERT-large` for the **ingestion-model-spec-v1** task in the `episodic-rs` project: given a blurb of text, emit a list of structured claim candidates conforming to the spec-v1 wire format (closed 21-predicate allowlist, 13-subject-type allowlist, predicate-specific object_value, character offset source_spans, calibrated confidence, first-class abstention). This checkpoint is the **production-selected backbone** of the bakeoff conducted 2026-05-09. See the project's `docs/bakeoff-decision-2026-05-09.md` for full rationale. ## Architecture The encoder pools [CLS] into four heads: | Head | Type | Output dim | Purpose | |---|---|---:|---| | `claim_present` | Linear → BCE | 1 | P(blurb has ≥1 claim); drives abstention | | `predicate` | Linear → CE | 21 | Argmax over spec-v1 predicate allowlist | | `subject_type` | Linear → CE | 13 | Argmax over spec-v1 subject_type allowlist | | `confidence` | Linear → sigmoid → MSE | 1 | Calibrated probability | Not directly loadable via `AutoModel.from_pretrained`. Use the `IngestionEncoder` wrapper in `scripts/train_ingestion_encoder.py` of the source repo. ## Training | | | |---|---| | Training data | 94,894 labeled blurbs from 6,099 deduplicated agent sessions | | Splits | session-stratified 70/10/10/10 train/cal/val/test | | Steps | 2000 @ batch 16 × lr 3e-5 (linear schedule, 80 step warmup implicit) | | Precision | bf16 autocast | | Elapsed | 832s on RTX 5090 (24 GiB) | | Peak VRAM | 16.61 GiB | ## Eval metrics (val split, 13,179 blurbs) | Metric | Value | |---|---:| | claim_present F1 | **0.930** | | claim_present precision | 0.955 | | claim_present recall | 0.906 | | predicate accuracy | 0.954 | | subject_type accuracy | 0.994 | | false_emission_rate | 0.012 | | confidence MAE | 0.028 | ## Loading The checkpoint is a `torch.save` of `{state_dict, backbone, args, step, eval}`. Load via: ```python import torch from transformers import AutoModel, AutoTokenizer from huggingface_hub import hf_hub_download import torch.nn as nn class IngestionEncoder(nn.Module): def __init__(self, backbone_name: str): super().__init__() self.backbone = AutoModel.from_pretrained(backbone_name) h = self.backbone.config.hidden_size self.dropout = nn.Dropout(0.1) self.head_claim_present = nn.Linear(h, 1) self.head_predicate = nn.Linear(h, 21) self.head_subject_type = nn.Linear(h, 13) self.head_confidence = nn.Linear(h, 1) def forward(self, input_ids, attention_mask): out = self.backbone(input_ids=input_ids, attention_mask=attention_mask) pooled = self.dropout(out.last_hidden_state[:, 0]) return { "claim_logit": self.head_claim_present(pooled).squeeze(-1), "predicate_logits": self.head_predicate(pooled), "subject_type_logits": self.head_subject_type(pooled), "confidence_pred": torch.sigmoid(self.head_confidence(pooled).squeeze(-1)), } ckpt = torch.load( hf_hub_download("Avifenesh/episodic-ingestion-compiler-modernbert-large-mh-2000", "best.pt"), map_location="cpu", weights_only=False, ) model = IngestionEncoder(ckpt["backbone"]) model.load_state_dict(ckpt["state_dict"]) model.eval() tokenizer = AutoTokenizer.from_pretrained(ckpt["backbone"]) ``` ## Allowlists (closed vocabularies enforced by the spec) **Predicates (21)**: `validated_by`, `had_outcome`, `failed_because`, `worked_because`, `decided`, `blocked_by`, `has_next_action`, `has_status`, `has_goal`, `touched_file`, `ran_command`, `logged_event`, `has_constraint`, `has_open_question`, `has_quality_finding`, `reverted_file`, `deleted_file`, `created_file`, `committed`, `deployed`, `incident_observed`. **Subject types (13)**: `objective`, `command`, `file`, `pr`, `incident`, `policy`, `person`, `repo`, `team`, `service`, `document`, `ticket`, `thread`. ## Wrapper responsibility (not done by this model) This model emits the information-bearing subset. A deterministic wrapper fills the literal fields: - `status = "candidate"` (always) - `source_authority = "model_draft"` (always) - `extraction_method = "ingestion_model_v1"` (always) - `source_memory_ids` from the caller's scope - `class_hint` derived from predicate - `qualifiers = {}` default - `source_spans` TBD — current encoder emits whole-blurb as the single span; a span-boundary head is planned ## Bakeoff context Competed against: ModernBERT-base-MH, DeBERTa-v3-large-MH, NuExtract-2.0-2B (QLoRA SFT), NuExtract-2.0-8B (QLoRA SFT), Qwen3.5-4B (QLoRA SFT). ModernBERT-large-MH won on every gate.