ModernBiaffineParser — microsoft/deberta-v3-base
Biaffine dependency parser + joint UPOS tagger trained on Universal Dependencies English Web Treebank (EWT).
Encoder: microsoft/deberta-v3-base (frozen weights not included — loaded from HuggingFace at runtime)
Custom head: biaffine_head.safetensors — word projection, arc/rel/POS MLPs and biaffine layers
Labels: 53 DEPREL labels · 19 UPOS tags
Score convention: s_arc[dep, head], s_rel[dep, head, rel]
Metrics (EWT, decode: Eisner (projective MST))
| Split | LAS | UPOS | UCM | LCM |
|---|---|---|---|---|
| dev | 93.57% | 98.12% | 75.46% | 66.37% |
| test | 93.52% | 98.10% | 76.75% | 67.79% |
ONNX / production use
model.fp16.onnx — fp16 model for CPU inference via ONNX Runtime (~357 MB).
Inputs: subwords [B, W, 20] int64. Outputs: s_arc [B,W,W], s_rel [B,W,W,R], word_repr [B,W,768], s_pos [B,W,P].
# download only inference artifacts
hf download ghotriw/deberta-v3-base-biaffine-dep-pos-en \
model.fp16.onnx vocabs.json idiom_classifier.json tokenizer.json \
--local-dir ./model
hf download ghotriw/deberta-v3-base-biaffine-dep-pos-en \
lexicon.json phrasal-verbs.json \
--local-dir ./dic
vocabs.json — DEPREL and UPOS vocabularies (str→int dicts).
idiom_classifier.json — linear idiom head (W·mean(word_repr)+b, BCE).
lexicon.json + phrasal-verbs.json — candidate lexicons for idiom/phrasal-verb detection.
Input format
The model expects a word-level subword grid [B, W, fix_len=20] (int64),
where each word is independently tokenised with the encoder's sentencepiece tokeniser
and padded/truncated to 20 subword slots. Position 0 is a synthetic ROOT word
whose only subword is [CLS] (id 1).
Vocabularies
config.json contains rel_vocab (str→int) and pos_vocab (str→int).
Index 0 is the <pad> / ROOT slot and should be ignored in evaluation.
- Downloads last month
- 150