--- language: - en license: mit base_model: microsoft/deberta-v3-xsmall pipeline_tag: token-classification tags: - dependency-parsing - pos-tagging - universal-dependencies datasets: - universal_dependencies metrics: - accuracy - las - uas model-index: - name: deberta-v3-xsmall-biaffine-dep-pos-en-ewt-gum results: - task: type: token-classification name: Dependency Parsing & POS/Morphology Tagging dataset: type: universal_dependencies name: EWT + GUM split: test metrics: - type: las name: LAS value: 92.97 - type: uas name: UAS value: 94.74 - type: accuracy name: UPOS value: 98.04 - type: ucm name: UCM value: 70.23 - type: lcm name: LCM value: 60.91 --- # ModernBiaffineParser — microsoft/deberta-v3-xsmall Biaffine dependency parser + joint UPOS tagger trained on [Universal Dependencies English Web Treebank (EWT)](https://universaldependencies.org/treebanks/en_ewt/index.html) and [GUM](https://universaldependencies.org/treebanks/en_gum/index.html). **Encoder:** `microsoft/deberta-v3-xsmall` (frozen weights not included — loaded from HuggingFace at runtime) **Custom head:** `biaffine_head.safetensors` — word projection, arc/rel/POS MLPs and biaffine layers **Labels:** 53 DEPREL labels · 19 UPOS tags **Score convention:** `s_arc[dep, head]`, `s_rel[dep, head, rel]` ## Metrics (EWT + GUM, decode: Eisner (projective MST)) | Split | LAS | UPOS | UCM | LCM | |-------|-----|------|-----|-----| | dev | 93.03% | 98.09% | 70.75% | 60.96% | | test | 92.97% | 98.04% | 70.23% | 60.91% | ## ONNX / production use `model.onnx` — fp32 model (Recommended for CPU inference). `model.fp16.onnx` — fp16 model (For GPU or environments with native fp16 support, ~~139 MB). Inputs: `subwords [B, W, 20]` int64. Outputs: `s_arc [B,W,W]`, `s_rel [B,W,W,R]`, `s_pos [B,W,P]`. ```bash # download only inference artifacts hf download ghotriw/deberta-v3-xsmall-biaffine-dep-pos-en-ewt-gum \ model.fp16.onnx model.onnx vocabs.json tokenizer.json \ --local-dir ./model ``` `vocabs.json` — DEPREL and UPOS vocabularies (str→int dicts). ## Input format The model expects a **word-level subword grid** `[B, W, fix_len=20]` (int64), where each word is independently tokenised with the encoder's sentencepiece tokeniser and padded/truncated to 20 subword slots. Position 0 is a synthetic ROOT word whose only subword is `[CLS]` (id 1). ## Vocabularies `config.json` contains `rel_vocab` (str→int) and `pos_vocab` (str→int). Index 0 is the `` / ROOT slot and should be ignored in evaluation.