---
language:
  - en
license: mit
base_model: microsoft/deberta-v3-xsmall
pipeline_tag: token-classification
tags:
  - dependency-parsing
  - pos-tagging
  - universal-dependencies
  - morphological-features
datasets:
  - universal_dependencies
metrics:
  - accuracy
  - las
  - uas
model-index:
  - name: deberta-v3-xsmall-biaffine-dep-pos-feats-en-ewt-gum
    results:
      - task:
          type: token-classification
          name: Dependency Parsing & POS/Morphology Tagging
        dataset:
          type: universal_dependencies
          name: EWT + GUM
          split: test
        metrics:
          - type: las
            name: LAS
            value: 93.01
          - type: uas
            name: UAS
            value: 94.73
          - type: accuracy
            name: UPOS
            value: 98.08
          - type: accuracy
            name: UFEATS
            value: 95.76
          - type: ucm
            name: UCM
            value: 70.63
          - type: lcm
            name: LCM
            value: 61.37
---

# ModernBiaffineParser — microsoft/deberta-v3-xsmall

Biaffine dependency parser + joint UPOS + morphological features (FEATS) tagger trained on
[Universal Dependencies English Web Treebank (EWT)](https://universaldependencies.org/treebanks/en_ewt/index.html) and [Universal Dependencies English GUM](https://universaldependencies.org/treebanks/en_gum/index.html).

**Encoder:** `microsoft/deberta-v3-xsmall` (frozen weights not included — loaded from HuggingFace at runtime)
**Custom head:** `biaffine_head.safetensors` — word projection, arc/rel/POS/FEATS MLPs and biaffine layers
**Labels:** 53 DEPREL labels · 19 UPOS tags · 21 FEATS categories
**Score convention:** `s_arc[dep, head]`, `s_rel[dep, head, rel]`

## Metrics (EWT + GUM, decode: Eisner (projective MST))

| Split | LAS | UPOS | UFEATS | UCM | LCM |
|-------|-----|------|------|-----|-----|
| dev   | 92.98% | 98.09% | 95.66% | 70.61% | 60.40% |
| test  | 93.01% | 98.08% | 95.76% | 70.63% | 61.37% |


## ONNX / TorchScript / production use

`model.onnx` — fp32 ONNX model (Recommended for CPU inference).
`model.fp16.onnx` — fp16 ONNX model (For GPU or environments with native fp16 support, ~139 MB).
`traced_model.pt` — TorchScript model (For tch-rs or PyTorch C++ API).

Inputs: `subwords [B, W, 20]` int64. Outputs: `s_arc [B,W,W]`, `s_rel [B,W,W,R]`, `s_pos [B,W,P]`, `s_feats [B,W,C,Vmax]`.

```bash
# download only inference artifacts
hf download ghotriw/deberta-v3-xsmall-biaffine-dep-pos-feats-en-ewt-gum \
  model.fp16.onnx model.onnx traced_model.pt vocabs.json tokenizer.json \
  --local-dir ./model
```

`vocabs.json` — DEPREL and UPOS vocabularies (str→int dicts).
`feats_vocab` — morphological categories `{category: {value: idx}}` (idx 0 = `_`/absent). `s_feats[..., c, :]` is an independent softmax per category; non-existent value slots carry `-inf`, so `argmax` over the last dim is always valid.

## Input format

The model expects a **word-level subword grid** `[B, W, fix_len=20]` (int64),
where each word is independently tokenised with the encoder's sentencepiece tokeniser
and padded/truncated to 20 subword slots. Position 0 is a synthetic ROOT word
whose only subword is `[CLS]` (id 1).

## Vocabularies

`config.json` contains `rel_vocab` (str→int) and `pos_vocab` (str→int).
Index 0 is the `<pad>` / ROOT slot and should be ignored in evaluation.

For FEATS models, `config.json` also contains `feats_vocab` `{category: {value: idx}}` and `feats_sizes` (per-category value counts, incl. the `_`/absent slot at index 0). The 4th output `s_feats [B,W,C,Vmax]` is decoded per category via `argmax` over its last dimension.