Instructions to use solailabs/wmt22-cometkiwi-da-int8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- COMET
How to use solailabs/wmt22-cometkiwi-da-int8 with COMET:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
license: apache-2.0
language:
- multilingual
tags:
- translation
- quality-estimation
- reference-free
- comet
- cometkiwi
- pruning
base_model: Unbabel/wmt22-cometkiwi-da
pipeline_tag: translation
wmt22-cometkiwi-da-int8
A compressed version of Unbabel/wmt22-cometkiwi-da — a reference-free machine-translation quality estimation model (source + MT only, no human reference required).
Lossless compression — zero human-Pearson loss, ~40% smaller on disk via int8 alone.
What's different from the base model
- No layer pruning — all 24 XLM-R encoder layers retained. Compression comes entirely from dynamic int8 quantization + fp16 storage.
layerwise_attentionrebuilt to mix only the surviving layers (embeddings + kept layer outputs).- Dynamic int8 quantization on the XLM-R encoder + fp16 storage (cast back to fp32 at load before quant). No layer pruning — all 24 encoder layers retained.
Accuracy
Benchmarked on 1200 stratified segments from RicardoRei/wmt-da-human-evaluation (reference-free, src+mt only):
| Metric | This variant | Full cometkiwi |
|---|---|---|
| Pearson r vs human DA | 0.6404 | 0.6402 |
| Spearman vs human DA | 0.6703 | 0.6698 |
| Pearson r vs full | 0.9919 | 1.0000 |
| MAE vs full | 0.0138 | 0.0000 |
| Params | 565.1M | 565.1M |
| On-disk size | ~1130 MB | ~2200 MB |
All variants at a glance
| Variant | Pearson(human) | Pearson(full) | Size | When to use |
|---|---|---|---|---|
| full base | 0.6402 | 1.0000 | ~2200 MB | reference quality |
-int8 |
0.6404 | 0.9919 | ~1300 MB | lossless compression |
-pruned-k2 |
0.6300 | 0.9784 | ~2100 MB | best-quality pruned |
-pruned-k4 |
0.5642 | 0.8316 | ~2060 MB | aggressive prune |
-pruned-k4-xs |
0.5544 | 0.8113 | ~1030 MB | smallest footprint |
Usage
# pip install "unbabel-comet" "setuptools<81" huggingface_hub
# export HF_TOKEN=<your_token> # must have Unbabel/wmt22-cometkiwi-da access
from huggingface_hub import snapshot_download
import sys
folder = snapshot_download(repo_id="solailabs/wmt22-cometkiwi-da-int8")
sys.path.insert(0, folder)
from load import load_model
model = load_model(folder)
out = model.predict(
[{{"src": "The meeting has been postponed until next week.",
"mt": "La réunion a été reportée à la semaine prochaine."}}],
batch_size=8, gpus=0, progress_bar=False, num_workers=2,
)
print(out["scores"])
The loader re-downloads the base cometkiwi, drops the same encoder layers, optionally applies int8 dynamic quantization, then loads the weights shipped in this repo.
Files
state_dict.pt— pruned model weightsconfig.json— base model id, kept/dropped layer indices, quant flag, accuracyload.py— drop-in loaderREADME.md— this file
Gated base model
The base Unbabel/wmt22-cometkiwi-da is gated. You must accept its license on the Hub while logged in with the same account your HF_TOKEN belongs to — otherwise the base-model download inside load.py returns 403.
Citation
Base model: Unbabel/wmt22-cometkiwi-da by Unbabel.
@inproceedings{{rei-etal-2022-cometkiwi,
title = "{{C}}omet{{K}}iwi: {{IST}}-{{U}}nbabel 2022 Submission for the Quality Estimation Shared Task",
author = "Rei, Ricardo and others",
booktitle = "WMT 2022",
}}
Released under the same license as the base model (Apache 2.0).