---
library_name: transformers
license: gpl-3.0
base_model: bertin-project/bertin-roberta-base-spanish
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: roberta-spanish-cefr
  results: []
datasets:
- UniversalCEFR/caes_es
language:
- es
pipeline_tag: text-classification
---

# Spanish CEFR Classification with BERTIN

## Model summary

`pymlex/roberta-spanish-cefr` is a Spanish text classifier fine-tuned from `bertin-project/bertin-roberta-base-spanish` for CEFR level prediction. It is intended for Spanish learner-text classification and readability-style proficiency assessment.

## Training data

The model was trained on `UniversalCEFR/caes_es`, a Spanish dataset of learner texts with CEFR annotations. The dataset has 31.1k rows.

## Evaluation

Results for the test set:

* Accuracy: 0.9882
* Precision: 0.9896
* Recall: 0.9892
* F1: 0.9894

## Comparison with other CEFR Spanish classifiers

Our model's performance (F1: 0.9894) is SOTA. Most documented Spanish CEFR classifiers fall within the 0.75 – 0.88 F1-score range. The obtained results significantly outperform these common baselines:

| Model / Source | Task / Language | Accuracy | F1-Score |
|---|---|---|---|
| This model (BERTIN-RoBERTa) | Spanish CEFR (6 classes) | 0.9882 | 0.9894 |
| Spanish CEFR Fine-tuned[](https://www.researchgate.net/figure/Performance-metrics-of-the-fine-tuned-model-across-CEFR-levels_tbl4_398474670) | CEFR Spanish (General) | ~0.8500 | 0.83–0.85 |
| BETO/mBERT Baseline[](https://www.researchgate.net/figure/Performance-of-all-models-on-the-Spanish-language-dataset-for-skill-classification-ACC_tbl4_389648000) | Spanish Skill Classif. | 0.7800 | 0.7700 |
| CEFR-ASAG Benchmark[](https://www.cambridge.org/core/journals/recall/article/predicting-cefr-levels-in-learners-of-english-the-use-of-microsystem-criterial-features-in-a-machine-learning-approach/C915A35CD69168EDFB80DE8F57A4328C) | Multi-level (Cross-corpus) | 0.5100 | — |
| IberLEF / Shared Tasks[](https://ceur-ws.org/Vol-3202/parmex-paper3.pdf) | Related Spanish NLP tasks | 0.9373 | 0.9300 |

## Inference

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "pymlex/roberta-spanish-cefr"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

def predict_cefr(text, top_k=3):
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
    )
    with torch.no_grad():
        logits = model(**inputs).logits
        probs = torch.softmax(logits, dim=-1)[0]

    k = min(top_k, probs.numel())
    values, indices = torch.topk(probs, k=k)

    return [
        {
            "label": model.config.id2label[i.item()],
            "score": float(v.item()),
        }
        for i, v in zip(indices, values)
    ]

text = "Estimados señores, les escribo para solicitar información sobre el curso."
print(predict_cefr(text, top_k=3))