Davlan/sib200
Viewer β’ Updated β’ 206k β’ 8.87k β’ 20
A multilingual text classification model fine-tuned on the SIB-200 dataset, capable of classifying text into 7 topics across 205 languages.
| Label | Description |
|---|---|
| π geography | Geographic content |
| π¬ science/technology | Science and tech content |
| π¬ entertainment | Entertainment content |
| ποΈ politics | Political content |
| π₯ health | Health and medical content |
| βοΈ travel | Travel content |
| β½ sports | Sports content |
| Metric | Score |
|---|---|
| Test Accuracy | 69.17% |
| Test F1 Macro | 67.62% |
| Languages | 205 |
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Keshav0308/multilingual-topic-classifier"
)
# Works in any language!
classifier("The patient was diagnosed with pneumonia.")
# {'label': 'health', 'score': 0.999}
classifier("El equipo ganΓ³ el campeonato mundial de fΓΊtbol.")
# {'label': 'sports', 'score': 0.999}
Fine-tuned on SIB-200 β a massively multilingual dataset with 205 languages.