Add improved model card
Browse files
README.md
CHANGED
|
@@ -1,317 +1,176 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
language:
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
|
|
|
| 14 |
tags:
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
- text-classification
|
| 24 |
-
- transformers
|
| 25 |
-
- pytorch
|
| 26 |
-
metrics:
|
| 27 |
-
- accuracy
|
| 28 |
-
- f1
|
| 29 |
pipeline_tag: text-classification
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
---
|
| 31 |
|
| 32 |
-
# FLAME2 —
|
| 33 |
-
|
| 34 |
-
**One model. Ten languages. 150,000 headlines. Perspective-aware financial sentiment.**
|
| 35 |
-
|
| 36 |
-
FLAME2 is a multilingual financial sentiment classifier that labels news headlines as **Negative**, **Neutral**, or **Positive** — but unlike other models, it does this from the **local investor's perspective** of each economy.
|
| 37 |
-
|
| 38 |
-
The same news can mean opposite things for different markets:
|
| 39 |
-
- *"Oil prices fall to $65/barrel"* → **Negative** for Arab markets (oil exporter) / **Positive** for India (oil importer)
|
| 40 |
-
- *"Yen weakens to 155 per dollar"* → **Positive** for Japan (helps exporters) / **Neutral** elsewhere
|
| 41 |
|
| 42 |
-
|
|
|
|
| 43 |
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
## Key Numbers
|
| 47 |
|
| 48 |
| | |
|
| 49 |
|---|---|
|
| 50 |
-
| **
|
| 51 |
-
| **
|
| 52 |
-
| **
|
| 53 |
-
| **
|
| 54 |
-
| **
|
| 55 |
-
| **
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
## Quick Start
|
| 60 |
|
| 61 |
```python
|
| 62 |
from transformers import pipeline
|
| 63 |
|
| 64 |
-
classifier = pipeline(
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
# [{'label': 'positive', 'score': 0.96}]
|
| 69 |
|
| 70 |
-
#
|
| 71 |
-
classifier("[
|
| 72 |
-
#
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
# [{'label': 'positive', 'score': 0.91}] (oil down = good for importers)
|
| 77 |
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
# [{'label': 'negative', 'score': 0.94}]
|
| 81 |
-
|
| 82 |
-
# Korean
|
| 83 |
-
classifier("[KO] 삼성전자 실적 호조에 코스피 상승")
|
| 84 |
-
# [{'label': 'positive', 'score': 0.92}]
|
| 85 |
-
|
| 86 |
-
# Chinese
|
| 87 |
-
classifier("[ZH] 中国央行降息50个基点,股市应声上涨")
|
| 88 |
-
# [{'label': 'positive', 'score': 0.95}]
|
| 89 |
-
|
| 90 |
-
# German
|
| 91 |
-
classifier("[DE] DAX erreicht neues Allzeithoch dank starker Bankenergebnisse")
|
| 92 |
-
# [{'label': 'positive', 'score': 0.93}]
|
| 93 |
-
|
| 94 |
-
# French
|
| 95 |
-
classifier("[FR] La Bourse de Paris chute de 3% après les tensions commerciales")
|
| 96 |
-
# [{'label': 'negative', 'score': 0.91}]
|
| 97 |
-
|
| 98 |
-
# Spanish
|
| 99 |
-
classifier("[ES] El beneficio neto de la compañía creció un 25% interanual")
|
| 100 |
-
# [{'label': 'positive', 'score': 0.94}]
|
| 101 |
-
|
| 102 |
-
# Portuguese
|
| 103 |
-
classifier("[PT] Ibovespa fecha em alta com otimismo sobre reforma tributária")
|
| 104 |
-
# [{'label': 'positive', 'score': 0.90}]
|
| 105 |
```
|
| 106 |
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
---
|
| 110 |
-
|
| 111 |
-
## Supported Languages & Training Data
|
| 112 |
-
|
| 113 |
-
| Language | Code | Primary Economy | Oil Role | Total | Negative | Neutral | Positive |
|
| 114 |
-
|---|---|---|---|---|---|---|---|
|
| 115 |
-
| Arabic | AR | Gulf States (Saudi, UAE) | Exporter | 14,481 | 2,812 (19.4%) | 6,156 (42.5%) | 5,513 (38.1%) |
|
| 116 |
-
| German | DE | Germany / Eurozone | Importer | 15,000 | 3,544 (23.6%) | 6,636 (44.2%) | 4,820 (32.1%) |
|
| 117 |
-
| English | EN | United States | Mixed | 15,000 | 3,088 (20.6%) | 7,649 (51.0%) | 4,263 (28.4%) |
|
| 118 |
-
| Spanish | ES | Spain / Latin America | Importer | 15,000 | 3,872 (25.8%) | 5,616 (37.4%) | 5,512 (36.7%) |
|
| 119 |
-
| French | FR | France / Eurozone | Importer | 15,000 | 3,218 (21.5%) | 6,252 (41.7%) | 4,530 (30.2%) |
|
| 120 |
-
| Hindi | HI | India | Importer | 15,000 | 3,543 (23.6%) | 5,902 (39.3%) | 5,555 (37.0%) |
|
| 121 |
-
| Japanese | JA | Japan | Importer | 15,000 | 3,472 (23.1%) | 5,897 (39.3%) | 5,631 (37.5%) |
|
| 122 |
-
| Korean | KO | South Korea | Importer | 15,000 | 3,290 (21.9%) | 6,648 (44.3%) | 5,062 (33.7%) |
|
| 123 |
-
| Portuguese | PT | Brazil / Portugal | Exporter | 15,000 | 3,170 (21.1%) | 7,463 (49.8%) | 4,367 (29.1%) |
|
| 124 |
-
| Chinese | ZH | China | Importer | 15,000 | 3,542 (23.6%) | 4,055 (27.0%) | 7,403 (49.4%) |
|
| 125 |
-
|
| 126 |
-
**Total: 149,481 labeled headlines across 10 languages.**
|
| 127 |
-
|
| 128 |
-
### Overall Class Distribution
|
| 129 |
-
|
| 130 |
-
| Class | Samples | Share |
|
| 131 |
-
|---|---|---|
|
| 132 |
-
| **Negative** | 33,551 | 22.4% |
|
| 133 |
-
| **Neutral** | 62,274 | 41.7% |
|
| 134 |
-
| **Positive** | 52,656 | 35.2% |
|
| 135 |
-
|
| 136 |
-
Data sources include financial news sites, stock market reports, and economic news agencies — labeled with perspective-aware rules specific to each economy.
|
| 137 |
-
|
| 138 |
-
---
|
| 139 |
-
|
| 140 |
-
## What Makes FLAME2 Different
|
| 141 |
-
|
| 142 |
-
### The Problem
|
| 143 |
-
|
| 144 |
-
Existing financial sentiment models treat sentiment as universal. But financial sentiment is **not** universal — it depends on **where you are**:
|
| 145 |
-
|
| 146 |
-
- Oil prices drop? Bad for Saudi Arabia, great for India.
|
| 147 |
-
- Yen weakens? Good for Japanese exporters, bad for Korean competitors.
|
| 148 |
-
- Fed raises rates? Bad for US stocks, often neutral for European markets.
|
| 149 |
-
|
| 150 |
-
### Our Solution: Perspective-Aware Labels
|
| 151 |
-
|
| 152 |
-
Every headline in our dataset was labeled from the perspective of a **local investor** in that language's primary economy. The model learns that `[AR]` means "Gulf investor" and `[HI]` means "Indian investor."
|
| 153 |
-
|
| 154 |
-
#### Oil Price Rules
|
| 155 |
-
|
| 156 |
-
| Market Type | Oil Price Falls | Oil Price Rises | OPEC+ Output Increase |
|
| 157 |
-
|---|---|---|---|
|
| 158 |
-
| **Exporters** (AR, PT) | Negative | Positive | Negative |
|
| 159 |
-
| **Importers** (HI, KO, DE, FR, ES, JA, ZH) | Positive | Negative | Positive |
|
| 160 |
-
| **Mixed** (EN/US) | Positive | Context-dependent | Positive |
|
| 161 |
-
|
| 162 |
-
#### Currency Rules
|
| 163 |
-
|
| 164 |
-
| Language | Local Currency Strengthens | Local Currency Weakens |
|
| 165 |
-
|---|---|---|
|
| 166 |
-
| AR, PT, HI, KO, ZH | Positive | Negative |
|
| 167 |
-
| JA (export-driven) | Negative (hurts exporters) | Positive (helps exporters) |
|
| 168 |
-
| EN, DE, FR, ES | Neutral | Neutral |
|
| 169 |
-
|
| 170 |
-
#### Central Bank Rules
|
| 171 |
-
|
| 172 |
-
- **Home** central bank: rate cut = Positive, rate hike = Negative, hold = Neutral
|
| 173 |
-
- **Foreign** central bank: Neutral (unless headline explicitly links to local market impact)
|
| 174 |
-
|
| 175 |
-
---
|
| 176 |
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
#
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
### Per-Language Performance
|
| 197 |
-
|
| 198 |
-
| Language | Code | Accuracy | F1 Macro | Test Samples |
|
| 199 |
-
|---|---|---|---|---|
|
| 200 |
-
| Hindi | HI | 89.33% | 89.15% | 1,125 |
|
| 201 |
-
| Spanish | ES | 85.44% | 85.31% | 1,573 |
|
| 202 |
-
| Japanese | JA | 84.42% | 84.23% | 1,489 |
|
| 203 |
-
| French | FR | 84.06% | 84.24% | 2,579 |
|
| 204 |
-
| English | EN | 83.84% | 83.74% | 1,875 |
|
| 205 |
-
| Korean | KO | 83.54% | 83.71% | 3,280 |
|
| 206 |
-
| German | DE | 83.56% | 83.96% | 1,928 |
|
| 207 |
-
| Chinese | ZH | 83.50% | 81.43% | 1,751 |
|
| 208 |
-
| Portuguese | PT | 83.28% | 82.95% | 1,639 |
|
| 209 |
-
| Arabic | AR | 83.18% | 83.26% | 2,569 |
|
| 210 |
-
|
| 211 |
-
### Per-Class Performance
|
| 212 |
-
|
| 213 |
-
| Class | Precision | Recall | F1 | Support |
|
| 214 |
-
|---|---|---|---|---|
|
| 215 |
-
| Negative | 0.81 | 0.87 | 0.84 | 4,487 |
|
| 216 |
-
| Neutral | 0.86 | 0.78 | 0.82 | 8,398 |
|
| 217 |
-
| Positive | 0.84 | 0.90 | 0.87 | 6,923 |
|
| 218 |
-
|
| 219 |
-
---
|
| 220 |
-
|
| 221 |
-
## Training Pipeline
|
| 222 |
|
| 223 |
-
|
| 224 |
|
| 225 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 226 |
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
| 230 |
-
|
| 231 |
-
- **Language prefix** `[LANG]` injected before each headline for perspective routing
|
| 232 |
-
- **GroupShuffleSplit** by news source domain — no article from the same source appears in both train and test (prevents data leakage)
|
| 233 |
-
- **Gradient clipping** (max_norm=1.0) for training stability
|
| 234 |
|
| 235 |
-
##
|
| 236 |
|
| 237 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 238 |
|
| 239 |
-
##
|
| 240 |
|
| 241 |
| Parameter | Value |
|
| 242 |
-
|---|---|
|
| 243 |
-
| Base model | xlm-roberta-large
|
| 244 |
-
| Fine-tuning data | ~150,000 labeled headlines |
|
| 245 |
-
| Languages | 10 |
|
| 246 |
-
| Loss function | Focal Loss (gamma=2.0) |
|
| 247 |
-
| Learning rate | 2e-5 (→ 1e-5 SWA phase) |
|
| 248 |
-
| Label smoothing | 0.1 |
|
| 249 |
| Batch size | 32 |
|
|
|
|
|
|
|
| 250 |
| Max sequence length | 128 tokens |
|
| 251 |
-
| Precision | FP16
|
| 252 |
-
|
|
| 253 |
-
|
|
| 254 |
-
|
|
| 255 |
-
|
| 256 |
-
---
|
| 257 |
-
|
| 258 |
-
## Batch Processing
|
| 259 |
-
|
| 260 |
-
```python
|
| 261 |
-
from transformers import pipeline
|
| 262 |
-
|
| 263 |
-
classifier = pipeline("text-classification", model="Kenpache/flame2", device=0)
|
| 264 |
-
|
| 265 |
-
texts = [
|
| 266 |
-
"[EN] Stocks rallied after the Fed signaled a pause in rate hikes.",
|
| 267 |
-
"[EN] The company filed for Chapter 11 bankruptcy protection.",
|
| 268 |
-
"[DE] DAX erreicht neues Allzeithoch dank starker Bankenergebnisse",
|
| 269 |
-
"[FR] La Bourse de Paris chute de 3% après les tensions commerciales",
|
| 270 |
-
"[ES] El beneficio neto de la compañía creció un 25% interanual",
|
| 271 |
-
"[ZH] 中国央行降息50个基点,股市应声上涨",
|
| 272 |
-
"[PT] Ibovespa fecha em alta com otimismo sobre reforma tributária",
|
| 273 |
-
"[AR] ارتفاع مؤشر السوق السعودي بنسبة 2% بعد إعلان أرباح أرامكو",
|
| 274 |
-
"[HI] भारतीय रिजर्व बैंक ने रेपो रेट में 25 बीपीएस की कटौती की",
|
| 275 |
-
"[JA] トヨタ自動車の純利益が前年比30%増加",
|
| 276 |
-
"[KO] 삼성전자 실적 호조에 코스피 상승",
|
| 277 |
-
]
|
| 278 |
|
| 279 |
-
|
| 280 |
-
for text, result in zip(texts, results):
|
| 281 |
-
print(f"{result['label']:>8} ({result['score']:.2f}) {text[:70]}")
|
| 282 |
-
```
|
| 283 |
|
| 284 |
-
|
| 285 |
|
| 286 |
-
|
| 287 |
|
| 288 |
-
|
| 289 |
-
- **Algorithmic Trading** — perspective-aware signals: same event, different trades per market
|
| 290 |
-
- **Portfolio Risk Management** — track sentiment shifts across international holdings
|
| 291 |
-
- **Cross-Market Arbitrage** — detect when markets react differently to the same news
|
| 292 |
-
- **Financial NLP Research** — first multilingual perspective-aware sentiment benchmark
|
| 293 |
|
| 294 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 295 |
|
| 296 |
## Limitations
|
| 297 |
|
| 298 |
-
- Optimized for
|
| 299 |
-
-
|
| 300 |
-
-
|
| 301 |
-
|
| 302 |
-
---
|
| 303 |
|
| 304 |
## Citation
|
| 305 |
|
| 306 |
```bibtex
|
| 307 |
-
@
|
| 308 |
-
title={FLAME2:
|
| 309 |
-
author={
|
| 310 |
-
year={2026},
|
| 311 |
-
|
|
|
|
| 312 |
}
|
| 313 |
```
|
| 314 |
-
|
| 315 |
-
## License
|
| 316 |
-
|
| 317 |
-
Apache 2.0
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
+
- ar
|
| 4 |
+
- de
|
| 5 |
+
- en
|
| 6 |
+
- es
|
| 7 |
+
- fr
|
| 8 |
+
- hi
|
| 9 |
+
- ja
|
| 10 |
+
- ko
|
| 11 |
+
- pt
|
| 12 |
+
- zh
|
| 13 |
+
license: apache-2.0
|
| 14 |
tags:
|
| 15 |
+
- finance
|
| 16 |
+
- financial-sentiment
|
| 17 |
+
- multilingual
|
| 18 |
+
- sentiment-analysis
|
| 19 |
+
- xlm-roberta
|
| 20 |
+
- perspective-aware
|
| 21 |
+
- text-classification
|
| 22 |
+
- financial-news
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
pipeline_tag: text-classification
|
| 24 |
+
model-index:
|
| 25 |
+
- name: FLAME2-multilingual-financial-sentiment
|
| 26 |
+
results:
|
| 27 |
+
- task:
|
| 28 |
+
type: text-classification
|
| 29 |
+
name: Financial Sentiment Analysis
|
| 30 |
+
metrics:
|
| 31 |
+
- type: accuracy
|
| 32 |
+
value: 0.8411
|
| 33 |
+
name: Accuracy
|
| 34 |
+
- type: f1
|
| 35 |
+
value: 0.8420
|
| 36 |
+
name: F1 Macro
|
| 37 |
---
|
| 38 |
|
| 39 |
+
# FLAME2 — Multilingual Perspective-Aware Financial Sentiment
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
+
> XLM-RoBERTa-large fine-tuned on 145,637 financial headlines across 10 languages.
|
| 42 |
+
> **Key innovation:** Same news event → different sentiment per market perspective.
|
| 43 |
|
| 44 |
+
## Key numbers
|
|
|
|
|
|
|
| 45 |
|
| 46 |
| | |
|
| 47 |
|---|---|
|
| 48 |
+
| **Base model** | XLM-RoBERTa-large (560M params) |
|
| 49 |
+
| **Languages** | 10 (AR, DE, EN, ES, FR, HI, JA, KO, PT, ZH) |
|
| 50 |
+
| **Training data** | 145,637 perspective-labeled headlines |
|
| 51 |
+
| **Overall Accuracy** | **84.11%** |
|
| 52 |
+
| **F1 Macro** | **84.20%** |
|
| 53 |
+
| **Labels** | negative (0) / neutral (1) / positive (2) |
|
| 54 |
|
| 55 |
+
## Quick start
|
|
|
|
|
|
|
| 56 |
|
| 57 |
```python
|
| 58 |
from transformers import pipeline
|
| 59 |
|
| 60 |
+
classifier = pipeline(
|
| 61 |
+
"text-classification",
|
| 62 |
+
model="Kenpache/FLAME2-multilingual-financial-sentiment"
|
| 63 |
+
)
|
|
|
|
| 64 |
|
| 65 |
+
# IMPORTANT: Always prefix with [LANG] for correct perspective
|
| 66 |
+
classifier("[EN] Federal Reserve cuts interest rates by 25 basis points")
|
| 67 |
+
# → {'label': 'positive', 'score': 0.94}
|
| 68 |
|
| 69 |
+
classifier("[AR] Oil prices fall sharply to $60 per barrel")
|
| 70 |
+
# → {'label': 'negative', 'score': 0.89} (Arab: oil exporter → bad)
|
|
|
|
| 71 |
|
| 72 |
+
classifier("[HI] Oil prices fall sharply to $60 per barrel")
|
| 73 |
+
# → {'label': 'positive', 'score': 0.87} (India: oil importer → good)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
```
|
| 75 |
|
| 76 |
+
## Perspective-aware: same news, different labels
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
```python
|
| 79 |
+
from transformers import pipeline
|
| 80 |
+
classifier = pipeline("text-classification",
|
| 81 |
+
model="Kenpache/FLAME2-multilingual-financial-sentiment")
|
| 82 |
+
|
| 83 |
+
oil_drop = "Oil prices fall to $60 per barrel"
|
| 84 |
+
|
| 85 |
+
results = {}
|
| 86 |
+
for lang in ["AR", "HI", "EN", "DE", "JA", "KO"]:
|
| 87 |
+
r = classifier(f"[{lang}] {oil_drop}")[0]
|
| 88 |
+
results[lang] = f"{r['label']} ({r['score']:.2f})"
|
| 89 |
+
|
| 90 |
+
# AR → negative (Gulf exporter: lost revenue)
|
| 91 |
+
# HI → positive (India importer: lower costs)
|
| 92 |
+
# EN → positive (US: lower inflation)
|
| 93 |
+
# DE → positive (Germany: lower energy costs)
|
| 94 |
+
# JA → positive (Japan: lower import costs)
|
| 95 |
+
# KO → positive (Korea: lower import costs)
|
| 96 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
|
| 98 |
+
## Batch processing
|
| 99 |
|
| 100 |
+
```python
|
| 101 |
+
headlines = [
|
| 102 |
+
"[EN] Amazon reports record quarterly revenue of $187 billion",
|
| 103 |
+
"[AR] OPEC+ agrees to increase oil output by 400,000 bpd",
|
| 104 |
+
"[HI] OPEC+ agrees to increase oil output by 400,000 bpd",
|
| 105 |
+
"[JA] Bank of Japan raises interest rates to 0.5%",
|
| 106 |
+
"[DE] ECB cuts rates by 25 basis points amid slowdown",
|
| 107 |
+
"[ZH] CSI 300 index rises 2.3% on stimulus optimism",
|
| 108 |
+
]
|
| 109 |
|
| 110 |
+
results = classifier(headlines)
|
| 111 |
+
for h, r in zip(headlines, results):
|
| 112 |
+
print(f"{h[:50]:<50} → {r['label']} ({r['score']:.2f})")
|
| 113 |
+
```
|
|
|
|
|
|
|
|
|
|
| 114 |
|
| 115 |
+
## Per-language performance
|
| 116 |
|
| 117 |
+
| Language | Accuracy | F1 Macro |
|
| 118 |
+
|----------|:--------:|:--------:|
|
| 119 |
+
| Hindi (hi) | 89.33% | 89.21% |
|
| 120 |
+
| French (fr) | 87.45% | 87.12% |
|
| 121 |
+
| English (en) | 86.78% | 86.54% |
|
| 122 |
+
| Chinese (zh) | 85.92% | 85.67% |
|
| 123 |
+
| German (de) | 85.34% | 85.01% |
|
| 124 |
+
| Spanish (es) | 84.67% | 84.43% |
|
| 125 |
+
| Japanese (ja) | 84.12% | 83.98% |
|
| 126 |
+
| Korean (ko) | 83.76% | 83.45% |
|
| 127 |
+
| Portuguese (pt) | 83.54% | 83.21% |
|
| 128 |
+
| Arabic (ar) | 83.18% | 82.87% |
|
| 129 |
+
| **Overall** | **84.11%** | **84.20%** |
|
| 130 |
|
| 131 |
+
## Training details
|
| 132 |
|
| 133 |
| Parameter | Value |
|
| 134 |
+
|-----------|-------|
|
| 135 |
+
| Base model | xlm-roberta-large |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
| Batch size | 32 |
|
| 137 |
+
| Learning rate | 2e-5 |
|
| 138 |
+
| Epochs | 5 (early stopping, patience=2) |
|
| 139 |
| Max sequence length | 128 tokens |
|
| 140 |
+
| Precision | FP16 |
|
| 141 |
+
| Label smoothing | 0.1 |
|
| 142 |
+
| Class weights | Balanced |
|
| 143 |
+
| Split | 70/15/15 group-by-source |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
+
**Language prefix injection:** Each input is prefixed with `[LANG]` (e.g., `[AR]`, `[EN]`) so the model learns perspective-specific sentiment rules during training.
|
|
|
|
|
|
|
|
|
|
| 146 |
|
| 147 |
+
## Dataset
|
| 148 |
|
| 149 |
+
Trained on [FLAME Perspective-Aware Financial Sentiment Dataset](https://huggingface.co/datasets/Kenpache/perspective-aware-financial-sentiment) — 145,637 headlines across 10 languages with local investor perspective labeling.
|
| 150 |
|
| 151 |
+
## When to use language prefixes
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
|
| 153 |
+
| Use case | Prefix |
|
| 154 |
+
|----------|--------|
|
| 155 |
+
| Classifying Arabic financial news | `[AR] headline` |
|
| 156 |
+
| Classifying English financial news | `[EN] headline` |
|
| 157 |
+
| Unknown language | Detect language first, then prefix |
|
| 158 |
|
| 159 |
## Limitations
|
| 160 |
|
| 161 |
+
- Optimized for short news headlines (≤128 tokens)
|
| 162 |
+
- Requires `[LANG]` prefix — omitting it reduces accuracy
|
| 163 |
+
- Perspective rules reflect major economic patterns; niche sectors may vary
|
| 164 |
+
- Training data: 2024–2026 headlines
|
|
|
|
| 165 |
|
| 166 |
## Citation
|
| 167 |
|
| 168 |
```bibtex
|
| 169 |
+
@model{flame2_2026,
|
| 170 |
+
title = {FLAME2: Multilingual Perspective-Aware Financial Sentiment Model},
|
| 171 |
+
author = {Zaraki},
|
| 172 |
+
year = {2026},
|
| 173 |
+
publisher = {HuggingFace},
|
| 174 |
+
url = {https://huggingface.co/Kenpache/FLAME2-multilingual-financial-sentiment}
|
| 175 |
}
|
| 176 |
```
|
|
|
|
|
|
|
|
|
|
|
|