Claude commited on
Commit
a0dc23e
·
unverified ·
1 Parent(s): 0aa159b

Sprint 2 du plan rapport — refactor generator.py en templates Jinja2

Browse files

Sprint 2 de la phase 0. Objectif : rendre maintenable le générateur HTML
avant d'y ajouter quoi que ce soit. Zéro changement fonctionnel visible,
tests snapshot pour verrouiller la non-régression.

Refactor :

- `picarones/report/generator.py` passe de 3690 à 617 lignes. Le
`_HTML_TEMPLATE` monolithique (~3100 lignes HTML+CSS+JS concaténés en
f-string Python avec `{{` d'échappement partout) est découpé en 10
fichiers externes dans `picarones/report/templates/` :
- `base.html.j2` — assemble via {% include %}
- `_header.html` — nav + bandeau exclusion globale + <main>
- `_footer.html` — </main> + <footer>
- `_styles.css` — 564 lignes de CSS
- `_app.js` — 2085 lignes de JS applicatif (94 fonctions)
- `view_ranking.html` — vue 1
- `view_gallery.html` — vue 2
- `view_document.html` — vue 3
- `view_analyses.html` — vue 4
- `view_characters.html` — vue 5

- Moteur de rendu : `jinja2>=3.1.0` (ajouté aux dépendances runtime). Le
`generate()` charge `base.html.j2` et rend avec les variables
habituelles : `corpus_name`, `picarones_version`, `html_lang`,
`chartjs_inline`, `report_data_json`, `i18n_json`.

- i18n externalisé : le dict Python `TRANSLATIONS` (101 clés × 2 langues)
de `picarones/i18n.py` migre vers `picarones/report/i18n/{fr,en}.json`,
chargés au premier import. L'API publique (`get_labels`, `TRANSLATIONS`,
`SUPPORTED_LANGS`) reste compatible.

- Packaging : `pyproject.toml` inclut les templates et JSON en
`package-data`, `MANIFEST.in` synchronisé, `.gitignore` exempte
`picarones/report/templates/*.html` de la règle globale `*.html`
(sinon les partials HTML sources seraient ignorés comme des rapports).

Tests :

- 16 nouveaux tests dans `test_sprint17_jinja2_refactor.py` :
- Présence des 10 templates attendus
- Aucun placeholder `.format()` résiduel dans les templates
- Rendu produit bien du HTML avec les 5 vues, `const DATA`/`const I18N`
- Pas de balises `<script>` dupliquées/déséquilibrées
- Génération déterministe (hash identique deux appels sur même data)
- Locale anglaise (`<html lang="en">`) rendue correctement
- Tous les JSON i18n parsent ; FR et EN ont les mêmes clés
- Contenu des templates extraits (règles CSS canoniques présentes,
`'use strict'` en tête de `_app.js`, pas de `<script>` résiduel
dans le JS, chaque vue a son élément racine avec `id`)

- Suite complète : 1101 passed, 2 skipped (vs 1085 avant). Zéro
régression.

Points délicats traités pendant l'extraction :
- Conversion `{{` → `{` et `}}` → `}` (désescapage f-string) dans les
contenus JS/CSS, sinon Jinja2 interprète les `{{` comme variables.
- Désescapage `\\` → `\` dans `_app.js` (4 occurrences du pattern
`l\'analyse` dans des littéraux JS imbriqués).
- Retrait de la balise `<script>` englobante dans `_app.js` — sinon le
rendu produisait des `<script>` imbriqués.

https://claude.ai/code/session_0162FdNNJyNvBuYzkgtsr9VB

.gitignore CHANGED
@@ -20,3 +20,6 @@ rapports/
20
  corpus_*/
21
  corpus/
22
  uploads/
 
 
 
 
20
  corpus_*/
21
  corpus/
22
  uploads/
23
+
24
+ # Exceptions : fichiers HTML sources du package (templates Jinja2, pas rapports)
25
+ !picarones/report/templates/*.html
CLAUDE.md CHANGED
@@ -45,7 +45,10 @@ picarones/
45
  │ ├── hallucination.py # Détection hallucinations VLM (score ancrage, ratio longueur)
46
  │ ├── line_metrics.py # Distribution erreurs par ligne (Gini, percentiles)
47
  │ ├── history.py # Suivi longitudinal SQLite
48
- ── robustness.py # Analyse robustesse (bruit, flou, rotation, résolution)
 
 
 
49
  ├── engines/
50
  │ ├── base.py # BaseEngine avec execution_mode ("io" ou "cpu")
51
  │ ├── tesseract.py # execution_mode = "cpu"
@@ -55,12 +58,12 @@ picarones/
55
  │ └── azure_doc_intel.py
56
  ├── llm/
57
  │ ├── base.py
58
- │ ├── mistral_adapter.py # POST /v1/chat/completions — BUG ACTIF : sortie vide à corriger
59
  │ ├── openai_adapter.py
60
  │ ├── anthropic_adapter.py
61
  │ └── ollama_adapter.py
62
  ├── pipelines/
63
- │ ├── base.py # OCRLLMPipeline BUG ACTIF : résultats 0/0 documents
64
  │ └── over_normalization.py
65
  ├── prompts/ # 8 fichiers .txt FR+EN
66
  │ ├── medieval_french.txt
@@ -72,8 +75,17 @@ picarones/
72
  │ ├── medieval_latin.txt
73
  │ └── zero_shot.txt
74
  ├── report/
75
- │ ├── generator.py # Rapport HTML auto-contenu (Chart.js + diff2html)
76
- ── diff_utils.py
 
 
 
 
 
 
 
 
 
77
  ├── web/
78
  │ └── app.py # FastAPI, SSE, upload corpus ZIP, endpoints modèles dynamiques
79
  └── importers/
@@ -179,6 +191,7 @@ AZURE_DOC_INTEL_KEY=...
179
  | 14 | Filtrage robuste des moteurs, validation corpus |
180
  | 15 | Correction du bug pipeline OCR+LLM sortie vide (normalisation ContentChunk Mistral, logs finish_reason/tokens) |
181
  | 16 | **Sprint 1 du plan rapport** : câblage de `line_metrics` et `hallucination` dans le runner et l'agrégation `EngineReport`, fondations du moteur narratif (`core/narrative/` avec modèle `Fact` et registre de détecteurs), correctifs qualité (deprecation Pillow `getdata` → `tobytes`, deux `except Exception: pass` remplacés par warnings explicites) |
 
182
 
183
  ---
184
 
@@ -211,7 +224,7 @@ défaut se fait sprint par sprint au fur et à mesure de leur implémentation :
211
  ## Contexte développement
212
 
213
  - **Environnement** : GitHub Codespaces (`/workspaces/Picarones`), Python 3.12
214
- - **Tests** : 1072 passed, 2 skipped (Sprint 16)
215
  - **Branche active** : `claude/review-picarones-benchmarks-E3J42`
216
  - **Transcript de la conversation de développement** :
217
  `/mnt/transcripts/2026-03-11-14-01-41-picarones-ocr-bench-project.txt`
 
45
  │ ├── hallucination.py # Détection hallucinations VLM (score ancrage, ratio longueur)
46
  │ ├── line_metrics.py # Distribution erreurs par ligne (Gini, percentiles)
47
  │ ├── history.py # Suivi longitudinal SQLite
48
+ ── robustness.py # Analyse robustesse (bruit, flou, rotation, résolution)
49
+ │ └── narrative/ # Moteur narratif factuel (Sprint 16) — modèle Fact + registre
50
+ │ ├── facts.py # Fact, FactType (12 types), FactImportance, DetectorRegistry
51
+ │ └── detectors.py # Stubs des 12 détecteurs, implémentations par sprint
52
  ├── engines/
53
  │ ├── base.py # BaseEngine avec execution_mode ("io" ou "cpu")
54
  │ ├── tesseract.py # execution_mode = "cpu"
 
58
  │ └── azure_doc_intel.py
59
  ├── llm/
60
  │ ├── base.py
61
+ │ ├── mistral_adapter.py
62
  │ ├── openai_adapter.py
63
  │ ├── anthropic_adapter.py
64
  │ └── ollama_adapter.py
65
  ├── pipelines/
66
+ │ ├── base.py # OCRLLMPipeline (interface BaseOCREngine)
67
  │ └── over_normalization.py
68
  ├── prompts/ # 8 fichiers .txt FR+EN
69
  │ ├── medieval_french.txt
 
75
  │ ├── medieval_latin.txt
76
  │ └── zero_shot.txt
77
  ├── report/
78
+ │ ├── generator.py # Orchestration Jinja2 (617 lignes depuis Sprint 17)
79
+ ── diff_utils.py
80
+ │ ├── templates/ # Templates Jinja2 (Sprint 17)
81
+ │ │ ├── base.html.j2 # assemble tout via {% include %}
82
+ │ │ ├── _header.html, _footer.html, _styles.css, _app.js
83
+ │ │ └── view_ranking.html, view_gallery.html, view_document.html,
84
+ │ │ view_analyses.html, view_characters.html
85
+ │ ├── i18n/ # Traductions FR/EN (Sprint 17 — extraites de i18n.py)
86
+ │ │ ├── fr.json
87
+ │ │ └── en.json
88
+ │ └── vendor/ # Chart.js vendorisé
89
  ├── web/
90
  │ └── app.py # FastAPI, SSE, upload corpus ZIP, endpoints modèles dynamiques
91
  └── importers/
 
191
  | 14 | Filtrage robuste des moteurs, validation corpus |
192
  | 15 | Correction du bug pipeline OCR+LLM sortie vide (normalisation ContentChunk Mistral, logs finish_reason/tokens) |
193
  | 16 | **Sprint 1 du plan rapport** : câblage de `line_metrics` et `hallucination` dans le runner et l'agrégation `EngineReport`, fondations du moteur narratif (`core/narrative/` avec modèle `Fact` et registre de détecteurs), correctifs qualité (deprecation Pillow `getdata` → `tobytes`, deux `except Exception: pass` remplacés par warnings explicites) |
194
+ | 17 | **Sprint 2 du plan rapport** : refactor de `generator.py` (3690 → 617 lignes) via Jinja2. Le monolithe `_HTML_TEMPLATE` est découpé en 10 fichiers externes dans `picarones/report/templates/` (base + 5 vues + header/footer + CSS + JS). L'i18n `i18n.py` (dict Python 101 clés) migré vers `picarones/report/i18n/{fr,en}.json` chargés à l'import. Ajout de 16 tests de non-régression (structure, déterminisme, i18n, garde-fous contre balises dupliquées). |
195
 
196
  ---
197
 
 
224
  ## Contexte développement
225
 
226
  - **Environnement** : GitHub Codespaces (`/workspaces/Picarones`), Python 3.12
227
+ - **Tests** : 1101 passed, 2 skipped (Sprint 17)
228
  - **Branche active** : `claude/review-picarones-benchmarks-E3J42`
229
  - **Transcript de la conversation de développement** :
230
  `/mnt/transcripts/2026-03-11-14-01-41-picarones-ocr-bench-project.txt`
MANIFEST.in CHANGED
@@ -5,3 +5,5 @@ include CLAUDE.md
5
  recursive-include picarones/prompts *.txt
6
  recursive-include picarones/web/static *.css
7
  recursive-include picarones *.json *.yaml *.yml
 
 
 
5
  recursive-include picarones/prompts *.txt
6
  recursive-include picarones/web/static *.css
7
  recursive-include picarones *.json *.yaml *.yml
8
+ recursive-include picarones/report/templates *.j2 *.html *.css *.js
9
+ recursive-include picarones/report/i18n *.json
picarones/i18n.py CHANGED
@@ -4,234 +4,45 @@ Langues supportées
4
  ------------------
5
  - ``"fr"`` : français (défaut)
6
  - ``"en"`` : anglais patrimonial (heritage English)
 
 
 
 
7
  """
8
 
9
  from __future__ import annotations
10
 
11
- TRANSLATIONS: dict[str, dict[str, str]] = {
12
- "fr": {
13
- # ── HTML méta ──────────────────────────────────────────────────────
14
- "html_lang": "fr",
15
- "date_locale": "fr-FR",
16
- # ── Navigation ─────────────────────────────────────────────────────
17
- "nav_report": "rapport OCR",
18
- "tab_ranking": "Classement",
19
- "tab_gallery": "Galerie",
20
- "tab_document": "Document",
21
- "tab_characters": "Caractères",
22
- "tab_analyses": "Analyses",
23
- "btn_present": "⊞ Présentation",
24
- # ── Classement ─────────────────────────────────────────────────────
25
- "h_ranking": "Classement des moteurs",
26
- "col_rank": "#",
27
- "col_engine": "Concurrent",
28
- "col_cer": "CER exact",
29
- "col_cer_diplo": "CER diplo.",
30
- "col_cer_diplo_title": "CER après normalisation diplomatique (ſ=s, u=v, i=j…) — mesure les erreurs substantielles en ignorant les variantes graphiques codifiées",
31
- "col_wer": "WER",
32
- "col_mer": "MER",
33
- "col_wil": "WIL",
34
- "col_ligatures": "Ligatures",
35
- "col_ligatures_title": "Taux de reconnaissance des ligatures (fi, fl, œ, æ, ff…)",
36
- "col_diacritics": "Diacritiques",
37
- "col_diacritics_title": "Taux de conservation des diacritiques (accents, cédilles, trémas…)",
38
- "col_gini": "Gini",
39
- "col_gini_title": "Coefficient de Gini des erreurs CER par ligne — 0 = erreurs uniformes, 1 = erreurs concentrées. Un bon moteur a CER bas ET Gini bas.",
40
- "col_anchor": "Ancrage",
41
- "col_anchor_title": "Score d'ancrage : proportion des trigrammes de la sortie trouvant un ancrage dans le GT — faible score = hallucinations probables (LLM/VLM)",
42
- "col_cer_median": "CER médian",
43
- "col_cer_min": "CER min",
44
- "col_cer_max": "CER max",
45
- "col_overnorm": "Sur-norm.",
46
- "col_overnorm_title": "Classe 10 — Sur-normalisation LLM : taux de mots corrects dégradés par le LLM",
47
- "col_docs": "Docs",
48
- # ── Galerie ────────────────────────────────────────────────────────
49
- "h_gallery": "Galerie des documents",
50
- "gallery_sort_label": "Trier par :",
51
- "gallery_sort_id": "Identifiant",
52
- "gallery_sort_cer": "CER moyen",
53
- "gallery_sort_difficulty": "Difficulté",
54
- "gallery_sort_best": "Meilleur moteur",
55
- "gallery_filter_cer_label": "Filtrer CER >",
56
- "gallery_filter_engine_label": "Moteur :",
57
- "gallery_filter_all": "Tous",
58
- "gallery_empty": "Aucun document ne correspond aux filtres.",
59
- # ── Document ───────────────────────────────────────────────────────
60
- "doc_sidebar_header": "Documents",
61
- "doc_title_default": "Sélectionner un document",
62
- "h_image": "Image originale",
63
- "h_gt": "Vérité terrain (GT)",
64
- "h_diff": "Sorties OCR — diff par moteur",
65
- "h_line_metrics": "Distribution des erreurs par ligne",
66
- "h_hallucination": "Analyse des hallucinations",
67
- # ── Analyses ───────────────────────────────────────────────────────
68
- "h_characters": "Analyse des caractères",
69
- "char_engine_label": "Moteur :",
70
- "h_cer_dist": "Distribution du CER par moteur",
71
- "h_radar": "Profil des moteurs (radar)",
72
- "radar_note": "Axe radar : CER, WER, MER, WIL — valeurs inversées (plus c'est haut, meilleur est le moteur).",
73
- "h_cer_doc": "CER par document (tous moteurs)",
74
- "h_duration": "Temps d'exécution moyen (secondes/document)",
75
- "h_quality_cer": "Qualité image ↔ CER (scatter plot)",
76
- "quality_cer_note": "Chaque point = un document. Axe X = score qualité image [0–1]. Axe Y = CER. Corrélation négative attendue.",
77
- "h_taxonomy": "Taxonomie des erreurs par moteur",
78
- "taxonomy_note": "Distribution des classes d'erreurs (classes 1–9 de la taxonomie Picarones).",
79
- "h_reliability": "Courbes de fiabilité",
80
- "reliability_note": "Pour les X% documents les plus faciles (triés par CER croissant), quel est le CER moyen cumulé ? Une courbe basse = moteur performant même sur les documents faciles.",
81
- "h_bootstrap": "Intervalles de confiance à 95 % (bootstrap)",
82
- "bootstrap_note": "IC à 95% sur le CER moyen par moteur (1000 itérations bootstrap).",
83
- "h_venn": "Erreurs communes / exclusives (Venn)",
84
- "venn_note": "Intersection des ensembles d'erreurs entre les 2 ou 3 premiers concurrents. Erreurs communes = segments partagés.",
85
- "h_pairwise": "Tests de Wilcoxon — comparaisons par paires",
86
- "pairwise_note": "Test signé-rangé de Wilcoxon (non-paramétrique). Seuil α = 0.05.",
87
- "h_clusters": "Clustering des patterns d'erreurs",
88
- "h_gini_cer": "Gini vs CER moyen",
89
- "gini_cer_ideal": "— idéal : bas-gauche",
90
- "gini_cer_note": "Axe X = CER moyen, Axe Y = coefficient de Gini. Un moteur idéal a CER bas ET Gini bas (erreurs rares et uniformes).",
91
- "h_ratio_anchor": "Ratio longueur vs ancrage",
92
- "ratio_anchor_subtitle": "— hallucinations VLM",
93
- "ratio_anchor_note": "Axe X = score d'ancrage trigrammes [0–1]. Axe Y = ratio longueur sortie/GT. Zone ⚠️ : ancrage &lt; 0.5 ou ratio &gt; 1.2 → hallucinations probables.",
94
- "h_correlation": "Matrice de corrélation entre métriques",
95
- "corr_engine_label": "Moteur :",
96
- "corr_note": "Coefficient de Pearson entre les métriques CER, WER, qualité image, ligatures, diacritiques. Vert = corrélation positive, Rouge = corrélation négative.",
97
- # ── Footer ────────────────────────────────────────────────────────
98
- "footer_generated": "Rapport généré le",
99
- "footer_by": "par Picarones",
100
- # ── JS strings dynamiques ─────────────────────────────────────────
101
- "heatmap_start": "Début",
102
- "heatmap_mid": "Milieu",
103
- "heatmap_end": "Fin",
104
- "heatmap_title": "CARTE THERMIQUE (position)",
105
- "percentile_title": "PERCENTILES CER",
106
- "lines": "lignes",
107
- "no_line_metrics": "Aucune métrique de ligne disponible.",
108
- "no_hall_metrics": "Aucune métrique d'hallucination disponible.",
109
- "no_hall_blocks": "Aucun bloc halluciné détecté.",
110
- "hall_detected": "⚠️ Hallucinations détectées",
111
- "hall_ok": "✓ Ancrage satisfaisant",
112
- "hall_blocks_title": "Blocs sans ancrage dans le GT :",
113
- "hall_block_label": "Bloc halluciné",
114
- "hall_more_blocks": "bloc(s) supplémentaire(s)",
115
- "no_gini": "Données Gini non disponibles.",
116
- "no_scatter": "Données non disponibles.",
117
- "total_errors": "Total :",
118
- "errors_classified": "erreurs classifiées.",
119
- "class_col": "Classe",
120
- "proportion_col": "Proportion",
121
- "taxonomy_engine_label": "Moteur :",
122
- },
123
- "en": {
124
- # ── HTML méta ──────────────────────────────────────────────────────
125
- "html_lang": "en",
126
- "date_locale": "en-GB",
127
- # ── Navigation ─────────────────────────────────────────────────────
128
- "nav_report": "OCR report",
129
- "tab_ranking": "Ranking",
130
- "tab_gallery": "Gallery",
131
- "tab_document": "Document",
132
- "tab_characters": "Characters",
133
- "tab_analyses": "Analyses",
134
- "btn_present": "⊞ Presentation",
135
- # ── Ranking ────────────────────────────────────────────────────────
136
- "h_ranking": "Engine Ranking",
137
- "col_rank": "#",
138
- "col_engine": "Engine",
139
- "col_cer": "Exact CER",
140
- "col_cer_diplo": "Diplo. CER",
141
- "col_cer_diplo_title": "CER after diplomatic normalisation (ſ=s, u=v, i=j…) — measures substantial errors ignoring codified graphical variants",
142
- "col_wer": "WER",
143
- "col_mer": "MER",
144
- "col_wil": "WIL",
145
- "col_ligatures": "Ligatures",
146
- "col_ligatures_title": "Ligature recognition rate (fi, fl, œ, æ, ff…)",
147
- "col_diacritics": "Diacritics",
148
- "col_diacritics_title": "Diacritic preservation rate (accents, cedillas, umlauts…)",
149
- "col_gini": "Gini",
150
- "col_gini_title": "Gini coefficient of per-line CER errors — 0 = uniform errors, 1 = concentrated errors. A good engine has low CER AND low Gini.",
151
- "col_anchor": "Anchor",
152
- "col_anchor_title": "Anchor score: proportion of output trigrams found in the GT — low score = probable hallucinations (LLM/VLM)",
153
- "col_cer_median": "Median CER",
154
- "col_cer_min": "Min CER",
155
- "col_cer_max": "Max CER",
156
- "col_overnorm": "Over-norm.",
157
- "col_overnorm_title": "Class 10 — LLM over-normalisation: rate of correct words degraded by the LLM",
158
- "col_docs": "Docs",
159
- # ── Gallery ────────────────────────────────────────────────────────
160
- "h_gallery": "Document Gallery",
161
- "gallery_sort_label": "Sort by:",
162
- "gallery_sort_id": "Identifier",
163
- "gallery_sort_cer": "Mean CER",
164
- "gallery_sort_difficulty": "Difficulty",
165
- "gallery_sort_best": "Best engine",
166
- "gallery_filter_cer_label": "Filter CER >",
167
- "gallery_filter_engine_label": "Engine:",
168
- "gallery_filter_all": "All",
169
- "gallery_empty": "No documents match the filters.",
170
- # ── Document ───────────────────────────────────────────────────────
171
- "doc_sidebar_header": "Documents",
172
- "doc_title_default": "Select a document",
173
- "h_image": "Original Image",
174
- "h_gt": "Ground Truth (GT)",
175
- "h_diff": "OCR Output — diff by engine",
176
- "h_line_metrics": "Error Distribution by Line",
177
- "h_hallucination": "Hallucination Analysis",
178
- # ── Analyses ───────────────────────────────────────────────────────
179
- "h_characters": "Character Analysis",
180
- "char_engine_label": "Engine:",
181
- "h_cer_dist": "CER Distribution by Engine",
182
- "h_radar": "Engine Profile (radar)",
183
- "radar_note": "Radar axes: CER, WER, MER, WIL — inverted values (higher = better engine).",
184
- "h_cer_doc": "CER by Document (all engines)",
185
- "h_duration": "Average Execution Time (seconds/document)",
186
- "h_quality_cer": "Image Quality ↔ CER (scatter plot)",
187
- "quality_cer_note": "Each point = one document. X-axis = image quality score [0–1]. Y-axis = CER. Negative correlation expected.",
188
- "h_taxonomy": "Error Taxonomy by Engine",
189
- "taxonomy_note": "Distribution of error classes (classes 1–9 of the Picarones taxonomy).",
190
- "h_reliability": "Reliability Curves",
191
- "reliability_note": "For the X% easiest documents (sorted by ascending CER), what is the cumulative mean CER? A low curve = engine performing well even on easy documents.",
192
- "h_bootstrap": "95% Bootstrap Confidence Intervals",
193
- "bootstrap_note": "95% CI on mean CER per engine (1000 bootstrap iterations).",
194
- "h_venn": "Shared / Exclusive Errors (Venn)",
195
- "venn_note": "Intersection of error sets between the 2 or 3 top engines. Shared errors = overlapping segments.",
196
- "h_pairwise": "Wilcoxon Tests — pairwise comparisons",
197
- "pairwise_note": "Wilcoxon signed-rank test (non-parametric). Threshold α = 0.05.",
198
- "h_clusters": "Frequent Error Clusters",
199
- "h_gini_cer": "Gini vs Mean CER",
200
- "gini_cer_ideal": "— ideal: bottom-left",
201
- "gini_cer_note": "X-axis = mean CER, Y-axis = Gini coefficient. An ideal engine has low CER AND low Gini (rare, uniform errors).",
202
- "h_ratio_anchor": "Length Ratio vs Anchor Score",
203
- "ratio_anchor_subtitle": "— VLM hallucinations",
204
- "ratio_anchor_note": "X-axis = trigram anchor score [0–1]. Y-axis = output/GT length ratio. ⚠️ Zone: anchor &lt; 0.5 or ratio &gt; 1.2 → probable hallucinations.",
205
- "h_correlation": "Metric Correlation Matrix",
206
- "corr_engine_label": "Engine:",
207
- "corr_note": "Pearson coefficient between CER, WER, image quality, ligatures, diacritics. Green = positive correlation, Red = negative.",
208
- # ── Footer ────────────────────────────────────────────────────────
209
- "footer_generated": "Report generated on",
210
- "footer_by": "by Picarones",
211
- # ── JS strings dynamiques ─────────────────────────────────────────
212
- "heatmap_start": "Start",
213
- "heatmap_mid": "Middle",
214
- "heatmap_end": "End",
215
- "heatmap_title": "HEATMAP (position)",
216
- "percentile_title": "CER PERCENTILES",
217
- "lines": "lines",
218
- "no_line_metrics": "No line metrics available.",
219
- "no_hall_metrics": "No hallucination metrics available.",
220
- "no_hall_blocks": "No hallucinated blocks detected.",
221
- "hall_detected": "⚠️ Hallucinations detected",
222
- "hall_ok": "✓ Satisfactory anchoring",
223
- "hall_blocks_title": "Blocks with no anchor in GT:",
224
- "hall_block_label": "Hallucinated block",
225
- "hall_more_blocks": "additional block(s)",
226
- "no_gini": "Gini data not available.",
227
- "no_scatter": "Data not available.",
228
- "total_errors": "Total:",
229
- "errors_classified": "classified errors.",
230
- "class_col": "Class",
231
- "proportion_col": "Proportion",
232
- "taxonomy_engine_label": "Engine:",
233
- },
234
- }
235
 
236
 
237
  def get_labels(lang: str = "fr") -> dict[str, str]:
@@ -246,8 +57,10 @@ def get_labels(lang: str = "fr") -> dict[str, str]:
246
  -------
247
  dict
248
  Labels traduits. Toujours valide : bascule sur ``"fr"`` si lang inconnu.
 
 
249
  """
250
- return TRANSLATIONS.get(lang, TRANSLATIONS["fr"])
251
 
252
 
253
  SUPPORTED_LANGS: list[str] = list(TRANSLATIONS.keys())
 
4
  ------------------
5
  - ``"fr"`` : français (défaut)
6
  - ``"en"`` : anglais patrimonial (heritage English)
7
+
8
+ Depuis le Sprint 16, les traductions sont stockées dans
9
+ ``picarones/report/i18n/{lang}.json`` et chargées au premier accès.
10
+ ``TRANSLATIONS`` reste exposé comme dict pour compatibilité ascendante.
11
  """
12
 
13
  from __future__ import annotations
14
 
15
+ import json
16
+ from pathlib import Path
17
+
18
+
19
+ _I18N_DIR = Path(__file__).parent / "report" / "i18n"
20
+
21
+
22
+ def _load_translations() -> dict[str, dict[str, str]]:
23
+ """Charge tous les fichiers JSON du dossier i18n.
24
+
25
+ Un fichier ``{lang}.json`` définit les labels de la langue ``lang``.
26
+ Retourne toujours un dict non-vide, même si le dossier est manquant
27
+ (dans ce cas, le dict est vide et ``get_labels`` tombe sur un fallback).
28
+ """
29
+ translations: dict[str, dict[str, str]] = {}
30
+ if not _I18N_DIR.is_dir():
31
+ return translations
32
+ for path in sorted(_I18N_DIR.glob("*.json")):
33
+ lang = path.stem
34
+ try:
35
+ with path.open(encoding="utf-8") as fh:
36
+ translations[lang] = json.load(fh)
37
+ except (OSError, json.JSONDecodeError) as e:
38
+ import logging
39
+ logging.getLogger(__name__).warning(
40
+ "[i18n] fichier '%s' ignoré : %s", path, e,
41
+ )
42
+ return translations
43
+
44
+
45
+ TRANSLATIONS: dict[str, dict[str, str]] = _load_translations()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
 
48
  def get_labels(lang: str = "fr") -> dict[str, str]:
 
57
  -------
58
  dict
59
  Labels traduits. Toujours valide : bascule sur ``"fr"`` si lang inconnu.
60
+ Si ``"fr"`` lui-même manque, retourne un dict vide (comportement dégradé
61
+ mais non bloquant).
62
  """
63
+ return TRANSLATIONS.get(lang, TRANSLATIONS.get("fr", {}))
64
 
65
 
66
  SUPPORTED_LANGS: list[str] = list(TRANSLATIONS.keys())
picarones/report/generator.py CHANGED
The diff for this file is too large to render. See raw diff
 
picarones/report/i18n/en.json ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bootstrap_note": "95% CI on mean CER per engine (1000 bootstrap iterations).",
3
+ "btn_present": "⊞ Presentation",
4
+ "char_engine_label": "Engine:",
5
+ "class_col": "Class",
6
+ "col_anchor": "Anchor",
7
+ "col_anchor_title": "Anchor score: proportion of output trigrams found in the GT — low score = probable hallucinations (LLM/VLM)",
8
+ "col_cer": "Exact CER",
9
+ "col_cer_diplo": "Diplo. CER",
10
+ "col_cer_diplo_title": "CER after diplomatic normalisation (ſ=s, u=v, i=j…) — measures substantial errors ignoring codified graphical variants",
11
+ "col_cer_max": "Max CER",
12
+ "col_cer_median": "Median CER",
13
+ "col_cer_min": "Min CER",
14
+ "col_diacritics": "Diacritics",
15
+ "col_diacritics_title": "Diacritic preservation rate (accents, cedillas, umlauts…)",
16
+ "col_docs": "Docs",
17
+ "col_engine": "Engine",
18
+ "col_gini": "Gini",
19
+ "col_gini_title": "Gini coefficient of per-line CER errors — 0 = uniform errors, 1 = concentrated errors. A good engine has low CER AND low Gini.",
20
+ "col_ligatures": "Ligatures",
21
+ "col_ligatures_title": "Ligature recognition rate (fi, fl, œ, æ, ff…)",
22
+ "col_mer": "MER",
23
+ "col_overnorm": "Over-norm.",
24
+ "col_overnorm_title": "Class 10 — LLM over-normalisation: rate of correct words degraded by the LLM",
25
+ "col_rank": "#",
26
+ "col_wer": "WER",
27
+ "col_wil": "WIL",
28
+ "corr_engine_label": "Engine:",
29
+ "corr_note": "Pearson coefficient between CER, WER, image quality, ligatures, diacritics. Green = positive correlation, Red = negative.",
30
+ "date_locale": "en-GB",
31
+ "doc_sidebar_header": "Documents",
32
+ "doc_title_default": "Select a document",
33
+ "errors_classified": "classified errors.",
34
+ "footer_by": "by Picarones",
35
+ "footer_generated": "Report generated on",
36
+ "gallery_empty": "No documents match the filters.",
37
+ "gallery_filter_all": "All",
38
+ "gallery_filter_cer_label": "Filter CER >",
39
+ "gallery_filter_engine_label": "Engine:",
40
+ "gallery_sort_best": "Best engine",
41
+ "gallery_sort_cer": "Mean CER",
42
+ "gallery_sort_difficulty": "Difficulty",
43
+ "gallery_sort_id": "Identifier",
44
+ "gallery_sort_label": "Sort by:",
45
+ "gini_cer_ideal": "— ideal: bottom-left",
46
+ "gini_cer_note": "X-axis = mean CER, Y-axis = Gini coefficient. An ideal engine has low CER AND low Gini (rare, uniform errors).",
47
+ "h_bootstrap": "95% Bootstrap Confidence Intervals",
48
+ "h_cer_dist": "CER Distribution by Engine",
49
+ "h_cer_doc": "CER by Document (all engines)",
50
+ "h_characters": "Character Analysis",
51
+ "h_clusters": "Frequent Error Clusters",
52
+ "h_correlation": "Metric Correlation Matrix",
53
+ "h_diff": "OCR Output — diff by engine",
54
+ "h_duration": "Average Execution Time (seconds/document)",
55
+ "h_gallery": "Document Gallery",
56
+ "h_gini_cer": "Gini vs Mean CER",
57
+ "h_gt": "Ground Truth (GT)",
58
+ "h_hallucination": "Hallucination Analysis",
59
+ "h_image": "Original Image",
60
+ "h_line_metrics": "Error Distribution by Line",
61
+ "h_pairwise": "Wilcoxon Tests — pairwise comparisons",
62
+ "h_quality_cer": "Image Quality ↔ CER (scatter plot)",
63
+ "h_radar": "Engine Profile (radar)",
64
+ "h_ranking": "Engine Ranking",
65
+ "h_ratio_anchor": "Length Ratio vs Anchor Score",
66
+ "h_reliability": "Reliability Curves",
67
+ "h_taxonomy": "Error Taxonomy by Engine",
68
+ "h_venn": "Shared / Exclusive Errors (Venn)",
69
+ "hall_block_label": "Hallucinated block",
70
+ "hall_blocks_title": "Blocks with no anchor in GT:",
71
+ "hall_detected": "⚠️ Hallucinations detected",
72
+ "hall_more_blocks": "additional block(s)",
73
+ "hall_ok": "✓ Satisfactory anchoring",
74
+ "heatmap_end": "End",
75
+ "heatmap_mid": "Middle",
76
+ "heatmap_start": "Start",
77
+ "heatmap_title": "HEATMAP (position)",
78
+ "html_lang": "en",
79
+ "lines": "lines",
80
+ "nav_report": "OCR report",
81
+ "no_gini": "Gini data not available.",
82
+ "no_hall_blocks": "No hallucinated blocks detected.",
83
+ "no_hall_metrics": "No hallucination metrics available.",
84
+ "no_line_metrics": "No line metrics available.",
85
+ "no_scatter": "Data not available.",
86
+ "pairwise_note": "Wilcoxon signed-rank test (non-parametric). Threshold α = 0.05.",
87
+ "percentile_title": "CER PERCENTILES",
88
+ "proportion_col": "Proportion",
89
+ "quality_cer_note": "Each point = one document. X-axis = image quality score [0–1]. Y-axis = CER. Negative correlation expected.",
90
+ "radar_note": "Radar axes: CER, WER, MER, WIL — inverted values (higher = better engine).",
91
+ "ratio_anchor_note": "X-axis = trigram anchor score [0–1]. Y-axis = output/GT length ratio. ⚠️ Zone: anchor &lt; 0.5 or ratio &gt; 1.2 → probable hallucinations.",
92
+ "ratio_anchor_subtitle": "— VLM hallucinations",
93
+ "reliability_note": "For the X% easiest documents (sorted by ascending CER), what is the cumulative mean CER? A low curve = engine performing well even on easy documents.",
94
+ "tab_analyses": "Analyses",
95
+ "tab_characters": "Characters",
96
+ "tab_document": "Document",
97
+ "tab_gallery": "Gallery",
98
+ "tab_ranking": "Ranking",
99
+ "taxonomy_engine_label": "Engine:",
100
+ "taxonomy_note": "Distribution of error classes (classes 1–9 of the Picarones taxonomy).",
101
+ "total_errors": "Total:",
102
+ "venn_note": "Intersection of error sets between the 2 or 3 top engines. Shared errors = overlapping segments."
103
+ }
picarones/report/i18n/fr.json ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bootstrap_note": "IC à 95% sur le CER moyen par moteur (1000 itérations bootstrap).",
3
+ "btn_present": "⊞ Présentation",
4
+ "char_engine_label": "Moteur :",
5
+ "class_col": "Classe",
6
+ "col_anchor": "Ancrage",
7
+ "col_anchor_title": "Score d'ancrage : proportion des trigrammes de la sortie trouvant un ancrage dans le GT — faible score = hallucinations probables (LLM/VLM)",
8
+ "col_cer": "CER exact",
9
+ "col_cer_diplo": "CER diplo.",
10
+ "col_cer_diplo_title": "CER après normalisation diplomatique (ſ=s, u=v, i=j…) — mesure les erreurs substantielles en ignorant les variantes graphiques codifiées",
11
+ "col_cer_max": "CER max",
12
+ "col_cer_median": "CER médian",
13
+ "col_cer_min": "CER min",
14
+ "col_diacritics": "Diacritiques",
15
+ "col_diacritics_title": "Taux de conservation des diacritiques (accents, cédilles, trémas…)",
16
+ "col_docs": "Docs",
17
+ "col_engine": "Concurrent",
18
+ "col_gini": "Gini",
19
+ "col_gini_title": "Coefficient de Gini des erreurs CER par ligne — 0 = erreurs uniformes, 1 = erreurs concentrées. Un bon moteur a CER bas ET Gini bas.",
20
+ "col_ligatures": "Ligatures",
21
+ "col_ligatures_title": "Taux de reconnaissance des ligatures (fi, fl, œ, æ, ff…)",
22
+ "col_mer": "MER",
23
+ "col_overnorm": "Sur-norm.",
24
+ "col_overnorm_title": "Classe 10 — Sur-normalisation LLM : taux de mots corrects dégradés par le LLM",
25
+ "col_rank": "#",
26
+ "col_wer": "WER",
27
+ "col_wil": "WIL",
28
+ "corr_engine_label": "Moteur :",
29
+ "corr_note": "Coefficient de Pearson entre les métriques CER, WER, qualité image, ligatures, diacritiques. Vert = corrélation positive, Rouge = corrélation négative.",
30
+ "date_locale": "fr-FR",
31
+ "doc_sidebar_header": "Documents",
32
+ "doc_title_default": "Sélectionner un document",
33
+ "errors_classified": "erreurs classifiées.",
34
+ "footer_by": "par Picarones",
35
+ "footer_generated": "Rapport généré le",
36
+ "gallery_empty": "Aucun document ne correspond aux filtres.",
37
+ "gallery_filter_all": "Tous",
38
+ "gallery_filter_cer_label": "Filtrer CER >",
39
+ "gallery_filter_engine_label": "Moteur :",
40
+ "gallery_sort_best": "Meilleur moteur",
41
+ "gallery_sort_cer": "CER moyen",
42
+ "gallery_sort_difficulty": "Difficulté",
43
+ "gallery_sort_id": "Identifiant",
44
+ "gallery_sort_label": "Trier par :",
45
+ "gini_cer_ideal": "— idéal : bas-gauche",
46
+ "gini_cer_note": "Axe X = CER moyen, Axe Y = coefficient de Gini. Un moteur idéal a CER bas ET Gini bas (erreurs rares et uniformes).",
47
+ "h_bootstrap": "Intervalles de confiance à 95 % (bootstrap)",
48
+ "h_cer_dist": "Distribution du CER par moteur",
49
+ "h_cer_doc": "CER par document (tous moteurs)",
50
+ "h_characters": "Analyse des caractères",
51
+ "h_clusters": "Clustering des patterns d'erreurs",
52
+ "h_correlation": "Matrice de corrélation entre métriques",
53
+ "h_diff": "Sorties OCR — diff par moteur",
54
+ "h_duration": "Temps d'exécution moyen (secondes/document)",
55
+ "h_gallery": "Galerie des documents",
56
+ "h_gini_cer": "Gini vs CER moyen",
57
+ "h_gt": "Vérité terrain (GT)",
58
+ "h_hallucination": "Analyse des hallucinations",
59
+ "h_image": "Image originale",
60
+ "h_line_metrics": "Distribution des erreurs par ligne",
61
+ "h_pairwise": "Tests de Wilcoxon — comparaisons par paires",
62
+ "h_quality_cer": "Qualité image ↔ CER (scatter plot)",
63
+ "h_radar": "Profil des moteurs (radar)",
64
+ "h_ranking": "Classement des moteurs",
65
+ "h_ratio_anchor": "Ratio longueur vs ancrage",
66
+ "h_reliability": "Courbes de fiabilité",
67
+ "h_taxonomy": "Taxonomie des erreurs par moteur",
68
+ "h_venn": "Erreurs communes / exclusives (Venn)",
69
+ "hall_block_label": "Bloc halluciné",
70
+ "hall_blocks_title": "Blocs sans ancrage dans le GT :",
71
+ "hall_detected": "⚠️ Hallucinations détectées",
72
+ "hall_more_blocks": "bloc(s) supplémentaire(s)",
73
+ "hall_ok": "✓ Ancrage satisfaisant",
74
+ "heatmap_end": "Fin",
75
+ "heatmap_mid": "Milieu",
76
+ "heatmap_start": "Début",
77
+ "heatmap_title": "CARTE THERMIQUE (position)",
78
+ "html_lang": "fr",
79
+ "lines": "lignes",
80
+ "nav_report": "rapport OCR",
81
+ "no_gini": "Données Gini non disponibles.",
82
+ "no_hall_blocks": "Aucun bloc halluciné détecté.",
83
+ "no_hall_metrics": "Aucune métrique d'hallucination disponible.",
84
+ "no_line_metrics": "Aucune métrique de ligne disponible.",
85
+ "no_scatter": "Données non disponibles.",
86
+ "pairwise_note": "Test signé-rangé de Wilcoxon (non-paramétrique). Seuil α = 0.05.",
87
+ "percentile_title": "PERCENTILES CER",
88
+ "proportion_col": "Proportion",
89
+ "quality_cer_note": "Chaque point = un document. Axe X = score qualité image [0–1]. Axe Y = CER. Corrélation négative attendue.",
90
+ "radar_note": "Axe radar : CER, WER, MER, WIL — valeurs inversées (plus c'est haut, meilleur est le moteur).",
91
+ "ratio_anchor_note": "Axe X = score d'ancrage trigrammes [0–1]. Axe Y = ratio longueur sortie/GT. Zone ⚠️ : ancrage &lt; 0.5 ou ratio &gt; 1.2 → hallucinations probables.",
92
+ "ratio_anchor_subtitle": "— hallucinations VLM",
93
+ "reliability_note": "Pour les X% documents les plus faciles (triés par CER croissant), quel est le CER moyen cumulé ? Une courbe basse = moteur performant même sur les documents faciles.",
94
+ "tab_analyses": "Analyses",
95
+ "tab_characters": "Caractères",
96
+ "tab_document": "Document",
97
+ "tab_gallery": "Galerie",
98
+ "tab_ranking": "Classement",
99
+ "taxonomy_engine_label": "Moteur :",
100
+ "taxonomy_note": "Distribution des classes d'erreurs (classes 1–9 de la taxonomie Picarones).",
101
+ "total_errors": "Total :",
102
+ "venn_note": "Intersection des ensembles d'erreurs entre les 2 ou 3 premiers concurrents. Erreurs communes = segments partagés."
103
+ }
picarones/report/templates/_app.js ADDED
@@ -0,0 +1,2085 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 'use strict';
2
+
3
+ // ── Palette couleurs par moteur ──────────────────────────────────
4
+ const PALETTE = [
5
+ '#2563eb','#dc2626','#16a34a','#ca8a04','#7c3aed',
6
+ '#0891b2','#c2410c','#0f766e','#9333ea','#b45309',
7
+ ];
8
+ function engineColor(idx) { return PALETTE[idx % PALETTE.length]; }
9
+
10
+ // ── Navigation ──────────────────────────────────────────────────
11
+ let currentView = 'ranking';
12
+ function _switchView(name) {
13
+ document.querySelectorAll('.view').forEach(v => v.classList.remove('active'));
14
+ document.querySelectorAll('.tab-btn').forEach(b => b.classList.remove('active'));
15
+ document.getElementById('view-' + name).classList.add('active');
16
+ // Activer le bon onglet nav
17
+ const tabMap = {ranking:'classement',gallery:'galerie',document:'document',characters:'caract',analyses:'analyses'};
18
+ const prefix = tabMap[name] || name;
19
+ document.querySelectorAll('.tab-btn').forEach(b => {
20
+ if (b.textContent.toLowerCase().startsWith(prefix.toLowerCase())) b.classList.add('active');
21
+ });
22
+ currentView = name;
23
+ if (name === 'analyses' && !chartsBuilt) buildCharts();
24
+ if (name === 'characters' && !charViewBuilt) initCharView();
25
+ }
26
+ function showView(name) {
27
+ _switchView(name);
28
+ updateURL(name);
29
+ }
30
+
31
+ // ── Formatage ───────────────────────────────────────────────────
32
+ function pct(v, d=2) {
33
+ if (v === null || v === undefined) return '—';
34
+ return (v * 100).toFixed(d) + ' %';
35
+ }
36
+ function cerColor(v) {
37
+ if (v < 0.05) return '#16a34a';
38
+ if (v < 0.15) return '#ca8a04';
39
+ if (v < 0.30) return '#ea580c';
40
+ return '#dc2626';
41
+ }
42
+ function cerBg(v) {
43
+ if (v < 0.05) return '#dcfce7';
44
+ if (v < 0.15) return '#fef9c3';
45
+ if (v < 0.30) return '#ffedd5';
46
+ return '#fee2e2';
47
+ }
48
+ function esc(s) {
49
+ return String(s)
50
+ .replace(/&/g,'&amp;').replace(/</g,'&lt;')
51
+ .replace(/>/g,'&gt;').replace(/"/g,'&quot;');
52
+ }
53
+
54
+ // ── Diff renderer ──────────────────────────────────────────────
55
+ function renderDiff(ops) {
56
+ if (!ops || !ops.length) return '<em style="color:var(--text-muted)">— aucune sortie —</em>';
57
+ return ops.map(op => {
58
+ if (op.op === 'equal')
59
+ return '<span class="d-eq">' + esc(op.text) + '</span>';
60
+ if (op.op === 'insert')
61
+ return '<span class="d-ins" title="Insertion OCR">' + esc(op.text) + '</span>';
62
+ if (op.op === 'delete')
63
+ return '<span class="d-del" title="Suppression (présent GT)">' + esc(op.text) + '</span>';
64
+ if (op.op === 'replace')
65
+ return '<span class="d-rep-old" title="Remplacement">' + esc(op.old) + '</span>'
66
+ + '<span class="d-rep-new">' + esc(op.new) + '</span>';
67
+ return '';
68
+ }).join(' ');
69
+ }
70
+
71
+ // ── Rendu côte à côte (char-level) ──────────────────────────────────
72
+ function renderSideBySide(docId) {
73
+ const doc = DATA.documents.find(d => d.doc_id === docId);
74
+ if (!doc) return;
75
+
76
+ const sel = document.getElementById('sbs-engine-dropdown');
77
+ const engineIdx = sel && sel.value !== '' ? parseInt(sel.value, 10) : 0;
78
+ const er = doc.engine_results[engineIdx];
79
+ if (!er) return;
80
+
81
+ const ops = er.diff || [];
82
+
83
+ // Construire le HTML GT (gauche) et OCR (droite) depuis les mêmes ops
84
+ let gtHtml = '', ocrHtml = '';
85
+ ops.forEach(op => {
86
+ if (op.op === 'equal') {
87
+ const t = esc(op.text);
88
+ gtHtml += t;
89
+ ocrHtml += t;
90
+ } else if (op.op === 'delete') {
91
+ // Présent dans GT, absent de l'OCR → orange dans GT
92
+ gtHtml += '<span class="d-miss" title="Manquant dans OCR">' + esc(op.text) + '</span>';
93
+ } else if (op.op === 'insert') {
94
+ // Présent dans OCR, absent du GT → vert dans OCR
95
+ ocrHtml += '<span class="d-ins-ocr" title="Insertion OCR">' + esc(op.text) + '</span>';
96
+ } else if (op.op === 'replace') {
97
+ // Substitution : orange dans GT, rouge dans OCR
98
+ gtHtml += '<span class="d-miss" title="Substitution GT">' + esc(op.old) + '</span>';
99
+ ocrHtml += '<span class="d-err" title="Différent du GT">' + esc(op.new) + '</span>';
100
+ }
101
+ });
102
+
103
+ document.getElementById('sbs-gt-body').innerHTML = gtHtml || '<em style="color:var(--text-muted)">—</em>';
104
+ document.getElementById('sbs-ocr-body').innerHTML = ocrHtml || '<em style="color:var(--text-muted)">Aucune sortie</em>';
105
+
106
+ // En-tête OCR : nom moteur + CER
107
+ const c = cerColor(er.cer); const bg = cerBg(er.cer);
108
+ document.getElementById('sbs-ocr-engine-name').textContent = er.engine;
109
+ const cerBadgeEl = document.getElementById('sbs-ocr-cer');
110
+ cerBadgeEl.textContent = pct(er.cer);
111
+ cerBadgeEl.style.cssText = `color:${c};background:${bg};display:inline-block`;
112
+
113
+ // Pipeline triple-diff (si applicable)
114
+ const tripleEl = document.getElementById('sbs-triple-diff');
115
+ if (er.ocr_intermediate) {
116
+ const ocrDiffHtml = renderDiff(er.ocr_diff);
117
+ const llmDiffHtml = renderDiff(er.llm_correction_diff);
118
+ const isPipeline = er.ocr_intermediate !== undefined;
119
+ const modeLabel = {text_only:'texte seul', text_and_image:'image+texte', zero_shot:'zero-shot'}[er.pipeline_mode] || '';
120
+ const pipeTag = `<span class="pipeline-tag">⛓ ${modeLabel || 'pipeline'}</span>`;
121
+ let onBadge = '';
122
+ if (er.over_normalization) {
123
+ const on = er.over_normalization;
124
+ const onPct = (on.score * 100).toFixed(2);
125
+ const cls = on.score > 0.05 ? 'over-norm-badge high' : 'over-norm-badge';
126
+ onBadge = `<span class="${cls}" title="Classe 10 — sur-normalisation LLM">Sur-norm. ${onPct}%</span>`;
127
+ }
128
+ let diplomaBadge = '';
129
+ if (er.cer_diplomatic !== null && er.cer_diplomatic !== undefined) {
130
+ const dipC = cerColor(er.cer_diplomatic); const dipB = cerBg(er.cer_diplomatic);
131
+ const delta = er.cer - er.cer_diplomatic;
132
+ const deltaHint = delta > 0.001 ? ` (−${(delta*100).toFixed(1)}% avec normalisation)` : '';
133
+ diplomaBadge = `<span class="cer-badge" style="color:${dipC};background:${dipB};opacity:.85"
134
+ title="CER diplomatique${deltaHint}">diplo. ${pct(er.cer_diplomatic)}</span>`;
135
+ }
136
+ tripleEl.style.display = '';
137
+ tripleEl.innerHTML = `
138
+ <div style="margin-top:.75rem;padding-top:.75rem;border-top:1px solid var(--border)">
139
+ <div style="display:flex;align-items:center;gap:.4rem;margin-bottom:.5rem;font-size:.83rem;font-weight:600">
140
+ ${pipeTag} ${diplomaBadge} ${onBadge}
141
+ <span class="badge" style="background:#f1f5f9">WER ${pct(er.wer)}</span>
142
+ </div>
143
+ <div class="triple-diff-wrap">
144
+ <div class="triple-diff-section">
145
+ <h5>GT → OCR brut</h5>
146
+ ${ocrDiffHtml || '<em style="color:var(--text-muted)">—</em>'}
147
+ </div>
148
+ <div class="triple-diff-section">
149
+ <h5>OCR brut → Correction LLM</h5>
150
+ ${llmDiffHtml || '<em style="color:var(--text-muted)">—</em>'}
151
+ </div>
152
+ </div>
153
+ </div>`;
154
+ } else {
155
+ // Afficher WER / CER diplomatique même hors pipeline
156
+ let diplomaBadge = '';
157
+ if (er.cer_diplomatic !== null && er.cer_diplomatic !== undefined) {
158
+ const dipC = cerColor(er.cer_diplomatic); const dipB = cerBg(er.cer_diplomatic);
159
+ const delta = er.cer - er.cer_diplomatic;
160
+ const deltaHint = delta > 0.001 ? ` (−${(delta*100).toFixed(1)}% avec normalisation)` : '';
161
+ diplomaBadge = `<span class="cer-badge" style="color:${dipC};background:${dipB};opacity:.85"
162
+ title="CER diplomatique${deltaHint}">diplo. ${pct(er.cer_diplomatic)}</span>`;
163
+ }
164
+ const errBadge = er.error ? `<span class="badge" style="background:#fee2e2;color:#dc2626">Erreur</span>` : '';
165
+ if (diplomaBadge || errBadge) {
166
+ tripleEl.style.display = '';
167
+ tripleEl.innerHTML = `<div style="margin-top:.5rem;display:flex;gap:.4rem;flex-wrap:wrap;font-size:.82rem">
168
+ <span class="badge" style="background:#f1f5f9">WER ${pct(er.wer)}</span>
169
+ ${diplomaBadge} ${errBadge}
170
+ </div>`;
171
+ } else {
172
+ tripleEl.style.display = 'none';
173
+ tripleEl.innerHTML = '';
174
+ }
175
+ }
176
+ }
177
+
178
+ // ── Score badge (ligatures / diacritiques) ───────────────────────
179
+ function _scoreBadge(v, label) {
180
+ if (v === null || v === undefined) return '<span style="color:var(--text-muted)">—</span>';
181
+ const pctVal = (v * 100).toFixed(1);
182
+ const color = v >= 0.9 ? '#16a34a' : v >= 0.7 ? '#ca8a04' : '#dc2626';
183
+ const bg = v >= 0.9 ? '#f0fdf4' : v >= 0.7 ? '#fefce8' : '#fef2f2';
184
+ return `<span class="cer-badge" style="color:${color};background:${bg}" title="${label} : ${pctVal}%">${pctVal}%</span>`;
185
+ }
186
+
187
+ // ── Vue Classement ──────────────────────────────────────────────
188
+ let rankingSort = { col: 'cer', dir: 'asc' };
189
+
190
+ function renderRanking() {
191
+ const engines = [...DATA.engines];
192
+ // Trier
193
+ engines.sort((a, b) => {
194
+ let va = a[rankingSort.col], vb = b[rankingSort.col];
195
+ if (typeof va === 'string') va = va.toLowerCase();
196
+ if (typeof vb === 'string') vb = vb.toLowerCase();
197
+ if (va === null) va = Infinity;
198
+ if (vb === null) vb = Infinity;
199
+ return rankingSort.dir === 'asc' ? (va > vb ? 1 : -1) : (va < vb ? 1 : -1);
200
+ });
201
+
202
+ const tbody = document.getElementById('ranking-tbody');
203
+ tbody.innerHTML = engines.map((e, i) => {
204
+ const rank = i + 1;
205
+ const badgeClass = rank === 1 ? 'rank-badge rank-1' : 'rank-badge';
206
+ const cerC = cerColor(e.cer); const cerB = cerBg(e.cer);
207
+ const barW = Math.min(100, e.cer * 100 * 3);
208
+
209
+ // Badge pipeline
210
+ let pipelineBadge = '';
211
+ let pipelineStepsHtml = '';
212
+ if (e.is_pipeline && e.pipeline_info) {
213
+ const pi = e.pipeline_info;
214
+ const modeLabel = {text_only:'texte', text_and_image:'image+texte', zero_shot:'zero-shot'}[pi.pipeline_mode] || pi.pipeline_mode || '';
215
+ pipelineBadge = `<span class="pipeline-tag" title="Pipeline OCR+LLM — mode ${modeLabel}">
216
+ ⛓ pipeline<span class="pipe-arrow">·${modeLabel}</span></span>`;
217
+ if (pi.pipeline_steps) {
218
+ pipelineStepsHtml = `<div class="pipeline-steps">` +
219
+ pi.pipeline_steps.map(s => s.type === 'ocr'
220
+ ? `<span class="step-chip ocr">OCR: ${esc(s.engine)}</span>`
221
+ : `<span class="step-chip llm">LLM: ${esc(s.model)}</span>`
222
+ ).join(`<span class="step-arrow">→</span>`) +
223
+ `</div>`;
224
+ }
225
+ }
226
+
227
+ // Sur-normalisation (classe 10)
228
+ let overNormCell = '<td style="color:var(--text-muted)">—</td>';
229
+ if (e.is_pipeline && e.pipeline_info && e.pipeline_info.over_normalization) {
230
+ const on = e.pipeline_info.over_normalization;
231
+ const onPct = (on.score * 100).toFixed(2);
232
+ const cls = on.score > 0.05 ? 'over-norm-badge high' : 'over-norm-badge';
233
+ overNormCell = `<td><span class="${cls}" title="Classe 10 — ${on.over_normalized_count} mots corrects dégradés sur ${on.total_correct_ocr_words}">${onPct} %</span></td>`;
234
+ }
235
+
236
+ // CER diplomatique
237
+ let diploCerCell = '<td style="color:var(--text-muted)">—</td>';
238
+ if (e.cer_diplomatic !== null && e.cer_diplomatic !== undefined) {
239
+ const dipC = cerColor(e.cer_diplomatic); const dipB = cerBg(e.cer_diplomatic);
240
+ const delta = e.cer - e.cer_diplomatic;
241
+ const deltaStr = delta > 0.001 ? ` <span style="font-size:.65rem;color:#059669">-${(delta*100).toFixed(1)}%</span>` : '';
242
+ const profileHint = e.cer_diplomatic_profile ? ` title="Profil : ${esc(e.cer_diplomatic_profile)}"` : '';
243
+ diploCerCell = `<td${profileHint}>
244
+ <span class="cer-badge" style="color:${dipC};background:${dipB}">${pct(e.cer_diplomatic)}</span>${deltaStr}
245
+ </td>`;
246
+ }
247
+
248
+ // ── Sprint 10 : Gini + Ancrage ─────────────────────────────────────
249
+ let giniCell = '<td style="color:var(--text-muted)">—</td>';
250
+ if (e.gini !== null && e.gini !== undefined) {
251
+ const gv = e.gini;
252
+ const gColor = gv < 0.3 ? '#16a34a' : gv < 0.5 ? '#ca8a04' : '#dc2626';
253
+ const gBg = gv < 0.3 ? '#f0fdf4' : gv < 0.5 ? '#fefce8' : '#fef2f2';
254
+ giniCell = `<td><span class="cer-badge" style="color:${gColor};background:${gBg}"
255
+ title="Gini=${gv.toFixed(3)} — 0=uniforme, 1=concentré">${gv.toFixed(3)}</span></td>`;
256
+ }
257
+ let anchorCell = '<td style="color:var(--text-muted)">—</td>';
258
+ if (e.anchor_score !== null && e.anchor_score !== undefined) {
259
+ const av = e.anchor_score;
260
+ const hallBadge = (e.hallucinating_doc_rate && e.hallucinating_doc_rate > 0.2)
261
+ ? ' <span title="Hallucinations détectées">⚠️</span>' : '';
262
+ anchorCell = `<td>${_scoreBadge(av, 'Ancrage trigrammes')}${hallBadge}</td>`;
263
+ }
264
+
265
+ return `<tr>
266
+ <td><span class="${badgeClass}">${rank}</span></td>
267
+ <td>
268
+ <span class="engine-name">${esc(e.name)}</span>
269
+ ${pipelineBadge}
270
+ ${e.is_vlm ? '<span class="pipeline-tag" style="background:#fce7f3;color:#9d174d">👁 VLM</span>' : ''}
271
+ <span class="engine-version">v${esc(e.version)}</span>
272
+ ${pipelineStepsHtml}
273
+ </td>
274
+ <td>
275
+ <span class="bar" style="width:${barW}px;background:${cerC}"></span>
276
+ <span class="cer-badge" style="color:${cerC};background:${cerB}">${pct(e.cer)}</span>
277
+ </td>
278
+ ${diploCerCell}
279
+ <td>${pct(e.wer)}</td>
280
+ <td>${pct(e.mer)}</td>
281
+ <td>${pct(e.wil)}</td>
282
+ <td>${_scoreBadge(e.ligature_score, 'Ligatures')}</td>
283
+ <td>${_scoreBadge(e.diacritic_score, 'Diacritiques')}</td>
284
+ ${giniCell}
285
+ ${anchorCell}
286
+ <td style="color:var(--text-muted)">${pct(e.cer_median)}</td>
287
+ <td style="color:var(--text-muted)">${pct(e.cer_min)}</td>
288
+ <td style="color:var(--text-muted)">${pct(e.cer_max)}</td>
289
+ ${overNormCell}
290
+ <td><span class="pill">${e.doc_count}</span></td>
291
+ </tr>`;
292
+ }).join('');
293
+
294
+ // Stats globales
295
+ const pipelineCount = DATA.engines.filter(e => e.is_pipeline).length;
296
+ const totalDocs = DATA.meta.document_count;
297
+ const exclCount = EXCLUDED_DOCS.size;
298
+ const activeDocs = totalDocs - exclCount;
299
+ const stats = document.getElementById('ranking-stats');
300
+ stats.innerHTML = `
301
+ <div class="stat">Corpus <b>${esc(DATA.meta.corpus_name)}</b></div>
302
+ <div class="stat">Documents <b>${activeDocs}</b>${exclCount > 0 ? ` <span style="font-size:.75rem;color:#dc2626">(−${exclCount} exclu${exclCount>1?'s':''})</span>` : ''}</div>
303
+ <div class="stat">Concurrents <b>${DATA.engines.length}</b>
304
+ ${pipelineCount ? `<span class="pipeline-tag" style="margin-left:.3rem">${pipelineCount} pipeline${pipelineCount>1?'s':''}</span>` : ''}
305
+ </div>
306
+ `;
307
+ }
308
+
309
+ // Tri au clic sur en-tête
310
+ document.querySelectorAll('#ranking-table th.sortable').forEach(th => {
311
+ th.addEventListener('click', () => {
312
+ const col = th.dataset.col;
313
+ if (rankingSort.col === col) {
314
+ rankingSort.dir = rankingSort.dir === 'asc' ? 'desc' : 'asc';
315
+ } else {
316
+ rankingSort.col = col;
317
+ rankingSort.dir = 'asc';
318
+ }
319
+ document.querySelectorAll('#ranking-table th').forEach(t => {
320
+ t.classList.remove('sorted');
321
+ const icon = t.querySelector('.sort-icon');
322
+ if (icon) icon.textContent = '↕';
323
+ });
324
+ th.classList.add('sorted');
325
+ const icon = th.querySelector('.sort-icon');
326
+ if (icon) icon.textContent = rankingSort.dir === 'asc' ? '↑' : '↓';
327
+ renderRanking();
328
+ });
329
+ });
330
+
331
+ // ── Système d'exclusion globale ─────────────────────────────────
332
+ // Union de toutes les sources d'exclusion (manuelle + hallucination toggles)
333
+ const EXCLUDED_DOCS = new Set();
334
+ const _manualExclusions = new Set();
335
+ const _hallucinationExclusions = new Set();
336
+
337
+ // Données originales sauvegardées pour recalcul
338
+ const _originalEngines = JSON.parse(JSON.stringify(DATA.engines));
339
+
340
+ function _updateExcludedDocs() {
341
+ EXCLUDED_DOCS.clear();
342
+ _manualExclusions.forEach(id => EXCLUDED_DOCS.add(id));
343
+ _hallucinationExclusions.forEach(id => EXCLUDED_DOCS.add(id));
344
+ _updateExclusionBanner();
345
+ }
346
+
347
+ function _updateExclusionBanner() {
348
+ const banner = document.getElementById('global-exclusion-banner');
349
+ const text = document.getElementById('global-exclusion-text');
350
+ if (EXCLUDED_DOCS.size > 0) {
351
+ banner.style.display = '';
352
+ text.textContent = EXCLUDED_DOCS.size + ' document' + (EXCLUDED_DOCS.size > 1 ? 's' : '') +
353
+ ' exclu' + (EXCLUDED_DOCS.size > 1 ? 's' : '') + ' de l\'analyse' +
354
+ (_manualExclusions.size > 0 ? ' (' + _manualExclusions.size + ' manuel' + (_manualExclusions.size > 1 ? 's' : '') + ')' : '') +
355
+ (_hallucinationExclusions.size > 0 ? ' (' + _hallucinationExclusions.size + ' hallucination' + (_hallucinationExclusions.size > 1 ? 's' : '') + ')' : '');
356
+ } else {
357
+ banner.style.display = 'none';
358
+ }
359
+ }
360
+
361
+ function resetAllExclusions() {
362
+ _manualExclusions.clear();
363
+ _hallucinationExclusions.clear();
364
+ EXCLUDED_DOCS.clear();
365
+ _updateExclusionBanner();
366
+ // Reset hallucination toggles
367
+ ['robust-cer-toggle','robust-anchor-toggle','robust-ratio-toggle'].forEach(id => {
368
+ const btn = document.getElementById(id);
369
+ if (btn) { btn.dataset.active = 'true'; btn.textContent = '✓'; btn.closest('label').classList.remove('criterion-off'); }
370
+ });
371
+ document.getElementById('robust-cer').value = 100;
372
+ document.getElementById('robust-cer-val').textContent = '100%';
373
+ document.getElementById('robust-anchor').value = 0.5;
374
+ document.getElementById('robust-anchor-val').textContent = '0.50';
375
+ document.getElementById('robust-ratio').value = 1.5;
376
+ document.getElementById('robust-ratio-val').textContent = '1.5';
377
+ recalculateAll();
378
+ renderGallery();
379
+ }
380
+
381
+ function _recalcEngineMetrics() {
382
+ // Recalcule les métriques agrégées de chaque moteur en excluant EXCLUDED_DOCS
383
+ DATA.engines.forEach((eng, idx) => {
384
+ const orig = _originalEngines[idx];
385
+ if (EXCLUDED_DOCS.size === 0) {
386
+ // Restaurer les valeurs originales
387
+ eng.cer = orig.cer;
388
+ eng.wer = orig.wer;
389
+ eng.mer = orig.mer;
390
+ eng.wil = orig.wil;
391
+ eng.cer_median = orig.cer_median;
392
+ eng.cer_min = orig.cer_min;
393
+ eng.cer_max = orig.cer_max;
394
+ eng.cer_values = orig.cer_values.slice();
395
+ eng.doc_count = orig.doc_count;
396
+ eng.gini = orig.gini;
397
+ eng.anchor_score = orig.anchor_score;
398
+ eng.length_ratio = orig.length_ratio;
399
+ eng.hallucinating_doc_rate = orig.hallucinating_doc_rate;
400
+ return;
401
+ }
402
+ // Recalculer depuis les documents non exclus
403
+ const cerVals = [], werVals = [], merVals = [], wilVals = [];
404
+ const giniVals = [], anchorVals = [];
405
+ DATA.documents.forEach(doc => {
406
+ if (EXCLUDED_DOCS.has(doc.doc_id)) return;
407
+ const er = doc.engine_results.find(r => r.engine === eng.name);
408
+ if (!er || er.error) return;
409
+ if (er.cer !== null) cerVals.push(er.cer);
410
+ if (er.wer !== null) werVals.push(er.wer);
411
+ if (er.mer !== null) merVals.push(er.mer);
412
+ if (er.wil !== null) wilVals.push(er.wil);
413
+ const lm = er.line_metrics;
414
+ if (lm && lm.gini !== null) giniVals.push(lm.gini);
415
+ const hm = er.hallucination_metrics;
416
+ if (hm && hm.anchor_score !== null) anchorVals.push(hm.anchor_score);
417
+ });
418
+ const mean = arr => arr.length ? arr.reduce((a,b) => a+b, 0) / arr.length : 0;
419
+ const sorted = arr => [...arr].sort((a,b) => a - b);
420
+ const median = arr => {
421
+ if (!arr.length) return 0;
422
+ const s = sorted(arr); const n = s.length;
423
+ return n % 2 === 0 ? (s[n/2-1] + s[n/2]) / 2 : s[Math.floor(n/2)];
424
+ };
425
+ eng.cer = cerVals.length ? mean(cerVals) : orig.cer;
426
+ eng.wer = werVals.length ? mean(werVals) : orig.wer;
427
+ eng.mer = merVals.length ? mean(merVals) : orig.mer;
428
+ eng.wil = wilVals.length ? mean(wilVals) : orig.wil;
429
+ eng.cer_median = cerVals.length ? median(cerVals) : orig.cer_median;
430
+ eng.cer_min = cerVals.length ? Math.min(...cerVals) : orig.cer_min;
431
+ eng.cer_max = cerVals.length ? Math.max(...cerVals) : orig.cer_max;
432
+ eng.cer_values = cerVals;
433
+ eng.doc_count = cerVals.length;
434
+ eng.gini = giniVals.length ? mean(giniVals) : orig.gini;
435
+ eng.anchor_score = anchorVals.length ? mean(anchorVals) : orig.anchor_score;
436
+ });
437
+ }
438
+
439
+ function recalculateAll() {
440
+ console.log('[Picarones] recalculateAll — EXCLUDED_DOCS:', [...EXCLUDED_DOCS]);
441
+ _recalcEngineMetrics();
442
+ renderRanking();
443
+ renderRobustMetrics();
444
+ // Rebuild charts if they were already built
445
+ if (chartsBuilt) {
446
+ chartsBuilt = false;
447
+ Object.keys(chartInstances).forEach(id => destroyChart(id));
448
+ buildCharts();
449
+ }
450
+ }
451
+
452
+ // ── Métriques robustes ──────────────────────────────────────────
453
+
454
+ function _computeHallucinationExclusions() {
455
+ // Recalcule _hallucinationExclusions à partir des toggles/sliders
456
+ _hallucinationExclusions.clear();
457
+ const cerOn = document.getElementById('robust-cer-toggle').dataset.active === 'true';
458
+ const anchorOn = document.getElementById('robust-anchor-toggle').dataset.active === 'true';
459
+ const ratioOn = document.getElementById('robust-ratio-toggle').dataset.active === 'true';
460
+ const cerThreshold = parseInt(document.getElementById('robust-cer').value) / 100;
461
+ const anchorThreshold = parseFloat(document.getElementById('robust-anchor').value);
462
+ const ratioThreshold = parseFloat(document.getElementById('robust-ratio').value);
463
+
464
+ DATA.documents.forEach(doc => {
465
+ // Un doc est exclu par hallucination si AU MOINS un moteur le détecte comme problématique
466
+ const dominated = doc.engine_results.some(er => {
467
+ if (!er || er.error) return false;
468
+ const hm = er.hallucination_metrics;
469
+ if (cerOn && cerThreshold < 1.0 && er.cer !== null && er.cer > cerThreshold) return true;
470
+ if (anchorOn && hm && hm.anchor_score < anchorThreshold) return true;
471
+ if (ratioOn && hm && hm.length_ratio > ratioThreshold) return true;
472
+ return false;
473
+ });
474
+ if (dominated) _hallucinationExclusions.add(doc.doc_id);
475
+ });
476
+ console.log('[Picarones] _hallucinationExclusions:', [..._hallucinationExclusions]);
477
+ _updateExcludedDocs();
478
+ }
479
+
480
+ function _robustStat(arr) {
481
+ // Retourne {mean, median, p90, p95} ou null si tableau vide
482
+ if (!arr.length) return null;
483
+ const sorted = [...arr].sort((a, b) => a - b);
484
+ const n = sorted.length;
485
+ const mean = sorted.reduce((a, b) => a + b, 0) / n;
486
+ const median = n % 2 === 0 ? (sorted[n/2-1] + sorted[n/2]) / 2 : sorted[Math.floor(n/2)];
487
+ const p90 = sorted[Math.min(Math.ceil(n * 0.9) - 1, n - 1)];
488
+ const p95 = sorted[Math.min(Math.ceil(n * 0.95) - 1, n - 1)];
489
+ return { mean, median, p90, p95 };
490
+ }
491
+
492
+ function _deltaCell(globalVal, robustVal) {
493
+ if (robustVal === null || globalVal === null) return '—';
494
+ const delta = robustVal - globalVal;
495
+ const cls = delta < -0.001 ? 'color:#16a34a' : delta > 0.001 ? 'color:#dc2626' : 'color:var(--text-muted)';
496
+ const sign = delta >= 0 ? '+' : '';
497
+ return `<span style="${cls}">${sign}${(delta*100).toFixed(2)}%</span>`;
498
+ }
499
+
500
+ function toggleRobustCriterion(id, btn) {
501
+ const active = btn.dataset.active !== 'true';
502
+ btn.dataset.active = active ? 'true' : 'false';
503
+ btn.textContent = active ? '✓' : '✕';
504
+ btn.closest('label').classList.toggle('criterion-off', !active);
505
+ _computeHallucinationExclusions();
506
+ recalculateAll();
507
+ }
508
+
509
+ function renderRobustMetrics() {
510
+ const cerOn = document.getElementById('robust-cer-toggle').dataset.active === 'true';
511
+ const anchorOn = document.getElementById('robust-anchor-toggle').dataset.active === 'true';
512
+ const ratioOn = document.getElementById('robust-ratio-toggle').dataset.active === 'true';
513
+ const cerThreshold = parseInt(document.getElementById('robust-cer').value) / 100;
514
+ const anchorThreshold = parseFloat(document.getElementById('robust-anchor').value);
515
+ const ratioThreshold = parseFloat(document.getElementById('robust-ratio').value);
516
+ const totalDocs = DATA.documents.length;
517
+
518
+ // Pour chaque engine : recalculer métriques en excluant les docs problématiques
519
+ const results = DATA.engines.map(eng => {
520
+ const excluded = [];
521
+ const cerVals = [], werVals = [], merVals = [], wilVals = [], giniVals = [], anchorVals = [];
522
+
523
+ DATA.documents.forEach(doc => {
524
+ const er = doc.engine_results.find(r => r.engine === eng.name);
525
+ if (!er || er.error) return;
526
+ const hm = er.hallucination_metrics;
527
+ const lm = er.line_metrics;
528
+
529
+ // Raisons d'exclusion
530
+ const reasons = [];
531
+ if (cerOn && cerThreshold < 1.0 && er.cer !== null && er.cer > cerThreshold)
532
+ reasons.push(`CER ${(er.cer*100).toFixed(1)}% > ${(cerThreshold*100).toFixed(0)}%`);
533
+ if (anchorOn && hm && hm.anchor_score < anchorThreshold)
534
+ reasons.push(`ancrage ${hm.anchor_score.toFixed(3)} < ${anchorThreshold.toFixed(2)}`);
535
+ if (ratioOn && hm && hm.length_ratio > ratioThreshold)
536
+ reasons.push(`ratio ${hm.length_ratio.toFixed(2)} > ${ratioThreshold.toFixed(1)}`);
537
+ if (_manualExclusions.has(doc.doc_id))
538
+ reasons.push('exclusion manuelle');
539
+
540
+ if (reasons.length > 0) {
541
+ excluded.push({
542
+ doc_id: doc.doc_id,
543
+ cer: er.cer,
544
+ anchor: hm ? hm.anchor_score : undefined,
545
+ ratio: hm ? hm.length_ratio : undefined,
546
+ reasons,
547
+ });
548
+ } else {
549
+ if (er.cer !== null) cerVals.push(er.cer);
550
+ if (er.wer !== null) werVals.push(er.wer);
551
+ if (er.mer !== null) merVals.push(er.mer);
552
+ if (er.wil !== null) wilVals.push(er.wil);
553
+ if (lm && lm.gini !== null) giniVals.push(lm.gini);
554
+ if (hm && hm.anchor_score !== null) anchorVals.push(hm.anchor_score);
555
+ }
556
+ });
557
+
558
+ const meanOf = arr => arr.length ? arr.reduce((a,b)=>a+b,0)/arr.length : null;
559
+ return {
560
+ name: eng.name,
561
+ global_cer: eng.cer,
562
+ global_wer: eng.wer,
563
+ global_mer: eng.mer,
564
+ global_wil: eng.wil,
565
+ robust_cer: _robustStat(cerVals),
566
+ robust_wer: meanOf(werVals),
567
+ robust_mer: meanOf(merVals),
568
+ robust_wil: meanOf(wilVals),
569
+ robust_gini: meanOf(giniVals),
570
+ robust_anchor: meanOf(anchorVals),
571
+ robust_docs: cerVals.length,
572
+ excluded_count: excluded.length,
573
+ excluded_docs: excluded,
574
+ };
575
+ });
576
+
577
+ // Résumé — nombre unique de docs exclus (au moins par un moteur)
578
+ const allExcludedIds = new Set(results.flatMap(r => r.excluded_docs.map(d => d.doc_id)));
579
+ const countExcl = allExcludedIds.size;
580
+ const countIncl = totalDocs - countExcl;
581
+ const summaryEl = document.getElementById('robust-summary');
582
+ summaryEl.textContent = countExcl === 0
583
+ ? `Aucun document exclu — métriques calculées sur ${totalDocs} documents.`
584
+ : `${countExcl} doc${countExcl>1?'s':''} exclu${countExcl>1?'s':''} sur ${totalDocs} — métriques robustes calculées sur ${countIncl} document${countIncl>1?'s':''}.`;
585
+
586
+ if (!results.some(r => r.robust_cer !== null)) {
587
+ document.getElementById('robust-table-wrap').innerHTML =
588
+ '<p style="color:var(--text-muted);font-size:.82rem">Aucune donnée disponible pour ce corpus.</p>';
589
+ document.getElementById('robust-excluded-docs').innerHTML = '';
590
+ return;
591
+ }
592
+
593
+ // Tableau comparatif étendu
594
+ const fmt = v => v !== null ? pct(v) : '—';
595
+ const rows = results.map(r => {
596
+ const rs = r.robust_cer;
597
+ const robCerMean = rs ? rs.mean : null;
598
+ return `<tr>
599
+ <td style="font-weight:600;white-space:nowrap">${esc(r.name)}</td>
600
+ <td style="text-align:center">${fmt(r.global_cer)}</td>
601
+ <td style="text-align:center">${rs ? pct(rs.mean) : '—'}</td>
602
+ <td style="text-align:center">${_deltaCell(r.global_cer, robCerMean)}</td>
603
+ <td style="text-align:center;color:var(--text-muted)">${rs ? pct(rs.median) : '—'}</td>
604
+ <td style="text-align:center;color:var(--text-muted)">${rs ? pct(rs.p90) : '—'}</td>
605
+ <td style="text-align:center;color:var(--text-muted)">${rs ? pct(rs.p95) : '—'}</td>
606
+ <td style="text-align:center">${fmt(r.global_wer)}</td>
607
+ <td style="text-align:center">${fmt(r.robust_wer)}</td>
608
+ <td style="text-align:center">${_deltaCell(r.global_wer, r.robust_wer)}</td>
609
+ <td style="text-align:center">${fmt(r.global_mer)}</td>
610
+ <td style="text-align:center">${fmt(r.robust_mer)}</td>
611
+ <td style="text-align:center">${fmt(r.global_wil)}</td>
612
+ <td style="text-align:center">${fmt(r.robust_wil)}</td>
613
+ <td style="text-align:center;color:var(--text-muted)">${r.robust_gini !== null ? r.robust_gini.toFixed(3) : '—'}</td>
614
+ <td style="text-align:center;color:var(--text-muted)">${r.robust_anchor !== null ? r.robust_anchor.toFixed(3) : '—'}</td>
615
+ <td style="text-align:center;color:var(--text-muted)">${r.excluded_count} / ${r.robust_docs}</td>
616
+ </tr>`;
617
+ }).join('');
618
+
619
+ const thStyle = 'padding:.35rem .5rem;font-size:.75rem;white-space:nowrap;text-align:center;border-bottom:1px solid var(--border)';
620
+ const thStyleL = thStyle + ';text-align:left';
621
+ document.getElementById('robust-table-wrap').innerHTML = `
622
+ <div style="overflow-x:auto">
623
+ <table style="width:100%;border-collapse:collapse;font-size:.82rem">
624
+ <thead>
625
+ <tr style="background:var(--bg)">
626
+ <th style="${thStyleL}">Moteur</th>
627
+ <th colspan="3" style="${thStyle};border-left:2px solid var(--border)">— CER —</th>
628
+ <th colspan="3" style="${thStyle}">— CER robuste détail —</th>
629
+ <th colspan="3" style="${thStyle};border-left:2px solid var(--border)">— WER —</th>
630
+ <th colspan="2" style="${thStyle};border-left:2px solid var(--border)">— MER —</th>
631
+ <th colspan="2" style="${thStyle};border-left:2px solid var(--border)">— WIL —</th>
632
+ <th style="${thStyle};border-left:2px solid var(--border)">Gini rob.</th>
633
+ <th style="${thStyle}">Ancrage rob.</th>
634
+ <th style="${thStyle}">Excl./Incl.</th>
635
+ </tr>
636
+ <tr style="background:var(--bg)">
637
+ <th style="${thStyleL}"></th>
638
+ <th style="${thStyle};border-left:2px solid var(--border)">Global</th>
639
+ <th style="${thStyle}">Robuste</th>
640
+ <th style="${thStyle}">Δ</th>
641
+ <th style="${thStyle}">Médiane</th>
642
+ <th style="${thStyle}">P90</th>
643
+ <th style="${thStyle}">P95</th>
644
+ <th style="${thStyle};border-left:2px solid var(--border)">Global</th>
645
+ <th style="${thStyle}">Robuste</th>
646
+ <th style="${thStyle}">Δ</th>
647
+ <th style="${thStyle};border-left:2px solid var(--border)">Global</th>
648
+ <th style="${thStyle}">Robuste</th>
649
+ <th style="${thStyle};border-left:2px solid var(--border)">Global</th>
650
+ <th style="${thStyle}">Robuste</th>
651
+ <th style="${thStyle};border-left:2px solid var(--border)"></th>
652
+ <th style="${thStyle}"></th>
653
+ <th style="${thStyle}"></th>
654
+ </tr>
655
+ </thead>
656
+ <tbody>${rows}</tbody>
657
+ </table>
658
+ </div>`;
659
+
660
+ // Documents exclus — liste déroulante unifiée
661
+ if (allExcludedIds.size > 0) {
662
+ // Collecter infos par doc_id (union des raisons de tous les moteurs)
663
+ const docInfoMap = new Map();
664
+ results.forEach(r => {
665
+ r.excluded_docs.forEach(d => {
666
+ if (!docInfoMap.has(d.doc_id)) {
667
+ docInfoMap.set(d.doc_id, { doc_id: d.doc_id, cer: d.cer, anchor: d.anchor, ratio: d.ratio, reasons: new Set() });
668
+ }
669
+ d.reasons.forEach(reason => docInfoMap.get(d.doc_id).reasons.add(reason));
670
+ });
671
+ });
672
+ const uniqDocs = [...docInfoMap.values()].sort((a,b) => a.doc_id.localeCompare(b.doc_id));
673
+ document.getElementById('robust-excluded-docs').innerHTML =
674
+ `<details><summary style="cursor:pointer;font-size:.82rem;color:var(--text-muted)">` +
675
+ `▶ Documents exclus (${uniqDocs.length})</summary>` +
676
+ `<ul style="margin:.4rem 0 0 1rem;font-size:.8rem;color:var(--text-muted);max-height:220px;overflow-y:auto">` +
677
+ uniqDocs.map(d => {
678
+ const cerStr = d.cer !== null && d.cer !== undefined ? ` CER ${(d.cer*100).toFixed(1)}%` : '';
679
+ return `<li><a href="#" onclick="openDocument('${esc(d.doc_id)}');return false">${esc(d.doc_id)}</a>${cerStr} — ${[...d.reasons].join(', ')}</li>`;
680
+ }).join('') +
681
+ `</ul></details>`;
682
+ } else {
683
+ document.getElementById('robust-excluded-docs').innerHTML = '';
684
+ }
685
+ }
686
+
687
+ // ── Vue Galerie ─────────────────────────────────────────────────
688
+ function toggleGalleryExclusion(docId, checked) {
689
+ if (checked) {
690
+ _manualExclusions.delete(docId);
691
+ } else {
692
+ _manualExclusions.add(docId);
693
+ }
694
+ _updateExcludedDocs();
695
+ _updateGalleryExclusionUI();
696
+ }
697
+
698
+ function resetGalleryExclusions() {
699
+ _manualExclusions.clear();
700
+ _updateExcludedDocs();
701
+ renderGallery();
702
+ recalculateAll();
703
+ }
704
+
705
+ function _updateGalleryExclusionUI() {
706
+ const count = _manualExclusions.size;
707
+ const btn = document.getElementById('gallery-reset-btn');
708
+ const info = document.getElementById('gallery-exclusion-info');
709
+ if (count > 0) {
710
+ btn.style.display = '';
711
+ info.style.display = '';
712
+ info.textContent = `${count} document${count>1?'s':''} exclu${count>1?'s':''} manuellement de l'analyse.`;
713
+ } else {
714
+ btn.style.display = 'none';
715
+ info.style.display = 'none';
716
+ }
717
+ recalculateAll();
718
+ }
719
+
720
+ function renderGallery() {
721
+ const sortKey = document.getElementById('gallery-sort').value;
722
+ const filterCer = parseFloat(document.getElementById('gallery-filter-cer').value) / 100 || 0;
723
+ const filterEngine = document.getElementById('gallery-engine-select').value;
724
+
725
+ let docs = [...DATA.documents];
726
+
727
+ // Filtre CER
728
+ if (filterCer > 0) {
729
+ docs = docs.filter(d => {
730
+ if (filterEngine) {
731
+ const er = d.engine_results.find(r => r.engine === filterEngine);
732
+ return er && er.cer >= filterCer;
733
+ }
734
+ return d.mean_cer >= filterCer;
735
+ });
736
+ }
737
+
738
+ // Tri
739
+ docs.sort((a, b) => {
740
+ if (sortKey === 'mean_cer') return a.mean_cer - b.mean_cer;
741
+ if (sortKey === 'difficulty_score') return (b.difficulty_score||0) - (a.difficulty_score||0);
742
+ if (sortKey === 'best_engine') return a.best_engine.localeCompare(b.best_engine);
743
+ return a.doc_id.localeCompare(b.doc_id);
744
+ });
745
+
746
+ const grid = document.getElementById('gallery-grid');
747
+ const empty = document.getElementById('gallery-empty');
748
+
749
+ if (!docs.length) {
750
+ grid.innerHTML = '';
751
+ empty.style.display = '';
752
+ return;
753
+ }
754
+ empty.style.display = 'none';
755
+
756
+ // Mise à jour bouton reset
757
+ const btn = document.getElementById('gallery-reset-btn');
758
+ const info = document.getElementById('gallery-exclusion-info');
759
+ if (_manualExclusions.size > 0) {
760
+ btn.style.display = '';
761
+ info.style.display = '';
762
+ info.textContent = `${_manualExclusions.size} document${_manualExclusions.size>1?'s':''} exclu${_manualExclusions.size>1?'s':''} manuellement de l'analyse.`;
763
+ } else {
764
+ btn.style.display = 'none';
765
+ info.style.display = 'none';
766
+ }
767
+
768
+ grid.innerHTML = docs.map(doc => {
769
+ const imgTag = doc.image_b64
770
+ ? `<img src="${doc.image_b64}" alt="${esc(doc.doc_id)}" loading="lazy">`
771
+ : `<div class="img-placeholder">🖹</div>`;
772
+
773
+ const badges = doc.engine_results.map(er => {
774
+ const c = cerColor(er.cer); const bg = cerBg(er.cer);
775
+ const isPipe = er.ocr_intermediate !== undefined;
776
+ const label = isPipe ? '⛓' + er.engine.slice(0,8) : er.engine.slice(0,8);
777
+ return `<span class="engine-cer-badge" style="color:${c};background:${bg}"
778
+ title="${esc(er.engine)}${isPipe?' (pipeline)':''}">${esc(label)} ${pct(er.cer,1)}</span>`;
779
+ }).join('');
780
+
781
+ // Difficulty badge
782
+ let diffBadge = '';
783
+ if (doc.difficulty_score !== undefined) {
784
+ const dScore = doc.difficulty_score;
785
+ const dColor = dScore < 0.25 ? '#16a34a' : dScore < 0.5 ? '#ca8a04' : dScore < 0.75 ? '#ea580c' : '#dc2626';
786
+ const dBg = dScore < 0.25 ? '#f0fdf4' : dScore < 0.5 ? '#fefce8' : dScore < 0.75 ? '#fff7ed' : '#fef2f2';
787
+ diffBadge = `<span class="diff-badge" style="color:${dColor};background:${dBg};margin-left:.3rem"
788
+ title="Difficulté intrinsèque : ${doc.difficulty_label}">⚡ ${doc.difficulty_label}</span>`;
789
+ }
790
+
791
+ const isExcluded = _manualExclusions.has(doc.doc_id);
792
+ const checkboxId = `gal-chk-${doc.doc_id.replace(/[^a-z0-9]/gi,'_')}`;
793
+ const cardStyle = isExcluded ? 'opacity:.5;border:2px dashed #dc2626' : '';
794
+ return `<div class="gallery-card" style="${cardStyle}">
795
+ <label class="gallery-exclude-label" title="${isExcluded ? 'Inclure dans l\'analyse' : 'Exclure de l\'analyse'}"
796
+ style="position:absolute;top:.35rem;right:.35rem;z-index:2;cursor:pointer;background:rgba(255,255,255,.85);border-radius:.25rem;padding:.1rem .25rem;font-size:.7rem;display:flex;align-items:center;gap:.25rem">
797
+ <input type="checkbox" id="${checkboxId}" ${isExcluded ? '' : 'checked'}
798
+ onchange="toggleGalleryExclusion('${esc(doc.doc_id)}',this.checked)"
799
+ onclick="event.stopPropagation()">
800
+ <span>${isExcluded ? 'Exclu' : 'Inclus'}</span>
801
+ </label>
802
+ <div onclick="openDocument('${esc(doc.doc_id)}')">
803
+ ${imgTag}
804
+ <div class="gallery-card-body">
805
+ <div class="gallery-card-title">${esc(doc.doc_id)}${diffBadge}</div>
806
+ <div class="gallery-card-badges">${badges}</div>
807
+ </div>
808
+ </div>
809
+ </div>`;
810
+ }).join('');
811
+ }
812
+
813
+ // ── Vue Document ────────────────────────────────────────────────
814
+ let currentDocId = null;
815
+ let zoomLevel = 1;
816
+ let dragStart = null;
817
+ let imgOffset = { x: 0, y: 0 };
818
+
819
+ function openDocument(docId) {
820
+ _switchView('document');
821
+ updateURL('document', { doc: docId });
822
+ loadDocument(docId);
823
+ }
824
+
825
+ function loadDocument(docId) {
826
+ const doc = DATA.documents.find(d => d.doc_id === docId);
827
+ if (!doc) return;
828
+ currentDocId = docId;
829
+
830
+ // Sidebar : highlight
831
+ document.querySelectorAll('.doc-list-item').forEach(el => {
832
+ el.classList.toggle('active', el.dataset.docId === docId);
833
+ });
834
+
835
+ // Titre
836
+ document.getElementById('doc-detail-title').textContent = doc.doc_id;
837
+
838
+ // Métriques
839
+ const metricsDiv = document.getElementById('doc-detail-metrics');
840
+ const cer = doc.mean_cer;
841
+ const dScore = doc.difficulty_score;
842
+ const dColor = dScore < 0.25 ? '#16a34a' : dScore < 0.5 ? '#ca8a04' : dScore < 0.75 ? '#ea580c' : '#dc2626';
843
+ const dLabel = doc.difficulty_label || '';
844
+ metricsDiv.innerHTML = `<div class="stat">CER moyen <b style="color:${cerColor(cer)}">${pct(cer)}</b></div>
845
+ <div class="stat">Meilleur moteur <b>${esc(doc.best_engine)}</b></div>
846
+ ${dScore !== undefined ? `<div class="stat">Difficulté <b style="color:${dColor}">${dLabel} (${(dScore*100).toFixed(0)}%)</b></div>` : ''}`;
847
+
848
+ // Image
849
+ resetZoom();
850
+ const img = document.getElementById('doc-image');
851
+ const placeholder = document.getElementById('doc-image-placeholder');
852
+ if (doc.image_b64) {
853
+ img.src = doc.image_b64;
854
+ img.style.display = '';
855
+ placeholder.style.display = 'none';
856
+ } else {
857
+ img.style.display = 'none';
858
+ placeholder.style.display = '';
859
+ placeholder.innerHTML = `<span style="font-size:2rem">🖹</span><span>${esc(doc.image_path)}</span>`;
860
+ }
861
+
862
+ // Side-by-side diff — sélecteur de concurrent
863
+ const selWrap = document.getElementById('sbs-engine-select');
864
+ const sel = document.getElementById('sbs-engine-dropdown');
865
+ if (doc.engine_results.length > 1) {
866
+ sel.innerHTML = doc.engine_results.map((er, i) =>
867
+ `<option value="${i}">${esc(er.engine)}</option>`
868
+ ).join('');
869
+ selWrap.style.display = '';
870
+ } else {
871
+ sel.innerHTML = '';
872
+ selWrap.style.display = 'none';
873
+ }
874
+ renderSideBySide(docId);
875
+
876
+ // ── Sprint 10 : distribution CER par ligne ��─────────────────────────
877
+ const lineCard = document.getElementById('doc-line-metrics-card');
878
+ const lineContent = document.getElementById('doc-line-metrics-content');
879
+ // Prendre le premier moteur ayant des line_metrics
880
+ const erWithLine = doc.engine_results.find(er => er.line_metrics);
881
+ if (erWithLine && erWithLine.line_metrics) {
882
+ lineCard.style.display = '';
883
+ lineContent.innerHTML = renderLineMetrics(doc.engine_results);
884
+ } else {
885
+ lineCard.style.display = 'none';
886
+ }
887
+
888
+ // ── Sprint 10 : hallucinations ──────────────────────────────────────
889
+ const hallCard = document.getElementById('doc-hallucination-card');
890
+ const hallContent = document.getElementById('doc-hallucination-content');
891
+ const erWithHall = doc.engine_results.find(er => er.hallucination_metrics && er.hallucination_metrics.is_hallucinating);
892
+ if (erWithHall || doc.engine_results.some(er => er.hallucination_metrics)) {
893
+ hallCard.style.display = '';
894
+ hallContent.innerHTML = renderHallucinationPanel(doc.engine_results);
895
+ } else {
896
+ hallCard.style.display = 'none';
897
+ }
898
+ }
899
+
900
+ // ── Sprint 10 : rendu distribution CER par ligne ────────────────
901
+ function renderLineMetrics(engineResults) {
902
+ const heatmapColors = (v) => {
903
+ if (v < 0.05) return '#86efac';
904
+ if (v < 0.15) return '#fde68a';
905
+ if (v < 0.30) return '#fb923c';
906
+ return '#f87171';
907
+ };
908
+
909
+ return engineResults.filter(er => er.line_metrics).map(er => {
910
+ const lm = er.line_metrics;
911
+ const c = cerColor(er.cer); const bg = cerBg(er.cer);
912
+
913
+ // Heatmap de position
914
+ const heatmap = lm.heatmap || [];
915
+ const maxHeat = Math.max(...heatmap, 0.01);
916
+ const heatmapHtml = heatmap.length > 0
917
+ ? `<div class="heatmap-wrap">` +
918
+ heatmap.map((v, i) => {
919
+ const h = Math.max(4, Math.round(60 * v / maxHeat));
920
+ return `<div class="heatmap-bar" style="height:${h}px;background:${heatmapColors(v)}"
921
+ title="Tranche ${i+1}/${heatmap.length} — CER=${(v*100).toFixed(1)}%"></div>`;
922
+ }).join('') +
923
+ `</div><div class="heatmap-labels"><span>${I18N.heatmap_start||'Début'}</span><span>${I18N.heatmap_mid||'Milieu'}</span><span>${I18N.heatmap_end||'Fin'}</span></div>`
924
+ : '<em style="color:var(--text-muted)">—</em>';
925
+
926
+ // Percentiles
927
+ const p = lm.percentiles || {};
928
+ const pctBars = ['p50','p75','p90','p95','p99'].map(k => {
929
+ const v = p[k] || 0;
930
+ const w = Math.min(100, v * 100 * 2);
931
+ const fillColor = v < 0.15 ? '#86efac' : v < 0.30 ? '#fde68a' : '#f87171';
932
+ return `<div class="pct-bar-row">
933
+ <span class="pct-bar-label">${k}</span>
934
+ <div class="pct-bar-track"><div class="pct-bar-fill" style="width:${w}%;background:${fillColor}"></div></div>
935
+ <span class="pct-bar-val">${(v*100).toFixed(1)}%</span>
936
+ </div>`;
937
+ }).join('');
938
+
939
+ // Taux catastrophiques
940
+ const cr = lm.catastrophic_rate || {};
941
+ const crRows = Object.entries(cr).map(([t, rate]) => {
942
+ const tPct = (parseFloat(t)*100).toFixed(0);
943
+ const ratePct = (rate*100).toFixed(1);
944
+ const color = rate < 0.05 ? '#16a34a' : rate < 0.15 ? '#ca8a04' : '#dc2626';
945
+ return `<span class="stat"><b style="color:${color}">${ratePct}%</b> lignes CER&gt;${tPct}%</span>`;
946
+ }).join('');
947
+
948
+ // Gini
949
+ const gini = lm.gini !== undefined ? lm.gini.toFixed(3) : '—';
950
+ const giniColor = lm.gini < 0.3 ? '#16a34a' : lm.gini < 0.5 ? '#ca8a04' : '#dc2626';
951
+
952
+ return `<div style="margin-bottom:1.25rem;padding-bottom:1rem;border-bottom:1px solid var(--border)">
953
+ <div style="display:flex;align-items:center;gap:.5rem;margin-bottom:.6rem">
954
+ <strong>${esc(er.engine)}</strong>
955
+ <span class="cer-badge" style="color:${c};background:${bg}">${pct(er.cer)}</span>
956
+ <span class="stat">Gini <b style="color:${giniColor}">${gini}</b></span>
957
+ <span class="stat">${lm.line_count} ${I18N.lines||'lignes'}</span>
958
+ ${crRows}
959
+ </div>
960
+ <div style="display:grid;grid-template-columns:1fr 1fr;gap:1rem">
961
+ <div>
962
+ <div style="font-size:.75rem;font-weight:600;color:var(--text-muted);margin-bottom:.3rem">${I18N.heatmap_title||'CARTE THERMIQUE (position)'}</div>
963
+ ${heatmapHtml}
964
+ </div>
965
+ <div>
966
+ <div style="font-size:.75rem;font-weight:600;color:var(--text-muted);margin-bottom:.3rem">${I18N.percentile_title||'PERCENTILES CER'}</div>
967
+ <div class="pct-bars">${pctBars}</div>
968
+ </div>
969
+ </div>
970
+ </div>`;
971
+ }).join('') || `<em style="color:var(--text-muted)">${I18N.no_line_metrics||'Aucune métrique de ligne disponible.'}</em>`;
972
+ }
973
+
974
+ // ── Sprint 10 : rendu panneau hallucinations ─────────────────────
975
+ function renderHallucinationPanel(engineResults) {
976
+ const withHall = engineResults.filter(er => er.hallucination_metrics);
977
+ if (!withHall.length) return `<em style="color:var(--text-muted)">${I18N.no_hall_metrics||"Aucune métrique d'hallucination disponible."}</em>`;
978
+
979
+ return withHall.map(er => {
980
+ const hm = er.hallucination_metrics;
981
+ const isHall = hm.is_hallucinating;
982
+ const badgeClass = isHall ? 'hallucination-badge' : 'hallucination-badge ok';
983
+ const badgeLabel = isHall ? (I18N.hall_detected||'⚠️ Hallucinations détectées') : (I18N.hall_ok||'✓ Ancrage satisfaisant');
984
+
985
+ const blocksHtml = hm.hallucinated_blocks && hm.hallucinated_blocks.length > 0
986
+ ? hm.hallucinated_blocks.slice(0, 5).map(b =>
987
+ `<div class="halluc-block">
988
+ <div class="halluc-block-meta">${I18N.hall_block_label||'Bloc halluciné'} — ${b.length} mots (tokens ${b.start_token}–${b.end_token})</div>
989
+ ${esc(b.text)}
990
+ </div>`
991
+ ).join('') +
992
+ (hm.hallucinated_blocks.length > 5 ? `<div style="font-size:.72rem;color:var(--text-muted);margin-top:.25rem">… ${hm.hallucinated_blocks.length - 5} ${I18N.hall_more_blocks||'bloc(s) supplémentaire(s)'}</div>` : '')
993
+ : `<em style="color:var(--text-muted);font-size:.8rem">${I18N.no_hall_blocks||'Aucun bloc halluciné détecté.'}</em>`;
994
+
995
+ return `<div style="margin-bottom:1.25rem;padding-bottom:1rem;border-bottom:1px solid var(--border)">
996
+ <div style="display:flex;align-items:center;gap:.5rem;margin-bottom:.6rem;flex-wrap:wrap">
997
+ <strong>${esc(er.engine)}</strong>
998
+ <span class="${badgeClass}">${badgeLabel}</span>
999
+ <span class="stat">Ancrage <b>${(hm.anchor_score*100).toFixed(1)}%</b></span>
1000
+ <span class="stat">Ratio longueur <b>${hm.length_ratio.toFixed(2)}</b></span>
1001
+ <span class="stat">Insertion nette <b>${(hm.net_insertion_rate*100).toFixed(1)}%</b></span>
1002
+ <span class="stat">${hm.gt_word_count} mots GT / ${hm.hyp_word_count} mots sortie</span>
1003
+ </div>
1004
+ ${isHall ? `<div style="margin-bottom:.5rem;font-size:.82rem;font-weight:600;color:#9d174d">${I18N.hall_blocks_title||'Blocs sans ancrage dans le GT :'}</div>` : ''}
1005
+ ${isHall ? blocksHtml : ''}
1006
+ </div>`;
1007
+ }).join('');
1008
+ }
1009
+
1010
+ // ── Sprint 10 — Scatter Gini vs CER moyen ──────────────────────
1011
+ function buildGiniCerScatter() {
1012
+ const canvas = document.getElementById('chart-gini-cer');
1013
+ if (!canvas) return;
1014
+ const pts = DATA.gini_vs_cer || [];
1015
+ if (!pts.length) {
1016
+ canvas.parentElement.innerHTML = `<p style="color:var(--text-muted);padding:1rem">${I18N.no_gini||'Données Gini non disponibles.'}</p>`;
1017
+ return;
1018
+ }
1019
+ const datasets = pts.map((p, i) => ({
1020
+ label: p.engine,
1021
+ data: [{ x: p.cer * 100, y: p.gini }],
1022
+ backgroundColor: engineColor(DATA.engines.findIndex(e => e.name === p.engine)) + 'cc',
1023
+ borderColor: engineColor(DATA.engines.findIndex(e => e.name === p.engine)),
1024
+ borderWidth: p.is_pipeline ? 2 : 1,
1025
+ pointRadius: p.is_pipeline ? 9 : 7,
1026
+ pointStyle: p.is_pipeline ? 'triangle' : 'circle',
1027
+ }));
1028
+
1029
+ chartInstances['gini-cer'] = new Chart(canvas.getContext('2d'), {
1030
+ type: 'scatter',
1031
+ data: { datasets },
1032
+ options: {
1033
+ responsive: true, maintainAspectRatio: false,
1034
+ plugins: {
1035
+ legend: { position: 'top', labels: { font: { size: 11 } } },
1036
+ tooltip: { callbacks: {
1037
+ label: ctx => `${ctx.dataset.label}: CER=${ctx.parsed.x.toFixed(2)}%, Gini=${ctx.parsed.y.toFixed(3)}`,
1038
+ } },
1039
+ },
1040
+ scales: {
1041
+ x: { min: 0, title: { display: true, text: 'CER moyen (%)', font: { size: 11 } } },
1042
+ y: { min: 0, max: 1, title: { display: true, text: 'Coefficient de Gini', font: { size: 11 } } },
1043
+ },
1044
+ },
1045
+ });
1046
+ }
1047
+
1048
+ // ── Sprint 10 — Scatter ratio longueur vs score d'ancrage ────────
1049
+ function buildRatioAnchorScatter() {
1050
+ const canvas = document.getElementById('chart-ratio-anchor');
1051
+ if (!canvas) return;
1052
+ const pts = DATA.ratio_vs_anchor || [];
1053
+ if (!pts.length) {
1054
+ canvas.parentElement.innerHTML = `<p style="color:var(--text-muted);padding:1rem">Données d'ancrage non disponibles.</p>`;
1055
+ return;
1056
+ }
1057
+
1058
+ // Zone de danger (ancrage < 0.5 OU ratio > 1.2) dessinée via plugin
1059
+ const datasets = pts.map((p, i) => ({
1060
+ label: p.engine + (p.is_vlm ? ' 👁' : ''),
1061
+ data: [{ x: p.anchor_score, y: p.length_ratio }],
1062
+ backgroundColor: engineColor(DATA.engines.findIndex(e => e.name === p.engine)) + 'cc',
1063
+ borderColor: engineColor(DATA.engines.findIndex(e => e.name === p.engine)),
1064
+ borderWidth: p.is_vlm ? 3 : 1,
1065
+ pointRadius: p.is_vlm ? 10 : 7,
1066
+ pointStyle: p.is_vlm ? 'star' : 'circle',
1067
+ }));
1068
+
1069
+ chartInstances['ratio-anchor'] = new Chart(canvas.getContext('2d'), {
1070
+ type: 'scatter',
1071
+ data: { datasets },
1072
+ options: {
1073
+ responsive: true, maintainAspectRatio: false,
1074
+ plugins: {
1075
+ legend: { position: 'top', labels: { font: { size: 11 } } },
1076
+ tooltip: { callbacks: {
1077
+ label: ctx => `${ctx.dataset.label}: ancrage=${(ctx.parsed.x*100).toFixed(1)}%, ratio=${ctx.parsed.y.toFixed(2)}`,
1078
+ } },
1079
+ },
1080
+ scales: {
1081
+ x: { min: 0, max: 1, title: { display: true, text: "Score d'ancrage [0–1]", font: { size: 11 } } },
1082
+ y: { min: 0, title: { display: true, text: 'Ratio longueur (sortie/GT)', font: { size: 11 } } },
1083
+ },
1084
+ },
1085
+ plugins: [{
1086
+ id: 'danger-zones',
1087
+ beforeDraw(chart) {
1088
+ const { ctx: c, chartArea: { left, top, right, bottom }, scales: { x, y } } = chart;
1089
+ c.save();
1090
+ // Ancrage < 0.5 (gauche)
1091
+ const xHalf = x.getPixelForValue(0.5);
1092
+ c.fillStyle = 'rgba(239,68,68,0.07)';
1093
+ c.fillRect(left, top, xHalf - left, bottom - top);
1094
+ // Ratio > 1.2 (haut)
1095
+ const y12 = y.getPixelForValue(1.2);
1096
+ if (y12 > top) {
1097
+ c.fillRect(left, top, right - left, y12 - top);
1098
+ }
1099
+ // Lignes de seuil
1100
+ c.strokeStyle = 'rgba(239,68,68,0.35)'; c.lineWidth = 1; c.setLineDash([4,4]);
1101
+ c.beginPath(); c.moveTo(xHalf, top); c.lineTo(xHalf, bottom); c.stroke();
1102
+ if (y12 > top) {
1103
+ c.beginPath(); c.moveTo(left, y12); c.lineTo(right, y12); c.stroke();
1104
+ }
1105
+ c.restore();
1106
+ },
1107
+ }],
1108
+ });
1109
+ }
1110
+
1111
+ function buildDocList() {
1112
+ const list = document.getElementById('doc-list');
1113
+ list.innerHTML = DATA.documents.map(doc => {
1114
+ const c = cerColor(doc.mean_cer); const bg = cerBg(doc.mean_cer);
1115
+ return `<div class="doc-list-item" data-doc-id="${esc(doc.doc_id)}"
1116
+ onclick="loadDocument('${esc(doc.doc_id)}')">
1117
+ <span class="doc-list-label">${esc(doc.doc_id)}</span>
1118
+ <span class="doc-list-cer" style="color:${c};background:${bg}">${pct(doc.mean_cer,1)}</span>
1119
+ </div>`;
1120
+ }).join('');
1121
+ if (DATA.documents.length) loadDocument(DATA.documents[0].doc_id);
1122
+ }
1123
+
1124
+ // Zoom
1125
+ function handleZoom(e) {
1126
+ e.preventDefault();
1127
+ zoom(e.deltaY < 0 ? 1.15 : 0.87);
1128
+ }
1129
+ function zoom(factor) {
1130
+ zoomLevel = Math.max(0.5, Math.min(5, zoomLevel * factor));
1131
+ applyZoom();
1132
+ }
1133
+ function resetZoom() {
1134
+ zoomLevel = 1; imgOffset = { x: 0, y: 0 };
1135
+ applyZoom();
1136
+ }
1137
+ function applyZoom() {
1138
+ const img = document.getElementById('doc-image');
1139
+ img.style.transform = `scale(${zoomLevel}) translate(${imgOffset.x}px, ${imgOffset.y}px)`;
1140
+ }
1141
+ function startDrag(e) {
1142
+ if (zoomLevel <= 1) return;
1143
+ dragStart = { x: e.clientX - imgOffset.x * zoomLevel, y: e.clientY - imgOffset.y * zoomLevel };
1144
+ document.getElementById('doc-image-wrap').style.cursor = 'grabbing';
1145
+ }
1146
+ function doDrag(e) {
1147
+ if (!dragStart) return;
1148
+ imgOffset.x = (e.clientX - dragStart.x) / zoomLevel;
1149
+ imgOffset.y = (e.clientY - dragStart.y) / zoomLevel;
1150
+ applyZoom();
1151
+ }
1152
+ function endDrag() {
1153
+ dragStart = null;
1154
+ document.getElementById('doc-image-wrap').style.cursor = zoomLevel > 1 ? 'grab' : 'zoom-in';
1155
+ }
1156
+
1157
+ // ── Graphiques ──────────────────────────────────────────────────
1158
+ let chartsBuilt = false;
1159
+ let chartInstances = {};
1160
+
1161
+ function destroyChart(id) {
1162
+ if (chartInstances[id]) { chartInstances[id].destroy(); delete chartInstances[id]; }
1163
+ }
1164
+
1165
+ function buildCharts() {
1166
+ if (chartsBuilt) return;
1167
+ chartsBuilt = true;
1168
+ buildCerHistogram();
1169
+ buildRadar();
1170
+ buildCerPerDoc();
1171
+ buildDurationChart();
1172
+ buildQualityCerScatter();
1173
+ buildTaxonomyChart();
1174
+ // Sprint 7
1175
+ buildReliabilityCurves();
1176
+ buildBootstrapCIChart();
1177
+ buildVennDiagram();
1178
+ buildWilcoxonTable();
1179
+ buildErrorClusters();
1180
+ initCorrelationMatrix();
1181
+ // Sprint 10
1182
+ buildGiniCerScatter();
1183
+ buildRatioAnchorScatter();
1184
+ }
1185
+
1186
+ function buildCerHistogram() {
1187
+ destroyChart('cer-hist');
1188
+ const ctx = document.getElementById('chart-cer-hist').getContext('2d');
1189
+ // Construire histogramme à bins fixes [0-5, 5-10, 10-20, 20-30, 30-50, 50+]
1190
+ const bins = [0, 0.05, 0.10, 0.20, 0.30, 0.50, 1.01];
1191
+ const labels = ['0–5%', '5–10%', '10–20%', '20–30%', '30–50%', '>50%'];
1192
+ const colors = ['#16a34a','#65a30d','#ca8a04','#ea580c','#dc2626','#9f1239'];
1193
+
1194
+ const datasets = DATA.engines.map((e, ei) => {
1195
+ const counts = new Array(labels.length).fill(0);
1196
+ e.cer_values.forEach(v => {
1197
+ for (let i = 0; i < bins.length - 1; i++) {
1198
+ if (v >= bins[i] && v < bins[i+1]) { counts[i]++; break; }
1199
+ }
1200
+ });
1201
+ return {
1202
+ label: e.name, data: counts,
1203
+ backgroundColor: engineColor(ei) + 'aa',
1204
+ borderColor: engineColor(ei),
1205
+ borderWidth: 1,
1206
+ };
1207
+ });
1208
+
1209
+ chartInstances['cer-hist'] = new Chart(ctx, {
1210
+ type: 'bar',
1211
+ data: { labels, datasets },
1212
+ options: {
1213
+ responsive: true, maintainAspectRatio: false,
1214
+ plugins: { legend: { position: 'top', labels: { font: { size: 11 } } } },
1215
+ scales: {
1216
+ x: { title: { display: true, text: 'Plage CER', font: { size: 11 } } },
1217
+ y: { title: { display: true, text: 'Nombre de documents', font: { size: 11 } },
1218
+ ticks: { stepSize: 1 } },
1219
+ },
1220
+ },
1221
+ });
1222
+ }
1223
+
1224
+ function buildRadar() {
1225
+ destroyChart('radar');
1226
+ const ctx = document.getElementById('chart-radar').getContext('2d');
1227
+ // Axes : CER, WER, MER, WIL inversés (1 - valeur → plus c'est élevé, mieux c'est)
1228
+ const metrics = ['CER', 'WER', 'MER', 'WIL'];
1229
+ const keys = ['cer', 'wer', 'mer', 'wil'];
1230
+ const datasets = DATA.engines.map((e, i) => {
1231
+ const data = keys.map(k => Math.max(0, (1 - (e[k] || 0)) * 100));
1232
+ return {
1233
+ label: e.name, data,
1234
+ backgroundColor: engineColor(i) + '33',
1235
+ borderColor: engineColor(i),
1236
+ borderWidth: 2,
1237
+ pointRadius: 4,
1238
+ pointHoverRadius: 6,
1239
+ };
1240
+ });
1241
+
1242
+ chartInstances['radar'] = new Chart(ctx, {
1243
+ type: 'radar',
1244
+ data: { labels: metrics, datasets },
1245
+ options: {
1246
+ responsive: true, maintainAspectRatio: false,
1247
+ plugins: { legend: { position: 'top', labels: { font: { size: 11 } } } },
1248
+ scales: {
1249
+ r: {
1250
+ min: 0, max: 100,
1251
+ ticks: { stepSize: 20, font: { size: 10 } },
1252
+ pointLabels: { font: { size: 12, weight: 'bold' } },
1253
+ },
1254
+ },
1255
+ },
1256
+ });
1257
+ }
1258
+
1259
+ function buildCerPerDoc() {
1260
+ destroyChart('cer-doc');
1261
+ const ctx = document.getElementById('chart-cer-doc').getContext('2d');
1262
+ const filteredDocs = DATA.documents.filter(d => !EXCLUDED_DOCS.has(d.doc_id));
1263
+ const labels = filteredDocs.map(d => d.doc_id);
1264
+ const datasets = DATA.engines.map((e, ei) => {
1265
+ const data = filteredDocs.map(doc => {
1266
+ const er = doc.engine_results.find(r => r.engine === e.name);
1267
+ return er ? er.cer * 100 : null;
1268
+ });
1269
+ return {
1270
+ label: e.name, data,
1271
+ borderColor: engineColor(ei),
1272
+ backgroundColor: engineColor(ei) + '22',
1273
+ tension: 0.3, fill: false,
1274
+ pointRadius: 3, pointHoverRadius: 5,
1275
+ };
1276
+ });
1277
+
1278
+ chartInstances['cer-doc'] = new Chart(ctx, {
1279
+ type: 'line',
1280
+ data: { labels, datasets },
1281
+ options: {
1282
+ responsive: true, maintainAspectRatio: false,
1283
+ plugins: { legend: { position: 'top', labels: { font: { size: 11 } } } },
1284
+ scales: {
1285
+ x: { ticks: { maxRotation: 45, font: { size: 10 } } },
1286
+ y: { title: { display: true, text: 'CER (%)', font: { size: 11 } }, min: 0 },
1287
+ },
1288
+ },
1289
+ });
1290
+ }
1291
+
1292
+ function buildDurationChart() {
1293
+ destroyChart('duration');
1294
+ const ctx = document.getElementById('chart-duration').getContext('2d');
1295
+
1296
+ const filteredDocs = DATA.documents.filter(d => !EXCLUDED_DOCS.has(d.doc_id));
1297
+ const labels = DATA.engines.map(e => e.name);
1298
+ const data = DATA.engines.map(e => {
1299
+ const durs = filteredDocs.flatMap(d => d.engine_results
1300
+ .filter(r => r.engine === e.name)
1301
+ .map(r => r.duration));
1302
+ const mean = durs.length ? durs.reduce((a,b) => a+b, 0) / durs.length : 0;
1303
+ return parseFloat(mean.toFixed(3));
1304
+ });
1305
+
1306
+ chartInstances['duration'] = new Chart(ctx, {
1307
+ type: 'bar',
1308
+ data: {
1309
+ labels,
1310
+ datasets: [{
1311
+ label: 'Durée moy. (s)',
1312
+ data,
1313
+ backgroundColor: DATA.engines.map((_, i) => engineColor(i) + 'aa'),
1314
+ borderColor: DATA.engines.map((_, i) => engineColor(i)),
1315
+ borderWidth: 1,
1316
+ }],
1317
+ },
1318
+ options: {
1319
+ responsive: true, maintainAspectRatio: false,
1320
+ plugins: { legend: { display: false } },
1321
+ scales: {
1322
+ y: { title: { display: true, text: 'Secondes', font: { size: 11 } }, min: 0 },
1323
+ },
1324
+ },
1325
+ });
1326
+ }
1327
+
1328
+ function buildQualityCerScatter() {
1329
+ const ctx = document.getElementById('chart-quality-cer');
1330
+ if (!ctx) return;
1331
+ const filteredDocs = DATA.documents.filter(d => !EXCLUDED_DOCS.has(d.doc_id));
1332
+ // Construire les points : un par document, un dataset par moteur
1333
+ const datasets = DATA.engines.map((e, ei) => {
1334
+ const points = filteredDocs.flatMap(doc => {
1335
+ const er = doc.engine_results.find(r => r.engine === e.name);
1336
+ if (!er || er.error || !er.image_quality) return [];
1337
+ return [{ x: er.image_quality.quality_score, y: er.cer * 100 }];
1338
+ });
1339
+ return {
1340
+ label: e.name, data: points,
1341
+ backgroundColor: engineColor(ei) + 'bb',
1342
+ borderColor: engineColor(ei),
1343
+ borderWidth: 1, pointRadius: 5, pointHoverRadius: 7,
1344
+ };
1345
+ }).filter(d => d.data.length > 0);
1346
+
1347
+ if (!datasets.length) { ctx.parentElement.innerHTML = '<p style="color:var(--text-muted);padding:1rem">Aucune donnée de qualité image disponible.</p>'; return; }
1348
+
1349
+ chartInstances['quality-cer'] = new Chart(ctx.getContext('2d'), {
1350
+ type: 'scatter',
1351
+ data: { datasets },
1352
+ options: {
1353
+ responsive: true, maintainAspectRatio: false,
1354
+ plugins: {
1355
+ legend: { position: 'top', labels: { font: { size: 11 } } },
1356
+ tooltip: { callbacks: {
1357
+ label: ctx => `${ctx.dataset.label}: qualité=${ctx.parsed.x.toFixed(2)}, CER=${ctx.parsed.y.toFixed(1)}%`,
1358
+ } },
1359
+ },
1360
+ scales: {
1361
+ x: { min: 0, max: 1, title: { display: true, text: 'Score qualité image [0–1]', font: { size: 11 } } },
1362
+ y: { min: 0, title: { display: true, text: 'CER (%)', font: { size: 11 } } },
1363
+ },
1364
+ },
1365
+ });
1366
+ }
1367
+
1368
+ function buildTaxonomyChart() {
1369
+ const ctx = document.getElementById('chart-taxonomy');
1370
+ if (!ctx) return;
1371
+ const taxLabels = ['Confusion visuelle','Diacritique','Casse','Ligature','Abréviation','Hapax','Segmentation','Hors-vocab.','Lacune'];
1372
+ const taxKeys = ['visual_confusion','diacritic_error','case_error','ligature_error','abbreviation_error','hapax','segmentation_error','oov_character','lacuna'];
1373
+ const taxColors = ['#6366f1','#f59e0b','#ec4899','#14b8a6','#8b5cf6','#64748b','#f97316','#06b6d4','#ef4444'];
1374
+
1375
+ const datasets = DATA.engines.map((e, ei) => {
1376
+ const tax = e.aggregated_taxonomy;
1377
+ const data = taxKeys.map(k => tax && tax.counts ? (tax.counts[k] || 0) : 0);
1378
+ return {
1379
+ label: e.name, data,
1380
+ backgroundColor: engineColor(ei) + '99',
1381
+ borderColor: engineColor(ei),
1382
+ borderWidth: 1,
1383
+ };
1384
+ });
1385
+
1386
+ chartInstances['taxonomy'] = new Chart(ctx.getContext('2d'), {
1387
+ type: 'bar',
1388
+ data: { labels: taxLabels, datasets },
1389
+ options: {
1390
+ responsive: true, maintainAspectRatio: false,
1391
+ plugins: { legend: { position: 'top', labels: { font: { size: 11 } } } },
1392
+ scales: {
1393
+ x: { ticks: { font: { size: 10 } } },
1394
+ y: { title: { display: true, text: "Nb d'erreurs", font: { size: 11 } }, min: 0, ticks: { stepSize: 1 } },
1395
+ },
1396
+ },
1397
+ });
1398
+ }
1399
+
1400
+ // ── Sprint 7 — Courbes de fiabilité ─────────────────────────────
1401
+ function buildReliabilityCurves() {
1402
+ const ctx = document.getElementById('chart-reliability');
1403
+ if (!ctx) return;
1404
+ const curves = DATA.reliability_curves || [];
1405
+ if (!curves.length) { ctx.parentElement.innerHTML = '<p style="color:var(--text-muted);padding:1rem">Données insuffisantes.</p>'; return; }
1406
+ const datasets = curves.map((c, i) => {
1407
+ const points = (c.points || []).map(p => ({ x: p.pct_docs, y: p.mean_cer * 100 }));
1408
+ return {
1409
+ label: c.engine, data: points,
1410
+ borderColor: engineColor(i), backgroundColor: engineColor(i) + '22',
1411
+ tension: 0.3, fill: false, pointRadius: 2, pointHoverRadius: 5,
1412
+ };
1413
+ });
1414
+ destroyChart('reliability');
1415
+ chartInstances['reliability'] = new Chart(ctx.getContext('2d'), {
1416
+ type: 'line',
1417
+ data: { datasets },
1418
+ options: {
1419
+ responsive: true, maintainAspectRatio: false,
1420
+ parsing: { xAxisKey: 'x', yAxisKey: 'y' },
1421
+ plugins: {
1422
+ legend: { position: 'top', labels: { font: { size: 11 } } },
1423
+ tooltip: { callbacks: {
1424
+ title: ([item]) => `${item.parsed.x.toFixed(0)}% docs les plus faciles`,
1425
+ label: item => `${item.dataset.label}: CER moy = ${item.parsed.y.toFixed(2)}%`,
1426
+ } },
1427
+ },
1428
+ scales: {
1429
+ x: { type:'linear', min:0, max:100,
1430
+ title: { display:true, text:'% documents (triés par CER croissant)', font:{ size:11 } } },
1431
+ y: { min:0, title: { display:true, text:'CER moyen (%)', font:{ size:11 } } },
1432
+ },
1433
+ },
1434
+ });
1435
+ }
1436
+
1437
+ // ── Sprint 7 — Bootstrap CI ──────────────────────────────────────
1438
+ function buildBootstrapCIChart() {
1439
+ const ctx = document.getElementById('chart-bootstrap-ci');
1440
+ if (!ctx) return;
1441
+ const cis = DATA.statistics && DATA.statistics.bootstrap_cis || [];
1442
+ if (!cis.length) { ctx.parentElement.innerHTML = '<p style="color:var(--text-muted);padding:1rem">Données insuffisantes.</p>'; return; }
1443
+
1444
+ const labels = cis.map(c => c.engine);
1445
+ const means = cis.map(c => (c.mean * 100));
1446
+ const lowers = cis.map(c => (c.mean - c.ci_lower) * 100);
1447
+ const uppers = cis.map(c => (c.ci_upper - c.mean) * 100);
1448
+
1449
+ destroyChart('bootstrap-ci');
1450
+ chartInstances['bootstrap-ci'] = new Chart(ctx.getContext('2d'), {
1451
+ type: 'bar',
1452
+ data: {
1453
+ labels,
1454
+ datasets: [{
1455
+ label: 'CER moyen (%)',
1456
+ data: means,
1457
+ backgroundColor: cis.map((_, i) => engineColor(i) + 'aa'),
1458
+ borderColor: cis.map((_, i) => engineColor(i)),
1459
+ borderWidth: 1,
1460
+ errorBars: { symmetric: false },
1461
+ }],
1462
+ },
1463
+ options: {
1464
+ responsive: true, maintainAspectRatio: false,
1465
+ plugins: {
1466
+ legend: { display: false },
1467
+ tooltip: {
1468
+ callbacks: {
1469
+ afterLabel: (ctx) => {
1470
+ const ci = cis[ctx.dataIndex];
1471
+ return `IC 95% : [${(ci.ci_lower*100).toFixed(2)}%, ${(ci.ci_upper*100).toFixed(2)}%]`;
1472
+ },
1473
+ },
1474
+ },
1475
+ },
1476
+ scales: { y: { min: 0, title: { display:true, text:'CER (%)', font:{size:11} } } },
1477
+ },
1478
+ plugins: [{
1479
+ id: 'errorBars',
1480
+ afterDatasetsDraw(chart) {
1481
+ const { ctx: c, data, scales: { x, y } } = chart;
1482
+ chart.data.datasets[0].data.forEach((val, i) => {
1483
+ const ci = cis[i];
1484
+ if (!ci) return;
1485
+ const xPos = x.getPixelForValue(i);
1486
+ const yTop = y.getPixelForValue(ci.ci_upper * 100);
1487
+ const yBot = y.getPixelForValue(ci.ci_lower * 100);
1488
+ c.save();
1489
+ c.strokeStyle = '#374151'; c.lineWidth = 2;
1490
+ c.beginPath(); c.moveTo(xPos, yTop); c.lineTo(xPos, yBot); c.stroke();
1491
+ c.beginPath(); c.moveTo(xPos-6, yTop); c.lineTo(xPos+6, yTop); c.stroke();
1492
+ c.beginPath(); c.moveTo(xPos-6, yBot); c.lineTo(xPos+6, yBot); c.stroke();
1493
+ c.restore();
1494
+ });
1495
+ },
1496
+ }],
1497
+ });
1498
+ }
1499
+
1500
+ // ── Sprint 7 — Diagramme de Venn ────────────────────────────────
1501
+ function buildVennDiagram() {
1502
+ const container = document.getElementById('venn-container');
1503
+ if (!container) return;
1504
+ const venn = DATA.venn_data;
1505
+ if (!venn || !venn.type) {
1506
+ container.innerHTML = '<p style="color:var(--text-muted)">Données insuffisantes pour le diagramme de Venn.</p>';
1507
+ return;
1508
+ }
1509
+
1510
+ if (venn.type === 'venn2') {
1511
+ const total = (venn.only_a || 0) + (venn.both || 0) + (venn.only_b || 0);
1512
+ const maxR = 80;
1513
+ const rA = Math.sqrt((venn.only_a + venn.both) / (total || 1)) * maxR + 30;
1514
+ const rB = Math.sqrt((venn.only_b + venn.both) / (total || 1)) * maxR + 30;
1515
+ const overlap = venn.both > 0 ? Math.min(rA, rB) * 0.6 : 0;
1516
+ const cxA = 140, cxB = cxA + rA + rB - overlap, cy = 130;
1517
+ const w = cxB + rB + 20, h = 260;
1518
+ container.innerHTML = `
1519
+ <div style="text-align:center">
1520
+ <svg width="${w}" height="${h}" viewBox="0 0 ${w} ${h}" style="max-width:100%">
1521
+ <circle cx="${cxA}" cy="${cy}" r="${rA}" fill="#2563eb" fill-opacity="0.25" stroke="#2563eb" stroke-width="2"/>
1522
+ <circle cx="${cxB}" cy="${cy}" r="${rB}" fill="#dc2626" fill-opacity="0.25" stroke="#dc2626" stroke-width="2"/>
1523
+ <text x="${cxA - rA*0.5}" y="${cy}" text-anchor="middle" font-size="13" font-weight="bold" fill="#1e40af">${venn.only_a}</text>
1524
+ <text x="${(cxA + cxB)/2}" y="${cy}" text-anchor="middle" font-size="13" font-weight="bold" fill="#374151">${venn.both}</text>
1525
+ <text x="${cxB + rB*0.5}" y="${cy}" text-anchor="middle" font-size="13" font-weight="bold" fill="#b91c1c">${venn.only_b}</text>
1526
+ <text x="${cxA - rA*0.5}" y="${cy + rA + 14}" text-anchor="middle" font-size="11" fill="#2563eb">${esc(venn.label_a)}</text>
1527
+ <text x="${cxB + rB*0.5}" y="${cy + rB + 14}" text-anchor="middle" font-size="11" fill="#dc2626">${esc(venn.label_b)}</text>
1528
+ <text x="${(cxA+cxB)/2}" y="${cy + Math.min(rA,rB) + 14}" text-anchor="middle" font-size="10" fill="#64748b">commun</text>
1529
+ </svg>
1530
+ <p style="font-size:.75rem;color:var(--text-muted);margin-top:.25rem">
1531
+ Erreurs exclusives ${esc(venn.label_a)} : ${venn.only_a} ·
1532
+ Communes : ${venn.both} ·
1533
+ Exclusives ${esc(venn.label_b)} : ${venn.only_b}
1534
+ </p>
1535
+ </div>
1536
+ `;
1537
+ } else if (venn.type === 'venn3') {
1538
+ // Venn 3 cercles simplifié
1539
+ const total = (venn.only_a||0)+(venn.only_b||0)+(venn.only_c||0)+(venn.ab||0)+(venn.ac||0)+(venn.bc||0)+(venn.abc||0) || 1;
1540
+ container.innerHTML = `
1541
+ <div style="text-align:center">
1542
+ <svg width="300" height="280" viewBox="0 0 300 280" style="max-width:100%">
1543
+ <circle cx="130" cy="110" r="80" fill="#2563eb" fill-opacity="0.2" stroke="#2563eb" stroke-width="1.5"/>
1544
+ <circle cx="170" cy="110" r="80" fill="#dc2626" fill-opacity="0.2" stroke="#dc2626" stroke-width="1.5"/>
1545
+ <circle cx="150" cy="155" r="80" fill="#16a34a" fill-opacity="0.2" stroke="#16a34a" stroke-width="1.5"/>
1546
+ <text x="95" y="95" text-anchor="middle" font-size="12" font-weight="bold" fill="#1e40af">${venn.only_a}</text>
1547
+ <text x="205" y="95" text-anchor="middle" font-size="12" font-weight="bold" fill="#b91c1c">${venn.only_b}</text>
1548
+ <text x="150" y="230" text-anchor="middle" font-size="12" font-weight="bold" fill="#15803d">${venn.only_c}</text>
1549
+ <text x="148" y="108" text-anchor="middle" font-size="11" fill="#374151">${venn.ab}</text>
1550
+ <text x="120" y="160" text-anchor="middle" font-size="11" fill="#374151">${venn.ac}</text>
1551
+ <text x="180" y="160" text-anchor="middle" font-size="11" fill="#374151">${venn.bc}</text>
1552
+ <text x="150" y="145" text-anchor="middle" font-size="11" font-weight="bold" fill="#374151">${venn.abc}</text>
1553
+ <text x="95" y="127" text-anchor="middle" font-size="9" fill="#2563eb">${esc((venn.label_a||'').slice(0,10))}</text>
1554
+ <text x="205" y="127" text-anchor="middle" font-size="9" fill="#dc2626">${esc((venn.label_b||'').slice(0,10))}</text>
1555
+ <text x="150" y="248" text-anchor="middle" font-size="9" fill="#16a34a">${esc((venn.label_c||'').slice(0,10))}</text>
1556
+ </svg>
1557
+ </div>
1558
+ `;
1559
+ }
1560
+ }
1561
+
1562
+ // ── Sprint 7 — Table de Wilcoxon ─────────────────────────────────
1563
+ function buildWilcoxonTable() {
1564
+ const container = document.getElementById('wilcoxon-table-container');
1565
+ if (!container) return;
1566
+ const stats = DATA.statistics && DATA.statistics.pairwise_wilcoxon || [];
1567
+ if (!stats.length) {
1568
+ container.innerHTML = '<p style="color:var(--text-muted)">Pas assez de données pour les tests statistiques (min 2 concurrents).</p>';
1569
+ return;
1570
+ }
1571
+ const rows = stats.map(s => {
1572
+ const sigClass = s.significant ? 'stat-sig' : 'stat-ns';
1573
+ const sigLabel = s.significant ? '✓ Significative' : '○ Non significative';
1574
+ return `<tr>
1575
+ <td style="padding:.4rem .6rem;font-weight:600">${esc(s.engine_a)}</td>
1576
+ <td style="padding:.4rem .3rem;color:var(--text-muted)">vs</td>
1577
+ <td style="padding:.4rem .6rem;font-weight:600">${esc(s.engine_b)}</td>
1578
+ <td style="padding:.4rem .6rem;text-align:right;font-variant-numeric:tabular-nums">${s.n_pairs}</td>
1579
+ <td style="padding:.4rem .6rem;text-align:right;font-variant-numeric:tabular-nums">${s.statistic}</td>
1580
+ <td style="padding:.4rem .6rem;text-align:right;font-variant-numeric:tabular-nums">${s.p_value}</td>
1581
+ <td style="padding:.4rem .75rem"><span class="${sigClass}">${sigLabel}</span></td>
1582
+ <td style="padding:.4rem .75rem;font-size:.78rem;color:var(--text-muted);max-width:280px">${esc(s.interpretation)}</td>
1583
+ </tr>`;
1584
+ }).join('');
1585
+ container.innerHTML = `
1586
+ <table style="border-collapse:collapse;font-size:.84rem;width:100%">
1587
+ <thead><tr style="background:var(--bg)">
1588
+ <th style="padding:.4rem .6rem;text-align:left;font-size:.75rem;text-transform:uppercase;letter-spacing:.04em">Concurrent A</th>
1589
+ <th></th>
1590
+ <th style="padding:.4rem .6rem;text-align:left;font-size:.75rem;text-transform:uppercase;letter-spacing:.04em">Concurrent B</th>
1591
+ <th style="padding:.4rem .6rem;text-align:right;font-size:.75rem">N paires</th>
1592
+ <th style="padding:.4rem .6rem;text-align:right;font-size:.75rem">W</th>
1593
+ <th style="padding:.4rem .6rem;text-align:right;font-size:.75rem">p-value</th>
1594
+ <th style="padding:.4rem .75rem;text-align:left;font-size:.75rem">Verdict</th>
1595
+ <th style="padding:.4rem .75rem;text-align:left;font-size:.75rem">Interprétation</th>
1596
+ </tr></thead>
1597
+ <tbody>${rows}</tbody>
1598
+ </table>
1599
+ `;
1600
+ }
1601
+
1602
+ // ── Sprint 7 — Clustering des erreurs ───────────────────────────
1603
+ function buildErrorClusters() {
1604
+ const container = document.getElementById('error-clusters-container');
1605
+ if (!container) return;
1606
+ const clusters = DATA.error_clusters || [];
1607
+ if (!clusters.length) {
1608
+ container.innerHTML = `<p style="color:var(--text-muted)">Aucun cluster d'erreur détecté.</p>`;
1609
+ return;
1610
+ }
1611
+ const cards = clusters.map(cl => {
1612
+ const examplesHtml = (cl.examples || []).slice(0, 3).map(ex => {
1613
+ const oldStr = ex.gt_fragment || '';
1614
+ const newStr = ex.ocr_fragment || '';
1615
+ return `<div class="cluster-ex">
1616
+ <span class="ex-old">${esc(oldStr || '∅')}</span>
1617
+ <span style="color:var(--text-muted)">→</span>
1618
+ <span class="ex-new">${esc(newStr || '∅')}</span>
1619
+ <span style="color:var(--text-muted);font-size:.72rem">(${esc(ex.engine || '')})</span>
1620
+ </div>`;
1621
+ }).join('');
1622
+ return `<div class="cluster-card">
1623
+ <div class="cluster-label">Cluster #${cl.cluster_id} : ${esc(cl.label)}</div>
1624
+ <div class="cluster-count">${cl.count} cas détectés</div>
1625
+ <div class="cluster-examples">${examplesHtml}</div>
1626
+ </div>`;
1627
+ }).join('');
1628
+ container.innerHTML = `<div class="cluster-grid">${cards}</div>`;
1629
+ }
1630
+
1631
+ // ── Sprint 7 — Matrice de corrélation ───────────────────────────
1632
+ function initCorrelationMatrix() {
1633
+ const sel = document.getElementById('corr-engine-select');
1634
+ if (!sel) return;
1635
+ const corrs = DATA.correlation_per_engine || [];
1636
+ sel.innerHTML = '';
1637
+ corrs.forEach(c => {
1638
+ const opt = document.createElement('option');
1639
+ opt.value = c.engine; opt.textContent = c.engine;
1640
+ sel.appendChild(opt);
1641
+ });
1642
+ renderCorrelationMatrix();
1643
+ }
1644
+
1645
+ function renderCorrelationMatrix() {
1646
+ const container = document.getElementById('corr-matrix-container');
1647
+ if (!container) return;
1648
+ const sel = document.getElementById('corr-engine-select');
1649
+ const engineName = sel && sel.value;
1650
+ const corrs = DATA.correlation_per_engine || [];
1651
+ const entry = corrs.find(c => c.engine === engineName) || corrs[0];
1652
+ if (!entry || !entry.labels || !entry.matrix) {
1653
+ container.innerHTML = '<p style="color:var(--text-muted)">Données insuffisantes.</p>';
1654
+ return;
1655
+ }
1656
+ const labels = entry.labels;
1657
+ const matrix = entry.matrix;
1658
+ const n = labels.length;
1659
+
1660
+ const labelNames = {
1661
+ cer: 'CER', wer: 'WER', mer: 'MER', wil: 'WIL',
1662
+ quality_score: 'Qualité img', sharpness: 'Netteté',
1663
+ ligature: 'Ligatures', diacritic: 'Diacritiques',
1664
+ };
1665
+ function corrColor(r) {
1666
+ if (r >= 0.7) return 'background:#dcfce7;color:#14532d';
1667
+ if (r >= 0.3) return 'background:#f0fdf4;color:#166534';
1668
+ if (r >= -0.3) return 'background:#f8fafc;color:#374151';
1669
+ if (r >= -0.7) return 'background:#fef2f2;color:#991b1b';
1670
+ return 'background:#fee2e2;color:#7f1d1d';
1671
+ }
1672
+
1673
+ const headerRow = '<tr><th></th>' + labels.map(l =>
1674
+ `<th>${esc(labelNames[l] || l)}</th>`).join('') + '</tr>';
1675
+ const dataRows = matrix.map((row, i) =>
1676
+ '<tr><th style="text-align:right">' + esc(labelNames[labels[i]] || labels[i]) + '</th>' +
1677
+ row.map((v, j) => {
1678
+ const style = corrColor(v);
1679
+ const display = i === j ? '1.00' : v.toFixed(2);
1680
+ return `<td style="${style}">${display}</td>`;
1681
+ }).join('') + '</tr>'
1682
+ ).join('');
1683
+
1684
+ container.innerHTML = `<table class="corr-table"><thead>${headerRow}</thead><tbody>${dataRows}</tbody></table>`;
1685
+ }
1686
+
1687
+ // ── Sprint 7 — URL stateful ──────────────────────────────────────
1688
+ function updateURL(view, params) {
1689
+ const hash = '#' + view + (params ? '?' + new URLSearchParams(params).toString() : '');
1690
+ history.replaceState(null, '', hash);
1691
+ }
1692
+
1693
+ function readURLState() {
1694
+ const hash = location.hash.slice(1);
1695
+ const [view, query] = hash.split('?');
1696
+ const params = query ? Object.fromEntries(new URLSearchParams(query)) : {};
1697
+ return { view: view || 'ranking', params };
1698
+ }
1699
+
1700
+ // ── Sprint 7 — Mode présentation ────────────────────────────────
1701
+ let presentMode = false;
1702
+ function togglePresentMode() {
1703
+ presentMode = !presentMode;
1704
+ document.body.classList.toggle('present-mode', presentMode);
1705
+ const btn = document.getElementById('btn-present');
1706
+ if (btn) {
1707
+ btn.classList.toggle('active', presentMode);
1708
+ btn.textContent = presentMode ? '⊡ Normal' : '⊞ Présentation';
1709
+ }
1710
+ }
1711
+
1712
+ // ── Sprint 7 — Export CSV ────────────────────────────────────────
1713
+ function _buildCSVRows(docs) {
1714
+ const header = ['doc_id','engine','cer','wer','mer','wil','duration','ligature_score','diacritic_score','difficulty_score','gini','anchor_score','length_ratio','is_hallucinating'];
1715
+ const rows = [header];
1716
+ docs.forEach(doc => {
1717
+ doc.engine_results.forEach(er => {
1718
+ rows.push([
1719
+ doc.doc_id,
1720
+ er.engine,
1721
+ er.cer !== null ? (er.cer * 100).toFixed(4) : '',
1722
+ er.wer !== null ? (er.wer * 100).toFixed(4) : '',
1723
+ er.mer !== null ? (er.mer * 100).toFixed(4) : '',
1724
+ er.wil !== null ? (er.wil * 100).toFixed(4) : '',
1725
+ er.duration !== null ? er.duration : '',
1726
+ er.ligature_score !== null ? er.ligature_score : '',
1727
+ er.diacritic_score !== null ? er.diacritic_score : '',
1728
+ doc.difficulty_score !== undefined ? (doc.difficulty_score * 100).toFixed(2) : '',
1729
+ er.line_metrics ? er.line_metrics.gini.toFixed(6) : '',
1730
+ er.hallucination_metrics ? er.hallucination_metrics.anchor_score.toFixed(6) : '',
1731
+ er.hallucination_metrics ? er.hallucination_metrics.length_ratio.toFixed(4) : '',
1732
+ er.hallucination_metrics ? (er.hallucination_metrics.is_hallucinating ? '1' : '0') : '',
1733
+ ]);
1734
+ });
1735
+ });
1736
+ return rows.map(r => r.map(v => JSON.stringify(String(v ?? ''))).join(',')).join('\n');
1737
+ }
1738
+
1739
+ function _downloadCSV(content, filename) {
1740
+ const blob = new Blob(['\ufeff' + content], { type: 'text/csv;charset=utf-8' });
1741
+ const url = URL.createObjectURL(blob);
1742
+ const a = document.createElement('a');
1743
+ a.href = url;
1744
+ a.download = filename;
1745
+ document.body.appendChild(a); a.click();
1746
+ setTimeout(() => { document.body.removeChild(a); URL.revokeObjectURL(url); }, 100);
1747
+ }
1748
+
1749
+ function exportCSV() {
1750
+ // Feuille 1 : tous les documents
1751
+ const corpusSlug = DATA.meta.corpus_name.replace(/\s+/g,'-');
1752
+ _downloadCSV(_buildCSVRows(DATA.documents), `picarones_metrics_${corpusSlug}.csv`);
1753
+
1754
+ // Feuille 2 : documents filtrés (exclusions robustes actives)
1755
+ const cerThreshold = parseInt(document.getElementById('robust-cer').value) / 100;
1756
+ const anchorThreshold = parseFloat(document.getElementById('robust-anchor').value);
1757
+ const ratioThreshold = parseFloat(document.getElementById('robust-ratio').value);
1758
+ const filteredDocs = DATA.documents.filter(doc => {
1759
+ // Exclure si doc est dans _manualExclusions
1760
+ if (_manualExclusions.has(doc.doc_id)) return false;
1761
+ // Exclure si tous les moteurs le détectent comme problématique
1762
+ return doc.engine_results.some(er => {
1763
+ if (!er || er.error) return false;
1764
+ if (cerThreshold < 1.0 && er.cer !== null && er.cer > cerThreshold) return false;
1765
+ const hm = er.hallucination_metrics;
1766
+ if (hm && hm.anchor_score < anchorThreshold) return false;
1767
+ if (hm && hm.length_ratio > ratioThreshold) return false;
1768
+ return true;
1769
+ });
1770
+ });
1771
+ // Télécharger avec un délai pour ne pas bloquer le premier download
1772
+ setTimeout(() => {
1773
+ _downloadCSV(_buildCSVRows(filteredDocs), `picarones_metrics_${corpusSlug}_robust.csv`);
1774
+ }, 400);
1775
+ }
1776
+
1777
+ // ── Vue Caractères ───────────────────────────────────────────────
1778
+ let charViewBuilt = false;
1779
+
1780
+ function initCharView() {
1781
+ charViewBuilt = true;
1782
+ // Remplir le sélecteur de moteur
1783
+ const sel = document.getElementById('char-engine-select');
1784
+ sel.innerHTML = '';
1785
+ DATA.engines.forEach(e => {
1786
+ const opt = document.createElement('option');
1787
+ opt.value = e.name; opt.textContent = e.name;
1788
+ sel.appendChild(opt);
1789
+ });
1790
+ renderCharView();
1791
+ }
1792
+
1793
+ function renderCharView() {
1794
+ const engineName = document.getElementById('char-engine-select').value;
1795
+ const eng = DATA.engines.find(e => e.name === engineName);
1796
+ if (!eng) return;
1797
+
1798
+ // Scores ligatures / diacritiques
1799
+ const scoresRow = document.getElementById('char-scores-row');
1800
+ const ligScore = eng.ligature_score;
1801
+ const diacScore = eng.diacritic_score;
1802
+ scoresRow.innerHTML = `
1803
+ <div class="stat">Ligatures <b>${_scoreBadge(ligScore, 'Ligatures')}</b></div>
1804
+ <div class="stat">Diacritiques <b>${_scoreBadge(diacScore, 'Diacritiques')}</b></div>
1805
+ ${eng.aggregated_structure ? `
1806
+ <div class="stat">Précision lignes <b>${_scoreBadge(eng.aggregated_structure.mean_line_accuracy, 'Précision nb lignes')}</b></div>
1807
+ <div class="stat">Ordre lecture <b>${_scoreBadge(eng.aggregated_structure.mean_reading_order_score, 'Score ordre de lecture')}</b></div>
1808
+ ` : ''}
1809
+ ${eng.aggregated_image_quality ? `
1810
+ <div class="stat">Qualité image moy. <b>${_scoreBadge(eng.aggregated_image_quality.mean_quality_score, 'Qualité image moyenne')}</b></div>
1811
+ ` : ''}
1812
+ `;
1813
+
1814
+ // Matrice de confusion heatmap
1815
+ renderConfusionHeatmap(eng);
1816
+
1817
+ // Détail ligatures
1818
+ renderLigatureDetail(eng);
1819
+
1820
+ // Taxonomie détaillée
1821
+ renderTaxonomyDetail(eng);
1822
+ }
1823
+
1824
+ function renderConfusionHeatmap(eng) {
1825
+ const container = document.getElementById('confusion-heatmap');
1826
+ const cm = eng.aggregated_confusion;
1827
+ if (!cm || !cm.matrix) {
1828
+ container.innerHTML = '<p style="color:var(--text-muted)">Aucune donnée de confusion disponible.</p>';
1829
+ return;
1830
+ }
1831
+
1832
+ // Collecter les top confusions (substitutions uniquement, hors ∅)
1833
+ const pairs = [];
1834
+ for (const [gt, ocrs] of Object.entries(cm.matrix)) {
1835
+ if (gt === '∅') continue;
1836
+ for (const [ocr, cnt] of Object.entries(ocrs)) {
1837
+ if (ocr !== gt && ocr !== '∅' && cnt > 0) {
1838
+ pairs.push({ gt, ocr, cnt });
1839
+ }
1840
+ }
1841
+ }
1842
+ pairs.sort((a,b) => b.cnt - a.cnt);
1843
+ const top = pairs.slice(0, 30);
1844
+
1845
+ if (!top.length) {
1846
+ container.innerHTML = '<p style="color:var(--text-muted)">Aucune substitution détectée.</p>';
1847
+ return;
1848
+ }
1849
+
1850
+ // Heatmap sous forme de tableau compact
1851
+ const maxCnt = top[0].cnt;
1852
+ const rows = top.map(p => {
1853
+ const intensity = Math.round((p.cnt / maxCnt) * 200 + 55); // 55–255
1854
+ const bg = `rgb(${intensity},50,50)`;
1855
+ const fg = intensity > 150 ? '#fff' : '#222';
1856
+ return `<tr onclick="showConfusionExamples('${esc(p.gt)}','${esc(p.ocr)}')" style="cursor:pointer" title="GT='${esc(p.gt)}' → OCR='${esc(p.ocr)}' : ${p.cnt} fois">
1857
+ <td style="font-family:monospace;font-size:1.1rem;padding:.3rem .6rem;text-align:center">${esc(p.gt)}</td>
1858
+ <td style="padding:.1rem .3rem;color:var(--text-muted)">→</td>
1859
+ <td style="font-family:monospace;font-size:1.1rem;padding:.3rem .6rem;text-align:center">${esc(p.ocr)}</td>
1860
+ <td style="padding:.3rem 1rem">
1861
+ <div style="display:flex;align-items:center;gap:.5rem">
1862
+ <div style="width:${Math.round(p.cnt/maxCnt*120)}px;height:12px;border-radius:3px;background:${bg}"></div>
1863
+ <span style="font-size:.8rem;color:var(--text-muted)">${p.cnt}×</span>
1864
+ </div>
1865
+ </td>
1866
+ </tr>`;
1867
+ }).join('');
1868
+
1869
+ container.innerHTML = `
1870
+ <p style="font-size:.75rem;color:var(--text-muted);margin-bottom:.5rem">
1871
+ Cliquer sur une ligne pour voir les exemples dans la vue Document.
1872
+ Total substitutions : <b>${cm.total_substitutions}</b>
1873
+ · Insertions : <b>${cm.total_insertions}</b>
1874
+ · Suppressions : <b>${cm.total_deletions}</b>
1875
+ </p>
1876
+ <table style="border-collapse:collapse;font-size:.85rem">
1877
+ <thead><tr>
1878
+ <th style="padding:.3rem .6rem;text-align:left">GT</th>
1879
+ <th></th>
1880
+ <th style="padding:.3rem .6rem;text-align:left">OCR</th>
1881
+ <th style="padding:.3rem 1rem;text-align:left">Fréquence</th>
1882
+ </tr></thead>
1883
+ <tbody>${rows}</tbody>
1884
+ </table>
1885
+ `;
1886
+ }
1887
+
1888
+ function showConfusionExamples(gtChar, ocrChar) {
1889
+ // Naviguer vers la vue Document en cherchant un exemple de cette confusion
1890
+ showView('document');
1891
+ const docWithConfusion = DATA.documents.find(doc =>
1892
+ doc.engine_results.some(er => {
1893
+ const h = er.hypothesis || '';
1894
+ const g = doc.ground_truth || '';
1895
+ return g.includes(gtChar) && h.includes(ocrChar);
1896
+ })
1897
+ );
1898
+ if (docWithConfusion) loadDocument(docWithConfusion.doc_id);
1899
+ }
1900
+
1901
+ function renderLigatureDetail(eng) {
1902
+ const container = document.getElementById('ligature-detail');
1903
+ // Agrégation sur tous les documents pour ce moteur
1904
+ const ligData = {};
1905
+ DATA.documents.forEach(doc => {
1906
+ const er = doc.engine_results.find(r => r.engine === eng.name);
1907
+ if (!er || !er.ligature_score) return;
1908
+ // On n'a que le score global par doc; pour le détail, utiliser aggregated_char_scores
1909
+ });
1910
+
1911
+ const agg = eng.aggregated_char_scores;
1912
+ if (!agg || !agg.ligature || !agg.ligature.per_ligature) {
1913
+ const overallScore = eng.ligature_score;
1914
+ if (overallScore !== null && overallScore !== undefined) {
1915
+ container.innerHTML = `<div class="stat">Score global ligatures : ${_scoreBadge(overallScore, 'Ligatures')}</div>`;
1916
+ } else {
1917
+ container.innerHTML = '<p style="color:var(--text-muted)">Aucune donnée ligature disponible (pas de ligatures dans le corpus).</p>';
1918
+ }
1919
+ return;
1920
+ }
1921
+
1922
+ const perLig = agg.ligature.per_ligature;
1923
+ if (!Object.keys(perLig).length) {
1924
+ container.innerHTML = '<p style="color:var(--text-muted)">Aucune ligature trouvée dans le corpus GT.</p>';
1925
+ return;
1926
+ }
1927
+
1928
+ const rows = Object.entries(perLig)
1929
+ .sort((a,b) => b[1].gt_count - a[1].gt_count)
1930
+ .map(([lig, d]) => {
1931
+ const sc = d.score;
1932
+ const color = sc >= 0.9 ? '#16a34a' : sc >= 0.7 ? '#ca8a04' : '#dc2626';
1933
+ const barW = Math.round(sc * 120);
1934
+ return `<tr>
1935
+ <td style="font-family:monospace;font-size:1.2rem;padding:.3rem .6rem">${esc(lig)}</td>
1936
+ <td style="padding:.3rem .6rem;font-size:.8rem;color:var(--text-muted)">${esc(lig.codePointAt(0).toString(16).toUpperCase().padStart(4,'0'))}</td>
1937
+ <td style="padding:.3rem .6rem">${d.gt_count} GT</td>
1938
+ <td style="padding:.3rem .6rem">${d.ocr_correct} corrects</td>
1939
+ <td style="padding:.3rem 1rem">
1940
+ <div style="display:flex;align-items:center;gap:.5rem">
1941
+ <div style="width:${barW}px;height:10px;border-radius:3px;background:${color}"></div>
1942
+ <span style="color:${color};font-weight:600">${(sc*100).toFixed(0)}%</span>
1943
+ </div>
1944
+ </td>
1945
+ </tr>`;
1946
+ }).join('');
1947
+
1948
+ container.innerHTML = `
1949
+ <table style="border-collapse:collapse;font-size:.85rem">
1950
+ <thead><tr>
1951
+ <th style="padding:.3rem .6rem;text-align:left">Ligature</th>
1952
+ <th style="padding:.3rem .6rem;text-align:left">Unicode</th>
1953
+ <th style="padding:.3rem .6rem">GT</th>
1954
+ <th style="padding:.3rem .6rem">Corrects</th>
1955
+ <th style="padding:.3rem 1rem;text-align:left">Score</th>
1956
+ </tr></thead>
1957
+ <tbody>${rows}</tbody>
1958
+ </table>
1959
+ `;
1960
+ }
1961
+
1962
+ function renderTaxonomyDetail(eng) {
1963
+ const container = document.getElementById('taxonomy-detail');
1964
+ const tax = eng.aggregated_taxonomy;
1965
+ if (!tax || !tax.counts) {
1966
+ container.innerHTML = '<p style="color:var(--text-muted)">Aucune donnée taxonomique disponible.</p>';
1967
+ return;
1968
+ }
1969
+
1970
+ const classNames = {
1971
+ visual_confusion: '1 — Confusion visuelle',
1972
+ diacritic_error: '2 — Erreur diacritique',
1973
+ case_error: '3 — Erreur de casse',
1974
+ ligature_error: '4 — Ligature',
1975
+ abbreviation_error: '5 — Abréviation',
1976
+ hapax: '6 — Hapax',
1977
+ segmentation_error: '7 — Segmentation',
1978
+ oov_character: '8 — Hors-vocabulaire',
1979
+ lacuna: '9 — Lacune',
1980
+ };
1981
+ const total = tax.total_errors || 1;
1982
+ const maxCnt = Math.max(...Object.values(tax.counts));
1983
+
1984
+ const rows = Object.entries(tax.counts)
1985
+ .filter(([, cnt]) => cnt > 0)
1986
+ .sort((a,b) => b[1]-a[1])
1987
+ .map(([cls, cnt]) => {
1988
+ const pctVal = (cnt / total * 100).toFixed(1);
1989
+ const barW = maxCnt > 0 ? Math.round(cnt/maxCnt * 200) : 0;
1990
+ return `<tr>
1991
+ <td style="padding:.3rem .6rem;font-size:.85rem">${esc(classNames[cls] || cls)}</td>
1992
+ <td style="padding:.3rem .6rem;text-align:right;font-variant-numeric:tabular-nums">${cnt}</td>
1993
+ <td style="padding:.3rem 1rem">
1994
+ <div style="display:flex;align-items:center;gap:.5rem">
1995
+ <div style="width:${barW}px;height:10px;border-radius:3px;background:#6366f1"></div>
1996
+ <span style="color:var(--text-muted);font-size:.8rem">${pctVal}%</span>
1997
+ </div>
1998
+ </td>
1999
+ </tr>`;
2000
+ }).join('');
2001
+
2002
+ container.innerHTML = `
2003
+ <p style="font-size:.75rem;color:var(--text-muted);margin-bottom:.5rem">Total : <b>${tax.total_errors}</b> erreurs classifiées.</p>
2004
+ <table style="border-collapse:collapse;font-size:.85rem;min-width:400px">
2005
+ <thead><tr>
2006
+ <th style="padding:.3rem .6rem;text-align:left">Classe</th>
2007
+ <th style="padding:.3rem .6rem;text-align:right">N</th>
2008
+ <th style="padding:.3rem 1rem;text-align:left">Proportion</th>
2009
+ </tr></thead>
2010
+ <tbody>${rows}</tbody>
2011
+ </table>
2012
+ `;
2013
+ }
2014
+
2015
+ // ── Init ────────────────────────────────────────────────────────
2016
+ function applyI18n() {
2017
+ // Applique les traductions aux éléments avec data-i18n (textContent)
2018
+ document.querySelectorAll('[data-i18n]').forEach(el => {
2019
+ const key = el.getAttribute('data-i18n');
2020
+ if (I18N[key] !== undefined) el.textContent = I18N[key];
2021
+ });
2022
+ // Options de select avec data-i18n-opt
2023
+ document.querySelectorAll('[data-i18n-opt]').forEach(el => {
2024
+ const key = el.getAttribute('data-i18n-opt');
2025
+ if (I18N[key] !== undefined) el.textContent = I18N[key];
2026
+ });
2027
+ // Tooltips des th via id
2028
+ const thMap = {
2029
+ 'th-cer-diplo': 'col_cer_diplo_title',
2030
+ 'th-ligatures': 'col_ligatures_title',
2031
+ 'th-diacritics': 'col_diacritics_title',
2032
+ 'th-gini': 'col_gini_title',
2033
+ 'th-anchor': 'col_anchor_title',
2034
+ 'th-overnorm': 'col_overnorm_title',
2035
+ };
2036
+ Object.entries(thMap).forEach(([id, key]) => {
2037
+ const el = document.getElementById(id);
2038
+ if (el && I18N[key]) el.title = I18N[key];
2039
+ });
2040
+ }
2041
+
2042
+ function init() {
2043
+ // i18n
2044
+ applyI18n();
2045
+
2046
+ // Méta nav
2047
+ const d = new Date(DATA.meta.run_date);
2048
+ const locale = I18N.date_locale || 'fr-FR';
2049
+ const fmt = d.toLocaleDateString(locale, { year:'numeric', month:'short', day:'numeric' });
2050
+ document.getElementById('nav-meta').textContent =
2051
+ DATA.meta.corpus_name + ' · ' + fmt;
2052
+ document.getElementById('footer-date').textContent =
2053
+ (I18N.footer_generated || 'Rapport généré le') + ' ' + fmt;
2054
+
2055
+ // Sélecteur moteur galerie
2056
+ const sel = document.getElementById('gallery-engine-select');
2057
+ DATA.engines.forEach(e => {
2058
+ const opt = document.createElement('option');
2059
+ opt.value = e.name; opt.textContent = e.name;
2060
+ sel.appendChild(opt);
2061
+ });
2062
+
2063
+ renderRanking();
2064
+ renderRobustMetrics();
2065
+ renderGallery();
2066
+ buildDocList();
2067
+
2068
+ // Restaurer l'état depuis l'URL
2069
+ const { view, params } = readURLState();
2070
+ if (view && view !== 'ranking') {
2071
+ _switchView(view); // appel direct pour ne pas écraser l'URL
2072
+ if (view === 'document' && params.doc) {
2073
+ loadDocument(params.doc);
2074
+ }
2075
+ }
2076
+
2077
+ // Gérer le bouton retour
2078
+ window.addEventListener('popstate', () => {
2079
+ const { view: v, params: p } = readURLState();
2080
+ _switchView(v || 'ranking');
2081
+ if ((v === 'document') && p.doc) loadDocument(p.doc);
2082
+ });
2083
+ }
2084
+
2085
+ document.addEventListener('DOMContentLoaded', init);
picarones/report/templates/_footer.html ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ </main>
2
+
3
+ <footer>
4
+ <span data-i18n="footer_by">par Picarones</span> v{{ picarones_version }}
5
+ — <span id="footer-date"></span>
6
+ </footer>
7
+
picarones/report/templates/_header.html ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <!-- ── Navigation ─────────────────────────────────────────────────── -->
3
+ <nav>
4
+ <div class="brand">
5
+ Picarones
6
+ <span data-i18n="nav_report">| rapport OCR</span>
7
+ </div>
8
+ <div class="tabs">
9
+ <button class="tab-btn active" onclick="showView('ranking')" data-i18n="tab_ranking">Classement</button>
10
+ <button class="tab-btn" onclick="showView('gallery')" data-i18n="tab_gallery">Galerie</button>
11
+ <button class="tab-btn" onclick="showView('document')" data-i18n="tab_document">Document</button>
12
+ <button class="tab-btn" onclick="showView('characters')" data-i18n="tab_characters">Caractères</button>
13
+ <button class="tab-btn" onclick="showView('analyses')" data-i18n="tab_analyses">Analyses</button>
14
+ </div>
15
+ <div class="meta" id="nav-meta">—</div>
16
+ <button class="btn-export-csv" onclick="exportCSV()" title="⬇ CSV">⬇ CSV</button>
17
+ <button class="btn-present" id="btn-present" onclick="togglePresentMode()" data-i18n="btn_present">⊞ Présentation</button>
18
+ </nav>
19
+
20
+ <!-- ── Bandeau exclusion globale ───────────────────────────────────── -->
21
+ <div id="global-exclusion-banner" style="display:none;background:#fef3c7;border-bottom:2px solid #f59e0b;padding:.5rem 1.5rem;font-size:.85rem;font-weight:600;color:#92400e;text-align:center">
22
+ <span id="global-exclusion-text"></span>
23
+ <button onclick="resetAllExclusions()" style="margin-left:1rem;font-size:.75rem;padding:.15rem .5rem;border:1px solid #d97706;background:#fff;border-radius:.25rem;cursor:pointer">Réinitialiser</button>
24
+ </div>
25
+
26
+ <!-- ── Main ───────────────────────────────────────────────────────── -->
27
+ <main>
28
+
picarones/report/templates/_styles.css ADDED
@@ -0,0 +1,564 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* ── Reset & base ─────────────────────────────────────────────────── */
2
+ *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
3
+ :root {
4
+ --bg: #f1f5f9;
5
+ --surface: #ffffff;
6
+ --border: #e2e8f0;
7
+ --primary: #1e40af;
8
+ --primary-lt: #dbeafe;
9
+ --text: #1e293b;
10
+ --text-muted: #64748b;
11
+ --ins: #16a34a;
12
+ --ins-bg: #dcfce7;
13
+ --del: #dc2626;
14
+ --del-bg: #fee2e2;
15
+ --rep: #c2410c;
16
+ --rep-bg: #ffedd5;
17
+ --radius: 8px;
18
+ --shadow: 0 1px 3px rgba(0,0,0,.08), 0 1px 2px rgba(0,0,0,.05);
19
+ --nav-h: 56px;
20
+ }
21
+ html { font-size: 14px; scroll-behavior: smooth; }
22
+ body {
23
+ font-family: system-ui, -apple-system, 'Segoe UI', sans-serif;
24
+ background: var(--bg);
25
+ color: var(--text);
26
+ min-height: 100vh;
27
+ }
28
+
29
+ /* ── Navigation ───────────────────────────────────────────────────── */
30
+ nav {
31
+ position: fixed; top: 0; left: 0; right: 0; z-index: 100;
32
+ height: var(--nav-h);
33
+ background: var(--primary);
34
+ display: flex; align-items: center;
35
+ padding: 0 1.5rem;
36
+ gap: 2rem;
37
+ box-shadow: 0 2px 8px rgba(0,0,0,.25);
38
+ }
39
+ nav .brand {
40
+ color: #fff; font-weight: 700; font-size: 1.1rem;
41
+ letter-spacing: -.3px; white-space: nowrap;
42
+ display: flex; align-items: center; gap: .4rem;
43
+ }
44
+ nav .brand span { opacity: .7; font-weight: 400; font-size: .85rem; }
45
+ nav .tabs {
46
+ display: flex; gap: .25rem; flex: 1;
47
+ }
48
+ .tab-btn {
49
+ background: transparent; border: none; cursor: pointer;
50
+ color: rgba(255,255,255,.7);
51
+ padding: .4rem .9rem; border-radius: 6px;
52
+ font-size: .9rem; font-weight: 500;
53
+ transition: background .15s, color .15s;
54
+ }
55
+ .tab-btn:hover { background: rgba(255,255,255,.12); color: #fff; }
56
+ .tab-btn.active { background: rgba(255,255,255,.18); color: #fff; }
57
+ nav .meta {
58
+ color: rgba(255,255,255,.6); font-size: .78rem;
59
+ white-space: nowrap; margin-left: auto;
60
+ }
61
+
62
+ /* ── Layout ───────────────────────────────────────────────────────── */
63
+ main {
64
+ margin-top: var(--nav-h);
65
+ padding: 1.5rem;
66
+ max-width: 1400px;
67
+ margin-left: auto; margin-right: auto;
68
+ }
69
+ .view { display: none; }
70
+ .view.active { display: block; }
71
+ .card {
72
+ background: var(--surface);
73
+ border-radius: var(--radius);
74
+ border: 1px solid var(--border);
75
+ box-shadow: var(--shadow);
76
+ padding: 1.25rem;
77
+ margin-bottom: 1.25rem;
78
+ }
79
+ h2 {
80
+ font-size: 1rem; font-weight: 700;
81
+ color: var(--text); margin-bottom: .75rem;
82
+ border-bottom: 2px solid var(--primary-lt);
83
+ padding-bottom: .4rem;
84
+ }
85
+ h3 { font-size: .9rem; font-weight: 600; margin-bottom: .5rem; }
86
+
87
+ /* ── Ranking table ────────────────────────────────────────────────── */
88
+ .table-wrap { overflow-x: auto; }
89
+ table {
90
+ width: 100%; border-collapse: collapse;
91
+ font-size: .88rem;
92
+ }
93
+ thead tr { background: var(--bg); }
94
+ th {
95
+ text-align: left; padding: .6rem .75rem;
96
+ border-bottom: 2px solid var(--border);
97
+ cursor: pointer; white-space: nowrap;
98
+ color: var(--text-muted); font-weight: 600; font-size: .8rem;
99
+ text-transform: uppercase; letter-spacing: .04em;
100
+ user-select: none;
101
+ }
102
+ th.sortable:hover { color: var(--primary); }
103
+ th .sort-icon { opacity: .4; margin-left: .25rem; font-style: normal; }
104
+ th.sorted .sort-icon { opacity: 1; color: var(--primary); }
105
+ td {
106
+ padding: .55rem .75rem;
107
+ border-bottom: 1px solid var(--border);
108
+ vertical-align: middle;
109
+ }
110
+ tr:last-child td { border-bottom: none; }
111
+ tbody tr:hover { background: #f8fafc; }
112
+ .rank-badge {
113
+ display: inline-flex; align-items: center; justify-content: center;
114
+ width: 1.6rem; height: 1.6rem; border-radius: 50%;
115
+ font-weight: 700; font-size: .75rem;
116
+ background: var(--primary-lt); color: var(--primary);
117
+ }
118
+ .rank-badge.rank-1 { background: #fef3c7; color: #92400e; }
119
+ .engine-name { font-weight: 600; }
120
+ .engine-version { color: var(--text-muted); font-size: .78rem; margin-left: .3rem; }
121
+ .cer-badge {
122
+ display: inline-block;
123
+ padding: .15rem .5rem; border-radius: 4px;
124
+ font-weight: 600; font-size: .82rem;
125
+ }
126
+ .bar {
127
+ display: inline-block; height: 8px; border-radius: 4px;
128
+ vertical-align: middle; margin-right: .4rem;
129
+ }
130
+
131
+ /* ── Gallery ──────────────────────────────────────────────────────── */
132
+ /* Robust metrics controls */
133
+ .robust-controls {
134
+ display: flex; flex-wrap: wrap; gap: 1.5rem; margin-bottom: .75rem;
135
+ }
136
+ .robust-controls label {
137
+ display: flex; align-items: center; gap: .4rem;
138
+ font-size: .82rem; color: var(--text-muted);
139
+ transition: opacity .15s;
140
+ }
141
+ .robust-controls label.criterion-off { opacity: .4; }
142
+ .robust-controls input[type=range] { width: 140px; }
143
+ .slider-val {
144
+ font-weight: 700; color: var(--text); min-width: 2.5rem;
145
+ }
146
+ .robust-toggle {
147
+ cursor: pointer; border: 1px solid; border-radius: .25rem;
148
+ padding: 0 .3rem; font-size: .8rem; font-weight: 700;
149
+ line-height: 1.6; background: none; flex-shrink: 0;
150
+ }
151
+ .robust-toggle[data-active="true"] { color: #16a34a; border-color: #16a34a; }
152
+ .robust-toggle[data-active="false"] { color: var(--text-muted); border-color: var(--border); }
153
+ .robust-table td { padding: .4rem .6rem; font-size: .85rem; }
154
+ .robust-table .improved { color: #16a34a; font-weight: 600; }
155
+ .robust-table .worsened { color: #dc2626; font-weight: 600; }
156
+
157
+ .gallery-controls {
158
+ display: flex; align-items: center; gap: .75rem;
159
+ margin-bottom: 1rem; flex-wrap: wrap;
160
+ }
161
+ .gallery-controls label { font-size: .82rem; color: var(--text-muted); }
162
+ .gallery-controls input[type=range] { width: 120px; }
163
+ .gallery-grid {
164
+ display: grid;
165
+ grid-template-columns: repeat(auto-fill, minmax(200px, 1fr));
166
+ gap: 1rem;
167
+ }
168
+ .gallery-card {
169
+ background: var(--surface);
170
+ border: 1px solid var(--border);
171
+ border-radius: var(--radius);
172
+ overflow: hidden;
173
+ cursor: pointer;
174
+ position: relative;
175
+ transition: transform .15s, box-shadow .15s;
176
+ }
177
+ .gallery-card:hover {
178
+ transform: translateY(-2px);
179
+ box-shadow: 0 4px 12px rgba(0,0,0,.12);
180
+ border-color: var(--primary);
181
+ }
182
+ .gallery-card img, .gallery-card .img-placeholder {
183
+ width: 100%; aspect-ratio: 4/3; object-fit: cover;
184
+ display: block; background: #e8e0d4;
185
+ }
186
+ .img-placeholder {
187
+ display: flex; align-items: center; justify-content: center;
188
+ font-size: 2rem; color: #94a3b8;
189
+ }
190
+ .gallery-card-body {
191
+ padding: .6rem .75rem;
192
+ }
193
+ .gallery-card-title {
194
+ font-size: .8rem; font-weight: 600; margin-bottom: .35rem;
195
+ white-space: nowrap; overflow: hidden; text-overflow: ellipsis;
196
+ }
197
+ .gallery-card-badges {
198
+ display: flex; gap: .3rem; flex-wrap: wrap;
199
+ }
200
+ .engine-cer-badge {
201
+ font-size: .7rem; font-weight: 700;
202
+ padding: .1rem .35rem; border-radius: 3px;
203
+ }
204
+
205
+ /* ── Document detail ──────────────────────────────────────────────── */
206
+ .doc-layout {
207
+ display: grid;
208
+ grid-template-columns: 220px 1fr;
209
+ gap: 1rem;
210
+ align-items: start;
211
+ }
212
+ @media (max-width: 768px) {
213
+ .doc-layout { grid-template-columns: 1fr; }
214
+ }
215
+ .doc-sidebar {
216
+ background: var(--surface);
217
+ border: 1px solid var(--border);
218
+ border-radius: var(--radius);
219
+ max-height: calc(100vh - var(--nav-h) - 3rem);
220
+ overflow-y: auto;
221
+ position: sticky;
222
+ top: calc(var(--nav-h) + 1.5rem);
223
+ }
224
+ .doc-sidebar-header {
225
+ padding: .6rem .75rem;
226
+ font-size: .8rem; font-weight: 700; color: var(--text-muted);
227
+ text-transform: uppercase; letter-spacing: .05em;
228
+ border-bottom: 1px solid var(--border);
229
+ position: sticky; top: 0; background: var(--surface);
230
+ }
231
+ .doc-list-item {
232
+ padding: .5rem .75rem;
233
+ cursor: pointer;
234
+ border-bottom: 1px solid var(--border);
235
+ display: flex; align-items: center; justify-content: space-between;
236
+ gap: .5rem;
237
+ transition: background .1s;
238
+ }
239
+ .doc-list-item:last-child { border-bottom: none; }
240
+ .doc-list-item:hover { background: var(--bg); }
241
+ .doc-list-item.active { background: var(--primary-lt); }
242
+ .doc-list-label { font-size: .82rem; font-weight: 500; }
243
+ .doc-list-cer {
244
+ font-size: .72rem; font-weight: 700;
245
+ padding: .1rem .3rem; border-radius: 3px;
246
+ flex-shrink: 0;
247
+ }
248
+
249
+ /* Image zone */
250
+ .doc-image-wrap {
251
+ position: relative; overflow: hidden;
252
+ border: 1px solid var(--border); border-radius: var(--radius);
253
+ background: #e8e0d4; cursor: zoom-in;
254
+ aspect-ratio: 4/3;
255
+ }
256
+ .doc-image-wrap img {
257
+ width: 100%; height: 100%; object-fit: contain;
258
+ transform-origin: center center;
259
+ transition: transform .2s;
260
+ user-select: none;
261
+ }
262
+ .doc-image-placeholder {
263
+ width: 100%; height: 100%;
264
+ display: flex; align-items: center; justify-content: center;
265
+ flex-direction: column; gap: .5rem; color: #94a3b8;
266
+ font-size: .9rem;
267
+ }
268
+ .zoom-controls {
269
+ position: absolute; bottom: .5rem; right: .5rem;
270
+ display: flex; gap: .3rem;
271
+ }
272
+ .zoom-btn {
273
+ background: rgba(0,0,0,.5); color: #fff;
274
+ border: none; border-radius: 4px; cursor: pointer;
275
+ width: 28px; height: 28px; font-size: .9rem;
276
+ display: flex; align-items: center; justify-content: center;
277
+ transition: background .1s;
278
+ }
279
+ .zoom-btn:hover { background: rgba(0,0,0,.75); }
280
+
281
+ /* Diff panels */
282
+ .diff-panels {
283
+ display: grid;
284
+ grid-template-columns: repeat(auto-fit, minmax(300px, 1fr));
285
+ gap: .75rem;
286
+ margin-top: .75rem;
287
+ }
288
+ .diff-panel {
289
+ background: var(--surface);
290
+ border: 1px solid var(--border);
291
+ border-radius: var(--radius);
292
+ overflow: hidden;
293
+ }
294
+ .diff-panel-header {
295
+ padding: .5rem .75rem;
296
+ background: var(--bg);
297
+ border-bottom: 1px solid var(--border);
298
+ display: flex; align-items: center; justify-content: space-between;
299
+ }
300
+ .diff-panel-title { font-size: .83rem; font-weight: 700; }
301
+ .diff-panel-metrics {
302
+ display: flex; gap: .4rem;
303
+ font-size: .72rem;
304
+ }
305
+ .diff-panel-body {
306
+ padding: .75rem; font-size: .82rem; line-height: 1.7;
307
+ font-family: 'Georgia', serif;
308
+ max-height: 260px; overflow-y: auto;
309
+ }
310
+ /* Diff spans */
311
+ .d-eq { color: var(--text); }
312
+ .d-ins { color: var(--ins); background: var(--ins-bg); border-radius: 2px; padding: 0 1px; }
313
+ .d-del { color: var(--del); background: var(--del-bg); border-radius: 2px; padding: 0 1px; text-decoration: line-through; }
314
+ .d-rep-old { color: var(--del); background: var(--del-bg); border-radius: 2px 0 0 2px; padding: 0 1px; text-decoration: line-through; }
315
+ .d-rep-new { color: var(--rep); background: var(--rep-bg); border-radius: 0 2px 2px 0; padding: 0 1px; }
316
+
317
+ /* Side-by-side diff */
318
+ .sbs-header {
319
+ display: flex; align-items: center; justify-content: space-between;
320
+ flex-wrap: wrap; gap: .5rem; margin-bottom: .75rem;
321
+ }
322
+ .sbs-engine-select {
323
+ display: flex; align-items: center; gap: .4rem; font-size: .82rem;
324
+ }
325
+ .sbs-engine-select select {
326
+ border: 1px solid var(--border); border-radius: 4px;
327
+ padding: .2rem .4rem; font-size: .82rem; background: var(--surface);
328
+ }
329
+ .sbs-columns {
330
+ display: grid; grid-template-columns: 1fr 1fr; gap: .75rem;
331
+ }
332
+ @media (max-width: 700px) {
333
+ .sbs-columns { grid-template-columns: 1fr; }
334
+ }
335
+ .sbs-col {
336
+ border: 1px solid var(--border); border-radius: var(--radius); overflow: hidden;
337
+ }
338
+ .sbs-col-header {
339
+ padding: .45rem .75rem;
340
+ display: flex; align-items: center; justify-content: space-between; gap: .5rem;
341
+ font-size: .83rem; font-weight: 700;
342
+ }
343
+ .sbs-gt-header {
344
+ background: #f0fdf4; border-bottom: 1px solid #bbf7d0; color: #15803d;
345
+ }
346
+ .sbs-ocr-header {
347
+ background: #eff6ff; border-bottom: 1px solid #bfdbfe; color: #1d4ed8;
348
+ }
349
+ .sbs-col-body {
350
+ padding: .75rem; font-size: .82rem; line-height: 1.8;
351
+ font-family: 'Georgia', serif;
352
+ max-height: 340px; overflow-y: auto;
353
+ color: var(--text); white-space: pre-wrap; word-break: break-word;
354
+ }
355
+ /* Caractères manquants dans GT (orange) */
356
+ .d-miss { color: #92400e; background: #fef3c7; border-radius: 2px; padding: 0 1px; }
357
+ /* Caractères erronés dans OCR (rouge) */
358
+ .d-err { color: var(--del); background: var(--del-bg); border-radius: 2px; padding: 0 1px; }
359
+ /* Insertions dans OCR (vert) */
360
+ .d-ins-ocr { color: var(--ins); background: var(--ins-bg); border-radius: 2px; padding: 0 1px; }
361
+
362
+ /* ── Analyses ─────────────────────────────────────────────────────── */
363
+ .charts-grid {
364
+ display: grid;
365
+ grid-template-columns: repeat(auto-fit, minmax(380px, 1fr));
366
+ gap: 1rem;
367
+ }
368
+ .chart-card {
369
+ background: var(--surface);
370
+ border: 1px solid var(--border);
371
+ border-radius: var(--radius);
372
+ padding: 1rem;
373
+ }
374
+ .chart-canvas-wrap { position: relative; height: 280px; }
375
+
376
+ /* ── Pipeline badges ──────────────────────────────────────────────── */
377
+ .pipeline-tag {
378
+ display: inline-flex; align-items: center; gap: .25rem;
379
+ padding: .12rem .38rem;
380
+ border-radius: 4px; font-size: .67rem; font-weight: 700;
381
+ background: #ede9fe; color: #6d28d9;
382
+ letter-spacing: .02em; vertical-align: middle;
383
+ }
384
+ .pipeline-tag .pipe-arrow { opacity: .7; }
385
+ .over-norm-badge {
386
+ display: inline-block; padding: .12rem .38rem;
387
+ border-radius: 4px; font-size: .67rem; font-weight: 700;
388
+ background: #fef3c7; color: #b45309;
389
+ }
390
+ .over-norm-badge.high { background: #fee2e2; color: #b91c1c; }
391
+ /* Vue triple-diff (pipeline) */
392
+ .triple-diff-wrap {
393
+ display: grid; grid-template-columns: 1fr 1fr; gap: .5rem;
394
+ margin-top: .5rem;
395
+ }
396
+ .triple-diff-section { background: var(--bg); border-radius: 6px; padding: .5rem; }
397
+ .triple-diff-section h5 {
398
+ font-size: .73rem; font-weight: 700; color: var(--text-muted);
399
+ margin-bottom: .35rem; text-transform: uppercase; letter-spacing: .04em;
400
+ }
401
+ .pipeline-steps {
402
+ display: flex; align-items: center; gap: .3rem; flex-wrap: wrap;
403
+ margin-top: .25rem;
404
+ }
405
+ .step-chip {
406
+ padding: .12rem .4rem; border-radius: 4px; font-size: .68rem; font-weight: 600;
407
+ }
408
+ .step-chip.ocr { background: #e0f2fe; color: #0369a1; }
409
+ .step-chip.llm { background: #ede9fe; color: #6d28d9; }
410
+ .step-arrow { color: var(--text-muted); font-size: .8rem; }
411
+
412
+ /* ── Misc ─────────────────────────────────────────────────────────── */
413
+ .badge {
414
+ display: inline-block; padding: .15rem .45rem;
415
+ border-radius: 4px; font-size: .72rem; font-weight: 700;
416
+ }
417
+ .pill {
418
+ display: inline-block; padding: .1rem .4rem;
419
+ border-radius: 12px; font-size: .72rem;
420
+ background: var(--primary-lt); color: var(--primary);
421
+ }
422
+ .empty-state {
423
+ text-align: center; padding: 3rem 1rem;
424
+ color: var(--text-muted); font-size: .9rem;
425
+ }
426
+ .legend-dot {
427
+ display: inline-block; width: 8px; height: 8px;
428
+ border-radius: 50%; margin-right: .3rem;
429
+ }
430
+ .legend-row {
431
+ display: flex; align-items: center; gap: .4rem;
432
+ font-size: .78rem; color: var(--text-muted);
433
+ }
434
+ footer {
435
+ text-align: center; padding: 1.5rem;
436
+ color: var(--text-muted); font-size: .75rem;
437
+ border-top: 1px solid var(--border); margin-top: 2rem;
438
+ }
439
+ .stat-row {
440
+ display: flex; gap: 1.5rem; flex-wrap: wrap; margin-bottom: .75rem;
441
+ }
442
+ .stat {
443
+ background: var(--bg); border-radius: 6px; padding: .4rem .75rem;
444
+ font-size: .8rem;
445
+ }
446
+ .stat b { color: var(--primary); }
447
+
448
+ /* ── Difficulty badge ─────────────────────────────────────────── */
449
+ .diff-badge {
450
+ display: inline-flex; align-items: center; gap: .2rem;
451
+ padding: .1rem .4rem; border-radius: 4px;
452
+ font-size: .7rem; font-weight: 700;
453
+ }
454
+
455
+ /* ── Presentation mode ────────────────────────────────────────── */
456
+ .btn-present {
457
+ background: rgba(255,255,255,.15); border: 1px solid rgba(255,255,255,.3);
458
+ color: #fff; padding: .3rem .7rem; border-radius: 6px;
459
+ font-size: .8rem; font-weight: 600; cursor: pointer;
460
+ transition: background .15s;
461
+ white-space: nowrap;
462
+ }
463
+ .btn-present:hover { background: rgba(255,255,255,.28); }
464
+ .btn-present.active { background: rgba(255,255,255,.35); }
465
+ .btn-export-csv {
466
+ background: rgba(255,255,255,.12); border: 1px solid rgba(255,255,255,.25);
467
+ color: rgba(255,255,255,.85); padding: .3rem .7rem; border-radius: 6px;
468
+ font-size: .8rem; font-weight: 600; cursor: pointer;
469
+ transition: background .15s; white-space: nowrap;
470
+ }
471
+ .btn-export-csv:hover { background: rgba(255,255,255,.22); color:#fff; }
472
+ body.present-mode .technical { display: none !important; }
473
+ body.present-mode .chart-card { page-break-inside: avoid; }
474
+ body.present-mode nav .meta { display: none; }
475
+
476
+ /* ── Cluster cards ─────────────────────────────────────────────── */
477
+ .cluster-grid {
478
+ display: grid;
479
+ grid-template-columns: repeat(auto-fill, minmax(240px, 1fr));
480
+ gap: .75rem; margin-top: .75rem;
481
+ }
482
+ .cluster-card {
483
+ background: var(--bg); border: 1px solid var(--border);
484
+ border-radius: var(--radius); padding: .75rem;
485
+ }
486
+ .cluster-label { font-weight: 700; font-size: .88rem; color: var(--primary); margin-bottom: .3rem; }
487
+ .cluster-count { font-size: .75rem; color: var(--text-muted); margin-bottom: .5rem; }
488
+ .cluster-examples {
489
+ display: flex; flex-direction: column; gap: .2rem;
490
+ }
491
+ .cluster-ex {
492
+ font-family: monospace; font-size: .78rem;
493
+ background: var(--surface); border-radius: 3px; padding: .15rem .35rem;
494
+ display: flex; align-items: center; gap: .35rem; color: var(--text-muted);
495
+ }
496
+ .cluster-ex .ex-old { color: var(--del); background: var(--del-bg); border-radius: 2px; padding: 0 3px; }
497
+ .cluster-ex .ex-new { color: var(--rep); background: var(--rep-bg); border-radius: 2px; padding: 0 3px; }
498
+
499
+ /* ── Statistical tests table ─────────────────────────────────────*/
500
+ .stat-sig { color: #dc2626; font-weight: 700; }
501
+ .stat-ns { color: #64748b; }
502
+
503
+ /* ── Venn diagram ────────────────────────────────────────────────*/
504
+ .venn-wrap { display: flex; justify-content: center; padding: 1rem; }
505
+
506
+ /* ── Correlation matrix ──────────────────────────────────────────*/
507
+ .corr-table { border-collapse: collapse; font-size: .8rem; margin: .5rem auto; }
508
+ .corr-table th, .corr-table td {
509
+ padding: .35rem .5rem; text-align: center; border: 1px solid var(--border);
510
+ min-width: 60px;
511
+ }
512
+ .corr-table th { background: var(--bg); font-weight: 600; font-size: .75rem; }
513
+
514
+ /* ── Sprint 10 — heatmap erreurs ─────────────────────────────────*/
515
+ .heatmap-wrap {
516
+ display: flex; gap: 3px; align-items: flex-end;
517
+ height: 60px; margin: .5rem 0;
518
+ }
519
+ .heatmap-bar {
520
+ flex: 1; border-radius: 3px 3px 0 0;
521
+ min-height: 4px;
522
+ transition: opacity .15s;
523
+ }
524
+ .heatmap-bar:hover { opacity: .75; }
525
+ .heatmap-labels {
526
+ display: flex; justify-content: space-between;
527
+ font-size: .65rem; color: var(--text-muted); margin-top: .15rem;
528
+ }
529
+
530
+ /* ── Sprint 10 — hallucination badge ─────────────────────────────*/
531
+ .hallucination-badge {
532
+ display: inline-flex; align-items: center; gap: .25rem;
533
+ padding: .15rem .45rem; border-radius: 4px;
534
+ font-size: .72rem; font-weight: 700;
535
+ background: #fce7f3; color: #9d174d;
536
+ border: 1px solid #fbcfe8;
537
+ }
538
+ .hallucination-badge.ok {
539
+ background: #f0fdf4; color: #15803d;
540
+ border-color: #bbf7d0;
541
+ }
542
+
543
+ /* ── Sprint 10 — bloc halluciné ──────────────────────────────────*/
544
+ .halluc-block {
545
+ background: #fce7f3; border: 1px solid #f9a8d4;
546
+ border-radius: 4px; padding: .35rem .6rem;
547
+ margin: .25rem 0; font-size: .78rem;
548
+ font-family: 'Georgia', serif; color: #9d174d;
549
+ }
550
+ .halluc-block-meta {
551
+ font-size: .65rem; color: #be185d; font-family: system-ui, sans-serif;
552
+ margin-bottom: .15rem; font-weight: 600;
553
+ }
554
+
555
+ /* ── Sprint 10 — percentile bars ─────────────────────────────────*/
556
+ .pct-bars { display: flex; flex-direction: column; gap: .25rem; margin: .4rem 0; }
557
+ .pct-bar-row { display: flex; align-items: center; gap: .4rem; font-size: .72rem; }
558
+ .pct-bar-label { width: 2.5rem; color: var(--text-muted); text-align: right; flex-shrink: 0; }
559
+ .pct-bar-track {
560
+ flex: 1; height: 8px; background: var(--bg);
561
+ border-radius: 4px; overflow: hidden;
562
+ }
563
+ .pct-bar-fill { height: 100%; border-radius: 4px; }
564
+ .pct-bar-val { width: 3rem; color: var(--text); font-weight: 600; }
picarones/report/templates/base.html.j2 ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="{{ html_lang }}">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Picarones — {{ corpus_name }}</title>
7
+
8
+ <!-- Chart.js (vendorisé inline) -->
9
+ <script>{{ chartjs_inline | safe }}</script>
10
+
11
+ <style>
12
+ {% include '_styles.css' %}
13
+ </style>
14
+ </head>
15
+
16
+ <body>
17
+
18
+ {% include '_header.html' %}
19
+
20
+ {% include 'view_ranking.html' %}
21
+
22
+ {% include 'view_gallery.html' %}
23
+
24
+ {% include 'view_document.html' %}
25
+
26
+ {% include 'view_analyses.html' %}
27
+
28
+ {% include 'view_characters.html' %}
29
+
30
+ {% include '_footer.html' %}
31
+
32
+ <!-- ── Données embarquées ──────────────────────────────────────────── -->
33
+ <script>
34
+ const DATA = {{ report_data_json | safe }};
35
+ const I18N = {{ i18n_json | safe }};
36
+ </script>
37
+
38
+ <!-- ── Application ────────────────────────────────────────────────── -->
39
+ <script>
40
+ {% include '_app.js' %}
41
+ </script>
42
+ </body>
43
+ </html>
picarones/report/templates/view_analyses.html ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <!-- ════ Vue 4 : Analyses ══════════════════════════════════════════ -->
3
+ <div id="view-analyses" class="view">
4
+ <div class="charts-grid">
5
+
6
+ <div class="chart-card">
7
+ <h3 data-i18n="h_cer_dist">Distribution du CER par moteur</h3>
8
+ <div class="chart-canvas-wrap">
9
+ <canvas id="chart-cer-hist"></canvas>
10
+ </div>
11
+ </div>
12
+
13
+ <div class="chart-card">
14
+ <h3 data-i18n="h_radar">Profil des moteurs (radar)</h3>
15
+ <div class="chart-canvas-wrap">
16
+ <canvas id="chart-radar"></canvas>
17
+ </div>
18
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.5rem" data-i18n="radar_note">
19
+ Axe radar : CER, WER, MER, WIL — valeurs inversées (plus c'est haut, meilleur est le moteur).
20
+ </div>
21
+ </div>
22
+
23
+ <div class="chart-card">
24
+ <h3 data-i18n="h_cer_doc">CER par document (tous moteurs)</h3>
25
+ <div class="chart-canvas-wrap">
26
+ <canvas id="chart-cer-doc"></canvas>
27
+ </div>
28
+ </div>
29
+
30
+ <div class="chart-card">
31
+ <h3 data-i18n="h_duration">Temps d'exécution moyen (secondes/document)</h3>
32
+ <div class="chart-canvas-wrap">
33
+ <canvas id="chart-duration"></canvas>
34
+ </div>
35
+ </div>
36
+
37
+ <div class="chart-card">
38
+ <h3 data-i18n="h_quality_cer">Qualité image ↔ CER (scatter plot)</h3>
39
+ <div class="chart-canvas-wrap">
40
+ <canvas id="chart-quality-cer"></canvas>
41
+ </div>
42
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="quality_cer_note">
43
+ Chaque point = un document. Axe X = score qualité image [0–1]. Axe Y = CER. Corrélation négative attendue.
44
+ </div>
45
+ </div>
46
+
47
+ <div class="chart-card" style="grid-column:1/-1">
48
+ <h3 data-i18n="h_taxonomy">Taxonomie des erreurs par moteur</h3>
49
+ <div class="chart-canvas-wrap" style="max-height:300px">
50
+ <canvas id="chart-taxonomy"></canvas>
51
+ </div>
52
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="taxonomy_note">
53
+ Distribution des classes d'erreurs (classes 1–9 de la taxonomie Picarones).
54
+ </div>
55
+ </div>
56
+
57
+ <!-- Sprint 7 — Courbe de fiabilité -->
58
+ <div class="chart-card" style="grid-column:1/-1">
59
+ <h3 data-i18n="h_reliability">Courbes de fiabilité</h3>
60
+ <div class="chart-canvas-wrap" style="max-height:300px">
61
+ <canvas id="chart-reliability"></canvas>
62
+ </div>
63
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="reliability_note">
64
+ Pour les X% documents les plus faciles (triés par CER croissant), quel est le CER moyen cumulé ?
65
+ Une courbe basse = moteur performant même sur les documents faciles.
66
+ </div>
67
+ </div>
68
+
69
+ <!-- Sprint 7 — Intervalles de confiance -->
70
+ <div class="chart-card">
71
+ <h3 data-i18n="h_bootstrap">Intervalles de confiance à 95 % (bootstrap)</h3>
72
+ <div class="chart-canvas-wrap">
73
+ <canvas id="chart-bootstrap-ci"></canvas>
74
+ </div>
75
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="bootstrap_note">
76
+ IC à 95% sur le CER moyen par moteur (1000 itérations bootstrap).
77
+ </div>
78
+ </div>
79
+
80
+ <!-- Sprint 7 — Diagramme de Venn -->
81
+ <div class="chart-card">
82
+ <h3 data-i18n="h_venn">Erreurs communes / exclusives (Venn)</h3>
83
+ <div id="venn-container" style="min-height:260px;display:flex;align-items:center;justify-content:center"></div>
84
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem technical" data-i18n="venn_note">
85
+ Intersection des ensembles d'erreurs entre les 2 ou 3 premiers concurrents.
86
+ Erreurs communes = segments partagés.
87
+ </div>
88
+ </div>
89
+
90
+ <!-- Sprint 7 — Tests de Wilcoxon -->
91
+ <div class="chart-card technical">
92
+ <h3 data-i18n="h_pairwise">Tests de Wilcoxon — comparaisons par paires</h3>
93
+ <div id="wilcoxon-table-container" style="overflow-x:auto"></div>
94
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="pairwise_note">
95
+ Test signé-rangé de Wilcoxon (non-paramétrique). Seuil α = 0.05.
96
+ </div>
97
+ </div>
98
+
99
+ <!-- Sprint 7 — Clustering des erreurs -->
100
+ <div class="chart-card" style="grid-column:1/-1">
101
+ <h3 data-i18n="h_clusters">Clustering des patterns d'erreurs</h3>
102
+ <div id="error-clusters-container"></div>
103
+ </div>
104
+
105
+ <!-- Sprint 10 — Scatter Gini vs CER moyen -->
106
+ <div class="chart-card">
107
+ <h3 data-i18n="h_gini_cer">Gini vs CER moyen <span style="font-size:.72rem;font-weight:400;color:var(--text-muted)" data-i18n="gini_cer_ideal">— idéal : bas-gauche</span></h3>
108
+ <div class="chart-canvas-wrap">
109
+ <canvas id="chart-gini-cer"></canvas>
110
+ </div>
111
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="gini_cer_note">
112
+ Axe X = CER moyen, Axe Y = coefficient de Gini. Un moteur idéal a CER bas ET Gini bas (erreurs rares et uniformes).
113
+ </div>
114
+ </div>
115
+
116
+ <!-- Sprint 10 — Scatter ratio longueur vs ancrage -->
117
+ <div class="chart-card">
118
+ <h3 data-i18n="h_ratio_anchor">Ratio longueur vs ancrage <span style="font-size:.72rem;font-weight:400;color:var(--text-muted)" data-i18n="ratio_anchor_subtitle">— hallucinations VLM</span></h3>
119
+ <div class="chart-canvas-wrap">
120
+ <canvas id="chart-ratio-anchor"></canvas>
121
+ </div>
122
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="ratio_anchor_note">
123
+ Axe X = score d'ancrage trigrammes [0–1]. Axe Y = ratio longueur sortie/GT.
124
+ Zone ⚠️ : ancrage &lt; 0.5 ou ratio &gt; 1.2 → hallucinations probables.
125
+ </div>
126
+ </div>
127
+
128
+ <!-- Sprint 7 — Matrice de corrélation -->
129
+ <div class="chart-card technical" style="grid-column:1/-1">
130
+ <h3 data-i18n="h_correlation">Matrice de corrélation entre métriques</h3>
131
+ <div style="margin-bottom:.5rem">
132
+ <label style="font-size:.82rem;font-weight:600"><span data-i18n="corr_engine_label">Moteur :</span>
133
+ <select id="corr-engine-select" onchange="renderCorrelationMatrix()"
134
+ style="padding:.25rem .5rem;border-radius:6px;border:1px solid var(--border);margin-left:.25rem"></select>
135
+ </label>
136
+ </div>
137
+ <div id="corr-matrix-container" style="overflow-x:auto"></div>
138
+ <div style="font-size:.72rem;color:var(--text-muted);margin-top:.4rem" data-i18n="corr_note">
139
+ Coefficient de Pearson entre les métriques CER, WER, qualité image, ligatures, diacritiques.
140
+ Vert = corrélation positive, Rouge = corrélation négative.
141
+ </div>
142
+ </div>
143
+
144
+ </div>
145
+ </div>
146
+
picarones/report/templates/view_characters.html ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- ════ Vue 5 : Caractères ════════════════════════════════════════ -->
2
+ <div id="view-characters" class="view">
3
+ <div class="card">
4
+ <h2 data-i18n="h_characters">Analyse des caractères</h2>
5
+
6
+ <!-- Sélecteur de moteur -->
7
+ <div class="stat-row" style="margin-bottom:1rem">
8
+ <label for="char-engine-select" style="font-weight:600;margin-right:.5rem" data-i18n="char_engine_label">Moteur :</label>
9
+ <select id="char-engine-select" onchange="renderCharView()"
10
+ style="padding:.35rem .7rem;border-radius:6px;border:1px solid var(--border)"></select>
11
+ </div>
12
+
13
+ <!-- Scores ligatures / diacritiques -->
14
+ <div class="stat-row" id="char-scores-row" style="gap:1.5rem;margin-bottom:1.5rem"></div>
15
+
16
+ <!-- Matrice de confusion unicode -->
17
+ <h3 style="margin-bottom:.75rem">Matrice de confusion unicode
18
+ <span style="font-size:.75rem;font-weight:400;color:var(--text-muted)">
19
+ — substitutions les plus fréquentes (caractère GT → caractère OCR)
20
+ </span>
21
+ </h3>
22
+ <div id="confusion-heatmap" style="overflow-x:auto;margin-bottom:1.5rem"></div>
23
+
24
+ <!-- Détail ligatures par type -->
25
+ <h3 style="margin-bottom:.75rem">Reconnaissance des ligatures</h3>
26
+ <div id="ligature-detail" style="margin-bottom:1.5rem"></div>
27
+
28
+ <!-- Taxonomie détaillée -->
29
+ <h3 style="margin-bottom:.75rem">Distribution taxonomique des erreurs</h3>
30
+ <div id="taxonomy-detail"></div>
31
+ </div>
32
+ </div>
33
+
picarones/report/templates/view_document.html ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <!-- ════ Vue 3 : Document ══════════════════════════════════════════ -->
3
+ <div id="view-document" class="view">
4
+ <div class="doc-layout">
5
+ <!-- Sidebar -->
6
+ <aside class="doc-sidebar">
7
+ <div class="doc-sidebar-header" data-i18n="doc_sidebar_header">Documents</div>
8
+ <div id="doc-list"></div>
9
+ </aside>
10
+
11
+ <!-- Contenu principal -->
12
+ <div>
13
+ <div class="card" id="doc-detail-header">
14
+ <div style="display:flex; align-items:baseline; justify-content:space-between; flex-wrap:wrap; gap:.5rem">
15
+ <h2 id="doc-detail-title" data-i18n="doc_title_default">Sélectionner un document</h2>
16
+ <div class="stat-row" id="doc-detail-metrics"></div>
17
+ </div>
18
+ </div>
19
+
20
+ <!-- Image zoomable -->
21
+ <div class="card">
22
+ <h3 data-i18n="h_image">Image originale</h3>
23
+ <div class="doc-image-wrap" id="doc-image-wrap"
24
+ onwheel="handleZoom(event)"
25
+ onmousedown="startDrag(event)"
26
+ onmousemove="doDrag(event)"
27
+ onmouseup="endDrag()"
28
+ onmouseleave="endDrag()">
29
+ <div class="doc-image-placeholder" id="doc-image-placeholder">
30
+ <span style="font-size:2rem">🖼</span>
31
+ <span>Sélectionnez un document</span>
32
+ </div>
33
+ <img id="doc-image" src="" alt="Image du document" style="display:none">
34
+ <div class="zoom-controls">
35
+ <button class="zoom-btn" onclick="zoom(1.25)" title="Zoom +">+</button>
36
+ <button class="zoom-btn" onclick="zoom(0.8)" title="Zoom −">−</button>
37
+ <button class="zoom-btn" onclick="resetZoom()" title="Réinitialiser">↺</button>
38
+ </div>
39
+ </div>
40
+ </div>
41
+
42
+ <!-- Diff côte à côte GT / OCR -->
43
+ <div class="card" id="doc-sidebyside-card">
44
+ <div class="sbs-header">
45
+ <h3 data-i18n="h_diff">Comparaison GT / OCR</h3>
46
+ <div class="sbs-engine-select" id="sbs-engine-select" style="display:none">
47
+ <label data-i18n="sbs_engine_label">Concurrent :</label>
48
+ <select id="sbs-engine-dropdown" onchange="renderSideBySide(currentDocId)"></select>
49
+ </div>
50
+ </div>
51
+ <div class="sbs-columns" id="sbs-columns">
52
+ <div class="sbs-col sbs-col-gt">
53
+ <div class="sbs-col-header sbs-gt-header">
54
+ <span>✓ Vérité terrain (GT)</span>
55
+ </div>
56
+ <div class="sbs-col-body" id="sbs-gt-body">—</div>
57
+ </div>
58
+ <div class="sbs-col sbs-col-ocr">
59
+ <div class="sbs-col-header sbs-ocr-header" id="sbs-ocr-header">
60
+ <span id="sbs-ocr-engine-name">OCR</span>
61
+ <span class="cer-badge" id="sbs-ocr-cer" style="display:none"></span>
62
+ </div>
63
+ <div class="sbs-col-body" id="sbs-ocr-body">—</div>
64
+ </div>
65
+ </div>
66
+ <!-- Pipeline triple-diff (affiché en dessous si applicable) -->
67
+ <div id="sbs-triple-diff" style="display:none"></div>
68
+ </div>
69
+
70
+ <!-- Sprint 10 — Distribution CER par ligne -->
71
+ <div class="card" id="doc-line-metrics-card" style="display:none">
72
+ <h3 data-i18n="h_line_metrics">Distribution des erreurs par ligne</h3>
73
+ <div id="doc-line-metrics-content"></div>
74
+ </div>
75
+
76
+ <!-- Sprint 10 — Hallucinations détectées -->
77
+ <div class="card" id="doc-hallucination-card" style="display:none">
78
+ <h3 data-i18n="h_hallucination">Analyse des hallucinations</h3>
79
+ <div id="doc-hallucination-content"></div>
80
+ </div>
81
+ </div>
82
+ </div>
83
+ </div>
84
+
picarones/report/templates/view_gallery.html ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <!-- ════ Vue 2 : Galerie ═══════════════════════════════════════════ -->
3
+ <div id="view-gallery" class="view">
4
+ <div class="card">
5
+ <h2 data-i18n="h_gallery">Galerie des documents</h2>
6
+ <div class="gallery-controls">
7
+ <label><span data-i18n="gallery_sort_label">Trier par :</span>
8
+ <select id="gallery-sort" onchange="renderGallery()">
9
+ <option value="doc_id" data-i18n-opt="gallery_sort_id">Identifiant</option>
10
+ <option value="mean_cer" data-i18n-opt="gallery_sort_cer">CER moyen</option>
11
+ <option value="difficulty_score" data-i18n-opt="gallery_sort_difficulty">Difficulté</option>
12
+ <option value="best_engine" data-i18n-opt="gallery_sort_best">Meilleur moteur</option>
13
+ </select>
14
+ </label>
15
+ <label><span data-i18n="gallery_filter_cer_label">Filtrer CER &gt;</span>
16
+ <input type="number" id="gallery-filter-cer" min="0" max="100" value="0" step="1"
17
+ style="width:60px" onchange="renderGallery()"> %
18
+ </label>
19
+ <label><span data-i18n="gallery_filter_engine_label">Moteur :</span>
20
+ <select id="gallery-engine-select" onchange="renderGallery()">
21
+ <option value="" data-i18n-opt="gallery_filter_all">Tous</option>
22
+ </select>
23
+ </label>
24
+ <button class="btn-secondary" onclick="resetGalleryExclusions()" id="gallery-reset-btn"
25
+ title="Réinitialiser toutes les exclusions manuelles" style="display:none">
26
+ ↺ Réinitialiser exclusions
27
+ </button>
28
+ </div>
29
+ <div id="gallery-exclusion-info" style="font-size:.82rem;color:var(--text-muted);margin:.4rem 0;display:none"></div>
30
+ <div id="gallery-grid" class="gallery-grid"></div>
31
+ <div id="gallery-empty" class="empty-state" style="display:none" data-i18n="gallery_empty">
32
+ Aucun document ne correspond aux filtres.
33
+ </div>
34
+ </div>
35
+ </div>
36
+
picarones/report/templates/view_ranking.html ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <!-- ════ Vue 1 : Classement ════════════════════════════════════════ -->
3
+ <div id="view-ranking" class="view active">
4
+ <div class="card">
5
+ <h2 data-i18n="h_ranking">Classement des moteurs</h2>
6
+ <div class="stat-row" id="ranking-stats"></div>
7
+ <div class="table-wrap">
8
+ <table id="ranking-table">
9
+ <thead>
10
+ <tr>
11
+ <th data-col="rank" class="sortable sorted" data-dir="asc" data-i18n="col_rank">#<i class="sort-icon">↑</i></th>
12
+ <th data-col="name" class="sortable" data-i18n="col_engine">Concurrent<i class="sort-icon">↕</i></th>
13
+ <th data-col="cer" class="sortable" data-i18n="col_cer">CER exact<i class="sort-icon">↕</i></th>
14
+ <th data-col="cer_diplomatic" class="sortable" id="th-cer-diplo" data-i18n="col_cer_diplo">CER diplo.<i class="sort-icon">↕</i></th>
15
+ <th data-col="wer" class="sortable" data-i18n="col_wer">WER<i class="sort-icon">↕</i></th>
16
+ <th data-col="mer" class="sortable" data-i18n="col_mer">MER<i class="sort-icon">↕</i></th>
17
+ <th data-col="wil" class="sortable" data-i18n="col_wil">WIL<i class="sort-icon">↕</i></th>
18
+ <th data-col="ligature_score" class="sortable" id="th-ligatures" data-i18n="col_ligatures">Ligatures<i class="sort-icon">↕</i></th>
19
+ <th data-col="diacritic_score" class="sortable" id="th-diacritics" data-i18n="col_diacritics">Diacritiques<i class="sort-icon">↕</i></th>
20
+ <th data-col="gini" class="sortable" id="th-gini" data-i18n="col_gini">Gini<i class="sort-icon">↕</i></th>
21
+ <th data-col="anchor_score" class="sortable" id="th-anchor" data-i18n="col_anchor">Ancrage<i class="sort-icon">↕</i></th>
22
+ <th data-i18n="col_cer_median">CER médian</th>
23
+ <th data-i18n="col_cer_min">CER min</th>
24
+ <th data-i18n="col_cer_max">CER max</th>
25
+ <th id="th-overnorm" data-i18n="col_overnorm">Sur-norm.</th>
26
+ <th data-i18n="col_docs">Docs</th>
27
+ </tr>
28
+ </thead>
29
+ <tbody id="ranking-tbody"></tbody>
30
+ </table>
31
+ </div>
32
+ <div class="stat-row" style="margin-top:.75rem">
33
+ <div class="legend-row">
34
+ <span class="legend-dot" style="background:#16a34a"></span>CER &lt; 5 %
35
+ </div>
36
+ <div class="legend-row">
37
+ <span class="legend-dot" style="background:#ca8a04"></span>5–15 %
38
+ </div>
39
+ <div class="legend-row">
40
+ <span class="legend-dot" style="background:#ea580c"></span>15–30 %
41
+ </div>
42
+ <div class="legend-row">
43
+ <span class="legend-dot" style="background:#dc2626"></span>&gt; 30 %
44
+ </div>
45
+ </div>
46
+ </div>
47
+
48
+ <!-- ── Métriques robustes ────────────────────────────────────── -->
49
+ <div class="card" id="robust-metrics-card">
50
+ <h2 data-i18n="h_robust">Analyse robuste (sans hallucinations)</h2>
51
+ <p style="font-size:.82rem;color:var(--text-muted);margin-bottom:.75rem" data-i18n="robust_desc">
52
+ Recalcule CER, WER, MER, WIL, Gini et ancrage en excluant les documents détectés comme hallucinés ou problématiques.
53
+ Cochez/décochez des documents dans la Galerie pour les exclure manuellement.
54
+ </p>
55
+ <div class="robust-controls">
56
+ <label>
57
+ <button class="robust-toggle" id="robust-cer-toggle" data-active="true"
58
+ onclick="toggleRobustCriterion('cer',this)">✓</button>
59
+ <span data-i18n="robust_cer_label">CER &gt; seuil :</span>
60
+ <input type="range" id="robust-cer" min="0" max="100" step="1" value="100"
61
+ oninput="document.getElementById('robust-cer-val').textContent=parseInt(this.value)+'%';_computeHallucinationExclusions();recalculateAll()">
62
+ <span id="robust-cer-val" class="slider-val">100%</span>
63
+ </label>
64
+ <label>
65
+ <button class="robust-toggle" id="robust-anchor-toggle" data-active="true"
66
+ onclick="toggleRobustCriterion('anchor',this)">✓</button>
67
+ <span data-i18n="robust_anchor_label">Ancrage &lt; seuil :</span>
68
+ <input type="range" id="robust-anchor" min="0" max="1" step="0.05" value="0.5"
69
+ oninput="document.getElementById('robust-anchor-val').textContent=parseFloat(this.value).toFixed(2);_computeHallucinationExclusions();recalculateAll()">
70
+ <span id="robust-anchor-val" class="slider-val">0.50</span>
71
+ </label>
72
+ <label>
73
+ <button class="robust-toggle" id="robust-ratio-toggle" data-active="true"
74
+ onclick="toggleRobustCriterion('ratio',this)">✓</button>
75
+ <span data-i18n="robust_ratio_label">Ratio longueur &gt; seuil :</span>
76
+ <input type="range" id="robust-ratio" min="1" max="3" step="0.1" value="1.5"
77
+ oninput="document.getElementById('robust-ratio-val').textContent=parseFloat(this.value).toFixed(1);_computeHallucinationExclusions();recalculateAll()">
78
+ <span id="robust-ratio-val" class="slider-val">1.5</span>
79
+ </label>
80
+ </div>
81
+ <div id="robust-summary" style="font-size:.85rem;font-weight:600;margin:.75rem 0;padding:.5rem .75rem;background:var(--bg);border-radius:.4rem;border:1px solid var(--border)"></div>
82
+ <div id="robust-table-wrap" class="table-wrap"></div>
83
+ <div id="robust-excluded-docs" style="margin-top:.75rem;font-size:.82rem"></div>
84
+ </div>
85
+ </div>
86
+
pyproject.toml CHANGED
@@ -30,6 +30,7 @@ dependencies = [
30
  "pytesseract>=0.3.10",
31
  "tqdm>=4.66.0",
32
  "numpy>=1.24.0",
 
33
  ]
34
 
35
  [project.urls]
@@ -74,7 +75,15 @@ where = ["."]
74
  include = ["picarones*"]
75
 
76
  [tool.setuptools.package-data]
77
- picarones = ["prompts/*.txt", "web/static/*.css"]
 
 
 
 
 
 
 
 
78
 
79
  [tool.pytest.ini_options]
80
  testpaths = ["tests"]
 
30
  "pytesseract>=0.3.10",
31
  "tqdm>=4.66.0",
32
  "numpy>=1.24.0",
33
+ "jinja2>=3.1.0",
34
  ]
35
 
36
  [project.urls]
 
75
  include = ["picarones*"]
76
 
77
  [tool.setuptools.package-data]
78
+ picarones = [
79
+ "prompts/*.txt",
80
+ "web/static/*.css",
81
+ "report/templates/*.j2",
82
+ "report/templates/*.html",
83
+ "report/templates/*.css",
84
+ "report/templates/*.js",
85
+ "report/i18n/*.json",
86
+ ]
87
 
88
  [tool.pytest.ini_options]
89
  testpaths = ["tests"]
tests/test_sprint17_jinja2_refactor.py ADDED
@@ -0,0 +1,211 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests Sprint 17 — refactor du générateur HTML en templates Jinja2.
2
+
3
+ Objectif : garantir que le découpage de ``_HTML_TEMPLATE`` (3100 lignes
4
+ monolithiques) en templates séparés (``base.html.j2`` + 9 partials) n'a pas
5
+ altéré la sortie du rapport. Après ce sprint, toute modification future doit
6
+ conserver ces invariants.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import hashlib
12
+ import json
13
+ import re
14
+ from pathlib import Path
15
+
16
+ import pytest
17
+
18
+ from picarones import fixtures
19
+ from picarones.report.generator import (
20
+ ReportGenerator,
21
+ _build_jinja_env,
22
+ _TEMPLATES_DIR,
23
+ )
24
+
25
+
26
+ # ---------------------------------------------------------------------------
27
+ # Structure des fichiers attendus
28
+ # ---------------------------------------------------------------------------
29
+
30
+ EXPECTED_TEMPLATE_FILES = {
31
+ "base.html.j2",
32
+ "_header.html",
33
+ "_footer.html",
34
+ "_styles.css",
35
+ "_app.js",
36
+ "view_ranking.html",
37
+ "view_gallery.html",
38
+ "view_document.html",
39
+ "view_analyses.html",
40
+ "view_characters.html",
41
+ }
42
+
43
+
44
+ class TestTemplateStructure:
45
+ def test_all_expected_template_files_exist(self):
46
+ present = {p.name for p in _TEMPLATES_DIR.iterdir() if p.is_file()}
47
+ missing = EXPECTED_TEMPLATE_FILES - present
48
+ assert not missing, f"Templates manquants : {missing}"
49
+
50
+ def test_jinja_env_can_load_base_template(self):
51
+ env = _build_jinja_env()
52
+ tpl = env.get_template("base.html.j2")
53
+ assert tpl is not None
54
+
55
+ def test_no_dangling_format_placeholders_in_templates(self):
56
+ """Aucun {placeholder} style .format() ne doit traîner — tout doit être
57
+ en syntaxe Jinja2 {{ variable }}."""
58
+ suspicious_pattern = re.compile(r"(?<!\{)\{[a-z_]+\}(?!\})")
59
+ for tpl_file in _TEMPLATES_DIR.iterdir():
60
+ if tpl_file.suffix in (".html", ".j2", ".css"):
61
+ content = tpl_file.read_text(encoding="utf-8")
62
+ matches = suspicious_pattern.findall(content)
63
+ assert not matches, (
64
+ f"{tpl_file.name} contient des placeholders style .format() : {matches}"
65
+ )
66
+
67
+
68
+ # ---------------------------------------------------------------------------
69
+ # Génération et validité du rapport
70
+ # ---------------------------------------------------------------------------
71
+
72
+ @pytest.fixture(scope="module")
73
+ def benchmark_result():
74
+ return fixtures.generate_sample_benchmark(n_docs=3)
75
+
76
+
77
+ class TestReportGeneration:
78
+ def test_generate_produces_file(self, benchmark_result, tmp_path):
79
+ out = tmp_path / "rapport.html"
80
+ gen = ReportGenerator(benchmark_result)
81
+ result_path = gen.generate(out)
82
+ assert result_path.exists()
83
+ assert result_path.stat().st_size > 10_000 # Chart.js inline à lui seul
84
+
85
+ def test_report_contains_expected_markers(self, benchmark_result, tmp_path):
86
+ out = tmp_path / "rapport.html"
87
+ ReportGenerator(benchmark_result).generate(out)
88
+ html = out.read_text(encoding="utf-8")
89
+
90
+ # Structure HTML attendue
91
+ assert "<!DOCTYPE html>" in html
92
+ assert "<html lang=\"fr\">" in html
93
+ assert "Picarones" in html
94
+ # Les 5 vues doivent être présentes
95
+ for view in ("view-ranking", "view-gallery", "view-document",
96
+ "view-analyses", "view-characters"):
97
+ assert f'id="{view}"' in html, f"Vue '{view}' absente du rapport"
98
+ # Données embarquées
99
+ assert "const DATA =" in html
100
+ assert "const I18N =" in html
101
+ # Chart.js inline
102
+ assert "Chart.js" in html
103
+
104
+ def test_report_has_no_nested_script_tags(self, benchmark_result, tmp_path):
105
+ """Un bug classique du refactor : les `<script>` dupliqués quand on
106
+ oublie de les retirer du contenu extrait."""
107
+ out = tmp_path / "rapport.html"
108
+ ReportGenerator(benchmark_result).generate(out)
109
+ html = out.read_text(encoding="utf-8")
110
+
111
+ # Chaque bloc script doit avoir un fermeture correspondante
112
+ opens = html.count("<script>")
113
+ # On tolère aussi `<script type="...">` mais on n'en utilise pas actuellement
114
+ closes = html.count("</script>")
115
+ assert opens == closes, f"Script tags déséquilibrés : {opens} ouvertures vs {closes} fermetures"
116
+
117
+ def test_report_deterministic_given_same_data(self, benchmark_result, tmp_path):
118
+ """Deux générations sur le MÊME benchmark produisent du HTML identique
119
+ (garde-fou pour le moteur narratif Sprint 4 qui doit être déterministe)."""
120
+ out1 = tmp_path / "r1.html"
121
+ out2 = tmp_path / "r2.html"
122
+ ReportGenerator(benchmark_result).generate(out1)
123
+ ReportGenerator(benchmark_result).generate(out2)
124
+ h1 = hashlib.sha256(out1.read_bytes()).hexdigest()
125
+ h2 = hashlib.sha256(out2.read_bytes()).hexdigest()
126
+ assert h1 == h2, "La génération du rapport doit être déterministe"
127
+
128
+ def test_english_locale_renders(self, benchmark_result, tmp_path):
129
+ out = tmp_path / "report_en.html"
130
+ ReportGenerator(benchmark_result, lang="en").generate(out)
131
+ html = out.read_text(encoding="utf-8")
132
+ assert '<html lang="en">' in html
133
+
134
+
135
+ # ---------------------------------------------------------------------------
136
+ # Chargement i18n depuis JSON
137
+ # ---------------------------------------------------------------------------
138
+
139
+ class TestI18nFromJSON:
140
+ def test_i18n_directory_exists_and_has_json(self):
141
+ i18n_dir = Path(__file__).parent.parent / "picarones" / "report" / "i18n"
142
+ assert i18n_dir.is_dir()
143
+ files = {p.name for p in i18n_dir.glob("*.json")}
144
+ assert "fr.json" in files
145
+ assert "en.json" in files
146
+
147
+ def test_all_i18n_files_parse_as_json(self):
148
+ i18n_dir = Path(__file__).parent.parent / "picarones" / "report" / "i18n"
149
+ for f in i18n_dir.glob("*.json"):
150
+ data = json.loads(f.read_text(encoding="utf-8"))
151
+ assert isinstance(data, dict)
152
+ assert len(data) > 50 # raisonnable : on a 101 clés
153
+
154
+ def test_fr_and_en_have_same_keys(self):
155
+ """Garde-fou contre les traductions manquantes."""
156
+ from picarones.i18n import TRANSLATIONS
157
+ fr_keys = set(TRANSLATIONS.get("fr", {}).keys())
158
+ en_keys = set(TRANSLATIONS.get("en", {}).keys())
159
+ missing_in_en = fr_keys - en_keys
160
+ missing_in_fr = en_keys - fr_keys
161
+ assert not missing_in_en, f"Clés manquantes en anglais : {missing_in_en}"
162
+ assert not missing_in_fr, f"Clés manquantes en français : {missing_in_fr}"
163
+
164
+ def test_translations_load_via_public_api(self):
165
+ from picarones.i18n import get_labels, SUPPORTED_LANGS
166
+ assert "fr" in SUPPORTED_LANGS
167
+ assert "en" in SUPPORTED_LANGS
168
+ fr = get_labels("fr")
169
+ en = get_labels("en")
170
+ assert fr["html_lang"] == "fr"
171
+ assert en["html_lang"] == "en"
172
+ # Fallback sur fr si langue inconnue
173
+ assert get_labels("xx") == fr
174
+
175
+
176
+ # ---------------------------------------------------------------------------
177
+ # Validation du contenu extrait (pas de régression sur le HTML rendu)
178
+ # ---------------------------------------------------------------------------
179
+
180
+ class TestTemplateContent:
181
+ def test_css_file_contains_expected_rules(self):
182
+ css = (_TEMPLATES_DIR / "_styles.css").read_text(encoding="utf-8")
183
+ # Quelques règles canoniques du rapport qui doivent rester
184
+ for marker in ("nav", ".cer-badge", ".gallery-card", ".tab-btn"):
185
+ assert marker in css, f"Règle CSS '{marker}' manquante"
186
+
187
+ def test_app_js_starts_with_use_strict(self):
188
+ js = (_TEMPLATES_DIR / "_app.js").read_text(encoding="utf-8")
189
+ first_nonblank = next((l for l in js.splitlines() if l.strip()), "")
190
+ assert "'use strict'" in first_nonblank
191
+
192
+ def test_app_js_has_no_residual_script_tag(self):
193
+ """Garde-fou contre un futur refactor qui ré-inclurait par erreur."""
194
+ js = (_TEMPLATES_DIR / "_app.js").read_text(encoding="utf-8")
195
+ assert "<script" not in js
196
+ assert "</script>" not in js
197
+
198
+ def test_view_files_contain_root_section_element(self):
199
+ """Chaque vue HTML doit avoir un élément racine avec id='view-<nom>'."""
200
+ view_ids = {
201
+ "view_ranking.html": "view-ranking",
202
+ "view_gallery.html": "view-gallery",
203
+ "view_document.html": "view-document",
204
+ "view_analyses.html": "view-analyses",
205
+ "view_characters.html": "view-characters",
206
+ }
207
+ for fname, expected_id in view_ids.items():
208
+ content = (_TEMPLATES_DIR / fname).read_text(encoding="utf-8")
209
+ assert f'id="{expected_id}"' in content, (
210
+ f"{fname} devrait contenir id='{expected_id}'"
211
+ )