Claude commited on
Commit
b914841
·
unverified ·
1 Parent(s): d86e268

sprint25: refactor web frontend en Jinja2 + JS externe (miroir Sprint 17)

Browse files

Sprint 17 avait découpé le rapport HTML monolithique de generator.py en
10 fichiers Jinja2. Sprint 25 fait la même opération sur la SPA web :

Avant
-----
picarones/web/app.py contenait ``_HTML_TEMPLATE = r"""..."""``, une
chaîne brute de ~1490 lignes mélangeant HTML, CSS et 1131 lignes de
JavaScript en string Python — non testable, non linté, non
type-checkable, et toute modification imposait de relire un fichier
de 3163 lignes.

Après
-----
- picarones/web/templates/ (8 fichiers) :
base.html.j2 (squelette, monte les partials)
_ascii_banner.html
_header_nav.html
_view_benchmark.html (200 lignes — la vue principale)
_view_reports.html
_view_engines.html
_view_import.html
_modals.html
- picarones/web/static/web-app.js (1131 lignes — toute la logique JS)
- picarones/web/app.py : 3163 → 1690 lignes (-1473).

Le rendu passe par ``_render_index(lang)`` qui utilise un
``jinja2.Environment`` cache-friendly avec autoescape HTML/J2.

CSP durcie partiellement
------------------------
``script-src`` conserve ``'unsafe-inline'`` à cause des ~30 ``onclick=``
encore inline dans les partials — leur migration vers
``addEventListener`` reste à faire (sous-sprint dédié pour ne pas
mélanger avec l'extraction des templates). En revanche, la victoire
réelle est que les 1131 lignes de JS ne sont plus inline : un test
``TestNoInlineScriptCode`` garantit qu'aucun bloc
``<script>...</script>`` ne contient plus de code dans la page rendue.

Packaging
---------
``pyproject.toml`` étendu pour inclure ``web/static/*.js`` et
``web/templates/*.{j2,html}`` dans le wheel.

Tests (+29, soit 1323 passing au total)
---------------------------------------

tests/test_sprint25_web_jinja_refactor.py couvre :
- présence et taille des 8 templates + ``web-app.js`` (≥ 500 lignes)
- déterminisme de ``_render_index`` (mêmes inputs → bit-à-bit identique)
- présence des 4 vues + nav + import-modal + script src + retro.css
- pas d'``id=`` dupliqué dans la page rendue (anti double-include)
- pas de bloc ``<script>...</script>`` inline avec du code (>200 chars)
- ``app.py`` < 2000 lignes
- ``_HTML_TEMPLATE = r`` absent du source
- ``GET /`` répond 200, respecte le cookie ``picarones_lang``,
retombe sur ``fr`` si langue non supportée
- ``GET /static/web-app.js`` servi correctement.

Effets de bord
--------------
Aucun changement comportemental côté API ni côté UI. La SPA reste
strictement identique du point de vue utilisateur — c'est de la
refactorisation pure. Les 182 tests Sprint 6 passent inchangés.

https://claude.ai/code/session_01L4RGWMrAajn5ZEFgTKjA5P

picarones/web/app.py CHANGED
@@ -1646,16 +1646,39 @@ def _run_benchmark_thread(job: BenchmarkJob, req: BenchmarkRequest) -> None:
1646
  # Page principale HTML (SPA)
1647
  # ---------------------------------------------------------------------------
1648
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1649
  @app.get("/", response_class=HTMLResponse)
1650
  async def index(picarones_lang: str = Cookie(default="fr")) -> HTMLResponse:
1651
  lang = picarones_lang if picarones_lang in _SUPPORTED_LANGS else "fr"
1652
- # Injecte le code langue dans la SPA via une balise meta
1653
- page = _HTML_TEMPLATE.replace(
1654
- "<head>",
1655
- f'<head>\n<meta name="picarones-lang" content="{lang}">',
1656
- 1,
1657
- ).replace("__VERSION__", __version__)
1658
- return HTMLResponse(content=page)
1659
 
1660
 
1661
  # ---------------------------------------------------------------------------
@@ -1665,1499 +1688,3 @@ async def index(picarones_lang: str = Cookie(default="fr")) -> HTMLResponse:
1665
  def _iso_now() -> str:
1666
  return datetime.now(timezone.utc).isoformat(timespec="seconds")
1667
 
1668
-
1669
- # ---------------------------------------------------------------------------
1670
- # HTML Template (SPA, French/English, Vanilla JS)
1671
- # ---------------------------------------------------------------------------
1672
-
1673
- _HTML_TEMPLATE = r"""<!DOCTYPE html>
1674
- <html lang="fr">
1675
- <head>
1676
- <meta charset="UTF-8">
1677
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
1678
- <title>Picarones — OCR Benchmark</title>
1679
- <link rel="stylesheet" href="/static/retro.css?v=__VERSION__">
1680
- <style>
1681
- /* Overrides locaux minimaux — le gros du CSS est dans /static/retro.css */
1682
- </style>
1683
- </head>
1684
- <body>
1685
-
1686
- <div id="ascii-banner">
1687
- <pre>██████╗ ██╗ ██████╗ █████╗ ██████╗ ██████╗ ███╗ ██╗███████╗███████╗
1688
- ██╔══██╗██║██╔════╝██╔══██╗██╔══██╗██╔═══██╗████╗ ██║██╔════╝██╔════╝
1689
- ██████╔╝██║██║ ███████║██████╔╝██║ ██║██╔██╗ ██║█████╗ ███████╗
1690
- ██╔═══╝ ██║██║ ██╔══██║██╔══██╗██║ ██║██║╚██╗██║██╔══╝ ╚════██║
1691
- ██║ ██║╚██████╗██║ ██║██║ ██║╚██████╔╝██║ ╚████║███████╗███████║
1692
- ╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═══╝╚══════╝╚══════╝</pre>
1693
- <span class="ascii-subtitle">OCR/HTR Benchmark Platform</span>
1694
- </div>
1695
-
1696
- <div id="header">
1697
- <h1 data-i18n="app_title">Picarones <span class="version" id="app-version"></span></h1>
1698
- <nav id="nav">
1699
- <button class="nav-btn active" onclick="showView('benchmark')" data-i18n="nav_benchmark">Benchmark</button>
1700
- <button class="nav-btn" onclick="showView('reports')" data-i18n="nav_reports">Rapports</button>
1701
- <button class="nav-btn" onclick="showView('engines')" data-i18n="nav_engines">Moteurs</button>
1702
- <button class="nav-btn" onclick="showView('import')" data-i18n="nav_import">Import</button>
1703
- </nav>
1704
- <button id="lang-btn" onclick="toggleLang()">EN</button>
1705
- </div>
1706
-
1707
- <div id="main">
1708
-
1709
- <!-- ===== VUE BENCHMARK ===== -->
1710
- <div id="view-benchmark" class="view active">
1711
-
1712
- <div class="card">
1713
- <h2 data-i18n="bench_corpus_title">1. Corpus</h2>
1714
-
1715
- <!-- Tab bar -->
1716
- <div class="corpus-tabs">
1717
- <button class="corpus-tab active" id="ctab-browse" onclick="switchCorpusTab('browse')" data-i18n="corpus_tab_browse">📁 Parcourir</button>
1718
- <button class="corpus-tab" id="ctab-upload" onclick="switchCorpusTab('upload')" data-i18n="corpus_tab_upload">⬆ Uploader</button>
1719
- </div>
1720
-
1721
- <!-- Browse tab -->
1722
- <div id="corpus-tab-browse">
1723
- <div class="form-group">
1724
- <label data-i18n="bench_corpus_label">Chemin vers le dossier corpus (paires image/.gt.txt)</label>
1725
- <div class="path-input-row">
1726
- <input type="text" id="corpus-path" placeholder="./corpus/" value="" />
1727
- <button class="btn btn-secondary btn-sm" onclick="openFileBrowser()" data-i18n="bench_browse">Parcourir</button>
1728
- </div>
1729
- </div>
1730
- <div id="file-browser-container" style="display:none; margin-top:10px;">
1731
- <div class="fb-path" id="fb-current-path">.</div>
1732
- <div id="file-browser"></div>
1733
- </div>
1734
- </div>
1735
-
1736
- <!-- Upload tab -->
1737
- <div id="corpus-tab-upload" style="display:none;">
1738
- <div class="upload-mode-row">
1739
- <label><input type="radio" name="upload-mode" value="zip" checked onchange="onUploadModeChange()"> 🗜 <span data-i18n="upload_zip_mode">Archive ZIP</span></label>
1740
- <label><input type="radio" name="upload-mode" value="files" onchange="onUploadModeChange()"> 🖼 <span data-i18n="upload_files_mode">Fichiers individuels</span></label>
1741
- </div>
1742
- <!-- Drop zone -->
1743
- <div id="upload-dropzone" class="upload-dropzone"
1744
- onclick="document.getElementById('upload-file-input').click()"
1745
- ondragover="event.preventDefault(); this.classList.add('dragover')"
1746
- ondragleave="this.classList.remove('dragover')"
1747
- ondrop="onDropFiles(event)">
1748
- <span class="upload-icon">⬆</span>
1749
- <span id="upload-dropzone-text" data-i18n="upload_drop_zip">Glissez un .zip ici ou cliquez pour sélectionner</span>
1750
- <input type="file" id="upload-file-input" style="display:none" accept=".zip" onchange="onFileInputChange(event)" />
1751
- </div>
1752
- <!-- Progress -->
1753
- <div id="upload-progress-container" style="display:none; margin-top:10px;">
1754
- <div class="progress-bar-outer">
1755
- <div class="progress-bar-inner" id="upload-progress-bar" style="width:0%; transition:width 0.2s;"></div>
1756
- </div>
1757
- <div id="upload-progress-text" style="font-size:12px; color:var(--text-muted); margin-top:4px;"></div>
1758
- </div>
1759
- <!-- Preview after upload -->
1760
- <div id="upload-preview" style="margin-top:10px;"></div>
1761
- <!-- Previously uploaded corpora -->
1762
- <div id="uploads-list" style="margin-top:14px;"></div>
1763
- </div>
1764
-
1765
- <div id="corpus-info" style="margin-top:8px; font-size:12px; color: var(--text-muted);"></div>
1766
- </div>
1767
-
1768
- <!-- ── Section 1 : Moteurs OCR ─────────────────────────────────── -->
1769
- <div class="card">
1770
- <h2 data-i18n="bench_ocr_title">2. Moteurs OCR</h2>
1771
- <div id="ocr-engines-status-list">
1772
- <div style="color: var(--text-muted); font-size: 12px;"><span class="spinner"></span> Chargement…</div>
1773
- </div>
1774
- </div>
1775
-
1776
- <!-- ── Section 2 : Modèles LLM ──────────────────────────────────── -->
1777
- <div class="card">
1778
- <h2 data-i18n="bench_llm_title">3. Modèles LLM</h2>
1779
- <div id="llm-status-list">
1780
- <div style="color: var(--text-muted); font-size: 12px;"><span class="spinner"></span> Chargement…</div>
1781
- </div>
1782
- </div>
1783
-
1784
- <!-- ── Section 3 : Composition des concurrents ──────────────────── -->
1785
- <div class="card">
1786
- <h2 data-i18n="bench_compose_title">4. Concurrents à benchmarker</h2>
1787
-
1788
- <div class="mode-toggle">
1789
- <label><input type="radio" name="compose-mode" value="ocr" checked onchange="onComposeModeChange()"> 🔍 <span data-i18n="compose_ocr_only">OCR seul</span></label>
1790
- <label><input type="radio" name="compose-mode" value="pipeline" onchange="onComposeModeChange()"> ⛓ <span data-i18n="compose_pipeline">Pipeline OCR+LLM</span></label>
1791
- <label><input type="radio" name="compose-mode" value="postcorrection" onchange="onComposeModeChange()"> 📝 <span data-i18n="compose_postcorrection">Post-correction (corpus OCR)</span></label>
1792
- </div>
1793
-
1794
- <div id="corpus-ocr-notice" style="display:none; margin:8px 0; padding:8px 12px; background:var(--bg-highlight,#f0fdf4); border-radius:6px; font-size:12px; color:var(--success,#16a34a);">
1795
- 📝 <span data-i18n="corpus_has_ocr">Ce corpus contient des fichiers OCR pré-calculés (.ocr.txt) — post-correction disponible.</span>
1796
- </div>
1797
-
1798
- <div id="compose-ocr-section" class="composer-row">
1799
- <div class="form-group">
1800
- <label data-i18n="compose_ocr_engine">Moteur OCR</label>
1801
- <select id="compose-ocr-engine" onchange="onComposeOCRChange()">
1802
- <option value="tesseract">Tesseract</option>
1803
- <option value="mistral_ocr">Mistral OCR</option>
1804
- <option value="google_vision">Google Vision</option>
1805
- <option value="azure_doc_intel">Azure Doc Intel</option>
1806
- </select>
1807
- </div>
1808
- <div class="form-group" style="flex:1;">
1809
- <label data-i18n="compose_ocr_model">Modèle / Langue <span class="spinner" id="sp-ocr-model" style="display:none"></span></label>
1810
- <select id="compose-ocr-model"></select>
1811
- </div>
1812
- </div>
1813
-
1814
- <div id="compose-pipeline-section" style="display:none;">
1815
- <div class="composer-row">
1816
- <div class="form-group">
1817
- <label data-i18n="compose_llm_provider">Provider LLM</label>
1818
- <select id="compose-llm-provider" onchange="onComposeLLMChange()">
1819
- <option value="openai">OpenAI</option>
1820
- <option value="anthropic">Anthropic</option>
1821
- <option value="mistral">Mistral LLM</option>
1822
- <option value="ollama">Ollama</option>
1823
- </select>
1824
- </div>
1825
- <div class="form-group" style="flex:1;">
1826
- <label data-i18n="compose_llm_model">Modèle LLM <span class="spinner" id="sp-llm-model" style="display:none"></span></label>
1827
- <select id="compose-llm-model"></select>
1828
- </div>
1829
- </div>
1830
- <div class="composer-row">
1831
- <div class="form-group">
1832
- <label data-i18n="compose_mode">Mode pipeline</label>
1833
- <select id="compose-pipeline-mode" onchange="onComposePipelineModeChange()">
1834
- <option value="text_only" data-i18n="mode_text_only">Post-correction texte</option>
1835
- <option value="text_and_image" data-i18n="mode_text_image">Post-correction image+texte</option>
1836
- <option value="zero_shot" data-i18n="mode_zero_shot">Zero-shot</option>
1837
- </select>
1838
- </div>
1839
- <div class="form-group" style="flex:1;">
1840
- <label data-i18n="compose_prompt">Prompt <span class="spinner" id="sp-prompt" style="display:none"></span></label>
1841
- <select id="compose-prompt"></select>
1842
- </div>
1843
- </div>
1844
- </div>
1845
-
1846
- <div style="display:flex; gap:10px; align-items:center; margin-top:10px;">
1847
- <button class="btn btn-primary btn-sm" onclick="addCompetitor()" data-i18n="compose_add">+ Ajouter</button>
1848
- <span id="compose-error" style="color: var(--danger); font-size:12px;"></span>
1849
- </div>
1850
-
1851
- <div id="competitors-list" style="margin-top:14px;">
1852
- <div style="color: var(--text-muted); font-size:12px;" data-i18n="compose_empty">Aucun concurrent ajouté.</div>
1853
- </div>
1854
- </div>
1855
-
1856
- <!-- ── 5. Options ─────────────────────────────────────────────────── -->
1857
- <div class="card">
1858
- <h2 data-i18n="bench_options_title">5. Options</h2>
1859
- <div class="form-row">
1860
- <div class="form-group">
1861
- <label data-i18n="bench_norm_label">Profil de normalisation</label>
1862
- <select id="norm-profile">
1863
- <option value="nfc">NFC (standard)</option>
1864
- </select>
1865
- </div>
1866
- <div class="form-group">
1867
- <label data-i18n="bench_char_exclude_label">Caractères à ignorer <span style="color:var(--text-muted);font-size:.75rem">(séparés par virgule, ex : ', -, –)</span></label>
1868
- <input type="text" id="char-exclude" placeholder="ex: ', -, –, ." style="font-family:monospace" />
1869
- </div>
1870
- <div class="form-group">
1871
- <label data-i18n="bench_output_label">Dossier de sortie</label>
1872
- <input type="text" id="output-dir" value="./rapports/" />
1873
- </div>
1874
- <div class="form-group">
1875
- <label data-i18n="bench_name_label">Nom du rapport (optionnel)</label>
1876
- <input type="text" id="report-name" placeholder="rapport_2024_01_15" />
1877
- </div>
1878
- </div>
1879
- </div>
1880
-
1881
- <div style="display:flex; gap:10px; align-items:center; margin-bottom:16px;">
1882
- <button class="btn btn-primary" id="start-btn" onclick="startBenchmark()" data-i18n="bench_start">▶ Lancer le benchmark</button>
1883
- <button class="btn btn-secondary" id="cancel-btn" style="display:none;" onclick="cancelBenchmark()" data-i18n="bench_cancel">✕ Annuler</button>
1884
- <span id="bench-status-text" style="font-size:12px; color: var(--text-muted);"></span>
1885
- </div>
1886
-
1887
- <div id="bench-progress-section" style="display:none;">
1888
- <div class="card">
1889
- <h2 data-i18n="bench_progress_title">Progression</h2>
1890
- <div id="engine-progress-list"></div>
1891
- <div style="margin-top: 12px;">
1892
- <label style="font-size:12px; color: var(--text-muted); display:block; margin-bottom:4px;" data-i18n="bench_log">Journal</label>
1893
- <div class="log-box" id="bench-log"></div>
1894
- </div>
1895
- </div>
1896
- </div>
1897
-
1898
- <div id="bench-result-section" style="display:none;">
1899
- <div class="card">
1900
- <h2 data-i18n="bench_result_title">Résultats</h2>
1901
- <div id="bench-ranking-table"></div>
1902
- <div style="margin-top:12px;">
1903
- <a id="bench-report-link" href="#" class="btn btn-primary" target="_blank" data-i18n="bench_open_report">Ouvrir le rapport</a>
1904
- </div>
1905
- </div>
1906
- </div>
1907
- </div>
1908
-
1909
- <!-- ===== VUE RAPPORTS ===== -->
1910
- <div id="view-reports" class="view">
1911
- <div class="card">
1912
- <h2 data-i18n="reports_title">Rapports générés</h2>
1913
- <div class="form-row" style="margin-bottom:12px;">
1914
- <div class="form-group" style="max-width:320px;">
1915
- <label data-i18n="reports_dir_label">Dossier de rapports</label>
1916
- <div class="path-input-row">
1917
- <input type="text" id="reports-dir" value="." />
1918
- <button class="btn btn-secondary btn-sm" onclick="loadReports()" data-i18n="reports_refresh">Rafraîchir</button>
1919
- </div>
1920
- </div>
1921
- </div>
1922
- <div id="reports-list">
1923
- <div style="color: var(--text-muted); font-size: 12px;" data-i18n="loading">Chargement…</div>
1924
- </div>
1925
- </div>
1926
- </div>
1927
-
1928
- <!-- ===== VUE MOTEURS ===== -->
1929
- <div id="view-engines" class="view">
1930
- <div class="card">
1931
- <h2 data-i18n="engines_ocr_title">Moteurs OCR</h2>
1932
- <div id="engines-ocr-list">
1933
- <div style="color: var(--text-muted); font-size: 12px;" data-i18n="loading">Chargement…</div>
1934
- </div>
1935
- </div>
1936
- <div class="card">
1937
- <h2 data-i18n="engines_llm_title">LLMs disponibles</h2>
1938
- <div id="engines-llm-list">
1939
- <div style="color: var(--text-muted); font-size: 12px;" data-i18n="loading">Chargement…</div>
1940
- </div>
1941
- </div>
1942
- </div>
1943
-
1944
- <!-- ===== VUE IMPORT ===== -->
1945
- <div id="view-import" class="view">
1946
-
1947
- <!-- HTR-United -->
1948
- <div class="card">
1949
- <h2 data-i18n="import_htr_title">Import HTR-United</h2>
1950
- <p style="font-size:12px; color:var(--text-muted); margin-bottom:12px;" data-i18n="import_htr_desc">
1951
- Catalogue communautaire de corpus HTR/OCR pour documents patrimoniaux.
1952
- </p>
1953
- <div class="form-row">
1954
- <div class="form-group" style="flex:2;">
1955
- <label data-i18n="import_search_label">Recherche</label>
1956
- <input type="text" id="htr-search" placeholder="médiéval, latin, manuscrits…" />
1957
- </div>
1958
- <div class="form-group">
1959
- <label data-i18n="import_lang_filter">Langue</label>
1960
- <select id="htr-lang-filter">
1961
- <option value="" data-i18n="all">Toutes</option>
1962
- </select>
1963
- </div>
1964
- <div class="form-group">
1965
- <label data-i18n="import_script_filter">Type d'écriture</label>
1966
- <select id="htr-script-filter">
1967
- <option value="" data-i18n="all">Tous</option>
1968
- </select>
1969
- </div>
1970
- <div class="form-group" style="justify-content: flex-end; padding-top: 18px;">
1971
- <button class="btn btn-primary btn-sm" onclick="searchHTRUnited()" data-i18n="search">Rechercher</button>
1972
- </div>
1973
- </div>
1974
- <div id="htr-results" class="ds-grid"></div>
1975
- </div>
1976
-
1977
- <!-- HuggingFace -->
1978
- <div class="card">
1979
- <h2 data-i18n="import_hf_title">Import HuggingFace Datasets</h2>
1980
- <p style="font-size:12px; color:var(--text-muted); margin-bottom:12px;" data-i18n="import_hf_desc">
1981
- Datasets OCR/HTR publics depuis HuggingFace Hub (IAM, RIMES, CATMuS, Gallica…).
1982
- </p>
1983
- <div class="form-row">
1984
- <div class="form-group" style="flex:2;">
1985
- <label data-i18n="import_search_label">Recherche</label>
1986
- <input type="text" id="hf-search" placeholder="medieval OCR, IAM, RIMES…" />
1987
- </div>
1988
- <div class="form-group">
1989
- <label data-i18n="import_lang_filter">Langue</label>
1990
- <input type="text" id="hf-lang-filter" placeholder="French, Latin…" />
1991
- </div>
1992
- <div class="form-group">
1993
- <label data-i18n="import_tag_filter">Tags</label>
1994
- <input type="text" id="hf-tags" placeholder="ocr, htr, historical…" />
1995
- </div>
1996
- <div class="form-group" style="justify-content: flex-end; padding-top: 18px;">
1997
- <button class="btn btn-primary btn-sm" onclick="searchHuggingFace()" data-i18n="search">Rechercher</button>
1998
- </div>
1999
- </div>
2000
- <div id="hf-results" class="ds-grid"></div>
2001
- </div>
2002
-
2003
- </div><!-- end view-import -->
2004
-
2005
- </div><!-- end #main -->
2006
-
2007
- <!-- Import modal -->
2008
- <div id="import-modal" style="display:none; position:fixed; inset:0; background:rgba(0,0,0,0.4); z-index:200; align-items:center; justify-content:center;">
2009
- <div class="card" style="width: 420px; max-width: 95vw;">
2010
- <h2 id="import-modal-title" data-i18n="import_modal_title">Importer le corpus</h2>
2011
- <input type="hidden" id="import-modal-type" />
2012
- <input type="hidden" id="import-modal-id" />
2013
- <div class="form-group" style="margin-bottom:12px;">
2014
- <label data-i18n="import_output_dir">Dossier de destination</label>
2015
- <input type="text" id="import-modal-output" value="./corpus/" />
2016
- </div>
2017
- <div class="form-group" style="margin-bottom:16px;">
2018
- <label data-i18n="import_max_samples">Nombre max de documents</label>
2019
- <input type="number" id="import-modal-max" value="100" min="1" max="10000" />
2020
- </div>
2021
- <div id="import-modal-status" style="margin-bottom:12px;"></div>
2022
- <div style="display:flex; gap:8px;">
2023
- <button class="btn btn-primary" onclick="confirmImport()" data-i18n="import_confirm">Importer</button>
2024
- <button class="btn btn-secondary" onclick="closeImportModal()" data-i18n="cancel">Annuler</button>
2025
- </div>
2026
- </div>
2027
- </div>
2028
-
2029
- <script>
2030
- // ─── i18n ────────────────────────────────────────────────────────────────────
2031
- const T = {
2032
- fr: {
2033
- app_title: "Picarones",
2034
- nav_benchmark: "Benchmark",
2035
- nav_reports: "Rapports",
2036
- nav_engines: "Moteurs",
2037
- nav_import: "Import",
2038
- loading: "Chargement…",
2039
- search: "Rechercher",
2040
- all: "Tous",
2041
- cancel: "Annuler",
2042
- bench_corpus_title: "1. Corpus",
2043
- bench_corpus_label: "Chemin vers le dossier corpus (paires image / .gt.txt)",
2044
- bench_browse: "Parcourir",
2045
- corpus_tab_browse: "📁 Parcourir",
2046
- corpus_tab_upload: "⬆ Uploader",
2047
- upload_zip_mode: "Archive ZIP",
2048
- upload_files_mode: "Fichiers individuels",
2049
- upload_drop_zip: "Glissez un .zip ici ou cliquez pour sélectionner",
2050
- upload_drop_files: "Glissez des images + .gt.txt ou cliquez pour sélectionner",
2051
- upload_uploading: "Upload en cours…",
2052
- upload_success: "Corpus chargé avec succès",
2053
- upload_no_corpus: "Aucun corpus uploadé.",
2054
- upload_select: "Utiliser ce corpus",
2055
- upload_delete: "Supprimer",
2056
- upload_pairs: "paires",
2057
- upload_missing_gt: "GT manquant(s)",
2058
- bench_engines_title: "2. Moteurs et pipelines",
2059
- bench_ocr_title: "2. Moteurs OCR",
2060
- bench_llm_title: "3. Modèles LLM",
2061
- bench_compose_title: "4. Concurrents à benchmarker",
2062
- bench_options_title: "5. Options",
2063
- compose_ocr_only: "OCR seul",
2064
- compose_pipeline: "Pipeline OCR+LLM",
2065
- compose_postcorrection: "Post-correction (corpus OCR)",
2066
- corpus_has_ocr: "Ce corpus contient des fichiers OCR pré-calculés (.ocr.txt) — post-correction disponible.",
2067
- corpus_no_ocr_warn: "Ce corpus ne contient pas de fichiers .ocr.txt — uploadez un corpus triplet pour la post-correction.",
2068
- compose_ocr_engine: "Moteur OCR",
2069
- compose_ocr_model: "Modèle / Langue",
2070
- compose_llm_provider: "Provider LLM",
2071
- compose_llm_model: "Modèle LLM",
2072
- compose_mode: "Mode pipeline",
2073
- compose_prompt: "Prompt",
2074
- compose_add: "+ Ajouter",
2075
- compose_empty: "Aucun concurrent ajouté.",
2076
- mode_text_only: "Post-correction texte",
2077
- mode_text_image: "Post-correction image+texte",
2078
- mode_zero_shot: "Zero-shot",
2079
- bench_norm_label: "Profil de normalisation",
2080
- bench_lang_label: "Langue (Tesseract)",
2081
- bench_output_label: "Dossier de sortie",
2082
- bench_name_label: "Nom du rapport (optionnel)",
2083
- bench_start: "▶ Lancer le benchmark",
2084
- bench_cancel: "✕ Annuler",
2085
- bench_progress_title: "Progression",
2086
- bench_log: "Journal",
2087
- bench_result_title: "Résultats",
2088
- bench_open_report: "Ouvrir le rapport",
2089
- reports_title: "Rapports générés",
2090
- reports_dir_label: "Dossier de rapports",
2091
- reports_refresh: "Rafraîchir",
2092
- engines_ocr_title: "Moteurs OCR",
2093
- engines_llm_title: "LLMs disponibles",
2094
- import_htr_title: "Import HTR-United",
2095
- import_htr_desc: "Catalogue communautaire de corpus HTR/OCR pour documents patrimoniaux.",
2096
- import_hf_title: "Import HuggingFace Datasets",
2097
- import_hf_desc: "Datasets OCR/HTR publics depuis HuggingFace Hub (IAM, RIMES, CATMuS, Gallica…).",
2098
- import_search_label: "Recherche",
2099
- import_lang_filter: "Langue",
2100
- import_script_filter: "Type d'écriture",
2101
- import_tag_filter: "Tags",
2102
- import_modal_title: "Importer le corpus",
2103
- import_output_dir: "Dossier de destination",
2104
- import_max_samples: "Nombre max de documents",
2105
- import_confirm: "Importer",
2106
- available: "disponible",
2107
- not_installed: "non installé",
2108
- configured: "configuré",
2109
- missing_key: "clé manquante",
2110
- running: "actif",
2111
- not_running: "inactif",
2112
- no_reports: "Aucun rapport trouvé.",
2113
- lines: "lignes",
2114
- centuries: "siècles",
2115
- },
2116
- en: {
2117
- app_title: "Picarones",
2118
- nav_benchmark: "Benchmark",
2119
- nav_reports: "Reports",
2120
- nav_engines: "Engines",
2121
- nav_import: "Import",
2122
- loading: "Loading…",
2123
- search: "Search",
2124
- all: "All",
2125
- cancel: "Cancel",
2126
- bench_corpus_title: "1. Corpus",
2127
- bench_corpus_label: "Path to corpus directory (image / .gt.txt pairs)",
2128
- bench_browse: "Browse",
2129
- corpus_tab_browse: "📁 Browse",
2130
- corpus_tab_upload: "⬆ Upload",
2131
- upload_zip_mode: "ZIP archive",
2132
- upload_files_mode: "Individual files",
2133
- upload_drop_zip: "Drop a .zip here or click to select",
2134
- upload_drop_files: "Drop images + .gt.txt files or click to select",
2135
- upload_uploading: "Uploading…",
2136
- upload_success: "Corpus loaded successfully",
2137
- upload_no_corpus: "No corpus uploaded.",
2138
- upload_select: "Use this corpus",
2139
- upload_delete: "Delete",
2140
- upload_pairs: "pairs",
2141
- upload_missing_gt: "missing GT",
2142
- bench_engines_title: "2. Engines & pipelines",
2143
- bench_ocr_title: "2. OCR Engines",
2144
- bench_llm_title: "3. LLM Models",
2145
- bench_compose_title: "4. Competitors",
2146
- bench_options_title: "5. Options",
2147
- compose_ocr_only: "OCR only",
2148
- compose_pipeline: "OCR+LLM Pipeline",
2149
- compose_postcorrection: "Post-correction (corpus OCR)",
2150
- corpus_has_ocr: "This corpus contains pre-computed OCR files (.ocr.txt) — post-correction available.",
2151
- corpus_no_ocr_warn: "This corpus has no .ocr.txt files — upload a triplet corpus for post-correction.",
2152
- compose_ocr_engine: "OCR Engine",
2153
- compose_ocr_model: "Model / Language",
2154
- compose_llm_provider: "LLM Provider",
2155
- compose_llm_model: "LLM Model",
2156
- compose_mode: "Pipeline mode",
2157
- compose_prompt: "Prompt",
2158
- compose_add: "+ Add",
2159
- compose_empty: "No competitors added.",
2160
- mode_text_only: "Text post-correction",
2161
- mode_text_image: "Image+text post-correction",
2162
- mode_zero_shot: "Zero-shot",
2163
- bench_norm_label: "Normalization profile",
2164
- bench_lang_label: "Language (Tesseract)",
2165
- bench_output_label: "Output directory",
2166
- bench_name_label: "Report name (optional)",
2167
- bench_start: "▶ Start benchmark",
2168
- bench_cancel: "✕ Cancel",
2169
- bench_progress_title: "Progress",
2170
- bench_log: "Log",
2171
- bench_result_title: "Results",
2172
- bench_open_report: "Open report",
2173
- reports_title: "Generated reports",
2174
- reports_dir_label: "Reports directory",
2175
- reports_refresh: "Refresh",
2176
- engines_ocr_title: "OCR Engines",
2177
- engines_llm_title: "Available LLMs",
2178
- import_htr_title: "Import from HTR-United",
2179
- import_htr_desc: "Community catalogue of HTR/OCR datasets for heritage documents.",
2180
- import_hf_title: "Import from HuggingFace Datasets",
2181
- import_hf_desc: "Public OCR/HTR datasets from HuggingFace Hub (IAM, RIMES, CATMuS, Gallica…).",
2182
- import_search_label: "Search",
2183
- import_lang_filter: "Language",
2184
- import_script_filter: "Script type",
2185
- import_tag_filter: "Tags",
2186
- import_modal_title: "Import corpus",
2187
- import_output_dir: "Output directory",
2188
- import_max_samples: "Max documents",
2189
- import_confirm: "Import",
2190
- available: "available",
2191
- not_installed: "not installed",
2192
- configured: "configured",
2193
- missing_key: "key missing",
2194
- running: "running",
2195
- not_running: "not running",
2196
- no_reports: "No reports found.",
2197
- lines: "lines",
2198
- centuries: "centuries",
2199
- },
2200
- };
2201
- let lang = "fr";
2202
- function t(key) { return (T[lang][key]) || key; }
2203
- function toggleLang() {
2204
- lang = lang === "fr" ? "en" : "fr";
2205
- document.getElementById("lang-btn").textContent = lang === "fr" ? "EN" : "FR";
2206
- document.querySelectorAll("[data-i18n]").forEach(el => {
2207
- const k = el.getAttribute("data-i18n");
2208
- if (T[lang][k]) el.textContent = T[lang][k];
2209
- });
2210
- }
2211
-
2212
- // ─── Navigation ──────────────────────────────────────────────────────────────
2213
- function showView(name) {
2214
- document.querySelectorAll(".view").forEach(v => v.classList.remove("active"));
2215
- document.querySelectorAll(".nav-btn").forEach(b => b.classList.remove("active"));
2216
- const view = document.getElementById("view-" + name);
2217
- if (view) view.classList.add("active");
2218
- const btns = document.querySelectorAll(".nav-btn");
2219
- const idx = ["benchmark","reports","engines","import"].indexOf(name);
2220
- if (btns[idx]) btns[idx].classList.add("active");
2221
-
2222
- if (name === "reports") loadReports();
2223
- if (name === "engines") loadEngines();
2224
- if (name === "import") { searchHTRUnited(); searchHuggingFace(); }
2225
- }
2226
-
2227
- // ─── Status / version ────────────────────────────────────────────────────────
2228
- async function loadStatus() {
2229
- try {
2230
- const r = await fetch("/api/status");
2231
- const d = await r.json();
2232
- document.getElementById("app-version").textContent = "v" + d.version;
2233
- } catch(e) {}
2234
- }
2235
-
2236
- // ─── Models cache & fetching ─────────────────────────────────────────────────
2237
- let _modelsCache = {};
2238
- let _enginesData = null;
2239
- let _competitors = [];
2240
- let _refreshIntervalId = null;
2241
- let _pendingOCREngine = null; // garde contre les réponses obsolètes (race condition)
2242
-
2243
- async function fetchModels(provider, capability) {
2244
- const cacheKey = capability ? `${provider}__${capability}` : provider;
2245
- if (_modelsCache[cacheKey]) return _modelsCache[cacheKey];
2246
- const url = capability ? `/api/models/${provider}?capability=${capability}` : `/api/models/${provider}`;
2247
- const r = await fetch(url);
2248
- const d = await r.json();
2249
- // Support both new format (objects with id+capabilities) and old format (flat strings)
2250
- let models = d.model_ids || d.models || [];
2251
- if (models.length > 0 && typeof models[0] === "object") {
2252
- models = models.map(m => m.id || m);
2253
- }
2254
- _modelsCache[cacheKey] = models;
2255
- return models;
2256
- }
2257
-
2258
- function populateSelect(selectId, models, spinnerId) {
2259
- const sel = document.getElementById(selectId);
2260
- if (spinnerId) { const sp = document.getElementById(spinnerId); if (sp) sp.style.display = "none"; }
2261
- if (!sel) return;
2262
- // Handle both string arrays and object arrays
2263
- const items = models.map(m => typeof m === "object" ? (m.id || m) : m);
2264
- sel.innerHTML = items.length === 0
2265
- ? '<option value="">— aucun modèle —</option>'
2266
- : items.map(m => `<option value="${m}">${m}</option>`).join("");
2267
- }
2268
-
2269
- // ─── Benchmark sections (OCR + LLM status + composer init) ───────────────────
2270
- async function loadBenchmarkSections() {
2271
- try {
2272
- const r = await fetch("/api/engines");
2273
- const d = await r.json();
2274
- _enginesData = d;
2275
- renderOCREnginesSection(d.engines);
2276
- renderLLMSection(d.llms);
2277
- } catch(e) {
2278
- document.getElementById("ocr-engines-status-list").innerHTML =
2279
- `<div style="color:var(--danger);font-size:12px;">Erreur : ${e.message}</div>`;
2280
- }
2281
- }
2282
-
2283
- function _makeProviderRow(eng, msId) {
2284
- const dotCls = eng.available ? "status-ok" : (eng.status === "not_running" ? "status-warn" : "status-err");
2285
- let statusLabel;
2286
- if (eng.available) statusLabel = eng.version ? eng.version : (lang === "fr" ? "disponible" : "available");
2287
- else if (eng.status === "missing_key") statusLabel = eng.key_env ? `<code style="font-size:11px;color:var(--warning)">${eng.key_env}</code>` : (lang === "fr" ? "clé manquante" : "key missing");
2288
- else if (eng.status === "not_running") statusLabel = lang === "fr" ? "inactif" : "not running";
2289
- else statusLabel = lang === "fr" ? "non installé" : "not installed";
2290
-
2291
- const row = document.createElement("div");
2292
- row.className = "provider-row";
2293
- row.innerHTML = `
2294
- <div class="provider-label"><span class="engine-status ${dotCls}"></span><strong>${eng.label}</strong></div>
2295
- <div class="provider-status">${statusLabel}</div>
2296
- <div class="provider-model-select" id="${msId}">${eng.available ? '<span class="spinner"></span>' : ""}</div>`;
2297
- return row;
2298
- }
2299
-
2300
- async function renderOCREnginesSection(engines) {
2301
- const container = document.getElementById("ocr-engines-status-list");
2302
- container.innerHTML = "";
2303
- for (const eng of engines) {
2304
- const msId = `ms-ocr-${eng.id}`;
2305
- container.appendChild(_makeProviderRow(eng, msId));
2306
- if (eng.available) {
2307
- fetchModels(eng.id).then(models => {
2308
- const div = document.getElementById(msId);
2309
- if (!div) return;
2310
- div.innerHTML = models.length === 0
2311
- ? `<span style="color:var(--text-muted);font-size:11px;">—</span>`
2312
- : `<span style="font-size:12px;">${models.slice(0,5).join(", ")}${models.length > 5 ? ` +${models.length-5}` : ""}</span>`;
2313
- }).catch(() => {
2314
- const div = document.getElementById(msId);
2315
- if (div) div.innerHTML = `<span style="color:var(--danger);font-size:11px;">Erreur API</span>`;
2316
- });
2317
- }
2318
- }
2319
- }
2320
-
2321
- async function renderLLMSection(llms) {
2322
- const container = document.getElementById("llm-status-list");
2323
- container.innerHTML = "";
2324
- for (const llm of llms) {
2325
- const msId = `ms-llm-${llm.id}`;
2326
- container.appendChild(_makeProviderRow(llm, msId));
2327
- if (llm.available) {
2328
- fetchModels(llm.id).then(models => {
2329
- const div = document.getElementById(msId);
2330
- if (!div) return;
2331
- div.innerHTML = models.length === 0
2332
- ? `<span style="color:var(--text-muted);font-size:11px;">—</span>`
2333
- : `<span style="font-size:12px;">${models.slice(0,3).join(", ")}${models.length > 3 ? ` +${models.length-3}` : ""}</span>`;
2334
- }).catch(() => {
2335
- const div = document.getElementById(msId);
2336
- if (div) div.innerHTML = `<span style="color:var(--danger);font-size:11px;">Erreur API</span>`;
2337
- });
2338
- }
2339
- }
2340
- }
2341
-
2342
- function startAutoRefresh() {
2343
- if (_refreshIntervalId) clearInterval(_refreshIntervalId);
2344
- _refreshIntervalId = setInterval(async () => {
2345
- try {
2346
- const r = await fetch("/api/engines");
2347
- const d = await r.json();
2348
- if (!_enginesData || JSON.stringify(d) !== JSON.stringify(_enginesData)) {
2349
- _modelsCache = {};
2350
- _enginesData = d;
2351
- renderOCREnginesSection(d.engines);
2352
- renderLLMSection(d.llms);
2353
- }
2354
- } catch(e) {}
2355
- }, 10000);
2356
- }
2357
-
2358
- // ─── Competitor composer ──────────────────────────────────────────────────────
2359
- async function onComposeOCRChange() {
2360
- const engine = document.getElementById("compose-ocr-engine").value;
2361
- _pendingOCREngine = engine; // marquer la requête courante
2362
- const sp = document.getElementById("sp-ocr-model");
2363
- // Google Vision et Azure ont des listes statiques — pas d'appel API nécessaire
2364
- if (engine === "google_vision") {
2365
- sp.style.display = "none";
2366
- populateSelect("compose-ocr-model", ["document_text_detection", "text_detection"], null);
2367
- return;
2368
- }
2369
- if (engine === "azure_doc_intel") {
2370
- sp.style.display = "none";
2371
- populateSelect("compose-ocr-model", ["prebuilt-document", "prebuilt-read"], null);
2372
- return;
2373
- }
2374
- // Tesseract : langues installées ; Mistral OCR : modèles vision (API dynamique)
2375
- sp.style.display = "inline-block";
2376
- try {
2377
- const models = await fetchModels(engine);
2378
- if (_pendingOCREngine !== engine) return; // réponse obsolète, abandonner
2379
- populateSelect("compose-ocr-model", models, "sp-ocr-model");
2380
- } catch(e) {
2381
- if (_pendingOCREngine !== engine) return;
2382
- sp.style.display = "none";
2383
- document.getElementById("compose-ocr-model").innerHTML = '<option value="">Erreur</option>';
2384
- }
2385
- }
2386
-
2387
- async function onComposeLLMChange() {
2388
- const provider = document.getElementById("compose-llm-provider").value;
2389
- const composeMode = document.querySelector("input[name=compose-mode]:checked").value;
2390
- const pipelineMode = document.getElementById("compose-pipeline-mode").value;
2391
- // Apply capability filter for modes requiring vision
2392
- const needsVision = (pipelineMode === "text_and_image" || pipelineMode === "zero_shot");
2393
- const capability = (composeMode === "postcorrection" || composeMode === "pipeline") && needsVision ? "vision" : "";
2394
- _loadLLMModelsWithCapability(provider, capability);
2395
- }
2396
-
2397
- function onComposeModeChange() {
2398
- const mode = document.querySelector("input[name=compose-mode]:checked").value;
2399
- const ocrSection = document.getElementById("compose-ocr-section");
2400
- const pipelineSection = document.getElementById("compose-pipeline-section");
2401
-
2402
- if (mode === "ocr") {
2403
- ocrSection.style.display = "flex";
2404
- pipelineSection.style.display = "none";
2405
- } else if (mode === "pipeline") {
2406
- ocrSection.style.display = "flex";
2407
- pipelineSection.style.display = "block";
2408
- // Reload LLM models without capability filter
2409
- onComposeLLMChange();
2410
- } else if (mode === "postcorrection") {
2411
- ocrSection.style.display = "none";
2412
- pipelineSection.style.display = "block";
2413
- // Reload LLM models with capability filter based on pipeline mode
2414
- onComposePipelineModeChange();
2415
- }
2416
- }
2417
-
2418
- function onComposePipelineModeChange() {
2419
- const composeMode = document.querySelector("input[name=compose-mode]:checked").value;
2420
- if (composeMode !== "postcorrection" && composeMode !== "pipeline") return;
2421
- const pipelineMode = document.getElementById("compose-pipeline-mode").value;
2422
- // Filter by vision capability for modes that need images
2423
- const needsVision = (pipelineMode === "text_and_image" || pipelineMode === "zero_shot");
2424
- const capability = needsVision ? "vision" : "";
2425
- const provider = document.getElementById("compose-llm-provider").value;
2426
- // Clear cache for this provider to re-fetch with new capability filter
2427
- const cacheKey = capability ? `${provider}__${capability}` : provider;
2428
- delete _modelsCache[cacheKey];
2429
- _loadLLMModelsWithCapability(provider, capability);
2430
- }
2431
-
2432
- async function _loadLLMModelsWithCapability(provider, capability) {
2433
- document.getElementById("sp-llm-model").style.display = "inline-block";
2434
- try {
2435
- const models = await fetchModels(provider, capability);
2436
- populateSelect("compose-llm-model", models, "sp-llm-model");
2437
- } catch(e) {
2438
- document.getElementById("sp-llm-model").style.display = "none";
2439
- document.getElementById("compose-llm-model").innerHTML = '<option value="">Erreur</option>';
2440
- }
2441
- }
2442
-
2443
- async function loadComposePrompts() {
2444
- document.getElementById("sp-prompt").style.display = "inline-block";
2445
- try {
2446
- const models = await fetchModels("prompts");
2447
- populateSelect("compose-prompt", models, "sp-prompt");
2448
- } catch(e) {
2449
- document.getElementById("sp-prompt").style.display = "none";
2450
- }
2451
- }
2452
-
2453
- function addCompetitor() {
2454
- const mode = document.querySelector("input[name=compose-mode]:checked").value;
2455
- const errEl = document.getElementById("compose-error");
2456
-
2457
- const comp = { name: "", ocr_engine: "", ocr_model: "",
2458
- llm_provider: "", llm_model: "", pipeline_mode: "", prompt_file: "" };
2459
-
2460
- if (mode === "postcorrection") {
2461
- // Post-correction : OCR vient du corpus (.ocr.txt)
2462
- comp.ocr_engine = "corpus";
2463
- comp.llm_provider = document.getElementById("compose-llm-provider").value;
2464
- comp.llm_model = document.getElementById("compose-llm-model").value;
2465
- comp.pipeline_mode = document.getElementById("compose-pipeline-mode").value;
2466
- comp.prompt_file = document.getElementById("compose-prompt").value;
2467
- if (!comp.llm_provider || !comp.llm_model) {
2468
- errEl.textContent = lang === "fr" ? "Sélectionnez un provider et un modèle LLM." : "Select an LLM provider and model.";
2469
- return;
2470
- }
2471
- const modeLabel = {"text_only":"texte","text_and_image":"img+texte","zero_shot":"zero-shot"}[comp.pipeline_mode] || comp.pipeline_mode;
2472
- comp.name = `📝 ${comp.llm_model} [${modeLabel}]`;
2473
- } else if (mode === "pipeline") {
2474
- const ocrEngine = document.getElementById("compose-ocr-engine").value;
2475
- const ocrModel = document.getElementById("compose-ocr-model").value;
2476
- if (!ocrEngine) {
2477
- errEl.textContent = lang === "fr" ? "Sélectionnez un moteur OCR." : "Select an OCR engine.";
2478
- return;
2479
- }
2480
- comp.ocr_engine = ocrEngine;
2481
- comp.ocr_model = ocrModel;
2482
- comp.llm_provider = document.getElementById("compose-llm-provider").value;
2483
- comp.llm_model = document.getElementById("compose-llm-model").value;
2484
- comp.pipeline_mode = document.getElementById("compose-pipeline-mode").value;
2485
- comp.prompt_file = document.getElementById("compose-prompt").value;
2486
- if (!comp.llm_provider) {
2487
- errEl.textContent = lang === "fr" ? "Sélectionnez un provider LLM." : "Select an LLM provider.";
2488
- return;
2489
- }
2490
- comp.name = `${ocrEngine}${ocrModel ? ":"+ocrModel : ""} → ${comp.llm_model || comp.llm_provider}`;
2491
- } else {
2492
- // OCR seul
2493
- const ocrEngine = document.getElementById("compose-ocr-engine").value;
2494
- const ocrModel = document.getElementById("compose-ocr-model").value;
2495
- if (!ocrEngine) {
2496
- errEl.textContent = lang === "fr" ? "Sélectionnez un moteur OCR." : "Select an OCR engine.";
2497
- return;
2498
- }
2499
- comp.ocr_engine = ocrEngine;
2500
- comp.ocr_model = ocrModel;
2501
- comp.name = `${ocrEngine}${ocrModel ? " ("+ocrModel+")" : ""}`;
2502
- }
2503
-
2504
- errEl.textContent = "";
2505
- _competitors.push(comp);
2506
- renderCompetitors();
2507
- }
2508
-
2509
- function removeCompetitor(idx) {
2510
- _competitors.splice(idx, 1);
2511
- renderCompetitors();
2512
- }
2513
-
2514
- function renderCompetitors() {
2515
- const container = document.getElementById("competitors-list");
2516
- if (_competitors.length === 0) {
2517
- container.innerHTML = `<div style="color:var(--text-muted);font-size:12px;">${t("compose_empty")}</div>`;
2518
- return;
2519
- }
2520
- container.innerHTML = _competitors.map((c, i) => {
2521
- const isCorpusOCR = c.ocr_engine === "corpus" || (c.ocr_engine === "" && c.llm_provider);
2522
- const isPipeline = !!c.llm_provider && !isCorpusOCR;
2523
- let badge, detail;
2524
- if (isCorpusOCR) {
2525
- badge = "📝 Post-correction";
2526
- detail = `corpus_ocr → ${c.llm_provider}:${c.llm_model} [${c.pipeline_mode}]`;
2527
- } else if (isPipeline) {
2528
- badge = "⛓ Pipeline";
2529
- detail = `${c.ocr_engine}:${c.ocr_model} → ${c.llm_provider}:${c.llm_model} [${c.pipeline_mode}]`;
2530
- } else {
2531
- badge = "🔍 OCR";
2532
- detail = `${c.ocr_engine}:${c.ocr_model}`;
2533
- }
2534
- return `<div class="competitor-card">
2535
- <div class="competitor-info">
2536
- <span class="competitor-badge">${badge}</span>
2537
- <span class="competitor-name">${c.name}</span>
2538
- <span class="competitor-detail">${detail}</span>
2539
- </div>
2540
- <button class="btn btn-danger btn-sm" onclick="removeCompetitor(${i})">✕</button>
2541
- </div>`;
2542
- }).join("");
2543
- }
2544
-
2545
- // ─── Normalization profiles ──────────────────────────────────────────────────
2546
- let _normProfilesData = [];
2547
- async function loadNormProfiles() {
2548
- try {
2549
- const r = await fetch("/api/normalization/profiles");
2550
- const d = await r.json();
2551
- _normProfilesData = d.profiles || [];
2552
- const sel = document.getElementById("norm-profile");
2553
- sel.innerHTML = "";
2554
- _normProfilesData.forEach(p => {
2555
- const opt = document.createElement("option");
2556
- opt.value = p.id;
2557
- opt.textContent = `${p.name} — ${p.description}`;
2558
- if (p.id === "nfc") opt.selected = true;
2559
- sel.appendChild(opt);
2560
- });
2561
- sel.addEventListener("change", () => {
2562
- const p = _normProfilesData.find(x => x.id === sel.value);
2563
- if (p && p.exclude_chars && p.exclude_chars.length) {
2564
- document.getElementById("char-exclude").value = p.exclude_chars.join(", ");
2565
- }
2566
- });
2567
- } catch(e) {}
2568
- }
2569
-
2570
- // ─── File browser ────────────────────────────────────────────────────────────
2571
- let _fbVisible = false;
2572
- function openFileBrowser() {
2573
- _fbVisible = !_fbVisible;
2574
- const c = document.getElementById("file-browser-container");
2575
- c.style.display = _fbVisible ? "block" : "none";
2576
- if (_fbVisible) browsePath(".");
2577
- }
2578
- async function browsePath(path) {
2579
- try {
2580
- const r = await fetch(`/api/corpus/browse?path=${encodeURIComponent(path)}`);
2581
- const d = await r.json();
2582
- document.getElementById("fb-current-path").textContent = d.current_path;
2583
- const fb = document.getElementById("file-browser");
2584
- fb.innerHTML = "";
2585
- if (d.parent_path) {
2586
- const up = document.createElement("div");
2587
- up.className = "fb-item";
2588
- up.innerHTML = `<span class="fb-icon">⬆</span><span class="fb-name">..</span>`;
2589
- up.onclick = () => browsePath(d.parent_path);
2590
- fb.appendChild(up);
2591
- }
2592
- d.items.filter(i => i.is_dir).forEach(item => {
2593
- const el = document.createElement("div");
2594
- el.className = "fb-item";
2595
- const hasCorpus = item.has_corpus ? `<span class="fb-badge" style="color:var(--success)">✓ ${item.gt_count} GT</span>` : "";
2596
- el.innerHTML = `<span class="fb-icon">📁</span><span class="fb-name">${item.name}</span>${hasCorpus}`;
2597
- el.onclick = () => {
2598
- if (item.has_corpus) {
2599
- document.getElementById("corpus-path").value = item.path;
2600
- document.getElementById("corpus-info").textContent = `✓ ${item.gt_count} documents GT trouvés.`;
2601
- _fbVisible = false;
2602
- document.getElementById("file-browser-container").style.display = "none";
2603
- } else {
2604
- browsePath(item.path);
2605
- }
2606
- };
2607
- fb.appendChild(el);
2608
- });
2609
- if (fb.children.length === 0) {
2610
- fb.innerHTML = '<div style="padding:12px; color: var(--text-muted); font-size:12px;">Dossier vide</div>';
2611
- }
2612
- } catch(e) {
2613
- document.getElementById("file-browser").innerHTML =
2614
- `<div style="padding:12px; color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
2615
- }
2616
- }
2617
-
2618
- // ─── Benchmark ───────────────────────────────────────────────────────────────
2619
- let _currentJobId = null;
2620
- let _eventSource = null;
2621
-
2622
- async function startBenchmark() {
2623
- const corpusPath = document.getElementById("corpus-path").value.trim();
2624
- if (!corpusPath) {
2625
- alert(lang === "fr" ? "Veuillez sélectionner un dossier corpus." : "Please select a corpus directory.");
2626
- return;
2627
- }
2628
- if (_competitors.length === 0) {
2629
- alert(lang === "fr" ? "Ajoutez au moins un concurrent (Section 4)." : "Add at least one competitor (Section 4).");
2630
- return;
2631
- }
2632
-
2633
- const payload = {
2634
- corpus_path: corpusPath,
2635
- competitors: _competitors,
2636
- normalization_profile: document.getElementById("norm-profile").value,
2637
- char_exclude: document.getElementById("char-exclude").value.trim(),
2638
- output_dir: document.getElementById("output-dir").value,
2639
- report_name: document.getElementById("report-name").value,
2640
- };
2641
-
2642
- document.getElementById("start-btn").disabled = true;
2643
- document.getElementById("cancel-btn").style.display = "inline-flex";
2644
- document.getElementById("bench-progress-section").style.display = "block";
2645
- document.getElementById("bench-result-section").style.display = "none";
2646
- document.getElementById("bench-log").textContent = "";
2647
- document.getElementById("engine-progress-list").innerHTML = "";
2648
- document.getElementById("bench-status-text").textContent = lang === "fr" ? "Démarrage…" : "Starting…";
2649
-
2650
- try {
2651
- const r = await fetch("/api/benchmark/run", {
2652
- method: "POST",
2653
- headers: {"Content-Type": "application/json"},
2654
- body: JSON.stringify(payload),
2655
- });
2656
- if (!r.ok) {
2657
- const err = await r.json();
2658
- throw new Error(err.detail || "Erreur serveur");
2659
- }
2660
- const d = await r.json();
2661
- _currentJobId = d.job_id;
2662
- _startSSE(_currentJobId);
2663
- } catch(e) {
2664
- appendLog(`Erreur : ${e.message}`, "error");
2665
- document.getElementById("start-btn").disabled = false;
2666
- document.getElementById("cancel-btn").style.display = "none";
2667
- document.getElementById("bench-status-text").textContent = "";
2668
- }
2669
- }
2670
-
2671
- function _startSSE(jobId) {
2672
- if (_eventSource) _eventSource.close();
2673
- const pl = document.getElementById("engine-progress-list");
2674
- pl.innerHTML = "";
2675
- const seenEngines = {};
2676
-
2677
- _eventSource = new EventSource(`/api/benchmark/${jobId}/stream`);
2678
-
2679
- _eventSource.addEventListener("start", e => {
2680
- const d = JSON.parse(e.data);
2681
- appendLog(d.message, "success");
2682
- document.getElementById("bench-status-text").textContent = lang === "fr" ? "En cours…" : "Running…";
2683
- });
2684
-
2685
- _eventSource.addEventListener("log", e => {
2686
- const d = JSON.parse(e.data);
2687
- appendLog(d.message);
2688
- });
2689
-
2690
- _eventSource.addEventListener("warning", e => {
2691
- const d = JSON.parse(e.data);
2692
- appendLog(d.message, "warn");
2693
- });
2694
-
2695
- _eventSource.addEventListener("progress", e => {
2696
- const d = JSON.parse(e.data);
2697
- const pct = Math.round(d.progress * 100);
2698
- const engId = d.engine.replace(/[^a-z0-9_-]/gi, "_");
2699
- if (!seenEngines[engId]) {
2700
- seenEngines[engId] = true;
2701
- const div = document.createElement("div");
2702
- div.style = "margin-bottom: 8px;";
2703
- div.innerHTML = `<div style="display:flex;justify-content:space-between;font-size:12px;margin-bottom:3px;">
2704
- <span>${d.engine}</span><span id="eng-pct-${engId}">0%</span></div>
2705
- <div class="progress-bar-outer"><div class="progress-bar-inner" id="eng-bar-${engId}" style="width:0%"></div></div>`;
2706
- pl.appendChild(div);
2707
- }
2708
- const bar = document.getElementById(`eng-bar-${engId}`);
2709
- const pctEl = document.getElementById(`eng-pct-${engId}`);
2710
- if (bar) bar.style.width = pct + "%";
2711
- if (pctEl) pctEl.textContent = pct + "%";
2712
- document.getElementById("bench-status-text").textContent =
2713
- `${pct}% — ${d.engine} (${d.processed}/${d.total})`;
2714
- });
2715
-
2716
- _eventSource.addEventListener("complete", e => {
2717
- const d = JSON.parse(e.data);
2718
- appendLog(d.message, "success");
2719
- _showResults(d);
2720
- _finishBenchmark();
2721
- });
2722
-
2723
- _eventSource.addEventListener("error", e => {
2724
- const d = JSON.parse(e.data);
2725
- appendLog(d.message, "error");
2726
- _finishBenchmark();
2727
- });
2728
-
2729
- _eventSource.addEventListener("cancelled", e => {
2730
- appendLog(lang === "fr" ? "Benchmark annulé." : "Benchmark cancelled.", "warn");
2731
- _finishBenchmark();
2732
- });
2733
-
2734
- _eventSource.addEventListener("done", e => { _finishBenchmark(); });
2735
- _eventSource.onerror = () => { if (_currentJobId) _finishBenchmark(); };
2736
- }
2737
-
2738
- function _showResults(data) {
2739
- const section = document.getElementById("bench-result-section");
2740
- section.style.display = "block";
2741
- if (data.output_html) {
2742
- const link = document.getElementById("bench-report-link");
2743
- link.href = `/reports/${data.output_html.split("/").pop()}`;
2744
- }
2745
- if (data.ranking) {
2746
- let html = `<table><thead><tr><th>#</th><th>${lang==="fr"?"Moteur":"Engine"}</th><th>CER</th><th>WER</th><th>${lang==="fr"?"Docs":"Docs"}</th></tr></thead><tbody>`;
2747
- data.ranking.forEach((row, i) => {
2748
- const cer = row.mean_cer != null ? (row.mean_cer*100).toFixed(2)+"%" : "N/A";
2749
- const wer = row.mean_wer != null ? (row.mean_wer*100).toFixed(2)+"%" : "N/A";
2750
- html += `<tr><td>${i+1}</td><td>${row.engine}</td><td>${cer}</td><td>${wer}</td><td>${row.total_docs || ""}</td></tr>`;
2751
- });
2752
- html += "</tbody></table>";
2753
- document.getElementById("bench-ranking-table").innerHTML = html;
2754
- }
2755
- }
2756
-
2757
- function _finishBenchmark() {
2758
- if (_eventSource) { _eventSource.close(); _eventSource = null; }
2759
- document.getElementById("start-btn").disabled = false;
2760
- document.getElementById("cancel-btn").style.display = "none";
2761
- document.getElementById("bench-status-text").textContent = "";
2762
- }
2763
-
2764
- async function cancelBenchmark() {
2765
- if (!_currentJobId) return;
2766
- await fetch(`/api/benchmark/${_currentJobId}/cancel`, {method: "POST"});
2767
- }
2768
-
2769
- function appendLog(msg, cls) {
2770
- const box = document.getElementById("bench-log");
2771
- const line = document.createElement("div");
2772
- if (cls === "error") line.className = "log-error";
2773
- else if (cls === "warn") line.className = "log-warn";
2774
- else if (cls === "success") line.className = "log-success";
2775
- line.textContent = msg;
2776
- box.appendChild(line);
2777
- box.scrollTop = box.scrollHeight;
2778
- }
2779
-
2780
- // ─── Reports ─────────────────────────────────────────────────────────────────
2781
- async function loadReports() {
2782
- const dir = document.getElementById("reports-dir").value || ".";
2783
- const container = document.getElementById("reports-list");
2784
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
2785
- try {
2786
- const r = await fetch(`/api/reports?reports_dir=${encodeURIComponent(dir)}`);
2787
- const d = await r.json();
2788
- if (d.reports.length === 0) {
2789
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("no_reports")}</div>`;
2790
- return;
2791
- }
2792
- let html = `<table><thead><tr><th>${lang==="fr"?"Fichier":"File"}</th><th>${lang==="fr"?"Taille":"Size"}</th><th>${lang==="fr"?"Modifié":"Modified"}</th><th></th></tr></thead><tbody>`;
2793
- d.reports.forEach(rep => {
2794
- const date = new Date(rep.modified).toLocaleString(lang === "fr" ? "fr-FR" : "en-US");
2795
- html += `<tr><td>${rep.filename}</td><td>${rep.size_kb} Ko</td><td>${date}</td>
2796
- <td><a href="${rep.url}" target="_blank" class="btn btn-primary btn-sm">${lang==="fr"?"Ouvrir":"Open"}</a></td></tr>`;
2797
- });
2798
- html += "</tbody></table>";
2799
- container.innerHTML = html;
2800
- } catch(e) {
2801
- container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
2802
- }
2803
- }
2804
-
2805
- // ─── Engines status ──────────────────────────────────────────────────────────
2806
- async function loadEngines() {
2807
- try {
2808
- const r = await fetch("/api/engines");
2809
- const d = await r.json();
2810
-
2811
- // OCR
2812
- let html = `<table><thead><tr><th>ID</th><th>${lang==="fr"?"Nom":"Name"}</th><th>Version</th><th>Statut</th></tr></thead><tbody>`;
2813
- d.engines.forEach(e => {
2814
- const cls = e.available ? "badge-ok" : "badge-err";
2815
- const lbl = e.available ? t("available") : t("not_installed");
2816
- html += `<tr><td><code>${e.id}</code></td><td>${e.label}</td><td>${e.version||"—"}</td>
2817
- <td><span class="badge ${cls}">${lbl}</span></td></tr>`;
2818
- });
2819
- html += "</tbody></table>";
2820
- document.getElementById("engines-ocr-list").innerHTML = html;
2821
-
2822
- // LLMs
2823
- let llmHtml = `<table><thead><tr><th>ID</th><th>${lang==="fr"?"Nom":"Name"}</th><th>Statut</th><th>${lang==="fr"?"Détail":"Detail"}</th></tr></thead><tbody>`;
2824
- d.llms.forEach(e => {
2825
- const cls = e.available ? "badge-ok" : "badge-warn";
2826
- const statusKey = e.status === "configured" ? "configured"
2827
- : e.status === "running" ? "running"
2828
- : e.status === "not_running" ? "not_running"
2829
- : "missing_key";
2830
- const lbl = t(statusKey);
2831
- let detail = "";
2832
- if (e.key_env) detail = `<code style="font-size:11px;">${e.key_env}</code>`;
2833
- if (e.models && e.models.length > 0) detail = e.models.slice(0, 3).join(", ");
2834
- llmHtml += `<tr><td><code>${e.id}</code></td><td>${e.label}</td>
2835
- <td><span class="badge ${cls}">${lbl}</span></td><td>${detail}</td></tr>`;
2836
- });
2837
- llmHtml += "</tbody></table>";
2838
- document.getElementById("engines-llm-list").innerHTML = llmHtml;
2839
- } catch(e) {
2840
- document.getElementById("engines-ocr-list").innerHTML =
2841
- `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
2842
- }
2843
- }
2844
-
2845
- // ─── HTR-United ──────────────────────────────────────────────────────────────
2846
- async function initHTRFilters() {
2847
- try {
2848
- const r = await fetch("/api/htr-united/catalogue");
2849
- const d = await r.json();
2850
- const langSel = document.getElementById("htr-lang-filter");
2851
- const scriptSel = document.getElementById("htr-script-filter");
2852
- langSel.innerHTML = `<option value="">${t("all")}</option>`;
2853
- d.available_languages.forEach(l => {
2854
- langSel.innerHTML += `<option value="${l}">${l}</option>`;
2855
- });
2856
- scriptSel.innerHTML = `<option value="">${t("all")}</option>`;
2857
- d.available_scripts.forEach(s => {
2858
- scriptSel.innerHTML += `<option value="${s}">${s}</option>`;
2859
- });
2860
- } catch(e) {}
2861
- }
2862
-
2863
- async function searchHTRUnited() {
2864
- const q = document.getElementById("htr-search").value;
2865
- const lang2 = document.getElementById("htr-lang-filter").value;
2866
- const script = document.getElementById("htr-script-filter").value;
2867
- const container = document.getElementById("htr-results");
2868
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
2869
- try {
2870
- const url = `/api/htr-united/catalogue?query=${encodeURIComponent(q)}&language=${encodeURIComponent(lang2)}&script=${encodeURIComponent(script)}`;
2871
- const r = await fetch(url);
2872
- const d = await r.json();
2873
- if (d.entries.length === 0) {
2874
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${lang==="fr"?"Aucun résultat.":"No results."}</div>`;
2875
- return;
2876
- }
2877
- container.innerHTML = d.entries.map(e => {
2878
- const tags = [...e.language, ...e.script].map(s => `<span class="ds-tag">${s}</span>`).join("");
2879
- return `<div class="ds-card">
2880
- <div style="display:flex; justify-content:space-between; align-items:flex-start;">
2881
- <h4>${e.title}</h4>
2882
- <button class="btn btn-primary btn-sm" onclick="openImportModal('htr', '${e.id}', '${e.title.replace(/'/g,"\\'")}')">
2883
- ${lang==="fr"?"Importer":"Import"}
2884
- </button>
2885
- </div>
2886
- <p>${e.description}</p>
2887
- <p style="color: var(--text-muted);">${e.institution} — ${e.lines.toLocaleString()} ${t("lines")} — ${e.format}</p>
2888
- <div class="ds-meta">${tags}</div>
2889
- </div>`;
2890
- }).join("");
2891
- } catch(e) {
2892
- container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
2893
- }
2894
- }
2895
-
2896
- async function searchHuggingFace() {
2897
- const q = document.getElementById("hf-search").value;
2898
- const langFilter = document.getElementById("hf-lang-filter").value;
2899
- const tags = document.getElementById("hf-tags").value;
2900
- const container = document.getElementById("hf-results");
2901
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
2902
- try {
2903
- const url = `/api/huggingface/search?query=${encodeURIComponent(q)}&language=${encodeURIComponent(langFilter)}&tags=${encodeURIComponent(tags)}`;
2904
- const r = await fetch(url);
2905
- const d = await r.json();
2906
- if (d.datasets.length === 0) {
2907
- container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${lang==="fr"?"Aucun résultat.":"No results."}</div>`;
2908
- return;
2909
- }
2910
- container.innerHTML = d.datasets.map(ds => {
2911
- const tags2 = ds.tags.slice(0,5).map(s => `<span class="ds-tag">${s}</span>`).join("");
2912
- return `<div class="ds-card">
2913
- <div style="display:flex; justify-content:space-between; align-items:flex-start;">
2914
- <h4>${ds.title}</h4>
2915
- <button class="btn btn-primary btn-sm" onclick="openImportModal('hf', '${ds.dataset_id.replace(/'/g,"\\'")}', '${ds.title.replace(/'/g,"\\'")}')">
2916
- ${lang==="fr"?"Importer":"Import"}
2917
- </button>
2918
- </div>
2919
- <p>${ds.description}</p>
2920
- <p style="color: var(--text-muted);">${ds.institution||ds.dataset_id} ${ds.downloads ? "— " + ds.downloads.toLocaleString() + " téléchargements" : ""}</p>
2921
- <div class="ds-meta">${tags2}</div>
2922
- </div>`;
2923
- }).join("");
2924
- } catch(e) {
2925
- container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
2926
- }
2927
- }
2928
-
2929
- // ─── Import modal ─────────────────────────────────────────────────────────────
2930
- function openImportModal(type, id, title) {
2931
- document.getElementById("import-modal-type").value = type;
2932
- document.getElementById("import-modal-id").value = id;
2933
- document.getElementById("import-modal-title").textContent = `${t("import_modal_title")} : ${title}`;
2934
- document.getElementById("import-modal-status").innerHTML = "";
2935
- document.getElementById("import-modal").style.display = "flex";
2936
- }
2937
- function closeImportModal() {
2938
- document.getElementById("import-modal").style.display = "none";
2939
- }
2940
- async function confirmImport() {
2941
- const type = document.getElementById("import-modal-type").value;
2942
- const id = document.getElementById("import-modal-id").value;
2943
- const outputDir = document.getElementById("import-modal-output").value;
2944
- const maxSamples = parseInt(document.getElementById("import-modal-max").value);
2945
- const statusDiv = document.getElementById("import-modal-status");
2946
- statusDiv.innerHTML = `<div class="alert alert-info"><span class="spinner"></span> ${lang==="fr"?"Import en cours…":"Importing…"}</div>`;
2947
-
2948
- try {
2949
- let url, body;
2950
- if (type === "htr") {
2951
- url = "/api/htr-united/import";
2952
- body = {entry_id: id, output_dir: outputDir, max_samples: maxSamples};
2953
- } else {
2954
- url = "/api/huggingface/import";
2955
- body = {dataset_id: id, output_dir: outputDir, max_samples: maxSamples};
2956
- }
2957
- const r = await fetch(url, {method:"POST", headers:{"Content-Type":"application/json"}, body: JSON.stringify(body)});
2958
- const d = await r.json();
2959
- if (!r.ok) throw new Error(d.detail || "Erreur");
2960
- const msg = lang === "fr"
2961
- ? `✓ Import terminé. ${d.files_imported || 0} fichiers dans <code>${d.output_dir}</code>`
2962
- : `✓ Import done. ${d.files_imported || 0} files in <code>${d.output_dir}</code>`;
2963
- statusDiv.innerHTML = `<div class="alert alert-success">${msg}</div>`;
2964
- // Suggestion de corpus path
2965
- document.getElementById("corpus-path").value = d.output_dir;
2966
- } catch(e) {
2967
- statusDiv.innerHTML = `<div class="alert alert-error">Erreur : ${e.message}</div>`;
2968
- }
2969
- }
2970
-
2971
- // ─── Corpus upload ────────────────────────────────────────────────────────────
2972
- let _uploadMode = "zip"; // "zip" | "files"
2973
-
2974
- function switchCorpusTab(tab) {
2975
- document.getElementById("corpus-tab-browse").style.display = tab === "browse" ? "block" : "none";
2976
- document.getElementById("corpus-tab-upload").style.display = tab === "upload" ? "block" : "none";
2977
- document.getElementById("ctab-browse").classList.toggle("active", tab === "browse");
2978
- document.getElementById("ctab-upload").classList.toggle("active", tab === "upload");
2979
- if (tab === "upload") loadUploadedCorpora();
2980
- }
2981
-
2982
- function onUploadModeChange() {
2983
- _uploadMode = document.querySelector("input[name=upload-mode]:checked").value;
2984
- const input = document.getElementById("upload-file-input");
2985
- if (_uploadMode === "zip") {
2986
- input.accept = ".zip";
2987
- input.multiple = false;
2988
- document.getElementById("upload-dropzone-text").textContent = t("upload_drop_zip");
2989
- } else {
2990
- input.accept = ".jpg,.jpeg,.png,.tif,.tiff,.webp,.gt.txt,.txt";
2991
- input.multiple = true;
2992
- document.getElementById("upload-dropzone-text").textContent = t("upload_drop_files");
2993
- }
2994
- }
2995
-
2996
- function onFileInputChange(event) {
2997
- const files = Array.from(event.target.files);
2998
- if (files.length > 0) uploadCorpus(files);
2999
- }
3000
-
3001
- function onDropFiles(event) {
3002
- event.preventDefault();
3003
- document.getElementById("upload-dropzone").classList.remove("dragover");
3004
- const files = Array.from(event.dataTransfer.files);
3005
- if (files.length > 0) uploadCorpus(files);
3006
- }
3007
-
3008
- async function uploadCorpus(files) {
3009
- const progressContainer = document.getElementById("upload-progress-container");
3010
- const progressBar = document.getElementById("upload-progress-bar");
3011
- const progressText = document.getElementById("upload-progress-text");
3012
- const previewEl = document.getElementById("upload-preview");
3013
-
3014
- progressContainer.style.display = "block";
3015
- progressBar.style.width = "10%";
3016
- progressText.textContent = t("upload_uploading");
3017
- previewEl.innerHTML = "";
3018
-
3019
- const fd = new FormData();
3020
- for (const f of files) fd.append("files", f);
3021
-
3022
- try {
3023
- // Simulate progress during upload
3024
- let pct = 10;
3025
- const timer = setInterval(() => {
3026
- pct = Math.min(pct + 5, 85);
3027
- progressBar.style.width = pct + "%";
3028
- }, 200);
3029
-
3030
- const r = await fetch("/api/corpus/upload", {method: "POST", body: fd});
3031
- clearInterval(timer);
3032
- progressBar.style.width = "100%";
3033
-
3034
- if (!r.ok) {
3035
- const err = await r.json();
3036
- throw new Error(err.detail || "Erreur serveur");
3037
- }
3038
- const d = await r.json();
3039
- progressText.textContent = `✓ ${t("upload_success")} — ${d.doc_count} ${t("upload_pairs")}`;
3040
- progressBar.style.background = "var(--success)";
3041
-
3042
- // Show preview
3043
- renderUploadPreview(d, previewEl);
3044
-
3045
- // Show corpus OCR notice if triplet corpus
3046
- _updateCorpusOCRNotice(d);
3047
-
3048
- // Set corpus path and auto-select
3049
- setCorpusPath(d.corpus_path, `upload:${d.corpus_id} (${d.doc_count} docs)`);
3050
-
3051
- // Refresh list
3052
- loadUploadedCorpora();
3053
- } catch(e) {
3054
- progressBar.style.width = "100%";
3055
- progressBar.style.background = "var(--danger)";
3056
- progressText.textContent = `✗ ${e.message}`;
3057
- }
3058
- }
3059
-
3060
- function renderUploadPreview(data, container) {
3061
- const missingBadge = data.has_missing_gt
3062
- ? `<span class="badge badge-err" style="margin-left:8px;">${data.missing_gt.length} ${t("upload_missing_gt")}</span>`
3063
- : "";
3064
- const ocrBadge = (data.has_ocr_text && data.ocr_text_count > 0)
3065
- ? `<span class="badge" style="margin-left:8px; background:#dcfce7; color:#16a34a;">📝 ${data.ocr_text_count} .ocr.txt</span>`
3066
- : "";
3067
- let html = `<div class="corpus-preview">
3068
- <div class="corpus-preview-header">
3069
- <span>📄 ${data.doc_count} ${t("upload_pairs")}</span>${ocrBadge}${missingBadge}
3070
- </div>`;
3071
- for (const p of data.pairs) {
3072
- html += `<div class="corpus-preview-pair">
3073
- <span style="color:var(--text-muted);">🖼</span><span>${p.image}</span>
3074
- <span style="color:var(--text-muted); margin-left:auto;">↔</span>
3075
- <span style="color:var(--success);">${p.gt}</span>
3076
- </div>`;
3077
- }
3078
- if (data.total_pairs > data.pairs.length) {
3079
- html += `<div class="corpus-preview-more">… et ${data.total_pairs - data.pairs.length} autres paires</div>`;
3080
- }
3081
- for (const w of (data.warnings || [])) {
3082
- html += `<div style="padding:5px 12px; font-size:11px; color:var(--warning);">⚠ ${w}</div>`;
3083
- }
3084
- html += `</div>`;
3085
- container.innerHTML = html;
3086
- }
3087
-
3088
- function setCorpusPath(path, label) {
3089
- document.getElementById("corpus-path").value = path;
3090
- document.getElementById("corpus-info").textContent = `✓ ${label}`;
3091
- }
3092
-
3093
- function _updateCorpusOCRNotice(corpusData) {
3094
- const notice = document.getElementById("corpus-ocr-notice");
3095
- if (!notice) return;
3096
- if (corpusData && corpusData.has_ocr_text && corpusData.ocr_text_count > 0) {
3097
- notice.style.display = "block";
3098
- notice.innerHTML = `📝 ${t("corpus_has_ocr")} <strong>(${corpusData.ocr_text_count} fichiers .ocr.txt)</strong>`;
3099
- } else {
3100
- notice.style.display = "none";
3101
- }
3102
- }
3103
-
3104
- async function loadUploadedCorpora() {
3105
- const container = document.getElementById("uploads-list");
3106
- try {
3107
- const r = await fetch("/api/corpus/uploads");
3108
- const d = await r.json();
3109
- if (d.uploads.length === 0) {
3110
- container.innerHTML = `<div style="color:var(--text-muted); font-size:12px;">${t("upload_no_corpus")}</div>`;
3111
- return;
3112
- }
3113
- const currentPath = document.getElementById("corpus-path").value;
3114
- container.innerHTML = d.uploads.map(u => {
3115
- const isSelected = u.corpus_path === currentPath;
3116
- const missing = u.has_missing_gt
3117
- ? `<span class="badge badge-warn" style="margin-left:6px;">${t("upload_missing_gt")}</span>` : "";
3118
- return `<div class="upload-corpus-item${isSelected ? " selected" : ""}"
3119
- onclick="setCorpusPath('${u.corpus_path}', 'upload (${u.doc_count} docs)'); loadUploadedCorpora()">
3120
- <span class="upload-corpus-label">
3121
- <strong>${u.doc_count} ${t("upload_pairs")}</strong>${missing}
3122
- <span style="display:block; font-size:11px; color:var(--text-muted); font-family:monospace;">${u.corpus_path}</span>
3123
- </span>
3124
- <button class="btn btn-danger btn-sm" onclick="event.stopPropagation(); deleteUploadedCorpus('${u.corpus_id}')"
3125
- title="${t("upload_delete")}">✕</button>
3126
- </div>`;
3127
- }).join("");
3128
- } catch(e) {
3129
- container.innerHTML = `<div style="color:var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
3130
- }
3131
- }
3132
-
3133
- async function deleteUploadedCorpus(corpusId) {
3134
- try {
3135
- await fetch(`/api/corpus/uploads/${corpusId}`, {method: "DELETE"});
3136
- loadUploadedCorpora();
3137
- // Clear corpus path if it was the deleted one
3138
- const p = document.getElementById("corpus-path").value;
3139
- if (p.includes(corpusId)) {
3140
- document.getElementById("corpus-path").value = "";
3141
- document.getElementById("corpus-info").textContent = "";
3142
- }
3143
- } catch(e) {}
3144
- }
3145
-
3146
- // ─── Init ────────────────────────────────────────────────────────────────────
3147
- document.addEventListener("DOMContentLoaded", async () => {
3148
- loadStatus();
3149
- loadNormProfiles();
3150
- initHTRFilters();
3151
- // Load OCR engines, LLM models, initialize composer
3152
- await loadBenchmarkSections();
3153
- onComposeOCRChange(); // Pre-populate Tesseract languages
3154
- loadComposePrompts(); // Pre-load prompt files
3155
- startAutoRefresh(); // Auto-detect new API keys every 10 s
3156
- // Close modal on backdrop click
3157
- document.getElementById("import-modal").addEventListener("click", e => {
3158
- if (e.target === document.getElementById("import-modal")) closeImportModal();
3159
- });
3160
- });
3161
- </script>
3162
- </body>
3163
- </html>"""
 
1646
  # Page principale HTML (SPA)
1647
  # ---------------------------------------------------------------------------
1648
 
1649
+ # Sprint 25 — environnement Jinja2 partagé pour la SPA.
1650
+ # Le HTML/CSS/JS inline qui vivait dans ``_HTML_TEMPLATE`` (3000+ lignes
1651
+ # de string Python) est maintenant découpé en :
1652
+ # - picarones/web/templates/ (base + 6 partials Jinja2)
1653
+ # - picarones/web/static/web-app.js (toute la logique JS)
1654
+ # Ce découpage permet :
1655
+ # 1. de tester chaque vue indépendamment ;
1656
+ # 2. de durcir la CSP à ``script-src 'self'`` (le JS n'est plus inline) ;
1657
+ # 3. de toucher l'UI sans relire un fichier de 3000 lignes.
1658
+ from jinja2 import Environment, FileSystemLoader, select_autoescape
1659
+
1660
+ _TEMPLATES_DIR = Path(__file__).parent / "templates"
1661
+ _jinja_env = Environment(
1662
+ loader=FileSystemLoader(str(_TEMPLATES_DIR)),
1663
+ autoescape=select_autoescape(["html", "j2"]),
1664
+ trim_blocks=False,
1665
+ lstrip_blocks=False,
1666
+ )
1667
+
1668
+
1669
+ def _render_index(lang: str) -> str:
1670
+ """Rend la SPA depuis ``base.html.j2``. Déterministe pour un même couple
1671
+ (lang, version) — utilisé par le test de non-régression Sprint 25."""
1672
+ return _jinja_env.get_template("base.html.j2").render(
1673
+ lang=lang,
1674
+ version=__version__,
1675
+ )
1676
+
1677
+
1678
  @app.get("/", response_class=HTMLResponse)
1679
  async def index(picarones_lang: str = Cookie(default="fr")) -> HTMLResponse:
1680
  lang = picarones_lang if picarones_lang in _SUPPORTED_LANGS else "fr"
1681
+ return HTMLResponse(content=_render_index(lang))
 
 
 
 
 
 
1682
 
1683
 
1684
  # ---------------------------------------------------------------------------
 
1688
  def _iso_now() -> str:
1689
  return datetime.now(timezone.utc).isoformat(timespec="seconds")
1690
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
picarones/web/security.py CHANGED
@@ -256,9 +256,16 @@ class RateLimiter:
256
  # CSP middleware
257
  # ---------------------------------------------------------------------------
258
 
259
- #: Politique CSP par défaut. ``unsafe-inline`` reste tant que ``_HTML_TEMPLATE``
260
- #: n'est pas refactoré (Sprint 25 : extraction des templates web Jinja2). Une
261
- #: fois le template externe, on resserre à ``script-src 'self'`` + nonces.
 
 
 
 
 
 
 
262
  DEFAULT_CSP = (
263
  "default-src 'self'; "
264
  "script-src 'self' 'unsafe-inline'; "
 
256
  # CSP middleware
257
  # ---------------------------------------------------------------------------
258
 
259
+ #: Politique CSP par défaut.
260
+ #:
261
+ #: Sprint 25 a extrait tout le JavaScript de la SPA (~1131 lignes) dans
262
+ #: ``picarones/web/static/web-app.js`` — c'est la victoire concrète. Reste
263
+ #: dans le HTML environ 30 ``onclick="..."`` inline qui forcent à conserver
264
+ #: ``'unsafe-inline'`` dans ``script-src``. Leur migration vers
265
+ #: ``addEventListener`` est planifiée (sous-sprint dédié à ne pas mélanger
266
+ #: avec l'extraction des templates pour limiter les risques de régression).
267
+ #: ``style-src`` reste sur ``'unsafe-inline'`` pour les ``style="..."``
268
+ #: sémantiques dans les partials (états vert/rouge/jaune).
269
  DEFAULT_CSP = (
270
  "default-src 'self'; "
271
  "script-src 'self' 'unsafe-inline'; "
picarones/web/static/web-app.js ADDED
@@ -0,0 +1,1131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // ─── i18n ────────────────────────────────────────────────────────────────────
2
+ const T = {
3
+ fr: {
4
+ app_title: "Picarones",
5
+ nav_benchmark: "Benchmark",
6
+ nav_reports: "Rapports",
7
+ nav_engines: "Moteurs",
8
+ nav_import: "Import",
9
+ loading: "Chargement…",
10
+ search: "Rechercher",
11
+ all: "Tous",
12
+ cancel: "Annuler",
13
+ bench_corpus_title: "1. Corpus",
14
+ bench_corpus_label: "Chemin vers le dossier corpus (paires image / .gt.txt)",
15
+ bench_browse: "Parcourir",
16
+ corpus_tab_browse: "📁 Parcourir",
17
+ corpus_tab_upload: "⬆ Uploader",
18
+ upload_zip_mode: "Archive ZIP",
19
+ upload_files_mode: "Fichiers individuels",
20
+ upload_drop_zip: "Glissez un .zip ici ou cliquez pour sélectionner",
21
+ upload_drop_files: "Glissez des images + .gt.txt ou cliquez pour sélectionner",
22
+ upload_uploading: "Upload en cours…",
23
+ upload_success: "Corpus chargé avec succès",
24
+ upload_no_corpus: "Aucun corpus uploadé.",
25
+ upload_select: "Utiliser ce corpus",
26
+ upload_delete: "Supprimer",
27
+ upload_pairs: "paires",
28
+ upload_missing_gt: "GT manquant(s)",
29
+ bench_engines_title: "2. Moteurs et pipelines",
30
+ bench_ocr_title: "2. Moteurs OCR",
31
+ bench_llm_title: "3. Modèles LLM",
32
+ bench_compose_title: "4. Concurrents à benchmarker",
33
+ bench_options_title: "5. Options",
34
+ compose_ocr_only: "OCR seul",
35
+ compose_pipeline: "Pipeline OCR+LLM",
36
+ compose_postcorrection: "Post-correction (corpus OCR)",
37
+ corpus_has_ocr: "Ce corpus contient des fichiers OCR pré-calculés (.ocr.txt) — post-correction disponible.",
38
+ corpus_no_ocr_warn: "Ce corpus ne contient pas de fichiers .ocr.txt — uploadez un corpus triplet pour la post-correction.",
39
+ compose_ocr_engine: "Moteur OCR",
40
+ compose_ocr_model: "Modèle / Langue",
41
+ compose_llm_provider: "Provider LLM",
42
+ compose_llm_model: "Modèle LLM",
43
+ compose_mode: "Mode pipeline",
44
+ compose_prompt: "Prompt",
45
+ compose_add: "+ Ajouter",
46
+ compose_empty: "Aucun concurrent ajouté.",
47
+ mode_text_only: "Post-correction texte",
48
+ mode_text_image: "Post-correction image+texte",
49
+ mode_zero_shot: "Zero-shot",
50
+ bench_norm_label: "Profil de normalisation",
51
+ bench_lang_label: "Langue (Tesseract)",
52
+ bench_output_label: "Dossier de sortie",
53
+ bench_name_label: "Nom du rapport (optionnel)",
54
+ bench_start: "▶ Lancer le benchmark",
55
+ bench_cancel: "✕ Annuler",
56
+ bench_progress_title: "Progression",
57
+ bench_log: "Journal",
58
+ bench_result_title: "Résultats",
59
+ bench_open_report: "Ouvrir le rapport",
60
+ reports_title: "Rapports générés",
61
+ reports_dir_label: "Dossier de rapports",
62
+ reports_refresh: "Rafraîchir",
63
+ engines_ocr_title: "Moteurs OCR",
64
+ engines_llm_title: "LLMs disponibles",
65
+ import_htr_title: "Import HTR-United",
66
+ import_htr_desc: "Catalogue communautaire de corpus HTR/OCR pour documents patrimoniaux.",
67
+ import_hf_title: "Import HuggingFace Datasets",
68
+ import_hf_desc: "Datasets OCR/HTR publics depuis HuggingFace Hub (IAM, RIMES, CATMuS, Gallica…).",
69
+ import_search_label: "Recherche",
70
+ import_lang_filter: "Langue",
71
+ import_script_filter: "Type d'écriture",
72
+ import_tag_filter: "Tags",
73
+ import_modal_title: "Importer le corpus",
74
+ import_output_dir: "Dossier de destination",
75
+ import_max_samples: "Nombre max de documents",
76
+ import_confirm: "Importer",
77
+ available: "disponible",
78
+ not_installed: "non installé",
79
+ configured: "configuré",
80
+ missing_key: "clé manquante",
81
+ running: "actif",
82
+ not_running: "inactif",
83
+ no_reports: "Aucun rapport trouvé.",
84
+ lines: "lignes",
85
+ centuries: "siècles",
86
+ },
87
+ en: {
88
+ app_title: "Picarones",
89
+ nav_benchmark: "Benchmark",
90
+ nav_reports: "Reports",
91
+ nav_engines: "Engines",
92
+ nav_import: "Import",
93
+ loading: "Loading…",
94
+ search: "Search",
95
+ all: "All",
96
+ cancel: "Cancel",
97
+ bench_corpus_title: "1. Corpus",
98
+ bench_corpus_label: "Path to corpus directory (image / .gt.txt pairs)",
99
+ bench_browse: "Browse",
100
+ corpus_tab_browse: "📁 Browse",
101
+ corpus_tab_upload: "⬆ Upload",
102
+ upload_zip_mode: "ZIP archive",
103
+ upload_files_mode: "Individual files",
104
+ upload_drop_zip: "Drop a .zip here or click to select",
105
+ upload_drop_files: "Drop images + .gt.txt files or click to select",
106
+ upload_uploading: "Uploading…",
107
+ upload_success: "Corpus loaded successfully",
108
+ upload_no_corpus: "No corpus uploaded.",
109
+ upload_select: "Use this corpus",
110
+ upload_delete: "Delete",
111
+ upload_pairs: "pairs",
112
+ upload_missing_gt: "missing GT",
113
+ bench_engines_title: "2. Engines & pipelines",
114
+ bench_ocr_title: "2. OCR Engines",
115
+ bench_llm_title: "3. LLM Models",
116
+ bench_compose_title: "4. Competitors",
117
+ bench_options_title: "5. Options",
118
+ compose_ocr_only: "OCR only",
119
+ compose_pipeline: "OCR+LLM Pipeline",
120
+ compose_postcorrection: "Post-correction (corpus OCR)",
121
+ corpus_has_ocr: "This corpus contains pre-computed OCR files (.ocr.txt) — post-correction available.",
122
+ corpus_no_ocr_warn: "This corpus has no .ocr.txt files — upload a triplet corpus for post-correction.",
123
+ compose_ocr_engine: "OCR Engine",
124
+ compose_ocr_model: "Model / Language",
125
+ compose_llm_provider: "LLM Provider",
126
+ compose_llm_model: "LLM Model",
127
+ compose_mode: "Pipeline mode",
128
+ compose_prompt: "Prompt",
129
+ compose_add: "+ Add",
130
+ compose_empty: "No competitors added.",
131
+ mode_text_only: "Text post-correction",
132
+ mode_text_image: "Image+text post-correction",
133
+ mode_zero_shot: "Zero-shot",
134
+ bench_norm_label: "Normalization profile",
135
+ bench_lang_label: "Language (Tesseract)",
136
+ bench_output_label: "Output directory",
137
+ bench_name_label: "Report name (optional)",
138
+ bench_start: "▶ Start benchmark",
139
+ bench_cancel: "✕ Cancel",
140
+ bench_progress_title: "Progress",
141
+ bench_log: "Log",
142
+ bench_result_title: "Results",
143
+ bench_open_report: "Open report",
144
+ reports_title: "Generated reports",
145
+ reports_dir_label: "Reports directory",
146
+ reports_refresh: "Refresh",
147
+ engines_ocr_title: "OCR Engines",
148
+ engines_llm_title: "Available LLMs",
149
+ import_htr_title: "Import from HTR-United",
150
+ import_htr_desc: "Community catalogue of HTR/OCR datasets for heritage documents.",
151
+ import_hf_title: "Import from HuggingFace Datasets",
152
+ import_hf_desc: "Public OCR/HTR datasets from HuggingFace Hub (IAM, RIMES, CATMuS, Gallica…).",
153
+ import_search_label: "Search",
154
+ import_lang_filter: "Language",
155
+ import_script_filter: "Script type",
156
+ import_tag_filter: "Tags",
157
+ import_modal_title: "Import corpus",
158
+ import_output_dir: "Output directory",
159
+ import_max_samples: "Max documents",
160
+ import_confirm: "Import",
161
+ available: "available",
162
+ not_installed: "not installed",
163
+ configured: "configured",
164
+ missing_key: "key missing",
165
+ running: "running",
166
+ not_running: "not running",
167
+ no_reports: "No reports found.",
168
+ lines: "lines",
169
+ centuries: "centuries",
170
+ },
171
+ };
172
+ let lang = "fr";
173
+ function t(key) { return (T[lang][key]) || key; }
174
+ function toggleLang() {
175
+ lang = lang === "fr" ? "en" : "fr";
176
+ document.getElementById("lang-btn").textContent = lang === "fr" ? "EN" : "FR";
177
+ document.querySelectorAll("[data-i18n]").forEach(el => {
178
+ const k = el.getAttribute("data-i18n");
179
+ if (T[lang][k]) el.textContent = T[lang][k];
180
+ });
181
+ }
182
+
183
+ // ─── Navigation ──────────────────────────────────────────────────────────────
184
+ function showView(name) {
185
+ document.querySelectorAll(".view").forEach(v => v.classList.remove("active"));
186
+ document.querySelectorAll(".nav-btn").forEach(b => b.classList.remove("active"));
187
+ const view = document.getElementById("view-" + name);
188
+ if (view) view.classList.add("active");
189
+ const btns = document.querySelectorAll(".nav-btn");
190
+ const idx = ["benchmark","reports","engines","import"].indexOf(name);
191
+ if (btns[idx]) btns[idx].classList.add("active");
192
+
193
+ if (name === "reports") loadReports();
194
+ if (name === "engines") loadEngines();
195
+ if (name === "import") { searchHTRUnited(); searchHuggingFace(); }
196
+ }
197
+
198
+ // ─── Status / version ────────────────────────────────────────────────────────
199
+ async function loadStatus() {
200
+ try {
201
+ const r = await fetch("/api/status");
202
+ const d = await r.json();
203
+ document.getElementById("app-version").textContent = "v" + d.version;
204
+ } catch(e) {}
205
+ }
206
+
207
+ // ─── Models cache & fetching ─────────────────────────────────────────────────
208
+ let _modelsCache = {};
209
+ let _enginesData = null;
210
+ let _competitors = [];
211
+ let _refreshIntervalId = null;
212
+ let _pendingOCREngine = null; // garde contre les réponses obsolètes (race condition)
213
+
214
+ async function fetchModels(provider, capability) {
215
+ const cacheKey = capability ? `${provider}__${capability}` : provider;
216
+ if (_modelsCache[cacheKey]) return _modelsCache[cacheKey];
217
+ const url = capability ? `/api/models/${provider}?capability=${capability}` : `/api/models/${provider}`;
218
+ const r = await fetch(url);
219
+ const d = await r.json();
220
+ // Support both new format (objects with id+capabilities) and old format (flat strings)
221
+ let models = d.model_ids || d.models || [];
222
+ if (models.length > 0 && typeof models[0] === "object") {
223
+ models = models.map(m => m.id || m);
224
+ }
225
+ _modelsCache[cacheKey] = models;
226
+ return models;
227
+ }
228
+
229
+ function populateSelect(selectId, models, spinnerId) {
230
+ const sel = document.getElementById(selectId);
231
+ if (spinnerId) { const sp = document.getElementById(spinnerId); if (sp) sp.style.display = "none"; }
232
+ if (!sel) return;
233
+ // Handle both string arrays and object arrays
234
+ const items = models.map(m => typeof m === "object" ? (m.id || m) : m);
235
+ sel.innerHTML = items.length === 0
236
+ ? '<option value="">— aucun modèle —</option>'
237
+ : items.map(m => `<option value="${m}">${m}</option>`).join("");
238
+ }
239
+
240
+ // ─── Benchmark sections (OCR + LLM status + composer init) ───────────────────
241
+ async function loadBenchmarkSections() {
242
+ try {
243
+ const r = await fetch("/api/engines");
244
+ const d = await r.json();
245
+ _enginesData = d;
246
+ renderOCREnginesSection(d.engines);
247
+ renderLLMSection(d.llms);
248
+ } catch(e) {
249
+ document.getElementById("ocr-engines-status-list").innerHTML =
250
+ `<div style="color:var(--danger);font-size:12px;">Erreur : ${e.message}</div>`;
251
+ }
252
+ }
253
+
254
+ function _makeProviderRow(eng, msId) {
255
+ const dotCls = eng.available ? "status-ok" : (eng.status === "not_running" ? "status-warn" : "status-err");
256
+ let statusLabel;
257
+ if (eng.available) statusLabel = eng.version ? eng.version : (lang === "fr" ? "disponible" : "available");
258
+ else if (eng.status === "missing_key") statusLabel = eng.key_env ? `<code style="font-size:11px;color:var(--warning)">${eng.key_env}</code>` : (lang === "fr" ? "clé manquante" : "key missing");
259
+ else if (eng.status === "not_running") statusLabel = lang === "fr" ? "inactif" : "not running";
260
+ else statusLabel = lang === "fr" ? "non installé" : "not installed";
261
+
262
+ const row = document.createElement("div");
263
+ row.className = "provider-row";
264
+ row.innerHTML = `
265
+ <div class="provider-label"><span class="engine-status ${dotCls}"></span><strong>${eng.label}</strong></div>
266
+ <div class="provider-status">${statusLabel}</div>
267
+ <div class="provider-model-select" id="${msId}">${eng.available ? '<span class="spinner"></span>' : ""}</div>`;
268
+ return row;
269
+ }
270
+
271
+ async function renderOCREnginesSection(engines) {
272
+ const container = document.getElementById("ocr-engines-status-list");
273
+ container.innerHTML = "";
274
+ for (const eng of engines) {
275
+ const msId = `ms-ocr-${eng.id}`;
276
+ container.appendChild(_makeProviderRow(eng, msId));
277
+ if (eng.available) {
278
+ fetchModels(eng.id).then(models => {
279
+ const div = document.getElementById(msId);
280
+ if (!div) return;
281
+ div.innerHTML = models.length === 0
282
+ ? `<span style="color:var(--text-muted);font-size:11px;">—</span>`
283
+ : `<span style="font-size:12px;">${models.slice(0,5).join(", ")}${models.length > 5 ? ` +${models.length-5}` : ""}</span>`;
284
+ }).catch(() => {
285
+ const div = document.getElementById(msId);
286
+ if (div) div.innerHTML = `<span style="color:var(--danger);font-size:11px;">Erreur API</span>`;
287
+ });
288
+ }
289
+ }
290
+ }
291
+
292
+ async function renderLLMSection(llms) {
293
+ const container = document.getElementById("llm-status-list");
294
+ container.innerHTML = "";
295
+ for (const llm of llms) {
296
+ const msId = `ms-llm-${llm.id}`;
297
+ container.appendChild(_makeProviderRow(llm, msId));
298
+ if (llm.available) {
299
+ fetchModels(llm.id).then(models => {
300
+ const div = document.getElementById(msId);
301
+ if (!div) return;
302
+ div.innerHTML = models.length === 0
303
+ ? `<span style="color:var(--text-muted);font-size:11px;">—</span>`
304
+ : `<span style="font-size:12px;">${models.slice(0,3).join(", ")}${models.length > 3 ? ` +${models.length-3}` : ""}</span>`;
305
+ }).catch(() => {
306
+ const div = document.getElementById(msId);
307
+ if (div) div.innerHTML = `<span style="color:var(--danger);font-size:11px;">Erreur API</span>`;
308
+ });
309
+ }
310
+ }
311
+ }
312
+
313
+ function startAutoRefresh() {
314
+ if (_refreshIntervalId) clearInterval(_refreshIntervalId);
315
+ _refreshIntervalId = setInterval(async () => {
316
+ try {
317
+ const r = await fetch("/api/engines");
318
+ const d = await r.json();
319
+ if (!_enginesData || JSON.stringify(d) !== JSON.stringify(_enginesData)) {
320
+ _modelsCache = {};
321
+ _enginesData = d;
322
+ renderOCREnginesSection(d.engines);
323
+ renderLLMSection(d.llms);
324
+ }
325
+ } catch(e) {}
326
+ }, 10000);
327
+ }
328
+
329
+ // ─── Competitor composer ──────────────────────────────────────────────────────
330
+ async function onComposeOCRChange() {
331
+ const engine = document.getElementById("compose-ocr-engine").value;
332
+ _pendingOCREngine = engine; // marquer la requête courante
333
+ const sp = document.getElementById("sp-ocr-model");
334
+ // Google Vision et Azure ont des listes statiques — pas d'appel API nécessaire
335
+ if (engine === "google_vision") {
336
+ sp.style.display = "none";
337
+ populateSelect("compose-ocr-model", ["document_text_detection", "text_detection"], null);
338
+ return;
339
+ }
340
+ if (engine === "azure_doc_intel") {
341
+ sp.style.display = "none";
342
+ populateSelect("compose-ocr-model", ["prebuilt-document", "prebuilt-read"], null);
343
+ return;
344
+ }
345
+ // Tesseract : langues installées ; Mistral OCR : modèles vision (API dynamique)
346
+ sp.style.display = "inline-block";
347
+ try {
348
+ const models = await fetchModels(engine);
349
+ if (_pendingOCREngine !== engine) return; // réponse obsolète, abandonner
350
+ populateSelect("compose-ocr-model", models, "sp-ocr-model");
351
+ } catch(e) {
352
+ if (_pendingOCREngine !== engine) return;
353
+ sp.style.display = "none";
354
+ document.getElementById("compose-ocr-model").innerHTML = '<option value="">Erreur</option>';
355
+ }
356
+ }
357
+
358
+ async function onComposeLLMChange() {
359
+ const provider = document.getElementById("compose-llm-provider").value;
360
+ const composeMode = document.querySelector("input[name=compose-mode]:checked").value;
361
+ const pipelineMode = document.getElementById("compose-pipeline-mode").value;
362
+ // Apply capability filter for modes requiring vision
363
+ const needsVision = (pipelineMode === "text_and_image" || pipelineMode === "zero_shot");
364
+ const capability = (composeMode === "postcorrection" || composeMode === "pipeline") && needsVision ? "vision" : "";
365
+ _loadLLMModelsWithCapability(provider, capability);
366
+ }
367
+
368
+ function onComposeModeChange() {
369
+ const mode = document.querySelector("input[name=compose-mode]:checked").value;
370
+ const ocrSection = document.getElementById("compose-ocr-section");
371
+ const pipelineSection = document.getElementById("compose-pipeline-section");
372
+
373
+ if (mode === "ocr") {
374
+ ocrSection.style.display = "flex";
375
+ pipelineSection.style.display = "none";
376
+ } else if (mode === "pipeline") {
377
+ ocrSection.style.display = "flex";
378
+ pipelineSection.style.display = "block";
379
+ // Reload LLM models without capability filter
380
+ onComposeLLMChange();
381
+ } else if (mode === "postcorrection") {
382
+ ocrSection.style.display = "none";
383
+ pipelineSection.style.display = "block";
384
+ // Reload LLM models with capability filter based on pipeline mode
385
+ onComposePipelineModeChange();
386
+ }
387
+ }
388
+
389
+ function onComposePipelineModeChange() {
390
+ const composeMode = document.querySelector("input[name=compose-mode]:checked").value;
391
+ if (composeMode !== "postcorrection" && composeMode !== "pipeline") return;
392
+ const pipelineMode = document.getElementById("compose-pipeline-mode").value;
393
+ // Filter by vision capability for modes that need images
394
+ const needsVision = (pipelineMode === "text_and_image" || pipelineMode === "zero_shot");
395
+ const capability = needsVision ? "vision" : "";
396
+ const provider = document.getElementById("compose-llm-provider").value;
397
+ // Clear cache for this provider to re-fetch with new capability filter
398
+ const cacheKey = capability ? `${provider}__${capability}` : provider;
399
+ delete _modelsCache[cacheKey];
400
+ _loadLLMModelsWithCapability(provider, capability);
401
+ }
402
+
403
+ async function _loadLLMModelsWithCapability(provider, capability) {
404
+ document.getElementById("sp-llm-model").style.display = "inline-block";
405
+ try {
406
+ const models = await fetchModels(provider, capability);
407
+ populateSelect("compose-llm-model", models, "sp-llm-model");
408
+ } catch(e) {
409
+ document.getElementById("sp-llm-model").style.display = "none";
410
+ document.getElementById("compose-llm-model").innerHTML = '<option value="">Erreur</option>';
411
+ }
412
+ }
413
+
414
+ async function loadComposePrompts() {
415
+ document.getElementById("sp-prompt").style.display = "inline-block";
416
+ try {
417
+ const models = await fetchModels("prompts");
418
+ populateSelect("compose-prompt", models, "sp-prompt");
419
+ } catch(e) {
420
+ document.getElementById("sp-prompt").style.display = "none";
421
+ }
422
+ }
423
+
424
+ function addCompetitor() {
425
+ const mode = document.querySelector("input[name=compose-mode]:checked").value;
426
+ const errEl = document.getElementById("compose-error");
427
+
428
+ const comp = { name: "", ocr_engine: "", ocr_model: "",
429
+ llm_provider: "", llm_model: "", pipeline_mode: "", prompt_file: "" };
430
+
431
+ if (mode === "postcorrection") {
432
+ // Post-correction : OCR vient du corpus (.ocr.txt)
433
+ comp.ocr_engine = "corpus";
434
+ comp.llm_provider = document.getElementById("compose-llm-provider").value;
435
+ comp.llm_model = document.getElementById("compose-llm-model").value;
436
+ comp.pipeline_mode = document.getElementById("compose-pipeline-mode").value;
437
+ comp.prompt_file = document.getElementById("compose-prompt").value;
438
+ if (!comp.llm_provider || !comp.llm_model) {
439
+ errEl.textContent = lang === "fr" ? "Sélectionnez un provider et un modèle LLM." : "Select an LLM provider and model.";
440
+ return;
441
+ }
442
+ const modeLabel = {"text_only":"texte","text_and_image":"img+texte","zero_shot":"zero-shot"}[comp.pipeline_mode] || comp.pipeline_mode;
443
+ comp.name = `📝 ${comp.llm_model} [${modeLabel}]`;
444
+ } else if (mode === "pipeline") {
445
+ const ocrEngine = document.getElementById("compose-ocr-engine").value;
446
+ const ocrModel = document.getElementById("compose-ocr-model").value;
447
+ if (!ocrEngine) {
448
+ errEl.textContent = lang === "fr" ? "Sélectionnez un moteur OCR." : "Select an OCR engine.";
449
+ return;
450
+ }
451
+ comp.ocr_engine = ocrEngine;
452
+ comp.ocr_model = ocrModel;
453
+ comp.llm_provider = document.getElementById("compose-llm-provider").value;
454
+ comp.llm_model = document.getElementById("compose-llm-model").value;
455
+ comp.pipeline_mode = document.getElementById("compose-pipeline-mode").value;
456
+ comp.prompt_file = document.getElementById("compose-prompt").value;
457
+ if (!comp.llm_provider) {
458
+ errEl.textContent = lang === "fr" ? "Sélectionnez un provider LLM." : "Select an LLM provider.";
459
+ return;
460
+ }
461
+ comp.name = `${ocrEngine}${ocrModel ? ":"+ocrModel : ""} → ${comp.llm_model || comp.llm_provider}`;
462
+ } else {
463
+ // OCR seul
464
+ const ocrEngine = document.getElementById("compose-ocr-engine").value;
465
+ const ocrModel = document.getElementById("compose-ocr-model").value;
466
+ if (!ocrEngine) {
467
+ errEl.textContent = lang === "fr" ? "Sélectionnez un moteur OCR." : "Select an OCR engine.";
468
+ return;
469
+ }
470
+ comp.ocr_engine = ocrEngine;
471
+ comp.ocr_model = ocrModel;
472
+ comp.name = `${ocrEngine}${ocrModel ? " ("+ocrModel+")" : ""}`;
473
+ }
474
+
475
+ errEl.textContent = "";
476
+ _competitors.push(comp);
477
+ renderCompetitors();
478
+ }
479
+
480
+ function removeCompetitor(idx) {
481
+ _competitors.splice(idx, 1);
482
+ renderCompetitors();
483
+ }
484
+
485
+ function renderCompetitors() {
486
+ const container = document.getElementById("competitors-list");
487
+ if (_competitors.length === 0) {
488
+ container.innerHTML = `<div style="color:var(--text-muted);font-size:12px;">${t("compose_empty")}</div>`;
489
+ return;
490
+ }
491
+ container.innerHTML = _competitors.map((c, i) => {
492
+ const isCorpusOCR = c.ocr_engine === "corpus" || (c.ocr_engine === "" && c.llm_provider);
493
+ const isPipeline = !!c.llm_provider && !isCorpusOCR;
494
+ let badge, detail;
495
+ if (isCorpusOCR) {
496
+ badge = "📝 Post-correction";
497
+ detail = `corpus_ocr → ${c.llm_provider}:${c.llm_model} [${c.pipeline_mode}]`;
498
+ } else if (isPipeline) {
499
+ badge = "⛓ Pipeline";
500
+ detail = `${c.ocr_engine}:${c.ocr_model} → ${c.llm_provider}:${c.llm_model} [${c.pipeline_mode}]`;
501
+ } else {
502
+ badge = "🔍 OCR";
503
+ detail = `${c.ocr_engine}:${c.ocr_model}`;
504
+ }
505
+ return `<div class="competitor-card">
506
+ <div class="competitor-info">
507
+ <span class="competitor-badge">${badge}</span>
508
+ <span class="competitor-name">${c.name}</span>
509
+ <span class="competitor-detail">${detail}</span>
510
+ </div>
511
+ <button class="btn btn-danger btn-sm" onclick="removeCompetitor(${i})">✕</button>
512
+ </div>`;
513
+ }).join("");
514
+ }
515
+
516
+ // ─── Normalization profiles ──────────────────────────────────────────────────
517
+ let _normProfilesData = [];
518
+ async function loadNormProfiles() {
519
+ try {
520
+ const r = await fetch("/api/normalization/profiles");
521
+ const d = await r.json();
522
+ _normProfilesData = d.profiles || [];
523
+ const sel = document.getElementById("norm-profile");
524
+ sel.innerHTML = "";
525
+ _normProfilesData.forEach(p => {
526
+ const opt = document.createElement("option");
527
+ opt.value = p.id;
528
+ opt.textContent = `${p.name} — ${p.description}`;
529
+ if (p.id === "nfc") opt.selected = true;
530
+ sel.appendChild(opt);
531
+ });
532
+ sel.addEventListener("change", () => {
533
+ const p = _normProfilesData.find(x => x.id === sel.value);
534
+ if (p && p.exclude_chars && p.exclude_chars.length) {
535
+ document.getElementById("char-exclude").value = p.exclude_chars.join(", ");
536
+ }
537
+ });
538
+ } catch(e) {}
539
+ }
540
+
541
+ // ─── File browser ────────────────────────────────────────────────────────────
542
+ let _fbVisible = false;
543
+ function openFileBrowser() {
544
+ _fbVisible = !_fbVisible;
545
+ const c = document.getElementById("file-browser-container");
546
+ c.style.display = _fbVisible ? "block" : "none";
547
+ if (_fbVisible) browsePath(".");
548
+ }
549
+ async function browsePath(path) {
550
+ try {
551
+ const r = await fetch(`/api/corpus/browse?path=${encodeURIComponent(path)}`);
552
+ const d = await r.json();
553
+ document.getElementById("fb-current-path").textContent = d.current_path;
554
+ const fb = document.getElementById("file-browser");
555
+ fb.innerHTML = "";
556
+ if (d.parent_path) {
557
+ const up = document.createElement("div");
558
+ up.className = "fb-item";
559
+ up.innerHTML = `<span class="fb-icon">⬆</span><span class="fb-name">..</span>`;
560
+ up.onclick = () => browsePath(d.parent_path);
561
+ fb.appendChild(up);
562
+ }
563
+ d.items.filter(i => i.is_dir).forEach(item => {
564
+ const el = document.createElement("div");
565
+ el.className = "fb-item";
566
+ const hasCorpus = item.has_corpus ? `<span class="fb-badge" style="color:var(--success)">✓ ${item.gt_count} GT</span>` : "";
567
+ el.innerHTML = `<span class="fb-icon">📁</span><span class="fb-name">${item.name}</span>${hasCorpus}`;
568
+ el.onclick = () => {
569
+ if (item.has_corpus) {
570
+ document.getElementById("corpus-path").value = item.path;
571
+ document.getElementById("corpus-info").textContent = `✓ ${item.gt_count} documents GT trouvés.`;
572
+ _fbVisible = false;
573
+ document.getElementById("file-browser-container").style.display = "none";
574
+ } else {
575
+ browsePath(item.path);
576
+ }
577
+ };
578
+ fb.appendChild(el);
579
+ });
580
+ if (fb.children.length === 0) {
581
+ fb.innerHTML = '<div style="padding:12px; color: var(--text-muted); font-size:12px;">Dossier vide</div>';
582
+ }
583
+ } catch(e) {
584
+ document.getElementById("file-browser").innerHTML =
585
+ `<div style="padding:12px; color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
586
+ }
587
+ }
588
+
589
+ // ─── Benchmark ───────────────────────────────────────────────────────────────
590
+ let _currentJobId = null;
591
+ let _eventSource = null;
592
+
593
+ async function startBenchmark() {
594
+ const corpusPath = document.getElementById("corpus-path").value.trim();
595
+ if (!corpusPath) {
596
+ alert(lang === "fr" ? "Veuillez sélectionner un dossier corpus." : "Please select a corpus directory.");
597
+ return;
598
+ }
599
+ if (_competitors.length === 0) {
600
+ alert(lang === "fr" ? "Ajoutez au moins un concurrent (Section 4)." : "Add at least one competitor (Section 4).");
601
+ return;
602
+ }
603
+
604
+ const payload = {
605
+ corpus_path: corpusPath,
606
+ competitors: _competitors,
607
+ normalization_profile: document.getElementById("norm-profile").value,
608
+ char_exclude: document.getElementById("char-exclude").value.trim(),
609
+ output_dir: document.getElementById("output-dir").value,
610
+ report_name: document.getElementById("report-name").value,
611
+ };
612
+
613
+ document.getElementById("start-btn").disabled = true;
614
+ document.getElementById("cancel-btn").style.display = "inline-flex";
615
+ document.getElementById("bench-progress-section").style.display = "block";
616
+ document.getElementById("bench-result-section").style.display = "none";
617
+ document.getElementById("bench-log").textContent = "";
618
+ document.getElementById("engine-progress-list").innerHTML = "";
619
+ document.getElementById("bench-status-text").textContent = lang === "fr" ? "Démarrage…" : "Starting…";
620
+
621
+ try {
622
+ const r = await fetch("/api/benchmark/run", {
623
+ method: "POST",
624
+ headers: {"Content-Type": "application/json"},
625
+ body: JSON.stringify(payload),
626
+ });
627
+ if (!r.ok) {
628
+ const err = await r.json();
629
+ throw new Error(err.detail || "Erreur serveur");
630
+ }
631
+ const d = await r.json();
632
+ _currentJobId = d.job_id;
633
+ _startSSE(_currentJobId);
634
+ } catch(e) {
635
+ appendLog(`Erreur : ${e.message}`, "error");
636
+ document.getElementById("start-btn").disabled = false;
637
+ document.getElementById("cancel-btn").style.display = "none";
638
+ document.getElementById("bench-status-text").textContent = "";
639
+ }
640
+ }
641
+
642
+ function _startSSE(jobId) {
643
+ if (_eventSource) _eventSource.close();
644
+ const pl = document.getElementById("engine-progress-list");
645
+ pl.innerHTML = "";
646
+ const seenEngines = {};
647
+
648
+ _eventSource = new EventSource(`/api/benchmark/${jobId}/stream`);
649
+
650
+ _eventSource.addEventListener("start", e => {
651
+ const d = JSON.parse(e.data);
652
+ appendLog(d.message, "success");
653
+ document.getElementById("bench-status-text").textContent = lang === "fr" ? "En cours…" : "Running…";
654
+ });
655
+
656
+ _eventSource.addEventListener("log", e => {
657
+ const d = JSON.parse(e.data);
658
+ appendLog(d.message);
659
+ });
660
+
661
+ _eventSource.addEventListener("warning", e => {
662
+ const d = JSON.parse(e.data);
663
+ appendLog(d.message, "warn");
664
+ });
665
+
666
+ _eventSource.addEventListener("progress", e => {
667
+ const d = JSON.parse(e.data);
668
+ const pct = Math.round(d.progress * 100);
669
+ const engId = d.engine.replace(/[^a-z0-9_-]/gi, "_");
670
+ if (!seenEngines[engId]) {
671
+ seenEngines[engId] = true;
672
+ const div = document.createElement("div");
673
+ div.style = "margin-bottom: 8px;";
674
+ div.innerHTML = `<div style="display:flex;justify-content:space-between;font-size:12px;margin-bottom:3px;">
675
+ <span>${d.engine}</span><span id="eng-pct-${engId}">0%</span></div>
676
+ <div class="progress-bar-outer"><div class="progress-bar-inner" id="eng-bar-${engId}" style="width:0%"></div></div>`;
677
+ pl.appendChild(div);
678
+ }
679
+ const bar = document.getElementById(`eng-bar-${engId}`);
680
+ const pctEl = document.getElementById(`eng-pct-${engId}`);
681
+ if (bar) bar.style.width = pct + "%";
682
+ if (pctEl) pctEl.textContent = pct + "%";
683
+ document.getElementById("bench-status-text").textContent =
684
+ `${pct}% — ${d.engine} (${d.processed}/${d.total})`;
685
+ });
686
+
687
+ _eventSource.addEventListener("complete", e => {
688
+ const d = JSON.parse(e.data);
689
+ appendLog(d.message, "success");
690
+ _showResults(d);
691
+ _finishBenchmark();
692
+ });
693
+
694
+ _eventSource.addEventListener("error", e => {
695
+ const d = JSON.parse(e.data);
696
+ appendLog(d.message, "error");
697
+ _finishBenchmark();
698
+ });
699
+
700
+ _eventSource.addEventListener("cancelled", e => {
701
+ appendLog(lang === "fr" ? "Benchmark annulé." : "Benchmark cancelled.", "warn");
702
+ _finishBenchmark();
703
+ });
704
+
705
+ _eventSource.addEventListener("done", e => { _finishBenchmark(); });
706
+ _eventSource.onerror = () => { if (_currentJobId) _finishBenchmark(); };
707
+ }
708
+
709
+ function _showResults(data) {
710
+ const section = document.getElementById("bench-result-section");
711
+ section.style.display = "block";
712
+ if (data.output_html) {
713
+ const link = document.getElementById("bench-report-link");
714
+ link.href = `/reports/${data.output_html.split("/").pop()}`;
715
+ }
716
+ if (data.ranking) {
717
+ let html = `<table><thead><tr><th>#</th><th>${lang==="fr"?"Moteur":"Engine"}</th><th>CER</th><th>WER</th><th>${lang==="fr"?"Docs":"Docs"}</th></tr></thead><tbody>`;
718
+ data.ranking.forEach((row, i) => {
719
+ const cer = row.mean_cer != null ? (row.mean_cer*100).toFixed(2)+"%" : "N/A";
720
+ const wer = row.mean_wer != null ? (row.mean_wer*100).toFixed(2)+"%" : "N/A";
721
+ html += `<tr><td>${i+1}</td><td>${row.engine}</td><td>${cer}</td><td>${wer}</td><td>${row.total_docs || ""}</td></tr>`;
722
+ });
723
+ html += "</tbody></table>";
724
+ document.getElementById("bench-ranking-table").innerHTML = html;
725
+ }
726
+ }
727
+
728
+ function _finishBenchmark() {
729
+ if (_eventSource) { _eventSource.close(); _eventSource = null; }
730
+ document.getElementById("start-btn").disabled = false;
731
+ document.getElementById("cancel-btn").style.display = "none";
732
+ document.getElementById("bench-status-text").textContent = "";
733
+ }
734
+
735
+ async function cancelBenchmark() {
736
+ if (!_currentJobId) return;
737
+ await fetch(`/api/benchmark/${_currentJobId}/cancel`, {method: "POST"});
738
+ }
739
+
740
+ function appendLog(msg, cls) {
741
+ const box = document.getElementById("bench-log");
742
+ const line = document.createElement("div");
743
+ if (cls === "error") line.className = "log-error";
744
+ else if (cls === "warn") line.className = "log-warn";
745
+ else if (cls === "success") line.className = "log-success";
746
+ line.textContent = msg;
747
+ box.appendChild(line);
748
+ box.scrollTop = box.scrollHeight;
749
+ }
750
+
751
+ // ─── Reports ─────────────────────────────────────────────────────────────────
752
+ async function loadReports() {
753
+ const dir = document.getElementById("reports-dir").value || ".";
754
+ const container = document.getElementById("reports-list");
755
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
756
+ try {
757
+ const r = await fetch(`/api/reports?reports_dir=${encodeURIComponent(dir)}`);
758
+ const d = await r.json();
759
+ if (d.reports.length === 0) {
760
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("no_reports")}</div>`;
761
+ return;
762
+ }
763
+ let html = `<table><thead><tr><th>${lang==="fr"?"Fichier":"File"}</th><th>${lang==="fr"?"Taille":"Size"}</th><th>${lang==="fr"?"Modifié":"Modified"}</th><th></th></tr></thead><tbody>`;
764
+ d.reports.forEach(rep => {
765
+ const date = new Date(rep.modified).toLocaleString(lang === "fr" ? "fr-FR" : "en-US");
766
+ html += `<tr><td>${rep.filename}</td><td>${rep.size_kb} Ko</td><td>${date}</td>
767
+ <td><a href="${rep.url}" target="_blank" class="btn btn-primary btn-sm">${lang==="fr"?"Ouvrir":"Open"}</a></td></tr>`;
768
+ });
769
+ html += "</tbody></table>";
770
+ container.innerHTML = html;
771
+ } catch(e) {
772
+ container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
773
+ }
774
+ }
775
+
776
+ // ─── Engines status ──────────────────────────────────────────────────────────
777
+ async function loadEngines() {
778
+ try {
779
+ const r = await fetch("/api/engines");
780
+ const d = await r.json();
781
+
782
+ // OCR
783
+ let html = `<table><thead><tr><th>ID</th><th>${lang==="fr"?"Nom":"Name"}</th><th>Version</th><th>Statut</th></tr></thead><tbody>`;
784
+ d.engines.forEach(e => {
785
+ const cls = e.available ? "badge-ok" : "badge-err";
786
+ const lbl = e.available ? t("available") : t("not_installed");
787
+ html += `<tr><td><code>${e.id}</code></td><td>${e.label}</td><td>${e.version||"—"}</td>
788
+ <td><span class="badge ${cls}">${lbl}</span></td></tr>`;
789
+ });
790
+ html += "</tbody></table>";
791
+ document.getElementById("engines-ocr-list").innerHTML = html;
792
+
793
+ // LLMs
794
+ let llmHtml = `<table><thead><tr><th>ID</th><th>${lang==="fr"?"Nom":"Name"}</th><th>Statut</th><th>${lang==="fr"?"Détail":"Detail"}</th></tr></thead><tbody>`;
795
+ d.llms.forEach(e => {
796
+ const cls = e.available ? "badge-ok" : "badge-warn";
797
+ const statusKey = e.status === "configured" ? "configured"
798
+ : e.status === "running" ? "running"
799
+ : e.status === "not_running" ? "not_running"
800
+ : "missing_key";
801
+ const lbl = t(statusKey);
802
+ let detail = "";
803
+ if (e.key_env) detail = `<code style="font-size:11px;">${e.key_env}</code>`;
804
+ if (e.models && e.models.length > 0) detail = e.models.slice(0, 3).join(", ");
805
+ llmHtml += `<tr><td><code>${e.id}</code></td><td>${e.label}</td>
806
+ <td><span class="badge ${cls}">${lbl}</span></td><td>${detail}</td></tr>`;
807
+ });
808
+ llmHtml += "</tbody></table>";
809
+ document.getElementById("engines-llm-list").innerHTML = llmHtml;
810
+ } catch(e) {
811
+ document.getElementById("engines-ocr-list").innerHTML =
812
+ `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
813
+ }
814
+ }
815
+
816
+ // ─── HTR-United ──────────────────────────────────────────────────────────────
817
+ async function initHTRFilters() {
818
+ try {
819
+ const r = await fetch("/api/htr-united/catalogue");
820
+ const d = await r.json();
821
+ const langSel = document.getElementById("htr-lang-filter");
822
+ const scriptSel = document.getElementById("htr-script-filter");
823
+ langSel.innerHTML = `<option value="">${t("all")}</option>`;
824
+ d.available_languages.forEach(l => {
825
+ langSel.innerHTML += `<option value="${l}">${l}</option>`;
826
+ });
827
+ scriptSel.innerHTML = `<option value="">${t("all")}</option>`;
828
+ d.available_scripts.forEach(s => {
829
+ scriptSel.innerHTML += `<option value="${s}">${s}</option>`;
830
+ });
831
+ } catch(e) {}
832
+ }
833
+
834
+ async function searchHTRUnited() {
835
+ const q = document.getElementById("htr-search").value;
836
+ const lang2 = document.getElementById("htr-lang-filter").value;
837
+ const script = document.getElementById("htr-script-filter").value;
838
+ const container = document.getElementById("htr-results");
839
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
840
+ try {
841
+ const url = `/api/htr-united/catalogue?query=${encodeURIComponent(q)}&language=${encodeURIComponent(lang2)}&script=${encodeURIComponent(script)}`;
842
+ const r = await fetch(url);
843
+ const d = await r.json();
844
+ if (d.entries.length === 0) {
845
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${lang==="fr"?"Aucun résultat.":"No results."}</div>`;
846
+ return;
847
+ }
848
+ container.innerHTML = d.entries.map(e => {
849
+ const tags = [...e.language, ...e.script].map(s => `<span class="ds-tag">${s}</span>`).join("");
850
+ return `<div class="ds-card">
851
+ <div style="display:flex; justify-content:space-between; align-items:flex-start;">
852
+ <h4>${e.title}</h4>
853
+ <button class="btn btn-primary btn-sm" onclick="openImportModal('htr', '${e.id}', '${e.title.replace(/'/g,"\\'")}')">
854
+ ${lang==="fr"?"Importer":"Import"}
855
+ </button>
856
+ </div>
857
+ <p>${e.description}</p>
858
+ <p style="color: var(--text-muted);">${e.institution} — ${e.lines.toLocaleString()} ${t("lines")} — ${e.format}</p>
859
+ <div class="ds-meta">${tags}</div>
860
+ </div>`;
861
+ }).join("");
862
+ } catch(e) {
863
+ container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
864
+ }
865
+ }
866
+
867
+ async function searchHuggingFace() {
868
+ const q = document.getElementById("hf-search").value;
869
+ const langFilter = document.getElementById("hf-lang-filter").value;
870
+ const tags = document.getElementById("hf-tags").value;
871
+ const container = document.getElementById("hf-results");
872
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${t("loading")}</div>`;
873
+ try {
874
+ const url = `/api/huggingface/search?query=${encodeURIComponent(q)}&language=${encodeURIComponent(langFilter)}&tags=${encodeURIComponent(tags)}`;
875
+ const r = await fetch(url);
876
+ const d = await r.json();
877
+ if (d.datasets.length === 0) {
878
+ container.innerHTML = `<div style="color: var(--text-muted); font-size:12px;">${lang==="fr"?"Aucun résultat.":"No results."}</div>`;
879
+ return;
880
+ }
881
+ container.innerHTML = d.datasets.map(ds => {
882
+ const tags2 = ds.tags.slice(0,5).map(s => `<span class="ds-tag">${s}</span>`).join("");
883
+ return `<div class="ds-card">
884
+ <div style="display:flex; justify-content:space-between; align-items:flex-start;">
885
+ <h4>${ds.title}</h4>
886
+ <button class="btn btn-primary btn-sm" onclick="openImportModal('hf', '${ds.dataset_id.replace(/'/g,"\\'")}', '${ds.title.replace(/'/g,"\\'")}')">
887
+ ${lang==="fr"?"Importer":"Import"}
888
+ </button>
889
+ </div>
890
+ <p>${ds.description}</p>
891
+ <p style="color: var(--text-muted);">${ds.institution||ds.dataset_id} ${ds.downloads ? "— " + ds.downloads.toLocaleString() + " téléchargements" : ""}</p>
892
+ <div class="ds-meta">${tags2}</div>
893
+ </div>`;
894
+ }).join("");
895
+ } catch(e) {
896
+ container.innerHTML = `<div style="color: var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
897
+ }
898
+ }
899
+
900
+ // ─── Import modal ─────────────────────────────────────────────────────────────
901
+ function openImportModal(type, id, title) {
902
+ document.getElementById("import-modal-type").value = type;
903
+ document.getElementById("import-modal-id").value = id;
904
+ document.getElementById("import-modal-title").textContent = `${t("import_modal_title")} : ${title}`;
905
+ document.getElementById("import-modal-status").innerHTML = "";
906
+ document.getElementById("import-modal").style.display = "flex";
907
+ }
908
+ function closeImportModal() {
909
+ document.getElementById("import-modal").style.display = "none";
910
+ }
911
+ async function confirmImport() {
912
+ const type = document.getElementById("import-modal-type").value;
913
+ const id = document.getElementById("import-modal-id").value;
914
+ const outputDir = document.getElementById("import-modal-output").value;
915
+ const maxSamples = parseInt(document.getElementById("import-modal-max").value);
916
+ const statusDiv = document.getElementById("import-modal-status");
917
+ statusDiv.innerHTML = `<div class="alert alert-info"><span class="spinner"></span> ${lang==="fr"?"Import en cours…":"Importing…"}</div>`;
918
+
919
+ try {
920
+ let url, body;
921
+ if (type === "htr") {
922
+ url = "/api/htr-united/import";
923
+ body = {entry_id: id, output_dir: outputDir, max_samples: maxSamples};
924
+ } else {
925
+ url = "/api/huggingface/import";
926
+ body = {dataset_id: id, output_dir: outputDir, max_samples: maxSamples};
927
+ }
928
+ const r = await fetch(url, {method:"POST", headers:{"Content-Type":"application/json"}, body: JSON.stringify(body)});
929
+ const d = await r.json();
930
+ if (!r.ok) throw new Error(d.detail || "Erreur");
931
+ const msg = lang === "fr"
932
+ ? `✓ Import terminé. ${d.files_imported || 0} fichiers dans <code>${d.output_dir}</code>`
933
+ : `✓ Import done. ${d.files_imported || 0} files in <code>${d.output_dir}</code>`;
934
+ statusDiv.innerHTML = `<div class="alert alert-success">${msg}</div>`;
935
+ // Suggestion de corpus path
936
+ document.getElementById("corpus-path").value = d.output_dir;
937
+ } catch(e) {
938
+ statusDiv.innerHTML = `<div class="alert alert-error">Erreur : ${e.message}</div>`;
939
+ }
940
+ }
941
+
942
+ // ─── Corpus upload ────────────────────────────────────────────────────────────
943
+ let _uploadMode = "zip"; // "zip" | "files"
944
+
945
+ function switchCorpusTab(tab) {
946
+ document.getElementById("corpus-tab-browse").style.display = tab === "browse" ? "block" : "none";
947
+ document.getElementById("corpus-tab-upload").style.display = tab === "upload" ? "block" : "none";
948
+ document.getElementById("ctab-browse").classList.toggle("active", tab === "browse");
949
+ document.getElementById("ctab-upload").classList.toggle("active", tab === "upload");
950
+ if (tab === "upload") loadUploadedCorpora();
951
+ }
952
+
953
+ function onUploadModeChange() {
954
+ _uploadMode = document.querySelector("input[name=upload-mode]:checked").value;
955
+ const input = document.getElementById("upload-file-input");
956
+ if (_uploadMode === "zip") {
957
+ input.accept = ".zip";
958
+ input.multiple = false;
959
+ document.getElementById("upload-dropzone-text").textContent = t("upload_drop_zip");
960
+ } else {
961
+ input.accept = ".jpg,.jpeg,.png,.tif,.tiff,.webp,.gt.txt,.txt";
962
+ input.multiple = true;
963
+ document.getElementById("upload-dropzone-text").textContent = t("upload_drop_files");
964
+ }
965
+ }
966
+
967
+ function onFileInputChange(event) {
968
+ const files = Array.from(event.target.files);
969
+ if (files.length > 0) uploadCorpus(files);
970
+ }
971
+
972
+ function onDropFiles(event) {
973
+ event.preventDefault();
974
+ document.getElementById("upload-dropzone").classList.remove("dragover");
975
+ const files = Array.from(event.dataTransfer.files);
976
+ if (files.length > 0) uploadCorpus(files);
977
+ }
978
+
979
+ async function uploadCorpus(files) {
980
+ const progressContainer = document.getElementById("upload-progress-container");
981
+ const progressBar = document.getElementById("upload-progress-bar");
982
+ const progressText = document.getElementById("upload-progress-text");
983
+ const previewEl = document.getElementById("upload-preview");
984
+
985
+ progressContainer.style.display = "block";
986
+ progressBar.style.width = "10%";
987
+ progressText.textContent = t("upload_uploading");
988
+ previewEl.innerHTML = "";
989
+
990
+ const fd = new FormData();
991
+ for (const f of files) fd.append("files", f);
992
+
993
+ try {
994
+ // Simulate progress during upload
995
+ let pct = 10;
996
+ const timer = setInterval(() => {
997
+ pct = Math.min(pct + 5, 85);
998
+ progressBar.style.width = pct + "%";
999
+ }, 200);
1000
+
1001
+ const r = await fetch("/api/corpus/upload", {method: "POST", body: fd});
1002
+ clearInterval(timer);
1003
+ progressBar.style.width = "100%";
1004
+
1005
+ if (!r.ok) {
1006
+ const err = await r.json();
1007
+ throw new Error(err.detail || "Erreur serveur");
1008
+ }
1009
+ const d = await r.json();
1010
+ progressText.textContent = `✓ ${t("upload_success")} — ${d.doc_count} ${t("upload_pairs")}`;
1011
+ progressBar.style.background = "var(--success)";
1012
+
1013
+ // Show preview
1014
+ renderUploadPreview(d, previewEl);
1015
+
1016
+ // Show corpus OCR notice if triplet corpus
1017
+ _updateCorpusOCRNotice(d);
1018
+
1019
+ // Set corpus path and auto-select
1020
+ setCorpusPath(d.corpus_path, `upload:${d.corpus_id} (${d.doc_count} docs)`);
1021
+
1022
+ // Refresh list
1023
+ loadUploadedCorpora();
1024
+ } catch(e) {
1025
+ progressBar.style.width = "100%";
1026
+ progressBar.style.background = "var(--danger)";
1027
+ progressText.textContent = `✗ ${e.message}`;
1028
+ }
1029
+ }
1030
+
1031
+ function renderUploadPreview(data, container) {
1032
+ const missingBadge = data.has_missing_gt
1033
+ ? `<span class="badge badge-err" style="margin-left:8px;">${data.missing_gt.length} ${t("upload_missing_gt")}</span>`
1034
+ : "";
1035
+ const ocrBadge = (data.has_ocr_text && data.ocr_text_count > 0)
1036
+ ? `<span class="badge" style="margin-left:8px; background:#dcfce7; color:#16a34a;">📝 ${data.ocr_text_count} .ocr.txt</span>`
1037
+ : "";
1038
+ let html = `<div class="corpus-preview">
1039
+ <div class="corpus-preview-header">
1040
+ <span>📄 ${data.doc_count} ${t("upload_pairs")}</span>${ocrBadge}${missingBadge}
1041
+ </div>`;
1042
+ for (const p of data.pairs) {
1043
+ html += `<div class="corpus-preview-pair">
1044
+ <span style="color:var(--text-muted);">🖼</span><span>${p.image}</span>
1045
+ <span style="color:var(--text-muted); margin-left:auto;">↔</span>
1046
+ <span style="color:var(--success);">${p.gt}</span>
1047
+ </div>`;
1048
+ }
1049
+ if (data.total_pairs > data.pairs.length) {
1050
+ html += `<div class="corpus-preview-more">… et ${data.total_pairs - data.pairs.length} autres paires</div>`;
1051
+ }
1052
+ for (const w of (data.warnings || [])) {
1053
+ html += `<div style="padding:5px 12px; font-size:11px; color:var(--warning);">⚠ ${w}</div>`;
1054
+ }
1055
+ html += `</div>`;
1056
+ container.innerHTML = html;
1057
+ }
1058
+
1059
+ function setCorpusPath(path, label) {
1060
+ document.getElementById("corpus-path").value = path;
1061
+ document.getElementById("corpus-info").textContent = `✓ ${label}`;
1062
+ }
1063
+
1064
+ function _updateCorpusOCRNotice(corpusData) {
1065
+ const notice = document.getElementById("corpus-ocr-notice");
1066
+ if (!notice) return;
1067
+ if (corpusData && corpusData.has_ocr_text && corpusData.ocr_text_count > 0) {
1068
+ notice.style.display = "block";
1069
+ notice.innerHTML = `📝 ${t("corpus_has_ocr")} <strong>(${corpusData.ocr_text_count} fichiers .ocr.txt)</strong>`;
1070
+ } else {
1071
+ notice.style.display = "none";
1072
+ }
1073
+ }
1074
+
1075
+ async function loadUploadedCorpora() {
1076
+ const container = document.getElementById("uploads-list");
1077
+ try {
1078
+ const r = await fetch("/api/corpus/uploads");
1079
+ const d = await r.json();
1080
+ if (d.uploads.length === 0) {
1081
+ container.innerHTML = `<div style="color:var(--text-muted); font-size:12px;">${t("upload_no_corpus")}</div>`;
1082
+ return;
1083
+ }
1084
+ const currentPath = document.getElementById("corpus-path").value;
1085
+ container.innerHTML = d.uploads.map(u => {
1086
+ const isSelected = u.corpus_path === currentPath;
1087
+ const missing = u.has_missing_gt
1088
+ ? `<span class="badge badge-warn" style="margin-left:6px;">${t("upload_missing_gt")}</span>` : "";
1089
+ return `<div class="upload-corpus-item${isSelected ? " selected" : ""}"
1090
+ onclick="setCorpusPath('${u.corpus_path}', 'upload (${u.doc_count} docs)'); loadUploadedCorpora()">
1091
+ <span class="upload-corpus-label">
1092
+ <strong>${u.doc_count} ${t("upload_pairs")}</strong>${missing}
1093
+ <span style="display:block; font-size:11px; color:var(--text-muted); font-family:monospace;">${u.corpus_path}</span>
1094
+ </span>
1095
+ <button class="btn btn-danger btn-sm" onclick="event.stopPropagation(); deleteUploadedCorpus('${u.corpus_id}')"
1096
+ title="${t("upload_delete")}">✕</button>
1097
+ </div>`;
1098
+ }).join("");
1099
+ } catch(e) {
1100
+ container.innerHTML = `<div style="color:var(--danger); font-size:12px;">Erreur : ${e.message}</div>`;
1101
+ }
1102
+ }
1103
+
1104
+ async function deleteUploadedCorpus(corpusId) {
1105
+ try {
1106
+ await fetch(`/api/corpus/uploads/${corpusId}`, {method: "DELETE"});
1107
+ loadUploadedCorpora();
1108
+ // Clear corpus path if it was the deleted one
1109
+ const p = document.getElementById("corpus-path").value;
1110
+ if (p.includes(corpusId)) {
1111
+ document.getElementById("corpus-path").value = "";
1112
+ document.getElementById("corpus-info").textContent = "";
1113
+ }
1114
+ } catch(e) {}
1115
+ }
1116
+
1117
+ // ─── Init ────────────────────────────────────────────────────────────────────
1118
+ document.addEventListener("DOMContentLoaded", async () => {
1119
+ loadStatus();
1120
+ loadNormProfiles();
1121
+ initHTRFilters();
1122
+ // Load OCR engines, LLM models, initialize composer
1123
+ await loadBenchmarkSections();
1124
+ onComposeOCRChange(); // Pre-populate Tesseract languages
1125
+ loadComposePrompts(); // Pre-load prompt files
1126
+ startAutoRefresh(); // Auto-detect new API keys every 10 s
1127
+ // Close modal on backdrop click
1128
+ document.getElementById("import-modal").addEventListener("click", e => {
1129
+ if (e.target === document.getElementById("import-modal")) closeImportModal();
1130
+ });
1131
+ });
picarones/web/templates/base.html.j2 ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {# Sprint 25 — base de la SPA Picarones.
2
+ #
3
+ # Assemble les partials extraits de l'ancien `_HTML_TEMPLATE` monolithique
4
+ # de `web/app.py`. Variables attendues (passées par `render_index`) :
5
+ # - lang : code langue ("fr" ou "en")
6
+ # - version : version Picarones (cache-busting des statics)
7
+ #
8
+ # Le contenu dynamique reste minimal — toute la logique vit dans
9
+ # `picarones/web/static/web-app.js`. Cette séparation permet à la CSP
10
+ # de durcir `script-src` à `'self'` sans casser la page.
11
+ #}<!DOCTYPE html>
12
+ <html lang="{{ lang }}">
13
+ <head>
14
+ <meta charset="UTF-8">
15
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
16
+ <meta name="picarones-lang" content="{{ lang }}">
17
+ <title>Picarones — OCR Benchmark</title>
18
+ <link rel="stylesheet" href="/static/retro.css?v={{ version }}">
19
+ </head>
20
+ <body>
21
+
22
+ {% include '_ascii_banner.html' %}
23
+
24
+ {% include '_header_nav.html' %}
25
+
26
+ <div id="main">
27
+ {% include '_view_benchmark.html' %}
28
+ {% include '_view_reports.html' %}
29
+ {% include '_view_engines.html' %}
30
+ {% include '_view_import.html' %}
31
+ </div><!-- end #main -->
32
+
33
+ {% include '_modals.html' %}
34
+
35
+ <script src="/static/web-app.js?v={{ version }}"></script>
36
+ </body>
37
+ </html>
pyproject.toml CHANGED
@@ -81,6 +81,9 @@ include = ["picarones*"]
81
  picarones = [
82
  "prompts/*.txt",
83
  "web/static/*.css",
 
 
 
84
  "report/templates/*.j2",
85
  "report/templates/*.html",
86
  "report/templates/*.css",
 
81
  picarones = [
82
  "prompts/*.txt",
83
  "web/static/*.css",
84
+ "web/static/*.js",
85
+ "web/templates/*.j2",
86
+ "web/templates/*.html",
87
  "report/templates/*.j2",
88
  "report/templates/*.html",
89
  "report/templates/*.css",
tests/test_sprint25_web_jinja_refactor.py ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests Sprint 25 — refactor web frontend miroir du Sprint 17.
2
+
3
+ Sprint 17 a découpé le rapport HTML monolithique en 10 fichiers Jinja2.
4
+ Sprint 25 fait pareil pour la SPA web : l'ancien ``_HTML_TEMPLATE`` de
5
+ ~1500 lignes string Python (3000+ lignes au total avec le JS) dans
6
+ ``picarones/web/app.py`` est remplacé par :
7
+
8
+ picarones/web/templates/
9
+ ├── base.html.j2
10
+ ├── _ascii_banner.html
11
+ ├── _header_nav.html
12
+ ├── _view_benchmark.html
13
+ ├── _view_reports.html
14
+ ├── _view_engines.html
15
+ ├── _view_import.html
16
+ └── _modals.html
17
+
18
+ picarones/web/static/
19
+ ├── retro.css (existait déjà)
20
+ └── web-app.js (extrait du <script> inline)
21
+
22
+ Ce module vérifie :
23
+
24
+ 1. Les fichiers attendus existent et ne sont pas vides.
25
+ 2. ``_render_index`` est déterministe (Sprint 17 imposait la même règle).
26
+ 3. Les éléments structurants critiques sont présents (vues, nav, modals).
27
+ 4. Pas de balise dupliquée (ex. deux ``id="view-benchmark"``).
28
+ 5. Pas de bloc ``<script>...</script>`` inline avec du code dans la page
29
+ rendue — uniquement des ``<script src="...">``.
30
+ 6. ``picarones/web/app.py`` est passé sous la barre des 2000 lignes
31
+ (était 3163 ; cible Sprint 25 long terme : ≤ 400, mais on commence
32
+ par mesurer la victoire de l'extraction des templates).
33
+ 7. Le rendu HTML reflète bien le cookie de langue (FR vs EN).
34
+ """
35
+
36
+ from __future__ import annotations
37
+
38
+ import re
39
+ from pathlib import Path
40
+
41
+ import pytest
42
+ from fastapi.testclient import TestClient
43
+
44
+ ROOT = Path(__file__).parent.parent
45
+ WEB_DIR = ROOT / "picarones" / "web"
46
+ TEMPLATES_DIR = WEB_DIR / "templates"
47
+ STATIC_DIR = WEB_DIR / "static"
48
+ APP_PY = WEB_DIR / "app.py"
49
+
50
+
51
+ # ---------------------------------------------------------------------------
52
+ # 1. Présence et taille des fichiers extraits
53
+ # ---------------------------------------------------------------------------
54
+
55
+ EXPECTED_TEMPLATES = [
56
+ "base.html.j2",
57
+ "_ascii_banner.html",
58
+ "_header_nav.html",
59
+ "_view_benchmark.html",
60
+ "_view_reports.html",
61
+ "_view_engines.html",
62
+ "_view_import.html",
63
+ "_modals.html",
64
+ ]
65
+
66
+
67
+ class TestTemplateFilesExist:
68
+ @pytest.mark.parametrize("name", EXPECTED_TEMPLATES)
69
+ def test_template_present_and_non_empty(self, name):
70
+ path = TEMPLATES_DIR / name
71
+ assert path.is_file(), f"Template manquant : {path}"
72
+ assert path.stat().st_size > 30, f"Template suspect (vide ?) : {path}"
73
+
74
+ def test_web_app_js_extracted(self):
75
+ path = STATIC_DIR / "web-app.js"
76
+ assert path.is_file(), "web-app.js doit être extrait dans static/"
77
+ # L'ancien <script> inline pesait ~1131 lignes
78
+ line_count = sum(1 for _ in path.read_text(encoding="utf-8").splitlines())
79
+ assert line_count > 500, (
80
+ f"web-app.js semble trop court ({line_count} lignes) — "
81
+ "extraction incomplète ?"
82
+ )
83
+
84
+ def test_retro_css_still_present(self):
85
+ # Sanity : on n'a pas accidentellement supprimé le CSS principal.
86
+ assert (STATIC_DIR / "retro.css").is_file()
87
+
88
+
89
+ # ---------------------------------------------------------------------------
90
+ # 2. Déterminisme du rendu
91
+ # ---------------------------------------------------------------------------
92
+
93
+ class TestRenderIndexDeterminism:
94
+ def test_same_inputs_same_output(self):
95
+ from picarones.web.app import _render_index
96
+
97
+ a = _render_index("fr")
98
+ b = _render_index("fr")
99
+ assert a == b, "Rendu non déterministe sur lang=fr"
100
+
101
+ def test_lang_change_changes_output(self):
102
+ from picarones.web.app import _render_index
103
+
104
+ fr = _render_index("fr")
105
+ en = _render_index("en")
106
+ assert fr != en, "Le rendu doit dépendre de la langue"
107
+ assert 'name="picarones-lang" content="fr"' in fr
108
+ assert 'name="picarones-lang" content="en"' in en
109
+
110
+ def test_html_lang_attribute_set(self):
111
+ from picarones.web.app import _render_index
112
+ assert '<html lang="en">' in _render_index("en")
113
+ assert '<html lang="fr">' in _render_index("fr")
114
+
115
+
116
+ # ---------------------------------------------------------------------------
117
+ # 3. Éléments structurants présents
118
+ # ---------------------------------------------------------------------------
119
+
120
+ class TestStructuralElementsPresent:
121
+ @pytest.fixture(scope="class")
122
+ def html(self) -> str:
123
+ from picarones.web.app import _render_index
124
+ return _render_index("fr")
125
+
126
+ @pytest.mark.parametrize("view_id", [
127
+ "view-benchmark",
128
+ "view-reports",
129
+ "view-engines",
130
+ "view-import",
131
+ ])
132
+ def test_each_view_present(self, html, view_id):
133
+ assert f'id="{view_id}"' in html, (
134
+ f"Vue '{view_id}' manquante dans la page rendue"
135
+ )
136
+
137
+ def test_nav_buttons_present(self, html):
138
+ for label in ("nav_benchmark", "nav_reports", "nav_engines", "nav_import"):
139
+ assert f'data-i18n="{label}"' in html
140
+
141
+ def test_import_modal_present(self, html):
142
+ assert 'id="import-modal"' in html
143
+
144
+ def test_external_js_referenced(self, html):
145
+ # Le bundle JS doit être chargé via <script src="...">
146
+ assert re.search(r'<script\s+src="/static/web-app\.js', html), (
147
+ "La balise <script src='/static/web-app.js'> doit être présente"
148
+ )
149
+
150
+ def test_retro_css_referenced(self, html):
151
+ assert re.search(r'<link\s+rel="stylesheet"\s+href="/static/retro\.css', html)
152
+
153
+
154
+ # ---------------------------------------------------------------------------
155
+ # 4. Pas de balise dupliquée (garde-fou contre {% include %} en double)
156
+ # ---------------------------------------------------------------------------
157
+
158
+ _ID_RE = re.compile(r'\sid="([a-zA-Z0-9_\-]+)"')
159
+
160
+
161
+ class TestNoDuplicateIds:
162
+ def test_no_duplicate_ids_in_rendered_page(self):
163
+ from picarones.web.app import _render_index
164
+ html = _render_index("fr")
165
+ ids = _ID_RE.findall(html)
166
+ # Les `id` HTML doivent être uniques (W3C). Une duplication signe un
167
+ # double-include accidentel ou un copier-coller raté.
168
+ seen: dict[str, int] = {}
169
+ for i in ids:
170
+ seen[i] = seen.get(i, 0) + 1
171
+ dupes = {k: v for k, v in seen.items() if v > 1}
172
+ assert not dupes, f"IDs dupliqués dans la SPA rendue : {dupes}"
173
+
174
+
175
+ # ---------------------------------------------------------------------------
176
+ # 5. Pas de gros bloc <script>...</script> inline avec du code
177
+ # ---------------------------------------------------------------------------
178
+
179
+ class TestNoInlineScriptCode:
180
+ """Sprint 25 a extrait tout le JS dans /static/web-app.js. La page
181
+ rendue ne doit plus contenir un bloc ``<script>...</script>`` qui
182
+ embarque du code (les ``<script src="..."></script>`` restent
183
+ autorisés)."""
184
+
185
+ def test_no_large_inline_script_block(self):
186
+ from picarones.web.app import _render_index
187
+ html = _render_index("fr")
188
+ # Capture tout le contenu entre <script> sans src= et </script>.
189
+ pattern = re.compile(
190
+ r"<script(?![^>]*\bsrc=)[^>]*>(.*?)</script>",
191
+ re.DOTALL,
192
+ )
193
+ for body in pattern.findall(html):
194
+ # Quelques bytes blancs sont tolérés (ex. <script>\n</script>)
195
+ stripped = body.strip()
196
+ assert len(stripped) < 200, (
197
+ "Un bloc <script> inline contient encore du code "
198
+ f"({len(stripped)} caractères). Doit vivre dans /static/web-app.js."
199
+ )
200
+
201
+
202
+ # ---------------------------------------------------------------------------
203
+ # 6. Mesure du dégonflement de app.py
204
+ # ---------------------------------------------------------------------------
205
+
206
+ class TestAppPyShrunk:
207
+ def test_app_py_below_2000_lines(self):
208
+ n = sum(1 for _ in APP_PY.read_text(encoding="utf-8").splitlines())
209
+ assert n < 2000, (
210
+ f"web/app.py fait encore {n} lignes après Sprint 25 — "
211
+ "le bloc _HTML_TEMPLATE est-il bien supprimé ?"
212
+ )
213
+
214
+ def test_html_template_string_removed(self):
215
+ src = APP_PY.read_text(encoding="utf-8")
216
+ assert "_HTML_TEMPLATE = r" not in src, (
217
+ "Le monolithe _HTML_TEMPLATE doit être supprimé de app.py"
218
+ )
219
+
220
+
221
+ # ---------------------------------------------------------------------------
222
+ # 7. Smoke test bout-en-bout via TestClient
223
+ # ---------------------------------------------------------------------------
224
+
225
+ class TestEndpointStillServesPage:
226
+ @pytest.fixture
227
+ def client(self):
228
+ from picarones.web.app import app
229
+ return TestClient(app)
230
+
231
+ def test_root_returns_200_and_html(self, client):
232
+ r = client.get("/")
233
+ assert r.status_code == 200
234
+ assert "text/html" in r.headers["content-type"]
235
+
236
+ def test_root_respects_cookie_lang(self, client):
237
+ r = client.get("/", cookies={"picarones_lang": "en"})
238
+ assert 'content="en"' in r.text
239
+ r2 = client.get("/", cookies={"picarones_lang": "fr"})
240
+ assert 'content="fr"' in r2.text
241
+
242
+ def test_root_falls_back_on_unsupported_lang(self, client):
243
+ r = client.get("/", cookies={"picarones_lang": "ne-pas-exister"})
244
+ # Doit retomber sur fr (cf. ``_SUPPORTED_LANGS``)
245
+ assert 'content="fr"' in r.text
246
+
247
+ def test_static_js_served(self, client):
248
+ r = client.get("/static/web-app.js")
249
+ assert r.status_code == 200
250
+ assert r.headers["content-type"].startswith(("application/javascript", "text/javascript"))