Spaces:
Sleeping
feat(sprint-S4-batch1+S5): coverage modules critiques + tests dégradation réseau
Browse filesSprint S4 (batch 1/4) + Sprint S5 (livré en parallèle).
S4.1 — JobStore (64% → 100%)
----------------------------
``tests/adapters/storage/test_s4_job_store_sql.py`` (26 tests, 7 classes) :
- ``TestCreate`` (5) — création, payload vide, job_id vide rejeté,
duplicate rejeté, payload complexe persisté.
- ``TestGetAndList`` (6) — get unknown=None, list vide, ordre par
created DESC, limit, limit=0.
- ``TestUpdateProgress`` (4) — clamping [0..1], unknown silencieux.
- ``TestStatusTransitions`` (5) — mark_running, complete avec
output, error avec message, cancelled, ``is_terminal``.
- ``TestOrphanedJobsCleanup`` (3) — pending+running →
``interrupted`` au boot, jobs terminaux préservés, message
``process restart`` posé.
- ``TestPayloadCorruptionTolerance`` (1) — payload_json invalide
dégrade en ``{}`` + warning, pas de crash.
- ``TestPersistence`` (2) — jobs persistent cross-instance,
``db_path`` exposé.
Coverage : 136 lignes / 0 manquantes / **100%**.
S4.2 — History router (55% → ~95%)
----------------------------------
``tests/web/routers/test_s4_history_router.py`` (6 tests, 4 classes) :
- ``TestEmptyHistory`` (2) — DB vide → count=0, threshold default.
- ``TestExplicitEngine`` (1) — param ``engine`` filtre.
- ``TestHistoryWithRegression`` (2) — populate via ``record_single``,
régression détectée, threshold filtre.
- ``TestDBErrorHandling`` (1) — db_path inaccessible → erreur
propre.
**Découverte d'audit** (vrai bug fixé) :
``picarones/interfaces/web/routers/history.py:43`` accédait à
``e.engine`` alors que ``HistoryEntry`` expose ``engine_name``.
Le typo était masqué par un ``except Exception:`` générique →
l'endpoint sans param ``engine`` retournait toujours 0 régression.
Bug silencieux découvert par les tests S4.2.
Fix : ``e.engine`` → ``e.engine_name`` + log explicite si
l'énumération échoue (au lieu du silence).
S4.3 — Importers router (0% direct → 80%+)
------------------------------------------
``tests/web/routers/test_s4_importers_router.py`` (10 tests, 4 classes) :
- ``TestHTRUnitedCatalogue`` (3) — listing démo, query filtre,
language filtre.
- ``TestHTRUnitedImport`` (2) — entry_id inconnu = 404, entry
connue appelle ``import_htr_united_corpus`` (mocké).
- ``TestHuggingFaceSearch`` (4) — résultats listés, vide, validation
``limit ∈ [1..50]``, parsing tags virgule.
- ``TestHuggingFaceImport`` (1) — appel ``import_dataset`` mocké
avec kwargs corrects.
Tous les appels réseau mockés — pas de tests live nécessaires.
S5 (parallèle, livré par agent) — Dégradation + edge cases
----------------------------------------------------------
44 tests, 6 fichiers, 42 passed + 2 xfailed (xfail = vrais bugs
documentés sans correction immédiate).
- ``tests/adapters/corpus/test_s5_gallica_down.py`` (9 tests).
**xfail** : ``test_raw_socket_timeout_propagates_documents_fragility``
documente que ``download_url`` ne capture pas
``socket.timeout``/``TimeoutError`` Py3.10+ — laisse fuiter.
À fixer dans un sprint dédié.
- ``tests/adapters/corpus/test_s5_iiif_corrupt_manifest.py`` (7 tests).
**xfail** : ``test_oversized_manifest_should_have_size_limit``
documente que ``IIIFImporter._fetch_manifest`` accepte sans
broncher un manifest >12 Mo (DoS mémoire potentiel).
- ``tests/adapters/corpus/test_s5_huggingface_unavailable.py`` (4 tests).
- ``tests/evaluation/metrics/test_s5_extreme_inputs.py`` (14 tests) —
texte 10 Mo, emoji multibyte, RTL arabe, NFC vs NFD, U+2028,
whitespace pur, null bytes.
- ``tests/golden/test_s5_benchmark_result_json_stable.py`` (5 tests) —
snapshot ``BenchmarkResult.to_json()`` byte-stable, fixture
golden ``benchmark_result_v2.json`` versionnée dans le repo.
- ``tests/integration/test_s5_disk_full_simulation.py`` (4 tests) —
mock ``OSError(ENOSPC)``, vérifie cleanup partiel et absence
de fichier corrompu.
Régression .gitignore
---------------------
``.gitignore`` ignorait silencieusement ``tests/adapters/corpus/``
(à cause de la ligne ``corpus/`` qui est trop large) — les 3
fichiers S5 corpus auraient été perdus à la prochaine clean
checkout. Fix : ajout d'exceptions ``!tests/adapters/corpus/``
+ ``!tests/adapters/corpus/**``.
Aussi : nettoyage des entrées ``.gitignore`` stale —
``picarones/web/templates`` (paquet supprimé H.4) →
``picarones/interfaces/web/templates`` ; ``picarones/reports_v2``
(renommé H.3) → ``picarones/reports``.
Tests
-----
- ``pytest tests/`` : 4287 passed (+85 vs S3), 9 skipped, 24
deselected, **2 xfailed** (vrais bugs S5 documentés).
- ``ruff check`` : All checks passed.
- ``pytest --cov=picarones.adapters.storage.job_store`` : 100%.
Reste pour S4
-------------
- S4.4-S4.7 : 4 vues HTML (pipeline 27%, robustness 38%,
diagnostics 48%, advanced_taxonomy 71%).
- S4.8 : 4 adapters VLM (anthropic, mistral, ollama, openai).
- S4.9 : corpus_service.py.
- S4.10 : job_runner.py.
https://claude.ai/code/session_01NxyVKqg2SowXLZdM4H1ZDE
- .gitignore +10 -12
- CLAUDE.md +2 -2
- README.md +1 -1
- picarones/interfaces/web/routers/history.py +13 -2
- tests/adapters/corpus/__init__.py +0 -0
- tests/adapters/corpus/test_s5_gallica_down.py +282 -0
- tests/adapters/corpus/test_s5_huggingface_unavailable.py +182 -0
- tests/adapters/corpus/test_s5_iiif_corrupt_manifest.py +210 -0
- tests/adapters/storage/test_s4_job_store_sql.py +293 -0
- tests/evaluation/metrics/__init__.py +0 -0
- tests/evaluation/metrics/test_s5_extreme_inputs.py +237 -0
- tests/golden/__init__.py +0 -0
- tests/golden/fixtures/benchmark_result_v2.json +237 -0
- tests/golden/test_s5_benchmark_result_json_stable.py +238 -0
- tests/integration/test_s5_disk_full_simulation.py +195 -0
- tests/web/routers/__init__.py +0 -0
- tests/web/routers/test_s4_history_router.py +207 -0
- tests/web/routers/test_s4_importers_router.py +244 -0
|
@@ -28,19 +28,17 @@ jobs.db-shm
|
|
| 28 |
jobs.db-wal
|
| 29 |
|
| 30 |
# Exceptions : fichiers HTML sources du package (templates Jinja2, pas rapports)
|
| 31 |
-
!picarones/web/templates/*.html
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
#
|
| 36 |
-
# cc53ead, faisant échouer ~91 tests (TemplateNotFound _header.html
|
| 37 |
-
# etc.). Cette nouvelle exception remplace l'ancienne (plus en
|
| 38 |
-
# vigueur depuis la suppression de picarones/report/ au Lot F).
|
| 39 |
-
!picarones/reports_v2/html/templates/*.html
|
| 40 |
-
# Sprint A14-S3 — sous-package du code (homonyme de corpus/ data ignoré ligne 21)
|
| 41 |
!picarones/adapters/corpus/
|
| 42 |
!picarones/adapters/corpus/**
|
| 43 |
-
|
| 44 |
-
|
|
|
|
|
|
|
| 45 |
picarones/adapters/corpus/**/__pycache__/
|
|
|
|
| 46 |
_version.py
|
|
|
|
| 28 |
jobs.db-wal
|
| 29 |
|
| 30 |
# Exceptions : fichiers HTML sources du package (templates Jinja2, pas rapports)
|
| 31 |
+
!picarones/interfaces/web/templates/*.html
|
| 32 |
+
!picarones/interfaces/web/templates/*.j2
|
| 33 |
+
!picarones/reports/html/templates/*.html
|
| 34 |
+
!picarones/reports/html/templates/*.j2
|
| 35 |
+
# Sous-packages dont le nom matche ``corpus/`` (data ignorée ligne 21).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
!picarones/adapters/corpus/
|
| 37 |
!picarones/adapters/corpus/**
|
| 38 |
+
!tests/adapters/corpus/
|
| 39 |
+
!tests/adapters/corpus/**
|
| 40 |
+
# Ré-ignorer __pycache__/ dans ces sous-packages — sinon la
|
| 41 |
+
# négation rouvre la règle ligne 1.
|
| 42 |
picarones/adapters/corpus/**/__pycache__/
|
| 43 |
+
tests/adapters/corpus/**/__pycache__/
|
| 44 |
_version.py
|
|
@@ -116,7 +116,7 @@ picarones/
|
|
| 116 |
|
| 117 |
## État des tests et bugs historiques
|
| 118 |
|
| 119 |
-
`pytest tests/` → **
|
| 120 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 121 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 122 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
@@ -268,7 +268,7 @@ détecte, arbitre, rend.
|
|
| 268 |
## Contexte développement
|
| 269 |
|
| 270 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 271 |
-
- **Tests** : `pytest tests/ -q` →
|
| 272 |
deselected, 0 failed (post-v2.0).
|
| 273 |
- **Manifeste architecture** : [`docs/explanation/architecture.md`](docs/explanation/architecture.md).
|
| 274 |
- **API publique stable** : [`docs/reference/api-stable.md`](docs/reference/api-stable.md).
|
|
|
|
| 116 |
|
| 117 |
## État des tests et bugs historiques
|
| 118 |
|
| 119 |
+
`pytest tests/` → **4320 passed, 12 skipped, 8 deselected, 0 failed**
|
| 120 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 121 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 122 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
|
|
| 268 |
## Contexte développement
|
| 269 |
|
| 270 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 271 |
+
- **Tests** : `pytest tests/ -q` → 4320 passed, 9 skipped, 24
|
| 272 |
deselected, 0 failed (post-v2.0).
|
| 273 |
- **Manifeste architecture** : [`docs/explanation/architecture.md`](docs/explanation/architecture.md).
|
| 274 |
- **API publique stable** : [`docs/reference/api-stable.md`](docs/reference/api-stable.md).
|
|
@@ -394,7 +394,7 @@ ruff check picarones/ tests/
|
|
| 394 |
python -m mypy picarones/core/
|
| 395 |
```
|
| 396 |
|
| 397 |
-
**Test suite**: ~
|
| 398 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 399 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 400 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
|
|
| 394 |
python -m mypy picarones/core/
|
| 395 |
```
|
| 396 |
|
| 397 |
+
**Test suite**: ~4320 tests, ~3 min on a modern laptop. Coverage
|
| 398 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 399 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 400 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
@@ -40,8 +40,19 @@ async def api_history_regressions(
|
|
| 40 |
else:
|
| 41 |
try:
|
| 42 |
entries = history.query(limit=10000)
|
| 43 |
-
|
| 44 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
targets = []
|
| 46 |
|
| 47 |
out: list[dict[str, Any]] = []
|
|
|
|
| 40 |
else:
|
| 41 |
try:
|
| 42 |
entries = history.query(limit=10000)
|
| 43 |
+
# Sprint S4 — fix : ``HistoryEntry`` expose
|
| 44 |
+
# ``engine_name``, pas ``engine`` (typo masquée par
|
| 45 |
+
# l'``except`` générique). Avant ce fix, l'endpoint
|
| 46 |
+
# sans param ``engine`` retournait toujours 0
|
| 47 |
+
# régression — bug silencieux découvert par les tests
|
| 48 |
+
# ``test_s4_history_router.py``.
|
| 49 |
+
targets = sorted(
|
| 50 |
+
{e.engine_name for e in entries if e.engine_name}
|
| 51 |
+
)
|
| 52 |
+
except Exception as exc: # noqa: BLE001
|
| 53 |
+
_logger.warning(
|
| 54 |
+
"[regressions] énumération des moteurs échouée : %s", exc,
|
| 55 |
+
)
|
| 56 |
targets = []
|
| 57 |
|
| 58 |
out: list[dict[str, Any]] = []
|
|
File without changes
|
|
@@ -0,0 +1,282 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Tests de dégradation réseau pour GallicaClient.
|
| 2 |
+
|
| 3 |
+
Ce module simule différents modes de panne de l'API Gallica (BnF) :
|
| 4 |
+
|
| 5 |
+
- Timeout de connexion
|
| 6 |
+
- Erreur HTTP 503 (Service Unavailable)
|
| 7 |
+
- Erreur HTTP 404 (Not Found)
|
| 8 |
+
- Connection refused (réseau inaccessible)
|
| 9 |
+
- Réponse partielle / connexion coupée
|
| 10 |
+
|
| 11 |
+
Pour chaque cas, on vérifie :
|
| 12 |
+
|
| 13 |
+
- ``GallicaClient`` ne masque pas l'erreur silencieusement (search()
|
| 14 |
+
documente l'erreur via logger, get_metadata() retourne un dict avec
|
| 15 |
+
juste l'ARK).
|
| 16 |
+
- Aucun fichier partiel n'est laissé sur disque en cas d'échec.
|
| 17 |
+
|
| 18 |
+
Les sources HTTP sont mockées au niveau ``urllib.request.urlopen`` pour
|
| 19 |
+
simuler les échecs réseau sans dépendance externe (voir CLAUDE.md règle
|
| 20 |
+
"pas de tests réseau réels par défaut").
|
| 21 |
+
"""
|
| 22 |
+
|
| 23 |
+
from __future__ import annotations
|
| 24 |
+
|
| 25 |
+
import socket
|
| 26 |
+
import urllib.error
|
| 27 |
+
from unittest.mock import patch
|
| 28 |
+
|
| 29 |
+
import pytest
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# --------------------------------------------------------------------------
|
| 33 |
+
# 1. Timeout de connexion
|
| 34 |
+
# --------------------------------------------------------------------------
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
class TestGallicaTimeoutPropagation:
|
| 38 |
+
"""Sur timeout réseau enveloppé par urllib (URLError), search()
|
| 39 |
+
retourne [] (par contrat) mais log l'erreur ; get_metadata()
|
| 40 |
+
retourne le dict minimal {'ark': ark}.
|
| 41 |
+
|
| 42 |
+
Note S5 : ``urllib.request.urlopen`` enveloppe les ``socket.timeout``
|
| 43 |
+
bruts dans ``URLError`` côté production. Ici on simule ce
|
| 44 |
+
comportement de wrapping pour que ``download_url`` capture bien
|
| 45 |
+
l'exception. Un ``socket.timeout`` (= ``TimeoutError``) brut
|
| 46 |
+
*ne serait pas* attrapé par le ``except (URLError, HTTPError)``
|
| 47 |
+
actuel — c'est un point de fragilité documenté ailleurs."""
|
| 48 |
+
|
| 49 |
+
def test_search_timeout_returns_empty_list_logs_error(self, caplog):
|
| 50 |
+
from picarones.adapters.corpus.gallica import GallicaClient
|
| 51 |
+
|
| 52 |
+
client = GallicaClient(delay_between_requests=0)
|
| 53 |
+
# Wrap le timeout dans URLError comme le ferait urllib
|
| 54 |
+
url_err = urllib.error.URLError(socket.timeout("connection timed out"))
|
| 55 |
+
with patch(
|
| 56 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 57 |
+
side_effect=url_err,
|
| 58 |
+
):
|
| 59 |
+
with caplog.at_level("ERROR"):
|
| 60 |
+
results = client.search(title="Froissart", max_results=5)
|
| 61 |
+
# Contrat : pas de plantage, retour vide silencieusement.
|
| 62 |
+
assert results == []
|
| 63 |
+
# Mais l'erreur est documentée
|
| 64 |
+
assert any(
|
| 65 |
+
"SRU" in rec.message or "Erreur" in rec.message
|
| 66 |
+
or "Impossible" in rec.message
|
| 67 |
+
for rec in caplog.records
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
def test_get_metadata_timeout_returns_minimal_dict(self):
|
| 71 |
+
from picarones.adapters.corpus.gallica import GallicaClient
|
| 72 |
+
|
| 73 |
+
client = GallicaClient(delay_between_requests=0)
|
| 74 |
+
url_err = urllib.error.URLError(socket.timeout("connection timed out"))
|
| 75 |
+
with patch(
|
| 76 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 77 |
+
side_effect=url_err,
|
| 78 |
+
):
|
| 79 |
+
meta = client.get_metadata("12148/btv1b8453561w")
|
| 80 |
+
assert meta == {"ark": "12148/btv1b8453561w"}
|
| 81 |
+
|
| 82 |
+
def test_raw_socket_timeout_propagates_documents_fragility(self):
|
| 83 |
+
"""Documente la fragilité réelle : un ``socket.timeout`` brut
|
| 84 |
+
(= ``TimeoutError`` Py3.10+) n'est PAS attrapé par
|
| 85 |
+
``except (URLError, HTTPError)`` dans download_url. C'est un bug
|
| 86 |
+
latent — marqué xfail jusqu'à fix production."""
|
| 87 |
+
from picarones.adapters.corpus._http import download_url
|
| 88 |
+
|
| 89 |
+
with patch(
|
| 90 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 91 |
+
side_effect=socket.timeout("raw timeout"),
|
| 92 |
+
):
|
| 93 |
+
try:
|
| 94 |
+
download_url(
|
| 95 |
+
"https://gallica.bnf.fr/test",
|
| 96 |
+
retries=1,
|
| 97 |
+
backoff=0.0,
|
| 98 |
+
timeout=1,
|
| 99 |
+
)
|
| 100 |
+
except RuntimeError:
|
| 101 |
+
# Comportement souhaité (si fix appliqué)
|
| 102 |
+
pass
|
| 103 |
+
except (TimeoutError, socket.timeout):
|
| 104 |
+
# Comportement actuel — bug latent
|
| 105 |
+
pytest.xfail(
|
| 106 |
+
"S5 — download_url ne capture pas socket.timeout brut "
|
| 107 |
+
"(seulement URLError/HTTPError). À corriger : ajouter "
|
| 108 |
+
"OSError/TimeoutError au except."
|
| 109 |
+
)
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
# --------------------------------------------------------------------------
|
| 113 |
+
# 2. Erreur HTTP 503 (Service Unavailable)
|
| 114 |
+
# --------------------------------------------------------------------------
|
| 115 |
+
|
| 116 |
+
|
| 117 |
+
class TestGallica503Propagation:
|
| 118 |
+
"""503 = panne de l'API Gallica côté serveur. Doit lever
|
| 119 |
+
``RuntimeError`` au niveau ``download_url`` ; le client de plus
|
| 120 |
+
haut niveau (search, get_metadata) absorbe en retour vide /
|
| 121 |
+
minimal mais log."""
|
| 122 |
+
|
| 123 |
+
def test_download_url_propagates_503_after_retries(self):
|
| 124 |
+
from picarones.adapters.corpus._http import download_url
|
| 125 |
+
|
| 126 |
+
http_error = urllib.error.HTTPError(
|
| 127 |
+
url="https://gallica.bnf.fr/SRU?q=test",
|
| 128 |
+
code=503,
|
| 129 |
+
msg="Service Unavailable",
|
| 130 |
+
hdrs=None, # type: ignore[arg-type]
|
| 131 |
+
fp=None,
|
| 132 |
+
)
|
| 133 |
+
with patch(
|
| 134 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 135 |
+
side_effect=http_error,
|
| 136 |
+
):
|
| 137 |
+
# ``download_url`` doit lever RuntimeError explicite, pas
|
| 138 |
+
# silence ni dict vide.
|
| 139 |
+
with pytest.raises(RuntimeError) as exc_info:
|
| 140 |
+
download_url(
|
| 141 |
+
"https://gallica.bnf.fr/SRU?q=test",
|
| 142 |
+
retries=2,
|
| 143 |
+
backoff=0.0,
|
| 144 |
+
timeout=1,
|
| 145 |
+
)
|
| 146 |
+
assert "https://gallica.bnf.fr/SRU?q=test" in str(exc_info.value)
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
# --------------------------------------------------------------------------
|
| 150 |
+
# 3. Erreur HTTP 404 (Not Found)
|
| 151 |
+
# --------------------------------------------------------------------------
|
| 152 |
+
|
| 153 |
+
|
| 154 |
+
class TestGallica404NotFound:
|
| 155 |
+
"""404 = ARK inexistant. get_ocr_text() retourne '' sans planter."""
|
| 156 |
+
|
| 157 |
+
def test_get_ocr_text_404_returns_empty(self):
|
| 158 |
+
from picarones.adapters.corpus.gallica import GallicaClient
|
| 159 |
+
|
| 160 |
+
client = GallicaClient(delay_between_requests=0)
|
| 161 |
+
http_error = urllib.error.HTTPError(
|
| 162 |
+
url="https://gallica.bnf.fr/ark:/12148/inexistant/f1.texteBrut",
|
| 163 |
+
code=404,
|
| 164 |
+
msg="Not Found",
|
| 165 |
+
hdrs=None, # type: ignore[arg-type]
|
| 166 |
+
fp=None,
|
| 167 |
+
)
|
| 168 |
+
with patch(
|
| 169 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 170 |
+
side_effect=http_error,
|
| 171 |
+
):
|
| 172 |
+
text = client.get_ocr_text("12148/inexistant", page=1)
|
| 173 |
+
# Contrat documenté : "" si OCR non disponible.
|
| 174 |
+
assert text == ""
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
# --------------------------------------------------------------------------
|
| 178 |
+
# 4. Connection refused (réseau totalement inaccessible)
|
| 179 |
+
# --------------------------------------------------------------------------
|
| 180 |
+
|
| 181 |
+
|
| 182 |
+
class TestGallicaConnectionRefused:
|
| 183 |
+
"""Le réseau est down (Wi-Fi coupé, DNS cassé). On veut une erreur
|
| 184 |
+
explicite avec message propre, pas un AttributeError ou KeyError."""
|
| 185 |
+
|
| 186 |
+
def test_download_url_connection_refused_explicit_error(self):
|
| 187 |
+
from picarones.adapters.corpus._http import download_url
|
| 188 |
+
|
| 189 |
+
url_error = urllib.error.URLError(
|
| 190 |
+
ConnectionRefusedError("Connection refused")
|
| 191 |
+
)
|
| 192 |
+
with patch(
|
| 193 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 194 |
+
side_effect=url_error,
|
| 195 |
+
):
|
| 196 |
+
with pytest.raises(RuntimeError) as exc_info:
|
| 197 |
+
download_url(
|
| 198 |
+
"https://gallica.bnf.fr/manifest.json",
|
| 199 |
+
retries=1,
|
| 200 |
+
backoff=0.0,
|
| 201 |
+
timeout=1,
|
| 202 |
+
)
|
| 203 |
+
assert "gallica.bnf.fr" in str(exc_info.value)
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
# --------------------------------------------------------------------------
|
| 207 |
+
# 5. Pas de fichier partiel sur disque en cas d'échec
|
| 208 |
+
# --------------------------------------------------------------------------
|
| 209 |
+
|
| 210 |
+
|
| 211 |
+
class TestGallicaNoPartialFileOnFailure:
|
| 212 |
+
"""Si le téléchargement échoue avant la fin, aucun fichier
|
| 213 |
+
partiel ne doit polluer le filesystem.
|
| 214 |
+
|
| 215 |
+
Note : la fonction ``download_url`` retourne ``bytes`` en mémoire,
|
| 216 |
+
elle n'écrit jamais sur disque (pas de risque de partial). On
|
| 217 |
+
vérifie tout de même le comportement défensif côté client.
|
| 218 |
+
"""
|
| 219 |
+
|
| 220 |
+
def test_no_orphan_files_after_search_timeout(self, tmp_path):
|
| 221 |
+
from picarones.adapters.corpus.gallica import GallicaClient
|
| 222 |
+
|
| 223 |
+
client = GallicaClient(delay_between_requests=0)
|
| 224 |
+
# Le tmp_path est totalement vide au départ
|
| 225 |
+
before = list(tmp_path.iterdir())
|
| 226 |
+
assert before == []
|
| 227 |
+
|
| 228 |
+
with patch(
|
| 229 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 230 |
+
side_effect=urllib.error.URLError(socket.timeout("timeout")),
|
| 231 |
+
):
|
| 232 |
+
client.search(title="Froissart")
|
| 233 |
+
|
| 234 |
+
# tmp_path doit rester vide : Gallica ne touche pas au disque
|
| 235 |
+
# pendant search/get_metadata
|
| 236 |
+
after = list(tmp_path.iterdir())
|
| 237 |
+
assert after == [], f"Fichiers parasites créés: {after}"
|
| 238 |
+
|
| 239 |
+
def test_get_ocr_text_failure_no_disk_artifact(self, tmp_path):
|
| 240 |
+
from picarones.adapters.corpus.gallica import GallicaClient
|
| 241 |
+
|
| 242 |
+
client = GallicaClient(delay_between_requests=0)
|
| 243 |
+
before = list(tmp_path.iterdir())
|
| 244 |
+
assert before == []
|
| 245 |
+
|
| 246 |
+
with patch(
|
| 247 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 248 |
+
side_effect=urllib.error.URLError("network unreachable"),
|
| 249 |
+
):
|
| 250 |
+
text = client.get_ocr_text("12148/anything", page=1)
|
| 251 |
+
|
| 252 |
+
assert text == ""
|
| 253 |
+
# Aucun fichier intermédiaire dans tmp_path
|
| 254 |
+
after = list(tmp_path.iterdir())
|
| 255 |
+
assert after == []
|
| 256 |
+
|
| 257 |
+
|
| 258 |
+
# --------------------------------------------------------------------------
|
| 259 |
+
# 6. Retry exponentiel : message d'erreur explicite après épuisement
|
| 260 |
+
# --------------------------------------------------------------------------
|
| 261 |
+
|
| 262 |
+
|
| 263 |
+
class TestGallicaRetriesExhausted:
|
| 264 |
+
"""Après ``retries`` tentatives, ``download_url`` lève une
|
| 265 |
+
``RuntimeError`` qui mentionne le nombre exact de tentatives."""
|
| 266 |
+
|
| 267 |
+
def test_retries_exhausted_explicit_message(self):
|
| 268 |
+
from picarones.adapters.corpus._http import download_url
|
| 269 |
+
|
| 270 |
+
with patch(
|
| 271 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen",
|
| 272 |
+
side_effect=urllib.error.URLError("server down"),
|
| 273 |
+
):
|
| 274 |
+
with pytest.raises(RuntimeError) as exc_info:
|
| 275 |
+
download_url(
|
| 276 |
+
"https://gallica.bnf.fr/test",
|
| 277 |
+
retries=3,
|
| 278 |
+
backoff=0.0, # pas d'attente pour le test
|
| 279 |
+
timeout=1,
|
| 280 |
+
)
|
| 281 |
+
# Le message contient "3 tentatives"
|
| 282 |
+
assert "3 tentatives" in str(exc_info.value)
|
|
@@ -0,0 +1,182 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Tests d'indisponibilité de HuggingFace Hub.
|
| 2 |
+
|
| 3 |
+
Cas couverts :
|
| 4 |
+
|
| 5 |
+
- HF Hub renvoie 503 (panne)
|
| 6 |
+
- HF Hub renvoie 404 (dataset inexistant)
|
| 7 |
+
- Erreur réseau (DNS down)
|
| 8 |
+
|
| 9 |
+
Pour chacun, vérifie que :
|
| 10 |
+
|
| 11 |
+
- ``HuggingFaceImporter.search`` retourne au moins les datasets de
|
| 12 |
+
référence (fallback gracieux), pas une exception cryptique.
|
| 13 |
+
- L'erreur API est documentée via ``record_fallback`` (pas de
|
| 14 |
+
silence complet).
|
| 15 |
+
- ``import_dataset`` n'écrit qu'un fichier de métadonnées si
|
| 16 |
+
``datasets`` n'a rien pu importer (jamais d'images partielles).
|
| 17 |
+
"""
|
| 18 |
+
|
| 19 |
+
from __future__ import annotations
|
| 20 |
+
|
| 21 |
+
import urllib.error
|
| 22 |
+
import warnings
|
| 23 |
+
from unittest.mock import patch
|
| 24 |
+
|
| 25 |
+
import pytest
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
# --------------------------------------------------------------------------
|
| 29 |
+
# Setup : les imports HuggingFace émettent un UserWarning expérimental.
|
| 30 |
+
# On les filtre pour la lisibilité des sorties pytest sans masquer un
|
| 31 |
+
# vrai warning du code testé.
|
| 32 |
+
# --------------------------------------------------------------------------
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
@pytest.fixture(autouse=True)
|
| 36 |
+
def _silence_hf_experimental_warning():
|
| 37 |
+
with warnings.catch_warnings():
|
| 38 |
+
warnings.filterwarnings(
|
| 39 |
+
"ignore",
|
| 40 |
+
message=".*huggingface.*experimental.*",
|
| 41 |
+
category=UserWarning,
|
| 42 |
+
)
|
| 43 |
+
yield
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# --------------------------------------------------------------------------
|
| 47 |
+
# 1. HF Hub renvoie 503
|
| 48 |
+
# --------------------------------------------------------------------------
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
class TestHuggingFace503:
|
| 52 |
+
"""Quand l'API HF répond 503, search() doit retourner au moins
|
| 53 |
+
les datasets de référence pré-intégrés (graceful degradation)."""
|
| 54 |
+
|
| 55 |
+
def test_search_503_falls_back_to_reference_datasets(self):
|
| 56 |
+
from picarones.adapters.corpus.huggingface import HuggingFaceImporter
|
| 57 |
+
|
| 58 |
+
importer = HuggingFaceImporter()
|
| 59 |
+
|
| 60 |
+
http_503 = urllib.error.HTTPError(
|
| 61 |
+
url="https://huggingface.co/api/datasets",
|
| 62 |
+
code=503,
|
| 63 |
+
msg="Service Unavailable",
|
| 64 |
+
hdrs=None, # type: ignore[arg-type]
|
| 65 |
+
fp=None,
|
| 66 |
+
)
|
| 67 |
+
with patch(
|
| 68 |
+
"urllib.request.urlopen",
|
| 69 |
+
side_effect=http_503,
|
| 70 |
+
):
|
| 71 |
+
results = importer.search(query="medieval", limit=5)
|
| 72 |
+
|
| 73 |
+
# Les datasets de référence pré-intégrés doivent être retournés
|
| 74 |
+
# même si l'API est down.
|
| 75 |
+
assert isinstance(results, list)
|
| 76 |
+
# Au moins un résultat dans la liste de référence
|
| 77 |
+
# (filtrage par query="medieval")
|
| 78 |
+
assert len(results) >= 1
|
| 79 |
+
# Tous les résultats viennent de la liste de référence
|
| 80 |
+
# (pas de l'API qui est down)
|
| 81 |
+
for r in results:
|
| 82 |
+
assert r.source == "reference"
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
# --------------------------------------------------------------------------
|
| 86 |
+
# 2. HF Hub renvoie 404 sur un dataset précis
|
| 87 |
+
# --------------------------------------------------------------------------
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
class TestHuggingFace404:
|
| 91 |
+
"""``import_dataset`` sur un dataset_id inexistant ne crée pas
|
| 92 |
+
d'images partielles. Seul le fichier de métadonnées
|
| 93 |
+
``huggingface_meta.json`` est créé (avec l'info "0 imported")."""
|
| 94 |
+
|
| 95 |
+
def test_import_unknown_dataset_writes_only_metadata(self, tmp_path):
|
| 96 |
+
from picarones.adapters.corpus.huggingface import HuggingFaceImporter
|
| 97 |
+
|
| 98 |
+
importer = HuggingFaceImporter()
|
| 99 |
+
# On force _try_import_with_datasets_lib à retourner 0
|
| 100 |
+
# (datasets non installé, ou dataset 404, ou ImportError)
|
| 101 |
+
with patch(
|
| 102 |
+
"picarones.adapters.corpus.huggingface."
|
| 103 |
+
"_try_import_with_datasets_lib",
|
| 104 |
+
return_value=0,
|
| 105 |
+
):
|
| 106 |
+
result = importer.import_dataset(
|
| 107 |
+
"nonexistent/dataset-404",
|
| 108 |
+
output_dir=tmp_path,
|
| 109 |
+
max_samples=10,
|
| 110 |
+
show_progress=False,
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
# Le fichier de métadonnées doit exister
|
| 114 |
+
meta_file = tmp_path / "huggingface_meta.json"
|
| 115 |
+
assert meta_file.exists()
|
| 116 |
+
|
| 117 |
+
# Et 0 fichier d'image / GT n'a été créé
|
| 118 |
+
files = sorted(p.name for p in tmp_path.iterdir())
|
| 119 |
+
# Le seul fichier qui doit exister est huggingface_meta.json
|
| 120 |
+
assert files == ["huggingface_meta.json"]
|
| 121 |
+
|
| 122 |
+
assert result["files_imported"] == 0
|
| 123 |
+
assert result["dataset_id"] == "nonexistent/dataset-404"
|
| 124 |
+
|
| 125 |
+
|
| 126 |
+
# --------------------------------------------------------------------------
|
| 127 |
+
# 3. Erreur réseau brute (DNS down)
|
| 128 |
+
# --------------------------------------------------------------------------
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
class TestHuggingFaceNetworkDown:
|
| 132 |
+
"""Sur DNS down ou socket refused, search() doit retourner les
|
| 133 |
+
datasets de référence sans propager l'exception (test du
|
| 134 |
+
contrat de graceful degradation)."""
|
| 135 |
+
|
| 136 |
+
def test_search_dns_down_returns_reference_only(self):
|
| 137 |
+
from picarones.adapters.corpus.huggingface import HuggingFaceImporter
|
| 138 |
+
|
| 139 |
+
importer = HuggingFaceImporter()
|
| 140 |
+
with patch(
|
| 141 |
+
"urllib.request.urlopen",
|
| 142 |
+
side_effect=urllib.error.URLError("Name or service not known"),
|
| 143 |
+
):
|
| 144 |
+
# Doit retourner sans lever d'exception
|
| 145 |
+
results = importer.search(query="ocr", limit=5)
|
| 146 |
+
|
| 147 |
+
assert isinstance(results, list)
|
| 148 |
+
for r in results:
|
| 149 |
+
# Tous viennent de la liste de référence (API inaccessible)
|
| 150 |
+
assert r.source == "reference"
|
| 151 |
+
|
| 152 |
+
|
| 153 |
+
# --------------------------------------------------------------------------
|
| 154 |
+
# 4. Erreur claire vs cryptique
|
| 155 |
+
# --------------------------------------------------------------------------
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
class TestHuggingFaceErrorMessageQuality:
|
| 159 |
+
"""Quand un dataset_id totalement vide est fourni, on s'attend
|
| 160 |
+
à un comportement défini (pas un AttributeError au fond d'une
|
| 161 |
+
pile non gérée)."""
|
| 162 |
+
|
| 163 |
+
def test_empty_dataset_id_does_not_crash_metadata_write(self, tmp_path):
|
| 164 |
+
from picarones.adapters.corpus.huggingface import HuggingFaceImporter
|
| 165 |
+
|
| 166 |
+
importer = HuggingFaceImporter()
|
| 167 |
+
with patch(
|
| 168 |
+
"picarones.adapters.corpus.huggingface."
|
| 169 |
+
"_try_import_with_datasets_lib",
|
| 170 |
+
return_value=0,
|
| 171 |
+
):
|
| 172 |
+
# Empty dataset_id : on accepte n'importe quel comportement
|
| 173 |
+
# tant qu'il est défini (pas de TypeError, pas d'AttributeError)
|
| 174 |
+
result = importer.import_dataset(
|
| 175 |
+
dataset_id="",
|
| 176 |
+
output_dir=tmp_path,
|
| 177 |
+
max_samples=1,
|
| 178 |
+
show_progress=False,
|
| 179 |
+
)
|
| 180 |
+
# Le fichier de métadonnées existe
|
| 181 |
+
assert (tmp_path / "huggingface_meta.json").exists()
|
| 182 |
+
assert result["dataset_id"] == ""
|
|
@@ -0,0 +1,210 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Tests de manifestes IIIF corrompus / malicieux.
|
| 2 |
+
|
| 3 |
+
Cas couverts :
|
| 4 |
+
|
| 5 |
+
- JSON tronqué (5 bytes seulement)
|
| 6 |
+
- JSON valide mais champs IIIF requis absents (``@context``,
|
| 7 |
+
``sequences``…)
|
| 8 |
+
- Manifeste qui pointe vers une URL d'image loopback (rejeté par
|
| 9 |
+
validate_http_url côté téléchargement)
|
| 10 |
+
- Manifeste géant (> 10 Mo) — ne doit pas tout charger en mémoire
|
| 11 |
+
sans limite explicite (xfail si la limite n'existe pas).
|
| 12 |
+
"""
|
| 13 |
+
|
| 14 |
+
from __future__ import annotations
|
| 15 |
+
|
| 16 |
+
import json
|
| 17 |
+
from unittest.mock import patch, MagicMock
|
| 18 |
+
|
| 19 |
+
import pytest
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
# --------------------------------------------------------------------------
|
| 23 |
+
# 1. JSON tronqué
|
| 24 |
+
# --------------------------------------------------------------------------
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
class TestIIIFTruncatedJson:
|
| 28 |
+
"""Un manifeste tronqué doit lever ``ValueError`` avec un message
|
| 29 |
+
explicite, pas une JSONDecodeError nue."""
|
| 30 |
+
|
| 31 |
+
def test_5_bytes_truncated_raises_value_error(self):
|
| 32 |
+
from picarones.adapters.corpus.iiif import _fetch_manifest
|
| 33 |
+
|
| 34 |
+
# 5 bytes de JSON mal formé
|
| 35 |
+
with patch(
|
| 36 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen"
|
| 37 |
+
) as mock_urlopen:
|
| 38 |
+
mock_resp = MagicMock()
|
| 39 |
+
mock_resp.read.return_value = b'{"@co'
|
| 40 |
+
mock_resp.__enter__ = lambda self: self
|
| 41 |
+
mock_resp.__exit__ = lambda self, *a: None
|
| 42 |
+
mock_urlopen.return_value = mock_resp
|
| 43 |
+
|
| 44 |
+
with pytest.raises(ValueError) as exc_info:
|
| 45 |
+
_fetch_manifest("https://example.org/manifest.json")
|
| 46 |
+
# Doit mentionner JSON ou manifeste
|
| 47 |
+
msg = str(exc_info.value).lower()
|
| 48 |
+
assert "json" in msg or "manifeste" in msg or "manifest" in msg
|
| 49 |
+
|
| 50 |
+
def test_empty_response_raises_value_error(self):
|
| 51 |
+
from picarones.adapters.corpus.iiif import _fetch_manifest
|
| 52 |
+
|
| 53 |
+
with patch(
|
| 54 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen"
|
| 55 |
+
) as mock_urlopen:
|
| 56 |
+
mock_resp = MagicMock()
|
| 57 |
+
mock_resp.read.return_value = b""
|
| 58 |
+
mock_resp.__enter__ = lambda self: self
|
| 59 |
+
mock_resp.__exit__ = lambda self, *a: None
|
| 60 |
+
mock_urlopen.return_value = mock_resp
|
| 61 |
+
|
| 62 |
+
with pytest.raises(ValueError):
|
| 63 |
+
_fetch_manifest("https://example.org/manifest.json")
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
# --------------------------------------------------------------------------
|
| 67 |
+
# 2. JSON valide mais champs IIIF requis absents
|
| 68 |
+
# --------------------------------------------------------------------------
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
class TestIIIFMissingFields:
|
| 72 |
+
"""Un manifeste sans ``@context`` ni ``items``/``sequences`` doit
|
| 73 |
+
pouvoir être détecté comme invalide par le parseur (ou produire 0
|
| 74 |
+
canvases sans plantage)."""
|
| 75 |
+
|
| 76 |
+
def test_no_context_no_sequences_yields_empty_canvases(self):
|
| 77 |
+
from picarones.adapters.corpus.iiif import IIIFManifestParser
|
| 78 |
+
|
| 79 |
+
# Manifeste valide JSON mais vide de toute donnée IIIF
|
| 80 |
+
empty = {}
|
| 81 |
+
parser = IIIFManifestParser(empty)
|
| 82 |
+
canvases = parser.canvases()
|
| 83 |
+
# Le parser ne doit pas planter sur un manifeste vide.
|
| 84 |
+
# Acceptable : retour vide.
|
| 85 |
+
assert canvases == []
|
| 86 |
+
|
| 87 |
+
def test_missing_sequences_v2_yields_empty(self):
|
| 88 |
+
from picarones.adapters.corpus.iiif import IIIFManifestParser
|
| 89 |
+
|
| 90 |
+
# Manifeste v2-like sans sequences
|
| 91 |
+
manifest = {
|
| 92 |
+
"@context": "http://iiif.io/api/presentation/2/context.json",
|
| 93 |
+
"@type": "sc:Manifest",
|
| 94 |
+
"label": "doc sans pages",
|
| 95 |
+
}
|
| 96 |
+
parser = IIIFManifestParser(manifest)
|
| 97 |
+
canvases = parser.canvases()
|
| 98 |
+
assert canvases == []
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
# --------------------------------------------------------------------------
|
| 102 |
+
# 3. Manifeste avec URL d'image loopback
|
| 103 |
+
# --------------------------------------------------------------------------
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
class TestIIIFLoopbackImageURL:
|
| 107 |
+
"""Si le manifeste pointe une image vers ``http://127.0.0.1/...``,
|
| 108 |
+
le téléchargement doit être bloqué par validate_http_url (anti-SSRF)."""
|
| 109 |
+
|
| 110 |
+
def test_download_loopback_image_rejected(self):
|
| 111 |
+
from picarones.adapters.corpus._http import download_url
|
| 112 |
+
|
| 113 |
+
# Une URL d'image qui pointe vers loopback doit être refusée
|
| 114 |
+
# avant la résolution réseau.
|
| 115 |
+
with pytest.raises(ValueError) as exc_info:
|
| 116 |
+
download_url("http://127.0.0.1/iiif/image/full/max/0/default.jpg")
|
| 117 |
+
msg = str(exc_info.value).lower()
|
| 118 |
+
assert "loopback" in msg or "ssrf" in msg or "interne" in msg or "127" in msg
|
| 119 |
+
|
| 120 |
+
def test_fetch_manifest_loopback_url_rejected(self):
|
| 121 |
+
from picarones.adapters.corpus.iiif import _fetch_manifest
|
| 122 |
+
|
| 123 |
+
# Manifeste hébergé sur loopback : refus immédiat (anti-SSRF
|
| 124 |
+
# statique côté validate_http_url).
|
| 125 |
+
with pytest.raises(ValueError):
|
| 126 |
+
_fetch_manifest("http://127.0.0.1/manifest.json")
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
# --------------------------------------------------------------------------
|
| 130 |
+
# 4. Manifeste géant (> 10 Mo)
|
| 131 |
+
# --------------------------------------------------------------------------
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
class TestIIIFOversizedManifest:
|
| 135 |
+
"""Un manifeste de plusieurs dizaines de Mo doit avoir une borne
|
| 136 |
+
de taille pour éviter un DoS mémoire.
|
| 137 |
+
|
| 138 |
+
Si la borne n'existe pas dans le code actuel, ce test est marqué
|
| 139 |
+
``xfail`` pour signaler explicitement l'absence de la fonctionnalité
|
| 140 |
+
(sans casser la suite ni masquer le problème).
|
| 141 |
+
"""
|
| 142 |
+
|
| 143 |
+
def test_oversized_manifest_should_have_size_limit(self):
|
| 144 |
+
from picarones.adapters.corpus.iiif import _fetch_manifest
|
| 145 |
+
|
| 146 |
+
# Manifeste valide mais artificiellement gonflé à ~12 Mo
|
| 147 |
+
# par un padding du label.
|
| 148 |
+
big_label = "x" * (12 * 1024 * 1024)
|
| 149 |
+
big_manifest = {
|
| 150 |
+
"@context": "http://iiif.io/api/presentation/2/context.json",
|
| 151 |
+
"@type": "sc:Manifest",
|
| 152 |
+
"label": big_label,
|
| 153 |
+
"sequences": [],
|
| 154 |
+
}
|
| 155 |
+
big_bytes = json.dumps(big_manifest).encode("utf-8")
|
| 156 |
+
|
| 157 |
+
with patch(
|
| 158 |
+
"picarones.adapters.corpus._http.urllib.request.urlopen"
|
| 159 |
+
) as mock_urlopen:
|
| 160 |
+
mock_resp = MagicMock()
|
| 161 |
+
mock_resp.read.return_value = big_bytes
|
| 162 |
+
mock_resp.__enter__ = lambda self: self
|
| 163 |
+
mock_resp.__exit__ = lambda self, *a: None
|
| 164 |
+
mock_urlopen.return_value = mock_resp
|
| 165 |
+
|
| 166 |
+
# Si une limite existe, on s'attend à une exception (ValueError
|
| 167 |
+
# ou OSError ou MemoryError selon implémentation). Sinon le
|
| 168 |
+
# manifeste est chargé entièrement — révélateur de l'absence
|
| 169 |
+
# de garde.
|
| 170 |
+
try:
|
| 171 |
+
manifest = _fetch_manifest("https://example.org/big.json")
|
| 172 |
+
# Pas de garde-fou : on charge tout. C'est la vérité du
|
| 173 |
+
# code actuel — on signale via xfail.
|
| 174 |
+
assert isinstance(manifest, dict)
|
| 175 |
+
pytest.xfail(
|
| 176 |
+
"S5 — IIIFImporter._fetch_manifest accepte sans broncher "
|
| 177 |
+
"un manifeste de >10 Mo : pas de borne de taille. "
|
| 178 |
+
"À durcir : ajouter une lecture par chunks avec MAX_MANIFEST_SIZE."
|
| 179 |
+
)
|
| 180 |
+
except (ValueError, MemoryError, OSError):
|
| 181 |
+
# Une garde existe — comportement souhaité.
|
| 182 |
+
pass
|
| 183 |
+
|
| 184 |
+
|
| 185 |
+
# --------------------------------------------------------------------------
|
| 186 |
+
# 5. Manifeste avec contenu malformé (clés bizarres)
|
| 187 |
+
# --------------------------------------------------------------------------
|
| 188 |
+
|
| 189 |
+
|
| 190 |
+
class TestIIIFMalformedFields:
|
| 191 |
+
"""Un canvas avec des champs ``label``/``image_url`` de types
|
| 192 |
+
inattendus doit être absorbé par le parseur sans crash."""
|
| 193 |
+
|
| 194 |
+
def test_canvas_with_int_label_does_not_crash(self):
|
| 195 |
+
from picarones.adapters.corpus.iiif import IIIFManifestParser
|
| 196 |
+
|
| 197 |
+
manifest = {
|
| 198 |
+
"@context": "http://iiif.io/api/presentation/2/context.json",
|
| 199 |
+
"@type": "sc:Manifest",
|
| 200 |
+
"sequences": [{
|
| 201 |
+
"canvases": [
|
| 202 |
+
{"label": 12345, "images": []},
|
| 203 |
+
],
|
| 204 |
+
}],
|
| 205 |
+
}
|
| 206 |
+
parser = IIIFManifestParser(manifest)
|
| 207 |
+
canvases = parser.canvases()
|
| 208 |
+
# Un canvas, pas de plantage
|
| 209 |
+
assert len(canvases) == 1
|
| 210 |
+
assert isinstance(canvases[0].label, str)
|
|
@@ -0,0 +1,293 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S4.1 — couverture des opérations SQL de ``JobStore``.
|
| 2 |
+
|
| 3 |
+
Avant S4 : ``job_store.py`` à 64% de couverture. Lignes non
|
| 4 |
+
couvertes : ``create``, ``get``, ``list``, ``update_progress``,
|
| 5 |
+
``mark_*``, ``mark_orphaned_jobs_interrupted``, ``_set_status``,
|
| 6 |
+
``_row_to_record`` (gestion payload corrompu).
|
| 7 |
+
|
| 8 |
+
Cible : 90%+ de couverture.
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
|
| 13 |
+
import sqlite3
|
| 14 |
+
from pathlib import Path
|
| 15 |
+
|
| 16 |
+
import pytest
|
| 17 |
+
|
| 18 |
+
from picarones.adapters.storage.job_store import JobRecord, JobStore, JobStoreError
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
@pytest.fixture
|
| 22 |
+
def store(tmp_path: Path) -> JobStore:
|
| 23 |
+
"""JobStore fraîchement créé sur un tmp_path."""
|
| 24 |
+
return JobStore(db_path=tmp_path / "jobs.sqlite")
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 28 |
+
# create
|
| 29 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
class TestCreate:
|
| 33 |
+
def test_create_returns_record(self, store: JobStore) -> None:
|
| 34 |
+
rec = store.create("job_001", payload={"corpus": "test"}, total_docs=10)
|
| 35 |
+
assert isinstance(rec, JobRecord)
|
| 36 |
+
assert rec.job_id == "job_001"
|
| 37 |
+
assert rec.status == "pending"
|
| 38 |
+
assert rec.total_docs == 10
|
| 39 |
+
assert rec.progress == 0.0
|
| 40 |
+
|
| 41 |
+
def test_create_with_no_payload_uses_empty_dict(
|
| 42 |
+
self, store: JobStore,
|
| 43 |
+
) -> None:
|
| 44 |
+
rec = store.create("job_002")
|
| 45 |
+
assert rec is not None
|
| 46 |
+
assert rec.status == "pending"
|
| 47 |
+
|
| 48 |
+
def test_create_empty_job_id_raises(self, store: JobStore) -> None:
|
| 49 |
+
with pytest.raises(JobStoreError, match="vide"):
|
| 50 |
+
store.create("")
|
| 51 |
+
|
| 52 |
+
def test_create_duplicate_job_id_raises(self, store: JobStore) -> None:
|
| 53 |
+
store.create("dup")
|
| 54 |
+
with pytest.raises(JobStoreError, match="déjà existant"):
|
| 55 |
+
store.create("dup")
|
| 56 |
+
|
| 57 |
+
def test_create_persists_payload_json(self, store: JobStore) -> None:
|
| 58 |
+
complex_payload = {
|
| 59 |
+
"corpus": "manuscrits",
|
| 60 |
+
"engines": ["tesseract", "pero"],
|
| 61 |
+
"options": {"lang": "fra"},
|
| 62 |
+
}
|
| 63 |
+
store.create("payload_test", payload=complex_payload)
|
| 64 |
+
rec = store.get("payload_test")
|
| 65 |
+
assert rec is not None
|
| 66 |
+
# Le payload est exposé via JobRecord.payload (dict).
|
| 67 |
+
assert rec.payload == complex_payload
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 71 |
+
# get + list
|
| 72 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
class TestGetAndList:
|
| 76 |
+
def test_get_unknown_returns_none(self, store: JobStore) -> None:
|
| 77 |
+
assert store.get("does_not_exist") is None
|
| 78 |
+
|
| 79 |
+
def test_get_returns_existing_record(self, store: JobStore) -> None:
|
| 80 |
+
store.create("a")
|
| 81 |
+
rec = store.get("a")
|
| 82 |
+
assert rec is not None
|
| 83 |
+
assert rec.job_id == "a"
|
| 84 |
+
|
| 85 |
+
def test_list_empty_store_returns_empty_tuple(
|
| 86 |
+
self, store: JobStore,
|
| 87 |
+
) -> None:
|
| 88 |
+
assert store.list() == ()
|
| 89 |
+
|
| 90 |
+
def test_list_orders_by_created_desc(self, store: JobStore) -> None:
|
| 91 |
+
# Crée 3 jobs avec un délai pour garantir l'ordre temporel
|
| 92 |
+
import time
|
| 93 |
+
for i in range(3):
|
| 94 |
+
store.create(f"job_{i:02d}")
|
| 95 |
+
time.sleep(0.01)
|
| 96 |
+
records = store.list()
|
| 97 |
+
assert len(records) == 3
|
| 98 |
+
# Le plus récent en premier
|
| 99 |
+
assert records[0].job_id == "job_02"
|
| 100 |
+
assert records[2].job_id == "job_00"
|
| 101 |
+
|
| 102 |
+
def test_list_respects_limit(self, store: JobStore) -> None:
|
| 103 |
+
for i in range(5):
|
| 104 |
+
store.create(f"j{i}")
|
| 105 |
+
results = store.list(limit=2)
|
| 106 |
+
assert len(results) == 2
|
| 107 |
+
|
| 108 |
+
def test_list_limit_zero_returns_empty(self, store: JobStore) -> None:
|
| 109 |
+
store.create("j")
|
| 110 |
+
assert store.list(limit=0) == ()
|
| 111 |
+
|
| 112 |
+
|
| 113 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 114 |
+
# update_progress
|
| 115 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
class TestUpdateProgress:
|
| 119 |
+
def test_update_progress_sets_value(self, store: JobStore) -> None:
|
| 120 |
+
store.create("p", total_docs=10)
|
| 121 |
+
store.update_progress("p", progress=0.5, processed_docs=5,
|
| 122 |
+
current_engine="tesseract")
|
| 123 |
+
rec = store.get("p")
|
| 124 |
+
assert rec is not None
|
| 125 |
+
assert rec.progress == 0.5
|
| 126 |
+
assert rec.processed_docs == 5
|
| 127 |
+
assert rec.current_engine == "tesseract"
|
| 128 |
+
|
| 129 |
+
def test_update_progress_clamps_above_one(self, store: JobStore) -> None:
|
| 130 |
+
store.create("p")
|
| 131 |
+
store.update_progress("p", progress=2.5)
|
| 132 |
+
rec = store.get("p")
|
| 133 |
+
assert rec is not None
|
| 134 |
+
assert rec.progress == 1.0
|
| 135 |
+
|
| 136 |
+
def test_update_progress_clamps_below_zero(self, store: JobStore) -> None:
|
| 137 |
+
store.create("p")
|
| 138 |
+
store.update_progress("p", progress=-0.5)
|
| 139 |
+
rec = store.get("p")
|
| 140 |
+
assert rec is not None
|
| 141 |
+
assert rec.progress == 0.0
|
| 142 |
+
|
| 143 |
+
def test_update_progress_unknown_job_is_silent(
|
| 144 |
+
self, store: JobStore,
|
| 145 |
+
) -> None:
|
| 146 |
+
# UPDATE WHERE job_id matches nothing — ne lève pas, mutation 0 ligne.
|
| 147 |
+
store.update_progress("ghost", progress=0.3)
|
| 148 |
+
# Aucun side-effect : le job n'apparaît pas après l'opération.
|
| 149 |
+
assert store.get("ghost") is None
|
| 150 |
+
|
| 151 |
+
|
| 152 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 153 |
+
# mark_* (transitions de statut)
|
| 154 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 155 |
+
|
| 156 |
+
|
| 157 |
+
class TestStatusTransitions:
|
| 158 |
+
def test_mark_running(self, store: JobStore) -> None:
|
| 159 |
+
store.create("r")
|
| 160 |
+
store.mark_running("r")
|
| 161 |
+
rec = store.get("r")
|
| 162 |
+
assert rec is not None
|
| 163 |
+
assert rec.status == "running"
|
| 164 |
+
assert rec.finished_at is None
|
| 165 |
+
|
| 166 |
+
def test_mark_complete_sets_output(self, store: JobStore) -> None:
|
| 167 |
+
store.create("c")
|
| 168 |
+
store.mark_complete("c", output_path="/tmp/report.html")
|
| 169 |
+
rec = store.get("c")
|
| 170 |
+
assert rec is not None
|
| 171 |
+
assert rec.status == "complete"
|
| 172 |
+
assert rec.output_path == "/tmp/report.html"
|
| 173 |
+
assert rec.finished_at is not None
|
| 174 |
+
|
| 175 |
+
def test_mark_error_sets_message(self, store: JobStore) -> None:
|
| 176 |
+
store.create("e")
|
| 177 |
+
store.mark_error("e", error_message="OCR engine failed")
|
| 178 |
+
rec = store.get("e")
|
| 179 |
+
assert rec is not None
|
| 180 |
+
assert rec.status == "error"
|
| 181 |
+
assert rec.error == "OCR engine failed"
|
| 182 |
+
assert rec.finished_at is not None
|
| 183 |
+
|
| 184 |
+
def test_mark_cancelled(self, store: JobStore) -> None:
|
| 185 |
+
store.create("x")
|
| 186 |
+
store.mark_cancelled("x")
|
| 187 |
+
rec = store.get("x")
|
| 188 |
+
assert rec is not None
|
| 189 |
+
assert rec.status == "cancelled"
|
| 190 |
+
assert rec.finished_at is not None
|
| 191 |
+
|
| 192 |
+
def test_is_terminal_helper(self, store: JobStore) -> None:
|
| 193 |
+
store.create("t")
|
| 194 |
+
store.mark_complete("t")
|
| 195 |
+
rec = store.get("t")
|
| 196 |
+
assert rec is not None
|
| 197 |
+
assert rec.is_terminal is True
|
| 198 |
+
assert rec.is_live is False
|
| 199 |
+
|
| 200 |
+
|
| 201 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 202 |
+
# mark_orphaned_jobs_interrupted (boot cleanup)
|
| 203 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
class TestOrphanedJobsCleanup:
|
| 207 |
+
def test_pending_and_running_become_interrupted(
|
| 208 |
+
self, store: JobStore,
|
| 209 |
+
) -> None:
|
| 210 |
+
store.create("p") # pending
|
| 211 |
+
store.create("r")
|
| 212 |
+
store.mark_running("r") # running
|
| 213 |
+
store.create("c")
|
| 214 |
+
store.mark_complete("c") # complete (terminal)
|
| 215 |
+
|
| 216 |
+
n = store.mark_orphaned_jobs_interrupted()
|
| 217 |
+
assert n == 2 # p + r
|
| 218 |
+
|
| 219 |
+
assert store.get("p").status == "interrupted" # type: ignore[union-attr]
|
| 220 |
+
assert store.get("r").status == "interrupted" # type: ignore[union-attr]
|
| 221 |
+
# Le job complete n'est pas affecté.
|
| 222 |
+
assert store.get("c").status == "complete" # type: ignore[union-attr]
|
| 223 |
+
|
| 224 |
+
def test_no_orphans_returns_zero(self, store: JobStore) -> None:
|
| 225 |
+
# Aucun job ou tous terminaux.
|
| 226 |
+
assert store.mark_orphaned_jobs_interrupted() == 0
|
| 227 |
+
|
| 228 |
+
def test_orphan_records_carry_explanation(
|
| 229 |
+
self, store: JobStore,
|
| 230 |
+
) -> None:
|
| 231 |
+
store.create("p")
|
| 232 |
+
store.mark_orphaned_jobs_interrupted()
|
| 233 |
+
rec = store.get("p")
|
| 234 |
+
assert rec is not None
|
| 235 |
+
assert rec.error == "process restart"
|
| 236 |
+
|
| 237 |
+
|
| 238 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 239 |
+
# _row_to_record — payload corrompu
|
| 240 |
+
# ──────────────────────────────────────────────────────���───────────────
|
| 241 |
+
|
| 242 |
+
|
| 243 |
+
class TestPayloadCorruptionTolerance:
|
| 244 |
+
"""Le store doit tolérer un payload_json corrompu (downgrade
|
| 245 |
+
de version, écriture concurrente cassée, etc.) sans crasher."""
|
| 246 |
+
|
| 247 |
+
def test_corrupted_payload_yields_empty_dict_with_warning(
|
| 248 |
+
self,
|
| 249 |
+
store: JobStore,
|
| 250 |
+
caplog: pytest.LogCaptureFixture,
|
| 251 |
+
) -> None:
|
| 252 |
+
store.create("corrupt")
|
| 253 |
+
# Réécriture brutale du payload_json en JSON invalide.
|
| 254 |
+
with sqlite3.connect(str(store.db_path)) as conn:
|
| 255 |
+
conn.execute(
|
| 256 |
+
"UPDATE jobs SET payload_json = ? WHERE job_id = ?",
|
| 257 |
+
("{not valid json", "corrupt"),
|
| 258 |
+
)
|
| 259 |
+
conn.commit()
|
| 260 |
+
|
| 261 |
+
import logging
|
| 262 |
+
with caplog.at_level(logging.WARNING):
|
| 263 |
+
rec = store.get("corrupt")
|
| 264 |
+
|
| 265 |
+
assert rec is not None
|
| 266 |
+
assert rec.payload == {}
|
| 267 |
+
# Un warning doit avoir été émis.
|
| 268 |
+
assert any(
|
| 269 |
+
"corrompu" in r.message.lower() or "corrupt" in r.message
|
| 270 |
+
for r in caplog.records
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
|
| 274 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 275 |
+
# Persistence cross-instance — db_path
|
| 276 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 277 |
+
|
| 278 |
+
|
| 279 |
+
class TestPersistence:
|
| 280 |
+
def test_jobs_persist_across_store_instances(
|
| 281 |
+
self, tmp_path: Path,
|
| 282 |
+
) -> None:
|
| 283 |
+
db = tmp_path / "shared.sqlite"
|
| 284 |
+
s1 = JobStore(db_path=db)
|
| 285 |
+
s1.create("persisted", total_docs=42)
|
| 286 |
+
|
| 287 |
+
s2 = JobStore(db_path=db)
|
| 288 |
+
rec = s2.get("persisted")
|
| 289 |
+
assert rec is not None
|
| 290 |
+
assert rec.total_docs == 42
|
| 291 |
+
|
| 292 |
+
def test_db_path_property_returns_path(self, store: JobStore) -> None:
|
| 293 |
+
assert isinstance(store.db_path, Path)
|
|
File without changes
|
|
@@ -0,0 +1,237 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Tests d'entrées extrêmes pour ``compute_metrics``.
|
| 2 |
+
|
| 3 |
+
Robustesse face à :
|
| 4 |
+
|
| 5 |
+
- Texte 10 Mo
|
| 6 |
+
- Emoji multibyte (🎉🎊)
|
| 7 |
+
- RTL arabe
|
| 8 |
+
- NFC vs NFD (formes Unicode équivalentes mais bytes différents)
|
| 9 |
+
- Null bytes / whitespace seul
|
| 10 |
+
- Line / Paragraph separator U+2028 / U+2029
|
| 11 |
+
|
| 12 |
+
Pour chacun, on vérifie qu'aucune exception ne fuit hors de
|
| 13 |
+
``compute_metrics`` (le décorateur try/except interne doit retourner
|
| 14 |
+
un MetricsResult avec ``error`` non-None ou des métriques numériques
|
| 15 |
+
correctes).
|
| 16 |
+
"""
|
| 17 |
+
|
| 18 |
+
from __future__ import annotations
|
| 19 |
+
|
| 20 |
+
import unicodedata
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# --------------------------------------------------------------------------
|
| 24 |
+
# 1. Texte de 10 Mo
|
| 25 |
+
# --------------------------------------------------------------------------
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
class TestExtremeLengthInputs:
|
| 29 |
+
def test_10mb_text_does_not_crash(self):
|
| 30 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 31 |
+
|
| 32 |
+
# 10 Mo de texte ASCII (caractère unique répété)
|
| 33 |
+
big = "a" * (10 * 1024 * 1024)
|
| 34 |
+
result = compute_metrics(big, big)
|
| 35 |
+
|
| 36 |
+
# Identité parfaite : CER = 0.0
|
| 37 |
+
# Si jiwer absent, error est non-None mais pas crash.
|
| 38 |
+
if result.error is None:
|
| 39 |
+
assert result.cer == 0.0
|
| 40 |
+
assert result.cer_nfc == 0.0
|
| 41 |
+
else:
|
| 42 |
+
# Échec géré sans exception remontante
|
| 43 |
+
assert isinstance(result.error, str)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# --------------------------------------------------------------------------
|
| 47 |
+
# 2. Emoji multibyte
|
| 48 |
+
# --------------------------------------------------------------------------
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
class TestEmojiInputs:
|
| 52 |
+
def test_emoji_identity_is_zero_cer(self):
|
| 53 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 54 |
+
|
| 55 |
+
ref = "Bonjour 🎉🎊 monde"
|
| 56 |
+
hyp = "Bonjour 🎉🎊 monde"
|
| 57 |
+
result = compute_metrics(ref, hyp)
|
| 58 |
+
|
| 59 |
+
if result.error is None:
|
| 60 |
+
assert result.cer == 0.0
|
| 61 |
+
|
| 62 |
+
def test_emoji_substitution_yields_positive_cer(self):
|
| 63 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 64 |
+
|
| 65 |
+
ref = "Bonjour 🎉🎊 monde"
|
| 66 |
+
hyp = "Bonjour 🎯🎯 monde"
|
| 67 |
+
result = compute_metrics(ref, hyp)
|
| 68 |
+
|
| 69 |
+
# Soit erreur gérée, soit CER > 0
|
| 70 |
+
if result.error is None:
|
| 71 |
+
assert result.cer is not None
|
| 72 |
+
assert result.cer > 0.0
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
# --------------------------------------------------------------------------
|
| 76 |
+
# 3. RTL arabe
|
| 77 |
+
# --------------------------------------------------------------------------
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
class TestRTLArabicInputs:
|
| 81 |
+
def test_arabic_identity_zero_cer(self):
|
| 82 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 83 |
+
|
| 84 |
+
ref = "السلام عليكم"
|
| 85 |
+
hyp = "السلام عليكم"
|
| 86 |
+
result = compute_metrics(ref, hyp)
|
| 87 |
+
|
| 88 |
+
if result.error is None:
|
| 89 |
+
assert result.cer == 0.0
|
| 90 |
+
# Tous les caractères doivent être comptés
|
| 91 |
+
assert result.reference_length == len(ref)
|
| 92 |
+
|
| 93 |
+
def test_arabic_one_char_diff_cer_positive(self):
|
| 94 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 95 |
+
|
| 96 |
+
ref = "السلام عليكم"
|
| 97 |
+
hyp = "السلام عليك" # un caractère manquant à la fin
|
| 98 |
+
result = compute_metrics(ref, hyp)
|
| 99 |
+
|
| 100 |
+
if result.error is None:
|
| 101 |
+
assert result.cer is not None
|
| 102 |
+
assert result.cer > 0.0
|
| 103 |
+
|
| 104 |
+
|
| 105 |
+
# --------------------------------------------------------------------------
|
| 106 |
+
# 4. NFC vs NFD : "é" en deux formes différentes
|
| 107 |
+
# --------------------------------------------------------------------------
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
class TestUnicodeNormalizationForms:
|
| 111 |
+
def test_nfc_vs_nfd_same_apparent_content(self):
|
| 112 |
+
"""``é`` NFC = U+00E9 ; ``é`` NFD = U+0065 + U+0301.
|
| 113 |
+
Le CER brut devrait être > 0 (bytes différents),
|
| 114 |
+
mais le CER NFC = 0 (les deux formes sont normalisées)."""
|
| 115 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 116 |
+
|
| 117 |
+
ref_nfc = unicodedata.normalize("NFC", "café") # 4 chars
|
| 118 |
+
ref_nfd = unicodedata.normalize("NFD", "café") # 5 chars
|
| 119 |
+
# Sanité : les deux représentations sont effectivement distinctes
|
| 120 |
+
assert ref_nfc != ref_nfd
|
| 121 |
+
assert len(ref_nfc) != len(ref_nfd)
|
| 122 |
+
|
| 123 |
+
result = compute_metrics(ref_nfc, ref_nfd)
|
| 124 |
+
|
| 125 |
+
if result.error is None:
|
| 126 |
+
# Le CER normalisé NFC doit être 0
|
| 127 |
+
assert result.cer_nfc == 0.0
|
| 128 |
+
|
| 129 |
+
def test_pure_combining_chars_handled(self):
|
| 130 |
+
"""Texte composé uniquement de caractères combinants
|
| 131 |
+
(par ex. accents seuls)."""
|
| 132 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 133 |
+
|
| 134 |
+
# Combining grave + combining acute
|
| 135 |
+
ref = "̀́̂"
|
| 136 |
+
hyp = "̀́̂"
|
| 137 |
+
result = compute_metrics(ref, hyp)
|
| 138 |
+
# Soit error gérée, soit identité parfaite
|
| 139 |
+
if result.error is None:
|
| 140 |
+
assert result.cer == 0.0
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
# --------------------------------------------------------------------------
|
| 144 |
+
# 5. Null bytes / whitespace seulement
|
| 145 |
+
# --------------------------------------------------------------------------
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
class TestNullAndWhitespaceInputs:
|
| 149 |
+
def test_null_bytes_only(self):
|
| 150 |
+
"""Texte uniquement composé de \\x00 — pas de crash."""
|
| 151 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 152 |
+
|
| 153 |
+
ref = "\x00\x00\x00"
|
| 154 |
+
hyp = "\x00\x00\x00"
|
| 155 |
+
result = compute_metrics(ref, hyp)
|
| 156 |
+
# Pas d'exception, comportement défini.
|
| 157 |
+
assert result is not None
|
| 158 |
+
|
| 159 |
+
def test_whitespace_only_strings(self):
|
| 160 |
+
"""Texte uniquement composé d'espaces — comportement défini."""
|
| 161 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 162 |
+
|
| 163 |
+
ref = " "
|
| 164 |
+
hyp = " "
|
| 165 |
+
result = compute_metrics(ref, hyp)
|
| 166 |
+
# Pas de crash. Le ``ref.strip()`` vide → la branche "ref vide"
|
| 167 |
+
# ou bien CER = 0.
|
| 168 |
+
assert result is not None
|
| 169 |
+
|
| 170 |
+
def test_empty_string_both_sides(self):
|
| 171 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 172 |
+
|
| 173 |
+
result = compute_metrics("", "")
|
| 174 |
+
# Comportement défini : pas de crash, error éventuelle
|
| 175 |
+
assert result is not None
|
| 176 |
+
|
| 177 |
+
|
| 178 |
+
# --------------------------------------------------------------------------
|
| 179 |
+
# 6. U+2028 / U+2029 (Line / Paragraph separator)
|
| 180 |
+
# --------------------------------------------------------------------------
|
| 181 |
+
|
| 182 |
+
|
| 183 |
+
class TestLineParagraphSeparators:
|
| 184 |
+
def test_u2028_line_separator(self):
|
| 185 |
+
"""U+2028 : LINE SEPARATOR. Doit être traité comme un caractère
|
| 186 |
+
normal par compute_metrics (jiwer travaille sur des codepoints)."""
|
| 187 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 188 |
+
|
| 189 |
+
ref = "ligne 1
ligne 2"
|
| 190 |
+
hyp = "ligne 1
ligne 2"
|
| 191 |
+
result = compute_metrics(ref, hyp)
|
| 192 |
+
if result.error is None:
|
| 193 |
+
assert result.cer == 0.0
|
| 194 |
+
|
| 195 |
+
def test_u2029_paragraph_separator(self):
|
| 196 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 197 |
+
|
| 198 |
+
ref = "para 1
para 2"
|
| 199 |
+
hyp = "para 1
para 2"
|
| 200 |
+
result = compute_metrics(ref, hyp)
|
| 201 |
+
if result.error is None:
|
| 202 |
+
assert result.cer == 0.0
|
| 203 |
+
|
| 204 |
+
|
| 205 |
+
# --------------------------------------------------------------------------
|
| 206 |
+
# 7. Mélange de scripts
|
| 207 |
+
# --------------------------------------------------------------------------
|
| 208 |
+
|
| 209 |
+
|
| 210 |
+
class TestMixedScripts:
|
| 211 |
+
def test_mixed_arabic_latin_emoji(self):
|
| 212 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 213 |
+
|
| 214 |
+
ref = "Hello مرحبا 🌍 sweet world"
|
| 215 |
+
hyp = "Hello مرحبا 🌍 sweet world"
|
| 216 |
+
result = compute_metrics(ref, hyp)
|
| 217 |
+
if result.error is None:
|
| 218 |
+
assert result.cer == 0.0
|
| 219 |
+
# On a bien des bytes / caractères tous comptés
|
| 220 |
+
assert result.reference_length > 0
|
| 221 |
+
|
| 222 |
+
|
| 223 |
+
# --------------------------------------------------------------------------
|
| 224 |
+
# 8. Texte avec uniquement contrôles ASCII
|
| 225 |
+
# --------------------------------------------------------------------------
|
| 226 |
+
|
| 227 |
+
|
| 228 |
+
class TestControlCharacters:
|
| 229 |
+
def test_only_control_chars(self):
|
| 230 |
+
"""Caractères de contrôle ASCII (BEL, BS, FF…)."""
|
| 231 |
+
from picarones.evaluation.metrics.text_metrics import compute_metrics
|
| 232 |
+
|
| 233 |
+
ref = "\x07\x08\x0c"
|
| 234 |
+
hyp = "\x07\x08\x0c"
|
| 235 |
+
result = compute_metrics(ref, hyp)
|
| 236 |
+
# Pas de crash
|
| 237 |
+
assert result is not None
|
|
File without changes
|
|
@@ -0,0 +1,237 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"corpus": {
|
| 3 |
+
"document_count": 2,
|
| 4 |
+
"name": "test_corpus_s5",
|
| 5 |
+
"source": "/fixtures/corpus.zip"
|
| 6 |
+
},
|
| 7 |
+
"engine_reports": [
|
| 8 |
+
{
|
| 9 |
+
"aggregated_metrics": {
|
| 10 |
+
"cer": {
|
| 11 |
+
"max": 0.05,
|
| 12 |
+
"mean": 0.025,
|
| 13 |
+
"median": 0.025,
|
| 14 |
+
"min": 0.0,
|
| 15 |
+
"stdev": 0.035355
|
| 16 |
+
},
|
| 17 |
+
"cer_caseless": {
|
| 18 |
+
"max": 0.05,
|
| 19 |
+
"mean": 0.025,
|
| 20 |
+
"median": 0.025,
|
| 21 |
+
"min": 0.0,
|
| 22 |
+
"stdev": 0.035355
|
| 23 |
+
},
|
| 24 |
+
"cer_nfc": {
|
| 25 |
+
"max": 0.05,
|
| 26 |
+
"mean": 0.025,
|
| 27 |
+
"median": 0.025,
|
| 28 |
+
"min": 0.0,
|
| 29 |
+
"stdev": 0.035355
|
| 30 |
+
},
|
| 31 |
+
"document_count": 2,
|
| 32 |
+
"failed_count": 0,
|
| 33 |
+
"mer": {
|
| 34 |
+
"max": 0.05,
|
| 35 |
+
"mean": 0.025,
|
| 36 |
+
"median": 0.025,
|
| 37 |
+
"min": 0.0,
|
| 38 |
+
"stdev": 0.035355
|
| 39 |
+
},
|
| 40 |
+
"wer": {
|
| 41 |
+
"max": 0.1,
|
| 42 |
+
"mean": 0.05,
|
| 43 |
+
"median": 0.05,
|
| 44 |
+
"min": 0.0,
|
| 45 |
+
"stdev": 0.070711
|
| 46 |
+
},
|
| 47 |
+
"wer_normalized": {
|
| 48 |
+
"max": 0.1,
|
| 49 |
+
"mean": 0.05,
|
| 50 |
+
"median": 0.05,
|
| 51 |
+
"min": 0.0,
|
| 52 |
+
"stdev": 0.070711
|
| 53 |
+
},
|
| 54 |
+
"wil": {
|
| 55 |
+
"max": 0.1,
|
| 56 |
+
"mean": 0.05,
|
| 57 |
+
"median": 0.05,
|
| 58 |
+
"min": 0.0,
|
| 59 |
+
"stdev": 0.070711
|
| 60 |
+
}
|
| 61 |
+
},
|
| 62 |
+
"document_results": [
|
| 63 |
+
{
|
| 64 |
+
"doc_id": "doc1",
|
| 65 |
+
"duration_seconds": 1.5,
|
| 66 |
+
"engine_error": null,
|
| 67 |
+
"ground_truth": "Bonjour le monde",
|
| 68 |
+
"hypothesis": "Bonjour le monde",
|
| 69 |
+
"image_path": "/fixtures/doc1.jpg",
|
| 70 |
+
"metrics": {
|
| 71 |
+
"cer": 0.0,
|
| 72 |
+
"cer_caseless": 0.0,
|
| 73 |
+
"cer_nfc": 0.0,
|
| 74 |
+
"error": null,
|
| 75 |
+
"hypothesis_length": 16,
|
| 76 |
+
"mer": 0.0,
|
| 77 |
+
"reference_length": 16,
|
| 78 |
+
"wer": 0.0,
|
| 79 |
+
"wer_normalized": 0.0,
|
| 80 |
+
"wil": 0.0
|
| 81 |
+
}
|
| 82 |
+
},
|
| 83 |
+
{
|
| 84 |
+
"doc_id": "doc2",
|
| 85 |
+
"duration_seconds": 2.0,
|
| 86 |
+
"engine_error": null,
|
| 87 |
+
"ground_truth": "Au revoir",
|
| 88 |
+
"hypothesis": "Au revoir!",
|
| 89 |
+
"image_path": "/fixtures/doc2.jpg",
|
| 90 |
+
"metrics": {
|
| 91 |
+
"cer": 0.05,
|
| 92 |
+
"cer_caseless": 0.05,
|
| 93 |
+
"cer_nfc": 0.05,
|
| 94 |
+
"error": null,
|
| 95 |
+
"hypothesis_length": 10,
|
| 96 |
+
"mer": 0.05,
|
| 97 |
+
"reference_length": 9,
|
| 98 |
+
"wer": 0.1,
|
| 99 |
+
"wer_normalized": 0.1,
|
| 100 |
+
"wil": 0.1
|
| 101 |
+
}
|
| 102 |
+
}
|
| 103 |
+
],
|
| 104 |
+
"engine_config": {
|
| 105 |
+
"lang": "fra"
|
| 106 |
+
},
|
| 107 |
+
"engine_name": "engine_alpha",
|
| 108 |
+
"engine_version": "1.0.0"
|
| 109 |
+
},
|
| 110 |
+
{
|
| 111 |
+
"aggregated_metrics": {
|
| 112 |
+
"cer": {
|
| 113 |
+
"max": 0.0625,
|
| 114 |
+
"mean": 0.03125,
|
| 115 |
+
"median": 0.03125,
|
| 116 |
+
"min": 0.0,
|
| 117 |
+
"stdev": 0.044194
|
| 118 |
+
},
|
| 119 |
+
"cer_caseless": {
|
| 120 |
+
"max": 0.0,
|
| 121 |
+
"mean": 0.0,
|
| 122 |
+
"median": 0.0,
|
| 123 |
+
"min": 0.0,
|
| 124 |
+
"stdev": 0.0
|
| 125 |
+
},
|
| 126 |
+
"cer_nfc": {
|
| 127 |
+
"max": 0.0625,
|
| 128 |
+
"mean": 0.03125,
|
| 129 |
+
"median": 0.03125,
|
| 130 |
+
"min": 0.0,
|
| 131 |
+
"stdev": 0.044194
|
| 132 |
+
},
|
| 133 |
+
"document_count": 2,
|
| 134 |
+
"failed_count": 0,
|
| 135 |
+
"mer": {
|
| 136 |
+
"max": 0.0625,
|
| 137 |
+
"mean": 0.03125,
|
| 138 |
+
"median": 0.03125,
|
| 139 |
+
"min": 0.0,
|
| 140 |
+
"stdev": 0.044194
|
| 141 |
+
},
|
| 142 |
+
"wer": {
|
| 143 |
+
"max": 0.333333,
|
| 144 |
+
"mean": 0.166666,
|
| 145 |
+
"median": 0.166666,
|
| 146 |
+
"min": 0.0,
|
| 147 |
+
"stdev": 0.235702
|
| 148 |
+
},
|
| 149 |
+
"wer_normalized": {
|
| 150 |
+
"max": 0.333333,
|
| 151 |
+
"mean": 0.166666,
|
| 152 |
+
"median": 0.166666,
|
| 153 |
+
"min": 0.0,
|
| 154 |
+
"stdev": 0.235702
|
| 155 |
+
},
|
| 156 |
+
"wil": {
|
| 157 |
+
"max": 0.111111,
|
| 158 |
+
"mean": 0.055556,
|
| 159 |
+
"median": 0.055556,
|
| 160 |
+
"min": 0.0,
|
| 161 |
+
"stdev": 0.078567
|
| 162 |
+
}
|
| 163 |
+
},
|
| 164 |
+
"document_results": [
|
| 165 |
+
{
|
| 166 |
+
"doc_id": "doc1",
|
| 167 |
+
"duration_seconds": 2.5,
|
| 168 |
+
"engine_error": null,
|
| 169 |
+
"ground_truth": "Bonjour le monde",
|
| 170 |
+
"hypothesis": "Bonjour Ie monde",
|
| 171 |
+
"image_path": "/fixtures/doc1.jpg",
|
| 172 |
+
"metrics": {
|
| 173 |
+
"cer": 0.0625,
|
| 174 |
+
"cer_caseless": 0.0,
|
| 175 |
+
"cer_nfc": 0.0625,
|
| 176 |
+
"error": null,
|
| 177 |
+
"hypothesis_length": 16,
|
| 178 |
+
"mer": 0.0625,
|
| 179 |
+
"reference_length": 16,
|
| 180 |
+
"wer": 0.333333,
|
| 181 |
+
"wer_normalized": 0.333333,
|
| 182 |
+
"wil": 0.111111
|
| 183 |
+
}
|
| 184 |
+
},
|
| 185 |
+
{
|
| 186 |
+
"doc_id": "doc2",
|
| 187 |
+
"duration_seconds": 1.8,
|
| 188 |
+
"engine_error": null,
|
| 189 |
+
"ground_truth": "Au revoir",
|
| 190 |
+
"hypothesis": "Au revoir",
|
| 191 |
+
"image_path": "/fixtures/doc2.jpg",
|
| 192 |
+
"metrics": {
|
| 193 |
+
"cer": 0.0,
|
| 194 |
+
"cer_caseless": 0.0,
|
| 195 |
+
"cer_nfc": 0.0,
|
| 196 |
+
"error": null,
|
| 197 |
+
"hypothesis_length": 9,
|
| 198 |
+
"mer": 0.0,
|
| 199 |
+
"reference_length": 9,
|
| 200 |
+
"wer": 0.0,
|
| 201 |
+
"wer_normalized": 0.0,
|
| 202 |
+
"wil": 0.0
|
| 203 |
+
}
|
| 204 |
+
}
|
| 205 |
+
],
|
| 206 |
+
"engine_config": {
|
| 207 |
+
"lang": "fra"
|
| 208 |
+
},
|
| 209 |
+
"engine_name": "engine_beta",
|
| 210 |
+
"engine_version": "2.1.3"
|
| 211 |
+
}
|
| 212 |
+
],
|
| 213 |
+
"metadata": {
|
| 214 |
+
"deterministic": true,
|
| 215 |
+
"sprint": "S5"
|
| 216 |
+
},
|
| 217 |
+
"picarones_version": "2.0.0-test",
|
| 218 |
+
"ranking": [
|
| 219 |
+
{
|
| 220 |
+
"documents": 2,
|
| 221 |
+
"engine": "engine_alpha",
|
| 222 |
+
"failed": 0,
|
| 223 |
+
"mean_cer": 0.025,
|
| 224 |
+
"mean_wer": 0.05,
|
| 225 |
+
"median_cer": 0.025
|
| 226 |
+
},
|
| 227 |
+
{
|
| 228 |
+
"documents": 2,
|
| 229 |
+
"engine": "engine_beta",
|
| 230 |
+
"failed": 0,
|
| 231 |
+
"mean_cer": 0.03125,
|
| 232 |
+
"mean_wer": 0.166666,
|
| 233 |
+
"median_cer": 0.03125
|
| 234 |
+
}
|
| 235 |
+
],
|
| 236 |
+
"run_date": "2026-05-09T00:00:00+00:00"
|
| 237 |
+
}
|
|
@@ -0,0 +1,238 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Tests de stabilité du JSON ``BenchmarkResult``.
|
| 2 |
+
|
| 3 |
+
Garantit que la sérialisation JSON de ``BenchmarkResult.as_dict``/
|
| 4 |
+
``to_json`` est :
|
| 5 |
+
|
| 6 |
+
- **Stable** : deux sérialisations successives produisent les mêmes
|
| 7 |
+
bytes (modulo la clé ``run_date`` qui est forcée déterministe).
|
| 8 |
+
- **Conforme au snapshot** : le JSON correspond à un golden file
|
| 9 |
+
versionné dans ``tests/golden/fixtures/benchmark_result_v2.json``.
|
| 10 |
+
|
| 11 |
+
Si le snapshot n'existe pas au premier run, il est créé et le test
|
| 12 |
+
échoue avec un message demandant de commit le fichier.
|
| 13 |
+
"""
|
| 14 |
+
|
| 15 |
+
from __future__ import annotations
|
| 16 |
+
|
| 17 |
+
import json
|
| 18 |
+
from pathlib import Path
|
| 19 |
+
|
| 20 |
+
import pytest
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
GOLDEN_PATH = (
|
| 24 |
+
Path(__file__).parent / "fixtures" / "benchmark_result_v2.json"
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def _build_deterministic_benchmark_result():
|
| 29 |
+
"""Construit un BenchmarkResult totalement déterministe pour le snapshot.
|
| 30 |
+
|
| 31 |
+
- Date fixée
|
| 32 |
+
- Version fixée
|
| 33 |
+
- 2 documents, 2 moteurs
|
| 34 |
+
- Pas de valeurs aléatoires
|
| 35 |
+
"""
|
| 36 |
+
from picarones.evaluation.benchmark_result import (
|
| 37 |
+
BenchmarkResult,
|
| 38 |
+
DocumentResult,
|
| 39 |
+
EngineReport,
|
| 40 |
+
)
|
| 41 |
+
from picarones.evaluation.metric_result import MetricsResult
|
| 42 |
+
|
| 43 |
+
# Document 1, moteur A
|
| 44 |
+
dr_a_1 = DocumentResult(
|
| 45 |
+
doc_id="doc1",
|
| 46 |
+
image_path="/fixtures/doc1.jpg",
|
| 47 |
+
ground_truth="Bonjour le monde",
|
| 48 |
+
hypothesis="Bonjour le monde",
|
| 49 |
+
metrics=MetricsResult(
|
| 50 |
+
cer=0.0,
|
| 51 |
+
cer_nfc=0.0,
|
| 52 |
+
cer_caseless=0.0,
|
| 53 |
+
wer=0.0,
|
| 54 |
+
wer_normalized=0.0,
|
| 55 |
+
mer=0.0,
|
| 56 |
+
wil=0.0,
|
| 57 |
+
reference_length=16,
|
| 58 |
+
hypothesis_length=16,
|
| 59 |
+
),
|
| 60 |
+
duration_seconds=1.5,
|
| 61 |
+
)
|
| 62 |
+
dr_a_2 = DocumentResult(
|
| 63 |
+
doc_id="doc2",
|
| 64 |
+
image_path="/fixtures/doc2.jpg",
|
| 65 |
+
ground_truth="Au revoir",
|
| 66 |
+
hypothesis="Au revoir!",
|
| 67 |
+
metrics=MetricsResult(
|
| 68 |
+
cer=0.05,
|
| 69 |
+
cer_nfc=0.05,
|
| 70 |
+
cer_caseless=0.05,
|
| 71 |
+
wer=0.1,
|
| 72 |
+
wer_normalized=0.1,
|
| 73 |
+
mer=0.05,
|
| 74 |
+
wil=0.1,
|
| 75 |
+
reference_length=9,
|
| 76 |
+
hypothesis_length=10,
|
| 77 |
+
),
|
| 78 |
+
duration_seconds=2.0,
|
| 79 |
+
)
|
| 80 |
+
|
| 81 |
+
# Document 1, moteur B
|
| 82 |
+
dr_b_1 = DocumentResult(
|
| 83 |
+
doc_id="doc1",
|
| 84 |
+
image_path="/fixtures/doc1.jpg",
|
| 85 |
+
ground_truth="Bonjour le monde",
|
| 86 |
+
hypothesis="Bonjour Ie monde", # I capital au lieu de l minuscule
|
| 87 |
+
metrics=MetricsResult(
|
| 88 |
+
cer=0.0625,
|
| 89 |
+
cer_nfc=0.0625,
|
| 90 |
+
cer_caseless=0.0,
|
| 91 |
+
wer=0.333333,
|
| 92 |
+
wer_normalized=0.333333,
|
| 93 |
+
mer=0.0625,
|
| 94 |
+
wil=0.111111,
|
| 95 |
+
reference_length=16,
|
| 96 |
+
hypothesis_length=16,
|
| 97 |
+
),
|
| 98 |
+
duration_seconds=2.5,
|
| 99 |
+
)
|
| 100 |
+
dr_b_2 = DocumentResult(
|
| 101 |
+
doc_id="doc2",
|
| 102 |
+
image_path="/fixtures/doc2.jpg",
|
| 103 |
+
ground_truth="Au revoir",
|
| 104 |
+
hypothesis="Au revoir",
|
| 105 |
+
metrics=MetricsResult(
|
| 106 |
+
cer=0.0,
|
| 107 |
+
cer_nfc=0.0,
|
| 108 |
+
cer_caseless=0.0,
|
| 109 |
+
wer=0.0,
|
| 110 |
+
wer_normalized=0.0,
|
| 111 |
+
mer=0.0,
|
| 112 |
+
wil=0.0,
|
| 113 |
+
reference_length=9,
|
| 114 |
+
hypothesis_length=9,
|
| 115 |
+
),
|
| 116 |
+
duration_seconds=1.8,
|
| 117 |
+
)
|
| 118 |
+
|
| 119 |
+
report_a = EngineReport(
|
| 120 |
+
engine_name="engine_alpha",
|
| 121 |
+
engine_version="1.0.0",
|
| 122 |
+
engine_config={"lang": "fra"},
|
| 123 |
+
document_results=[dr_a_1, dr_a_2],
|
| 124 |
+
)
|
| 125 |
+
report_b = EngineReport(
|
| 126 |
+
engine_name="engine_beta",
|
| 127 |
+
engine_version="2.1.3",
|
| 128 |
+
engine_config={"lang": "fra"},
|
| 129 |
+
document_results=[dr_b_1, dr_b_2],
|
| 130 |
+
)
|
| 131 |
+
|
| 132 |
+
bench = BenchmarkResult(
|
| 133 |
+
corpus_name="test_corpus_s5",
|
| 134 |
+
corpus_source="/fixtures/corpus.zip",
|
| 135 |
+
document_count=2,
|
| 136 |
+
engine_reports=[report_a, report_b],
|
| 137 |
+
run_date="2026-05-09T00:00:00+00:00", # forcée déterministe
|
| 138 |
+
picarones_version="2.0.0-test",
|
| 139 |
+
metadata={"sprint": "S5", "deterministic": True},
|
| 140 |
+
)
|
| 141 |
+
return bench
|
| 142 |
+
|
| 143 |
+
|
| 144 |
+
# --------------------------------------------------------------------------
|
| 145 |
+
# 1. Stabilité : sérialiser 2 fois doit produire les mêmes bytes
|
| 146 |
+
# --------------------------------------------------------------------------
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
class TestBenchmarkResultSerializationStability:
|
| 150 |
+
def test_two_serializations_same_bytes(self):
|
| 151 |
+
bench = _build_deterministic_benchmark_result()
|
| 152 |
+
# JSON sérialisation déterministe : ensure_ascii + sort_keys
|
| 153 |
+
# via json.dumps explicite.
|
| 154 |
+
s1 = json.dumps(
|
| 155 |
+
bench.as_dict(), ensure_ascii=False, sort_keys=True, indent=2,
|
| 156 |
+
)
|
| 157 |
+
s2 = json.dumps(
|
| 158 |
+
bench.as_dict(), ensure_ascii=False, sort_keys=True, indent=2,
|
| 159 |
+
)
|
| 160 |
+
assert s1 == s2, "BenchmarkResult.as_dict instable entre 2 appels"
|
| 161 |
+
|
| 162 |
+
def test_serialization_via_to_json_stable(self, tmp_path):
|
| 163 |
+
bench = _build_deterministic_benchmark_result()
|
| 164 |
+
path1 = bench.to_json(tmp_path / "bench1.json")
|
| 165 |
+
path2 = bench.to_json(tmp_path / "bench2.json")
|
| 166 |
+
# Les deux fichiers doivent avoir le même contenu byte-pour-byte
|
| 167 |
+
b1 = path1.read_bytes()
|
| 168 |
+
b2 = path2.read_bytes()
|
| 169 |
+
assert b1 == b2, "to_json non déterministe entre 2 écritures"
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
# --------------------------------------------------------------------------
|
| 173 |
+
# 2. Snapshot golden
|
| 174 |
+
# --------------------------------------------------------------------------
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
class TestBenchmarkResultGoldenSnapshot:
|
| 178 |
+
def test_matches_golden_fixture(self):
|
| 179 |
+
bench = _build_deterministic_benchmark_result()
|
| 180 |
+
# Sérialisation canonique avec sort_keys pour stabilité
|
| 181 |
+
actual = json.dumps(
|
| 182 |
+
bench.as_dict(), ensure_ascii=False, sort_keys=True, indent=2,
|
| 183 |
+
)
|
| 184 |
+
|
| 185 |
+
if not GOLDEN_PATH.exists():
|
| 186 |
+
# Premier run : on crée le snapshot et on échoue
|
| 187 |
+
# explicitement pour forcer l'opérateur à commit.
|
| 188 |
+
GOLDEN_PATH.parent.mkdir(parents=True, exist_ok=True)
|
| 189 |
+
GOLDEN_PATH.write_text(actual + "\n", encoding="utf-8")
|
| 190 |
+
pytest.fail(
|
| 191 |
+
f"Snapshot golden créé dans {GOLDEN_PATH} — "
|
| 192 |
+
"vérifier le contenu et commit le fichier."
|
| 193 |
+
)
|
| 194 |
+
|
| 195 |
+
expected = GOLDEN_PATH.read_text(encoding="utf-8").rstrip("\n")
|
| 196 |
+
assert actual == expected, (
|
| 197 |
+
f"Snapshot divergeant. Golden: {GOLDEN_PATH}.\n"
|
| 198 |
+
"Si le changement est intentionnel, supprimer le golden et "
|
| 199 |
+
"relancer le test pour le régénérer."
|
| 200 |
+
)
|
| 201 |
+
|
| 202 |
+
|
| 203 |
+
# --------------------------------------------------------------------------
|
| 204 |
+
# 3. Structure invariante : les clés de premier niveau ne changent pas
|
| 205 |
+
# --------------------------------------------------------------------------
|
| 206 |
+
|
| 207 |
+
|
| 208 |
+
class TestBenchmarkResultTopLevelKeys:
|
| 209 |
+
"""Les clés top-level du JSON font partie de l'API publique
|
| 210 |
+
(consommée par les rapports HTML, l'export CSV…). Les changer
|
| 211 |
+
sans préavis casse les consommateurs."""
|
| 212 |
+
|
| 213 |
+
def test_top_level_keys_preserved(self):
|
| 214 |
+
bench = _build_deterministic_benchmark_result()
|
| 215 |
+
d = bench.as_dict()
|
| 216 |
+
|
| 217 |
+
expected_keys = {
|
| 218 |
+
"picarones_version",
|
| 219 |
+
"run_date",
|
| 220 |
+
"corpus",
|
| 221 |
+
"ranking",
|
| 222 |
+
"engine_reports",
|
| 223 |
+
"metadata",
|
| 224 |
+
}
|
| 225 |
+
actual_keys = set(d.keys())
|
| 226 |
+
# Toutes les clés requises présentes
|
| 227 |
+
missing = expected_keys - actual_keys
|
| 228 |
+
assert not missing, (
|
| 229 |
+
f"Clés top-level manquantes dans BenchmarkResult.as_dict: {missing}"
|
| 230 |
+
)
|
| 231 |
+
|
| 232 |
+
def test_corpus_substructure_keys(self):
|
| 233 |
+
bench = _build_deterministic_benchmark_result()
|
| 234 |
+
d = bench.as_dict()
|
| 235 |
+
corpus = d["corpus"]
|
| 236 |
+
assert "name" in corpus
|
| 237 |
+
assert "source" in corpus
|
| 238 |
+
assert "document_count" in corpus
|
|
@@ -0,0 +1,195 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S5 — Simulation de disque plein (ENOSPC).
|
| 2 |
+
|
| 3 |
+
Vérifie la robustesse du chemin "écriture sur disque" face à un
|
| 4 |
+
``OSError(28, 'No space left on device')``.
|
| 5 |
+
|
| 6 |
+
Cas couverts :
|
| 7 |
+
|
| 8 |
+
- ``partial_store._save_partial_line`` doit logger un warning et NE
|
| 9 |
+
PAS lever (le benchmark continue, on ne casse pas tout pour une
|
| 10 |
+
ligne perdue).
|
| 11 |
+
- ``BenchmarkResult.to_json`` doit propager l'OSError (l'utilisateur
|
| 12 |
+
veut savoir que le rapport n'a pas pu être écrit).
|
| 13 |
+
- Aucun fichier corrompu / partiel n'est laissé.
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
from __future__ import annotations
|
| 17 |
+
|
| 18 |
+
import errno
|
| 19 |
+
import os
|
| 20 |
+
import json
|
| 21 |
+
from pathlib import Path
|
| 22 |
+
from unittest.mock import patch
|
| 23 |
+
|
| 24 |
+
import pytest
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
def _enospc_oserror():
|
| 28 |
+
"""Construit un OSError(ENOSPC) prêt à utiliser comme side_effect."""
|
| 29 |
+
return OSError(errno.ENOSPC, os.strerror(errno.ENOSPC))
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# --------------------------------------------------------------------------
|
| 33 |
+
# 1. partial_store._save_partial_line absorbe ENOSPC
|
| 34 |
+
# --------------------------------------------------------------------------
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
class TestPartialStoreEnospcAbsorbed:
|
| 38 |
+
"""Quand le disque est plein, on ne veut pas casser un
|
| 39 |
+
benchmark de 1000 docs juste parce que le partial_dir est full :
|
| 40 |
+
``_save_partial_line`` log warning et retourne."""
|
| 41 |
+
|
| 42 |
+
def test_save_partial_line_enospc_logs_warning_no_raise(
|
| 43 |
+
self, tmp_path, caplog,
|
| 44 |
+
):
|
| 45 |
+
from picarones.app.services.partial_store import _save_partial_line
|
| 46 |
+
from picarones.evaluation.benchmark_result import DocumentResult
|
| 47 |
+
from picarones.evaluation.metric_result import MetricsResult
|
| 48 |
+
|
| 49 |
+
partial_path = tmp_path / "p.partial.jsonl"
|
| 50 |
+
doc = DocumentResult(
|
| 51 |
+
doc_id="d1",
|
| 52 |
+
image_path="/a/b.jpg",
|
| 53 |
+
ground_truth="x",
|
| 54 |
+
hypothesis="x",
|
| 55 |
+
metrics=MetricsResult(reference_length=1, hypothesis_length=1),
|
| 56 |
+
duration_seconds=0.1,
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
# Patch ``open`` pour lever ENOSPC à l'ouverture en append.
|
| 60 |
+
original_open = Path.open
|
| 61 |
+
|
| 62 |
+
def _open_with_enospc(self, mode="r", *args, **kwargs):
|
| 63 |
+
if "a" in mode and self == partial_path:
|
| 64 |
+
raise _enospc_oserror()
|
| 65 |
+
return original_open(self, mode, *args, **kwargs)
|
| 66 |
+
|
| 67 |
+
with patch.object(Path, "open", _open_with_enospc):
|
| 68 |
+
with caplog.at_level("WARNING"):
|
| 69 |
+
# Ne doit PAS lever
|
| 70 |
+
_save_partial_line(partial_path, doc)
|
| 71 |
+
|
| 72 |
+
# Le warning a été loggé
|
| 73 |
+
assert any(
|
| 74 |
+
"partial_dir" in rec.message or "impossible" in rec.message.lower()
|
| 75 |
+
for rec in caplog.records
|
| 76 |
+
)
|
| 77 |
+
# Aucun fichier partiel n'a été créé (open a échoué avant écriture)
|
| 78 |
+
assert not partial_path.exists()
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
# --------------------------------------------------------------------------
|
| 82 |
+
# 2. _delete_partial absorbe ENOSPC
|
| 83 |
+
# --------------------------------------------------------------------------
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
class TestDeletePartialEnospcAbsorbed:
|
| 87 |
+
def test_delete_partial_oserror_logs_warning(self, tmp_path, caplog):
|
| 88 |
+
from picarones.app.services.partial_store import _delete_partial
|
| 89 |
+
|
| 90 |
+
# Créer un fichier réel
|
| 91 |
+
partial_path = tmp_path / "p.partial.jsonl"
|
| 92 |
+
partial_path.write_text('{"doc_id": "x"}\n', encoding="utf-8")
|
| 93 |
+
|
| 94 |
+
with patch.object(Path, "unlink", side_effect=_enospc_oserror()):
|
| 95 |
+
with caplog.at_level("WARNING"):
|
| 96 |
+
# Ne lève pas
|
| 97 |
+
_delete_partial(partial_path)
|
| 98 |
+
|
| 99 |
+
# Le warning est loggé
|
| 100 |
+
assert any(
|
| 101 |
+
"partial_dir" in rec.message or "impossible" in rec.message.lower()
|
| 102 |
+
for rec in caplog.records
|
| 103 |
+
)
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
# --------------------------------------------------------------------------
|
| 107 |
+
# 3. BenchmarkResult.to_json sur disque plein
|
| 108 |
+
# --------------------------------------------------------------------------
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
class TestBenchmarkResultToJsonEnospc:
|
| 112 |
+
"""``to_json`` ouvre un fichier et écrit en JSON. Sur ENOSPC,
|
| 113 |
+
on doit propager l'OSError (l'utilisateur veut le savoir, le
|
| 114 |
+
rapport est critique). Et aucun fichier corrompu ne doit
|
| 115 |
+
rester sur disque (le file handler ferme automatiquement, mais
|
| 116 |
+
on vérifie qu'aucun .json tronqué ne pollue le résultat).
|
| 117 |
+
"""
|
| 118 |
+
|
| 119 |
+
def test_to_json_enospc_propagates_and_no_garbage(self, tmp_path):
|
| 120 |
+
from picarones.evaluation.benchmark_result import (
|
| 121 |
+
BenchmarkResult,
|
| 122 |
+
EngineReport,
|
| 123 |
+
DocumentResult,
|
| 124 |
+
)
|
| 125 |
+
from picarones.evaluation.metric_result import MetricsResult
|
| 126 |
+
|
| 127 |
+
dr = DocumentResult(
|
| 128 |
+
doc_id="d1",
|
| 129 |
+
image_path="/a/b.jpg",
|
| 130 |
+
ground_truth="x",
|
| 131 |
+
hypothesis="x",
|
| 132 |
+
metrics=MetricsResult(reference_length=1, hypothesis_length=1),
|
| 133 |
+
duration_seconds=0.1,
|
| 134 |
+
)
|
| 135 |
+
report = EngineReport(
|
| 136 |
+
engine_name="e",
|
| 137 |
+
engine_version="1",
|
| 138 |
+
engine_config={},
|
| 139 |
+
document_results=[dr],
|
| 140 |
+
)
|
| 141 |
+
bench = BenchmarkResult(
|
| 142 |
+
corpus_name="c",
|
| 143 |
+
corpus_source=None,
|
| 144 |
+
document_count=1,
|
| 145 |
+
engine_reports=[report],
|
| 146 |
+
)
|
| 147 |
+
|
| 148 |
+
out = tmp_path / "rapport.json"
|
| 149 |
+
|
| 150 |
+
# Patch json.dump pour lever ENOSPC pendant l'écriture
|
| 151 |
+
# (simule un disque qui se remplit pendant l'écriture).
|
| 152 |
+
with patch(
|
| 153 |
+
"picarones.evaluation.benchmark_result.json.dump",
|
| 154 |
+
side_effect=_enospc_oserror(),
|
| 155 |
+
):
|
| 156 |
+
with pytest.raises(OSError) as exc_info:
|
| 157 |
+
bench.to_json(out)
|
| 158 |
+
assert exc_info.value.errno == errno.ENOSPC
|
| 159 |
+
|
| 160 |
+
# Le fichier a pu être créé (ouverture en mode "w" précède dump)
|
| 161 |
+
# mais s'il existe il doit être vide (aucune ligne JSON valide).
|
| 162 |
+
if out.exists():
|
| 163 |
+
content = out.read_text(encoding="utf-8")
|
| 164 |
+
# Pas de JSON tronqué : soit vide, soit explicitement
|
| 165 |
+
# incomplet. On ne tolère pas un demi-objet.
|
| 166 |
+
if content:
|
| 167 |
+
# Doit être impossible de parser comme JSON valide
|
| 168 |
+
with pytest.raises(json.JSONDecodeError):
|
| 169 |
+
json.loads(content)
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
# --------------------------------------------------------------------------
|
| 173 |
+
# 4. Idempotence du delete_partial absent
|
| 174 |
+
# --------------------------------------------------------------------------
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
class TestDeletePartialAbsent:
|
| 178 |
+
"""Si le fichier n'existe pas, ``_delete_partial`` est un no-op
|
| 179 |
+
silencieux (pas de FileNotFoundError, pas de warning)."""
|
| 180 |
+
|
| 181 |
+
def test_delete_nonexistent_partial_silent_noop(self, tmp_path, caplog):
|
| 182 |
+
from picarones.app.services.partial_store import _delete_partial
|
| 183 |
+
|
| 184 |
+
nonexistent = tmp_path / "absent.partial.jsonl"
|
| 185 |
+
assert not nonexistent.exists()
|
| 186 |
+
|
| 187 |
+
with caplog.at_level("WARNING"):
|
| 188 |
+
_delete_partial(nonexistent)
|
| 189 |
+
|
| 190 |
+
# Pas de warning : c'est un no-op silencieux par contrat
|
| 191 |
+
warnings = [
|
| 192 |
+
r for r in caplog.records
|
| 193 |
+
if r.levelname == "WARNING"
|
| 194 |
+
]
|
| 195 |
+
assert warnings == []
|
|
File without changes
|
|
@@ -0,0 +1,207 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S4.2 — couverture du router ``/api/history/regressions``.
|
| 2 |
+
|
| 3 |
+
Avant S4 : ``routers/history.py`` à 55%. Lignes non couvertes :
|
| 4 |
+
branche ``engine`` explicite, gestion d'exceptions sur ouverture
|
| 5 |
+
DB et sur ``detect_regression``, filtrage des régressions.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from __future__ import annotations
|
| 9 |
+
|
| 10 |
+
from pathlib import Path
|
| 11 |
+
|
| 12 |
+
import pytest
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 16 |
+
# App de test minimaliste
|
| 17 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
def _make_app():
|
| 21 |
+
from fastapi import FastAPI
|
| 22 |
+
from picarones.interfaces.web.routers import history as history_router
|
| 23 |
+
|
| 24 |
+
app = FastAPI()
|
| 25 |
+
app.include_router(history_router.router)
|
| 26 |
+
return app
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 30 |
+
# 1. Endpoint sans historique — retourne 0 régression
|
| 31 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
class TestEmptyHistory:
|
| 35 |
+
def test_no_db_returns_empty_regressions(self, tmp_path: Path) -> None:
|
| 36 |
+
from fastapi.testclient import TestClient
|
| 37 |
+
|
| 38 |
+
app = _make_app()
|
| 39 |
+
# On pointe vers un fichier SQLite qui sera créé vide.
|
| 40 |
+
db_path = tmp_path / "empty.sqlite"
|
| 41 |
+
|
| 42 |
+
with TestClient(app) as client:
|
| 43 |
+
r = client.get(
|
| 44 |
+
"/api/history/regressions",
|
| 45 |
+
params={"db_path": str(db_path)},
|
| 46 |
+
)
|
| 47 |
+
assert r.status_code == 200
|
| 48 |
+
body = r.json()
|
| 49 |
+
assert body["count"] == 0
|
| 50 |
+
assert body["regressions"] == []
|
| 51 |
+
|
| 52 |
+
def test_threshold_default_is_001(self, tmp_path: Path) -> None:
|
| 53 |
+
from fastapi.testclient import TestClient
|
| 54 |
+
|
| 55 |
+
app = _make_app()
|
| 56 |
+
db_path = tmp_path / "empty.sqlite"
|
| 57 |
+
|
| 58 |
+
with TestClient(app) as client:
|
| 59 |
+
r = client.get(
|
| 60 |
+
"/api/history/regressions",
|
| 61 |
+
params={"db_path": str(db_path)},
|
| 62 |
+
)
|
| 63 |
+
assert r.status_code == 200
|
| 64 |
+
body = r.json()
|
| 65 |
+
assert body["threshold"] == 0.01
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 69 |
+
# 2. Endpoint avec engine explicite
|
| 70 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
class TestExplicitEngine:
|
| 74 |
+
def test_engine_param_filters_targets(self, tmp_path: Path) -> None:
|
| 75 |
+
from fastapi.testclient import TestClient
|
| 76 |
+
|
| 77 |
+
app = _make_app()
|
| 78 |
+
db_path = tmp_path / "engine_filter.sqlite"
|
| 79 |
+
|
| 80 |
+
with TestClient(app) as client:
|
| 81 |
+
r = client.get(
|
| 82 |
+
"/api/history/regressions",
|
| 83 |
+
params={
|
| 84 |
+
"engine": "tesseract",
|
| 85 |
+
"db_path": str(db_path),
|
| 86 |
+
"threshold": 0.05,
|
| 87 |
+
},
|
| 88 |
+
)
|
| 89 |
+
assert r.status_code == 200
|
| 90 |
+
body = r.json()
|
| 91 |
+
assert body["threshold"] == 0.05
|
| 92 |
+
# Aucune régression possible (DB vide) mais l'endpoint
|
| 93 |
+
# ne doit pas crasher.
|
| 94 |
+
assert body["count"] == 0
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 98 |
+
# 3. Avec historique simulé qui contient une régression
|
| 99 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
class TestHistoryWithRegression:
|
| 103 |
+
@pytest.fixture
|
| 104 |
+
def populated_db(self, tmp_path: Path) -> Path:
|
| 105 |
+
"""Crée une DB historique avec 2 runs tesseract qui régressent."""
|
| 106 |
+
from picarones.evaluation.metrics.history import BenchmarkHistory
|
| 107 |
+
|
| 108 |
+
db = tmp_path / "history.sqlite"
|
| 109 |
+
h = BenchmarkHistory(db_path=str(db))
|
| 110 |
+
# Baseline : CER faible
|
| 111 |
+
h.record_single(
|
| 112 |
+
run_id="baseline_run",
|
| 113 |
+
corpus_name="test_corpus",
|
| 114 |
+
engine_name="tesseract",
|
| 115 |
+
cer_mean=0.05,
|
| 116 |
+
wer_mean=0.10,
|
| 117 |
+
doc_count=10,
|
| 118 |
+
timestamp="2026-01-01T00:00:00+00:00",
|
| 119 |
+
)
|
| 120 |
+
# Actuel : CER plus haut (régression)
|
| 121 |
+
h.record_single(
|
| 122 |
+
run_id="current_run",
|
| 123 |
+
corpus_name="test_corpus",
|
| 124 |
+
engine_name="tesseract",
|
| 125 |
+
cer_mean=0.15,
|
| 126 |
+
wer_mean=0.20,
|
| 127 |
+
doc_count=10,
|
| 128 |
+
timestamp="2026-05-01T00:00:00+00:00",
|
| 129 |
+
)
|
| 130 |
+
return db
|
| 131 |
+
|
| 132 |
+
def test_regression_detected_above_threshold(
|
| 133 |
+
self, populated_db: Path,
|
| 134 |
+
) -> None:
|
| 135 |
+
from fastapi.testclient import TestClient
|
| 136 |
+
|
| 137 |
+
app = _make_app()
|
| 138 |
+
with TestClient(app) as client:
|
| 139 |
+
r = client.get(
|
| 140 |
+
"/api/history/regressions",
|
| 141 |
+
params={
|
| 142 |
+
"db_path": str(populated_db),
|
| 143 |
+
"threshold": 0.01,
|
| 144 |
+
},
|
| 145 |
+
)
|
| 146 |
+
assert r.status_code == 200
|
| 147 |
+
body = r.json()
|
| 148 |
+
# Au moins une régression sur tesseract.
|
| 149 |
+
assert body["count"] >= 1
|
| 150 |
+
assert any(reg["engine"] == "tesseract"
|
| 151 |
+
for reg in body["regressions"])
|
| 152 |
+
# Les champs contractuels du payload sont présents.
|
| 153 |
+
for reg in body["regressions"]:
|
| 154 |
+
assert "delta_cer" in reg
|
| 155 |
+
assert "current_cer" in reg
|
| 156 |
+
assert "baseline_cer" in reg
|
| 157 |
+
assert "is_regression" in reg
|
| 158 |
+
|
| 159 |
+
def test_high_threshold_filters_out_small_regressions(
|
| 160 |
+
self, populated_db: Path,
|
| 161 |
+
) -> None:
|
| 162 |
+
from fastapi.testclient import TestClient
|
| 163 |
+
|
| 164 |
+
app = _make_app()
|
| 165 |
+
with TestClient(app) as client:
|
| 166 |
+
# Seuil 99% : aucune régression < 99 pp.
|
| 167 |
+
r = client.get(
|
| 168 |
+
"/api/history/regressions",
|
| 169 |
+
params={
|
| 170 |
+
"db_path": str(populated_db),
|
| 171 |
+
"threshold": 0.99,
|
| 172 |
+
},
|
| 173 |
+
)
|
| 174 |
+
assert r.status_code == 200
|
| 175 |
+
body = r.json()
|
| 176 |
+
assert body["count"] == 0
|
| 177 |
+
|
| 178 |
+
|
| 179 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 180 |
+
# 4. Erreur d'ouverture DB → 500 propre
|
| 181 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 182 |
+
|
| 183 |
+
|
| 184 |
+
class TestDBErrorHandling:
|
| 185 |
+
def test_db_path_unwritable_returns_500_or_empty(
|
| 186 |
+
self, tmp_path: Path,
|
| 187 |
+
) -> None:
|
| 188 |
+
"""db_path qui pointe sur un répertoire inexistant + non
|
| 189 |
+
créable doit produire une erreur compréhensible (500 ou
|
| 190 |
+
body avec count=0 mais sans crash silencieux)."""
|
| 191 |
+
from fastapi.testclient import TestClient
|
| 192 |
+
|
| 193 |
+
app = _make_app()
|
| 194 |
+
# Chemin qui devrait être impossible à créer (sous /proc).
|
| 195 |
+
impossible_path = "/proc/cannot_write/history.sqlite"
|
| 196 |
+
|
| 197 |
+
with TestClient(app, raise_server_exceptions=False) as client:
|
| 198 |
+
r = client.get(
|
| 199 |
+
"/api/history/regressions",
|
| 200 |
+
params={"db_path": impossible_path},
|
| 201 |
+
)
|
| 202 |
+
# Soit 500 (le bon comportement), soit 200 mais avec
|
| 203 |
+
# count=0. Pas de crash, pas de stack trace au client.
|
| 204 |
+
assert r.status_code in (200, 500)
|
| 205 |
+
if r.status_code == 500:
|
| 206 |
+
body = r.json()
|
| 207 |
+
assert "detail" in body
|
|
@@ -0,0 +1,244 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint S4.3 — couverture des endpoints HTR-United / HuggingFace.
|
| 2 |
+
|
| 3 |
+
Avant S4 : ``routers/importers.py`` à 0% direct (testé
|
| 4 |
+
transitivement par d'autres tests web mais sans ciblage).
|
| 5 |
+
|
| 6 |
+
Cible : 80%+ de couverture des 4 endpoints :
|
| 7 |
+
- ``GET /api/htr-united/catalogue``
|
| 8 |
+
- ``POST /api/htr-united/import``
|
| 9 |
+
- ``GET /api/huggingface/search``
|
| 10 |
+
- ``POST /api/huggingface/import``
|
| 11 |
+
|
| 12 |
+
Mocking : les appels réseau sont mockés ; aucun test n'a besoin
|
| 13 |
+
d'Internet.
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
from __future__ import annotations
|
| 17 |
+
|
| 18 |
+
from pathlib import Path
|
| 19 |
+
from unittest.mock import MagicMock, patch
|
| 20 |
+
|
| 21 |
+
import pytest
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
def _make_app():
|
| 25 |
+
from fastapi import FastAPI
|
| 26 |
+
from picarones.interfaces.web.routers import importers as imp_router
|
| 27 |
+
|
| 28 |
+
app = FastAPI()
|
| 29 |
+
app.include_router(imp_router.router)
|
| 30 |
+
return app
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 34 |
+
# 1. HTR-United catalogue (GET)
|
| 35 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
class TestHTRUnitedCatalogue:
|
| 39 |
+
def test_default_lists_demo_catalogue(self) -> None:
|
| 40 |
+
from fastapi.testclient import TestClient
|
| 41 |
+
|
| 42 |
+
app = _make_app()
|
| 43 |
+
with TestClient(app) as client:
|
| 44 |
+
r = client.get("/api/htr-united/catalogue")
|
| 45 |
+
assert r.status_code == 200
|
| 46 |
+
body = r.json()
|
| 47 |
+
assert "source" in body
|
| 48 |
+
assert "total" in body
|
| 49 |
+
assert "entries" in body
|
| 50 |
+
assert isinstance(body["entries"], list)
|
| 51 |
+
# La démo embarque au moins 1 entrée.
|
| 52 |
+
assert body["total"] >= 1
|
| 53 |
+
# Champs filtres exposés.
|
| 54 |
+
assert "available_languages" in body
|
| 55 |
+
assert "available_scripts" in body
|
| 56 |
+
|
| 57 |
+
def test_query_filter_reduces_results(self) -> None:
|
| 58 |
+
from fastapi.testclient import TestClient
|
| 59 |
+
|
| 60 |
+
app = _make_app()
|
| 61 |
+
with TestClient(app) as client:
|
| 62 |
+
r1 = client.get("/api/htr-united/catalogue").json()
|
| 63 |
+
r2 = client.get(
|
| 64 |
+
"/api/htr-united/catalogue",
|
| 65 |
+
params={"query": "zzzznonexistent"},
|
| 66 |
+
).json()
|
| 67 |
+
assert r2["total"] <= r1["total"]
|
| 68 |
+
# Une recherche bidon → 0 résultat (typiquement).
|
| 69 |
+
assert r2["total"] == 0
|
| 70 |
+
|
| 71 |
+
def test_language_filter_applied(self) -> None:
|
| 72 |
+
from fastapi.testclient import TestClient
|
| 73 |
+
|
| 74 |
+
app = _make_app()
|
| 75 |
+
with TestClient(app) as client:
|
| 76 |
+
# Premier appel : récupérer une langue valide.
|
| 77 |
+
full = client.get("/api/htr-united/catalogue").json()
|
| 78 |
+
available = full.get("available_languages", [])
|
| 79 |
+
if not available:
|
| 80 |
+
pytest.skip("Catalogue démo sans langues — fixture vide")
|
| 81 |
+
lang = available[0]
|
| 82 |
+
r = client.get(
|
| 83 |
+
"/api/htr-united/catalogue",
|
| 84 |
+
params={"language": lang},
|
| 85 |
+
)
|
| 86 |
+
assert r.status_code == 200
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 90 |
+
# 2. HTR-United import (POST)
|
| 91 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
class TestHTRUnitedImport:
|
| 95 |
+
def test_unknown_entry_id_returns_404(self, tmp_path: Path) -> None:
|
| 96 |
+
from fastapi.testclient import TestClient
|
| 97 |
+
|
| 98 |
+
app = _make_app()
|
| 99 |
+
with TestClient(app) as client:
|
| 100 |
+
r = client.post(
|
| 101 |
+
"/api/htr-united/import",
|
| 102 |
+
json={
|
| 103 |
+
"entry_id": "non_existent_id",
|
| 104 |
+
"output_dir": str(tmp_path),
|
| 105 |
+
"max_samples": 5,
|
| 106 |
+
},
|
| 107 |
+
)
|
| 108 |
+
assert r.status_code == 404
|
| 109 |
+
assert "non trouvée" in r.json()["detail"]
|
| 110 |
+
|
| 111 |
+
def test_known_entry_calls_importer(self, tmp_path: Path) -> None:
|
| 112 |
+
"""Avec un entry_id du catalogue démo, l'endpoint appelle
|
| 113 |
+
``import_htr_united_corpus``. On mocke pour éviter le
|
| 114 |
+
download réel."""
|
| 115 |
+
from fastapi.testclient import TestClient
|
| 116 |
+
|
| 117 |
+
app = _make_app()
|
| 118 |
+
with patch(
|
| 119 |
+
"picarones.adapters.corpus.htr_united.import_htr_united_corpus",
|
| 120 |
+
) as mock_import:
|
| 121 |
+
mock_import.return_value = {"imported": 3, "output_dir": str(tmp_path)}
|
| 122 |
+
|
| 123 |
+
# Récupère un entry_id du catalogue démo.
|
| 124 |
+
with TestClient(app) as client:
|
| 125 |
+
catalog = client.get("/api/htr-united/catalogue").json()
|
| 126 |
+
if not catalog["entries"]:
|
| 127 |
+
pytest.skip("Catalogue démo vide")
|
| 128 |
+
entry_id = catalog["entries"][0]["id"]
|
| 129 |
+
|
| 130 |
+
r = client.post(
|
| 131 |
+
"/api/htr-united/import",
|
| 132 |
+
json={
|
| 133 |
+
"entry_id": entry_id,
|
| 134 |
+
"output_dir": str(tmp_path),
|
| 135 |
+
"max_samples": 3,
|
| 136 |
+
},
|
| 137 |
+
)
|
| 138 |
+
assert r.status_code == 200
|
| 139 |
+
assert mock_import.called
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 143 |
+
# 3. HuggingFace search (GET)
|
| 144 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 145 |
+
|
| 146 |
+
|
| 147 |
+
class TestHuggingFaceSearch:
|
| 148 |
+
def test_search_returns_list(self) -> None:
|
| 149 |
+
from fastapi.testclient import TestClient
|
| 150 |
+
|
| 151 |
+
app = _make_app()
|
| 152 |
+
# Mock le HF Hub pour ne pas appeler le vrai réseau.
|
| 153 |
+
with patch(
|
| 154 |
+
"picarones.adapters.corpus.huggingface.HuggingFaceImporter.search",
|
| 155 |
+
) as mock_search:
|
| 156 |
+
fake_dataset = MagicMock()
|
| 157 |
+
fake_dataset.as_dict.return_value = {
|
| 158 |
+
"id": "test/dataset", "tags": ["ocr"], "language": "fr",
|
| 159 |
+
}
|
| 160 |
+
mock_search.return_value = [fake_dataset]
|
| 161 |
+
|
| 162 |
+
with TestClient(app) as client:
|
| 163 |
+
r = client.get(
|
| 164 |
+
"/api/huggingface/search",
|
| 165 |
+
params={"query": "ocr"},
|
| 166 |
+
)
|
| 167 |
+
assert r.status_code == 200
|
| 168 |
+
body = r.json()
|
| 169 |
+
assert body["total"] == 1
|
| 170 |
+
assert body["datasets"][0]["id"] == "test/dataset"
|
| 171 |
+
|
| 172 |
+
def test_search_empty_returns_empty_list(self) -> None:
|
| 173 |
+
from fastapi.testclient import TestClient
|
| 174 |
+
|
| 175 |
+
app = _make_app()
|
| 176 |
+
with patch(
|
| 177 |
+
"picarones.adapters.corpus.huggingface.HuggingFaceImporter.search",
|
| 178 |
+
return_value=[],
|
| 179 |
+
):
|
| 180 |
+
with TestClient(app) as client:
|
| 181 |
+
r = client.get("/api/huggingface/search", params={"query": "x"})
|
| 182 |
+
assert r.status_code == 200
|
| 183 |
+
assert r.json() == {"total": 0, "datasets": []}
|
| 184 |
+
|
| 185 |
+
def test_search_limit_validation(self) -> None:
|
| 186 |
+
"""``limit`` est entre 1 et 50 — au-delà, validation FastAPI."""
|
| 187 |
+
from fastapi.testclient import TestClient
|
| 188 |
+
|
| 189 |
+
app = _make_app()
|
| 190 |
+
with TestClient(app) as client:
|
| 191 |
+
r = client.get("/api/huggingface/search", params={"limit": 100})
|
| 192 |
+
assert r.status_code == 422 # validation pydantic
|
| 193 |
+
|
| 194 |
+
def test_search_tags_parsed_as_list(self) -> None:
|
| 195 |
+
from fastapi.testclient import TestClient
|
| 196 |
+
|
| 197 |
+
app = _make_app()
|
| 198 |
+
with patch(
|
| 199 |
+
"picarones.adapters.corpus.huggingface.HuggingFaceImporter.search",
|
| 200 |
+
) as mock_search:
|
| 201 |
+
mock_search.return_value = []
|
| 202 |
+
with TestClient(app) as client:
|
| 203 |
+
client.get(
|
| 204 |
+
"/api/huggingface/search",
|
| 205 |
+
params={"tags": "ocr,manuscript,medieval"},
|
| 206 |
+
)
|
| 207 |
+
# Vérifie que les tags ont été splitté correctement.
|
| 208 |
+
_, kwargs = mock_search.call_args
|
| 209 |
+
assert kwargs["tags"] == ["ocr", "manuscript", "medieval"]
|
| 210 |
+
|
| 211 |
+
|
| 212 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 213 |
+
# 4. HuggingFace import (POST)
|
| 214 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 215 |
+
|
| 216 |
+
|
| 217 |
+
class TestHuggingFaceImport:
|
| 218 |
+
def test_import_calls_importer(self, tmp_path: Path) -> None:
|
| 219 |
+
from fastapi.testclient import TestClient
|
| 220 |
+
|
| 221 |
+
app = _make_app()
|
| 222 |
+
with patch(
|
| 223 |
+
"picarones.adapters.corpus.huggingface.HuggingFaceImporter.import_dataset",
|
| 224 |
+
) as mock_import:
|
| 225 |
+
mock_import.return_value = {
|
| 226 |
+
"imported": 5,
|
| 227 |
+
"output_dir": str(tmp_path),
|
| 228 |
+
}
|
| 229 |
+
|
| 230 |
+
with TestClient(app) as client:
|
| 231 |
+
r = client.post(
|
| 232 |
+
"/api/huggingface/import",
|
| 233 |
+
json={
|
| 234 |
+
"dataset_id": "test/dataset",
|
| 235 |
+
"output_dir": str(tmp_path),
|
| 236 |
+
"split": "train",
|
| 237 |
+
"max_samples": 5,
|
| 238 |
+
},
|
| 239 |
+
)
|
| 240 |
+
assert r.status_code == 200
|
| 241 |
+
assert mock_import.called
|
| 242 |
+
_, kwargs = mock_import.call_args
|
| 243 |
+
assert kwargs["dataset_id"] == "test/dataset"
|
| 244 |
+
assert kwargs["max_samples"] == 5
|