Spaces:
Sleeping
feat(sprint-D.1.b): engine_to_pipeline_spec + build_adapter_resolver
Browse filesSprint D.1.b du plan v2.0 — deuxième brique de l'adapter de
compatibilité ``run_benchmark_via_service``. Pose le mapping
``BaseOCREngine`` → ``PipelineSpec`` et le adapter resolver qui
permettra à ``BenchmarkService`` de consommer des engines legacy.
Helpers ajoutés
---------------
- ``engine_to_pipeline_spec(engine)`` — produit la spec adaptée :
- **OCR seul** (``is_pipeline=False``) → spec mono-step
``IMAGE → RAW_TEXT`` avec ``adapter_name=engine.name``.
- **OCRLLMPipeline** (``is_pipeline=True``) → spec composée via
``make_ocr_llm_pipeline_spec`` (commit f894bf0) avec mode +
sous-OCR + LLM ; le ``prompt_template`` est passé en
``llm_params``.
- ``build_adapter_resolver(engines)`` — construit un
``Callable[[str], Any]`` consommable par ``PipelineExecutor`` :
- Pour un OCR simple : enregistre
``LegacyOCREngineExecutor(engine)`` sous ``engine.name``.
- Pour un ``OCRLLMPipeline`` : enregistre les deux
sous-composants (``ocr_engine`` wrappé + ``llm_adapter`` qui
est déjà un ``StepExecutor`` natif depuis Sprint A14-S44).
Le pipeline lui-même n'est pas enregistré directement —
sa spec référence ses sous-steps par leur ``adapter_name``.
- Lève ``PicaronesError`` si deux engines partagent le même
``name`` avec des instances différentes (collision).
- Lève ``KeyError`` à l'appel pour un nom inconnu.
Helpers privés
--------------
- ``_ocr_only_to_spec`` (mono-step IMAGE → RAW_TEXT).
- ``_ocr_llm_pipeline_to_spec`` (3 modes via le builder).
- ``_llm_adapter_name`` (format ``provider:model`` cohérent avec
Sprint B).
- ``_safe_pipeline_name`` (sanitise pour ``PipelineSpec.name``).
Architecture
------------
``test_layer_imports_are_legal[layer-app]`` : la couche ``app/``
ne peut pas importer ``picarones.pipelines.base`` (legacy).
``_ocr_llm_pipeline_to_spec`` consomme donc un
``OCRLLMPipeline`` exclusivement par **duck typing**
(``is_pipeline``, ``ocr_engine``, ``llm_adapter``, ``mode``,
``prompt_template``). Pas d'import direct.
``test_file_budgets`` : entrée ajoutée pour
``picarones/app/services/_legacy_runner_adapter.py`` (budget 575,
actuel 498). Module transitoire qui sera supprimé en D.6 avec
``measurements/runner/``.
Tests
-----
``tests/app/test_sprint_d_legacy_runner_adapter.py`` étendu de
13 nouveaux tests (29 au total) :
- ``TestEngineToPipelineSpec`` (5 tests) :
- OCR seul produit 1 step (``IMAGE`` → ``RAW_TEXT``).
- ``initial_inputs`` est ``(IMAGE,)``.
- Le nom de la spec est sanitisé (caractères safe).
- ``OCRLLMPipeline`` text_only produit 2 steps (OCR + LLM).
- ``OCRLLMPipeline`` zero_shot produit 1 step (VLM).
- ``TestBuildAdapterResolver`` (5 tests) :
- Engine simple résout son nom.
- Nom inconnu → ``KeyError``.
- Plusieurs engines coexistent.
- Collision de noms → ``PicaronesError``.
- Pipeline enregistre ses sous-composants (pas le pipeline lui-même).
- ``TestEngineSpecResolverIntegration`` (3 tests) :
- Tous les ``adapter_name`` de la spec produite par
``engine_to_pipeline_spec`` sont résolus par
``build_adapter_resolver([engine])``.
Bilan
-----
- ``pytest tests/`` : 4788 passed (+15), 0 failed.
- ``ruff check`` : clean.
- 1 module étendu (262 → 498 LOC), 1 test étendu (16 → 29 tests).
Sprint D.1.c — prochaine étape
-------------------------------
Conversion ``RunResult`` (rewrite) → ``BenchmarkResult`` (legacy).
Mapping :
- ``RunDocumentResult.pipeline_results`` × ``EvaluationView`` →
``EngineReport.document_results``.
- Calcul des métriques CER/WER via ``TextView`` rewrite.
- Reconstitution de ``EngineReport.aggregated_metrics`` et
``pipeline_info``.
https://claude.ai/code/session_011XQZNitg1rCgia8ZD1a2hP
|
@@ -123,7 +123,7 @@ picarones/
|
|
| 123 |
|
| 124 |
## État des tests et bugs historiques
|
| 125 |
|
| 126 |
-
`pytest tests/` → **
|
| 127 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 128 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 129 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
@@ -253,7 +253,7 @@ Résumé express :
|
|
| 253 |
|
| 254 |
1. `git branch --show-current` → `claude/repo-analysis-cukvm`.
|
| 255 |
2. `git status` → working tree clean.
|
| 256 |
-
3. `pytest tests/ -q --no-header --tb=line` →
|
| 257 |
4. `git log -1 --format=%B` → décrit la prochaine sub-phase.
|
| 258 |
|
| 259 |
**Règles d'architecture critiques** (apprises à la dure) :
|
|
@@ -341,7 +341,7 @@ détecte, arbitre, rend.
|
|
| 341 |
## Contexte développement
|
| 342 |
|
| 343 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 344 |
-
- **Tests** : `pytest tests/ -q` →
|
| 345 |
deselected, 0 failed (au moment de la pause de session).
|
| 346 |
- **Plan d'évolution actif** : [`docs/roadmap/evolution-2026.md`](docs/roadmap/evolution-2026.md).
|
| 347 |
- **Plan retrait du legacy (maître)** : [`docs/migration/legacy-retirement-plan.md`](docs/migration/legacy-retirement-plan.md).
|
|
|
|
| 123 |
|
| 124 |
## État des tests et bugs historiques
|
| 125 |
|
| 126 |
+
`pytest tests/` → **4820 passed, 12 skipped, 8 deselected, 0 failed**
|
| 127 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 128 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 129 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
|
|
| 253 |
|
| 254 |
1. `git branch --show-current` → `claude/repo-analysis-cukvm`.
|
| 255 |
2. `git status` → working tree clean.
|
| 256 |
+
3. `pytest tests/ -q --no-header --tb=line` → 4820 passed.
|
| 257 |
4. `git log -1 --format=%B` → décrit la prochaine sub-phase.
|
| 258 |
|
| 259 |
**Règles d'architecture critiques** (apprises à la dure) :
|
|
|
|
| 341 |
## Contexte développement
|
| 342 |
|
| 343 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 344 |
+
- **Tests** : `pytest tests/ -q` → 4820 passed, 12 skipped, 24
|
| 345 |
deselected, 0 failed (au moment de la pause de session).
|
| 346 |
- **Plan d'évolution actif** : [`docs/roadmap/evolution-2026.md`](docs/roadmap/evolution-2026.md).
|
| 347 |
- **Plan retrait du legacy (maître)** : [`docs/migration/legacy-retirement-plan.md`](docs/migration/legacy-retirement-plan.md).
|
|
@@ -395,7 +395,7 @@ ruff check picarones/ tests/
|
|
| 395 |
python -m mypy picarones/core/
|
| 396 |
```
|
| 397 |
|
| 398 |
-
**Test suite**: ~
|
| 399 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 400 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 401 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
|
|
| 395 |
python -m mypy picarones/core/
|
| 396 |
```
|
| 397 |
|
| 398 |
+
**Test suite**: ~4820 tests, ~3 min on a modern laptop. Coverage
|
| 399 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 400 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 401 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
@@ -34,16 +34,32 @@ quand toutes les briques seront en place.
|
|
| 34 |
from __future__ import annotations
|
| 35 |
|
| 36 |
from pathlib import Path
|
| 37 |
-
from typing import TYPE_CHECKING
|
| 38 |
|
|
|
|
|
|
|
|
|
|
| 39 |
from picarones.domain.artifacts import ArtifactType
|
| 40 |
from picarones.domain.corpus import CorpusSpec
|
| 41 |
from picarones.domain.documents import DocumentRef, GroundTruthRef
|
| 42 |
from picarones.domain.errors import PicaronesError
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
if TYPE_CHECKING:
|
|
|
|
| 45 |
from picarones.evaluation.corpus import Corpus, Document
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
# ──────────────────────────────────────────────────────────────────────
|
| 49 |
# Mapping Document (legacy) → DocumentRef (rewrite)
|
|
@@ -198,11 +214,191 @@ def corpus_to_corpus_spec(
|
|
| 198 |
)
|
| 199 |
|
| 200 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
# ──────────────────────────────────────────────────────────────────────
|
| 202 |
# Helpers privés
|
| 203 |
# ──────────────────────────────────────────────────────────────────────
|
| 204 |
|
| 205 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 206 |
def _safe_doc_id(doc_id: str) -> str:
|
| 207 |
"""Coerce un ``Document.doc_id`` vers le regex de ``DocumentRef.id``.
|
| 208 |
|
|
@@ -297,4 +493,6 @@ def _payload_to_text(level: ArtifactType, payload: object) -> str:
|
|
| 297 |
__all__ = [
|
| 298 |
"document_to_document_ref",
|
| 299 |
"corpus_to_corpus_spec",
|
|
|
|
|
|
|
| 300 |
]
|
|
|
|
| 34 |
from __future__ import annotations
|
| 35 |
|
| 36 |
from pathlib import Path
|
| 37 |
+
from typing import TYPE_CHECKING, Any, Callable
|
| 38 |
|
| 39 |
+
from picarones.adapters.legacy_engines._step_executor import (
|
| 40 |
+
LegacyOCREngineExecutor,
|
| 41 |
+
)
|
| 42 |
from picarones.domain.artifacts import ArtifactType
|
| 43 |
from picarones.domain.corpus import CorpusSpec
|
| 44 |
from picarones.domain.documents import DocumentRef, GroundTruthRef
|
| 45 |
from picarones.domain.errors import PicaronesError
|
| 46 |
+
from picarones.domain.pipeline_spec import (
|
| 47 |
+
INITIAL_STEP_ID,
|
| 48 |
+
PipelineSpec,
|
| 49 |
+
PipelineStep,
|
| 50 |
+
)
|
| 51 |
+
from picarones.pipeline.llm_pipeline_builder import make_ocr_llm_pipeline_spec
|
| 52 |
|
| 53 |
if TYPE_CHECKING:
|
| 54 |
+
from picarones.adapters.legacy_engines.base import BaseOCREngine
|
| 55 |
from picarones.evaluation.corpus import Corpus, Document
|
| 56 |
|
| 57 |
+
# Pas d'import direct de ``picarones.pipelines.base.OCRLLMPipeline`` ici —
|
| 58 |
+
# l'invariant architectural ``test_layer_imports_are_legal[layer-app]``
|
| 59 |
+
# interdit à ``app/`` de dépendre du legacy. On consomme un
|
| 60 |
+
# ``OCRLLMPipeline`` exclusivement par duck typing (``is_pipeline``,
|
| 61 |
+
# ``ocr_engine``, ``llm_adapter``, ``mode``, ``prompt_template``).
|
| 62 |
+
|
| 63 |
|
| 64 |
# ──────────────────────────────────────────────────────────────────────
|
| 65 |
# Mapping Document (legacy) → DocumentRef (rewrite)
|
|
|
|
| 214 |
)
|
| 215 |
|
| 216 |
|
| 217 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 218 |
+
# Mapping BaseOCREngine → PipelineSpec
|
| 219 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 220 |
+
|
| 221 |
+
|
| 222 |
+
def engine_to_pipeline_spec(engine: "BaseOCREngine") -> PipelineSpec:
|
| 223 |
+
"""Convertit un ``BaseOCREngine`` legacy en ``PipelineSpec`` rewrite.
|
| 224 |
+
|
| 225 |
+
Deux cas :
|
| 226 |
+
|
| 227 |
+
- **OCRLLMPipeline** (``engine.is_pipeline = True``) : la spec
|
| 228 |
+
composée est construite via ``make_ocr_llm_pipeline_spec``
|
| 229 |
+
avec le mode (``text_only`` / ``text_and_image`` /
|
| 230 |
+
``zero_shot``), l'OCR amont (s'il existe), le LLM, et le
|
| 231 |
+
template de prompt en ``llm_params``.
|
| 232 |
+
- **OCR seul** : spec mono-step (IMAGE → RAW_TEXT). Le step
|
| 233 |
+
référencera ``engine.name`` ; le caller l'enregistre dans
|
| 234 |
+
l'adapter resolver via un ``LegacyOCREngineExecutor(engine)``.
|
| 235 |
+
|
| 236 |
+
Parameters
|
| 237 |
+
----------
|
| 238 |
+
engine:
|
| 239 |
+
Instance d'un sous-classe de ``BaseOCREngine`` (Tesseract,
|
| 240 |
+
Pero, Mistral OCR, Google Vision, Azure DI) ou un
|
| 241 |
+
``OCRLLMPipeline``.
|
| 242 |
+
|
| 243 |
+
Returns
|
| 244 |
+
-------
|
| 245 |
+
PipelineSpec
|
| 246 |
+
Spec immutable consommable par ``BenchmarkService``.
|
| 247 |
+
"""
|
| 248 |
+
if getattr(engine, "is_pipeline", False):
|
| 249 |
+
return _ocr_llm_pipeline_to_spec(engine)
|
| 250 |
+
return _ocr_only_to_spec(engine)
|
| 251 |
+
|
| 252 |
+
|
| 253 |
+
def _ocr_only_to_spec(engine: "BaseOCREngine") -> PipelineSpec:
|
| 254 |
+
"""Spec mono-step : un OCR simple consommant IMAGE et produisant RAW_TEXT."""
|
| 255 |
+
name = engine.name
|
| 256 |
+
safe_name = _safe_pipeline_name(name)
|
| 257 |
+
return PipelineSpec(
|
| 258 |
+
name=f"ocr_only_{safe_name}",
|
| 259 |
+
description=f"OCR step seul ({name}) — IMAGE → RAW_TEXT.",
|
| 260 |
+
initial_inputs=(ArtifactType.IMAGE,),
|
| 261 |
+
steps=(
|
| 262 |
+
PipelineStep(
|
| 263 |
+
id="ocr",
|
| 264 |
+
kind="ocr",
|
| 265 |
+
adapter_name=name,
|
| 266 |
+
input_types=(ArtifactType.IMAGE,),
|
| 267 |
+
output_types=(ArtifactType.RAW_TEXT,),
|
| 268 |
+
inputs_from={ArtifactType.IMAGE: INITIAL_STEP_ID},
|
| 269 |
+
),
|
| 270 |
+
),
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
|
| 274 |
+
def _ocr_llm_pipeline_to_spec(pipeline: Any) -> PipelineSpec:
|
| 275 |
+
"""Spec composée pour un ``OCRLLMPipeline`` (3 modes)."""
|
| 276 |
+
mode = pipeline.mode.value
|
| 277 |
+
llm_name = _llm_adapter_name(pipeline.llm_adapter)
|
| 278 |
+
llm_params: dict[str, str | int | float | bool] = {
|
| 279 |
+
"prompt_template": pipeline.prompt_template,
|
| 280 |
+
}
|
| 281 |
+
if mode == "zero_shot":
|
| 282 |
+
return make_ocr_llm_pipeline_spec(
|
| 283 |
+
mode="zero_shot",
|
| 284 |
+
llm_adapter_name=llm_name,
|
| 285 |
+
llm_params=llm_params,
|
| 286 |
+
)
|
| 287 |
+
if pipeline.ocr_engine is None:
|
| 288 |
+
raise PicaronesError(
|
| 289 |
+
f"OCRLLMPipeline mode {mode!r} requiert un ocr_engine — "
|
| 290 |
+
"valeur None inattendue.",
|
| 291 |
+
)
|
| 292 |
+
return make_ocr_llm_pipeline_spec(
|
| 293 |
+
mode=mode,
|
| 294 |
+
ocr_adapter_name=pipeline.ocr_engine.name,
|
| 295 |
+
llm_adapter_name=llm_name,
|
| 296 |
+
llm_params=llm_params,
|
| 297 |
+
)
|
| 298 |
+
|
| 299 |
+
|
| 300 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 301 |
+
# Adapter resolver
|
| 302 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 303 |
+
|
| 304 |
+
|
| 305 |
+
def build_adapter_resolver(
|
| 306 |
+
engines: list["BaseOCREngine"],
|
| 307 |
+
) -> Callable[[str], Any]:
|
| 308 |
+
"""Construit un adapter resolver pour ``PipelineExecutor``.
|
| 309 |
+
|
| 310 |
+
Parcourt les engines fournis et associe leur ``name`` à un
|
| 311 |
+
``StepExecutor`` valide :
|
| 312 |
+
|
| 313 |
+
- **OCR simple** (``BaseOCREngine``) → wrapped via
|
| 314 |
+
``LegacyOCREngineExecutor`` (qui satisfait le contrat
|
| 315 |
+
``StepExecutor``).
|
| 316 |
+
- **OCRLLMPipeline** → enregistre les deux sous-composants :
|
| 317 |
+
``ocr_engine`` (wrapped) et ``llm_adapter`` (déjà
|
| 318 |
+
``StepExecutor`` natif depuis Sprint A14-S44). Le pipeline
|
| 319 |
+
lui-même n'est pas enregistré directement — sa spec
|
| 320 |
+
référence ses sous-steps par leur ``adapter_name``.
|
| 321 |
+
|
| 322 |
+
Le resolver retourné lève ``KeyError`` si un nom inconnu est
|
| 323 |
+
demandé.
|
| 324 |
+
|
| 325 |
+
Parameters
|
| 326 |
+
----------
|
| 327 |
+
engines:
|
| 328 |
+
Liste d'engines/pipelines legacy à enregistrer.
|
| 329 |
+
|
| 330 |
+
Returns
|
| 331 |
+
-------
|
| 332 |
+
Callable[[str], Any]
|
| 333 |
+
Fonction ``resolver(name) -> step_executor``.
|
| 334 |
+
|
| 335 |
+
Raises
|
| 336 |
+
------
|
| 337 |
+
PicaronesError
|
| 338 |
+
Si deux engines partagent le même ``name`` (collision).
|
| 339 |
+
"""
|
| 340 |
+
name_to_executor: dict[str, Any] = {}
|
| 341 |
+
|
| 342 |
+
def _register(name: str, executor: Any) -> None:
|
| 343 |
+
existing = name_to_executor.get(name)
|
| 344 |
+
if existing is not None and existing is not executor:
|
| 345 |
+
raise PicaronesError(
|
| 346 |
+
f"Adapter resolver : nom {name!r} enregistré "
|
| 347 |
+
"deux fois avec des instances différentes — "
|
| 348 |
+
"collision impossible à résoudre.",
|
| 349 |
+
)
|
| 350 |
+
name_to_executor[name] = executor
|
| 351 |
+
|
| 352 |
+
for engine in engines:
|
| 353 |
+
if getattr(engine, "is_pipeline", False):
|
| 354 |
+
# OCRLLMPipeline : enregistrer ocr + llm sous-jacents.
|
| 355 |
+
ocr_engine = getattr(engine, "ocr_engine", None)
|
| 356 |
+
llm_adapter = getattr(engine, "llm_adapter", None)
|
| 357 |
+
if ocr_engine is not None:
|
| 358 |
+
_register(ocr_engine.name, LegacyOCREngineExecutor(ocr_engine))
|
| 359 |
+
if llm_adapter is not None:
|
| 360 |
+
_register(_llm_adapter_name(llm_adapter), llm_adapter)
|
| 361 |
+
else:
|
| 362 |
+
_register(engine.name, LegacyOCREngineExecutor(engine))
|
| 363 |
+
|
| 364 |
+
def resolver(name: str) -> Any:
|
| 365 |
+
if name not in name_to_executor:
|
| 366 |
+
raise KeyError(
|
| 367 |
+
f"adapter inconnu pour le resolver legacy : {name!r}. "
|
| 368 |
+
f"Enregistrés : {sorted(name_to_executor.keys())!r}."
|
| 369 |
+
)
|
| 370 |
+
return name_to_executor[name]
|
| 371 |
+
|
| 372 |
+
return resolver
|
| 373 |
+
|
| 374 |
+
|
| 375 |
# ──────────────────────────────────────────────────────────────────────
|
| 376 |
# Helpers privés
|
| 377 |
# ──────────────────────────────────────────────────────────────────────
|
| 378 |
|
| 379 |
|
| 380 |
+
def _llm_adapter_name(llm_adapter: Any) -> str:
|
| 381 |
+
"""Identifiant ``provider:model`` stable pour un adapter LLM/VLM.
|
| 382 |
+
|
| 383 |
+
Convention identique à celle utilisée par
|
| 384 |
+
``picarones.pipelines._executor_runner`` (Sprint B) — les
|
| 385 |
+
adapter resolvers internes attendent ce format.
|
| 386 |
+
"""
|
| 387 |
+
return f"{llm_adapter.name}:{llm_adapter.model}"
|
| 388 |
+
|
| 389 |
+
|
| 390 |
+
def _safe_pipeline_name(name: str) -> str:
|
| 391 |
+
"""Convertit un ``engine.name`` quelconque en suffixe identifiant
|
| 392 |
+
valide pour ``PipelineSpec.name`` (alphanum + ``_-``)."""
|
| 393 |
+
out: list[str] = []
|
| 394 |
+
for ch in name:
|
| 395 |
+
if ch.isalnum() or ch in "_-":
|
| 396 |
+
out.append(ch)
|
| 397 |
+
else:
|
| 398 |
+
out.append("_")
|
| 399 |
+
return "".join(out).strip("_") or "engine"
|
| 400 |
+
|
| 401 |
+
|
| 402 |
def _safe_doc_id(doc_id: str) -> str:
|
| 403 |
"""Coerce un ``Document.doc_id`` vers le regex de ``DocumentRef.id``.
|
| 404 |
|
|
|
|
| 493 |
__all__ = [
|
| 494 |
"document_to_document_ref",
|
| 495 |
"corpus_to_corpus_spec",
|
| 496 |
+
"engine_to_pipeline_spec",
|
| 497 |
+
"build_adapter_resolver",
|
| 498 |
]
|
|
@@ -17,12 +17,20 @@ from pathlib import Path
|
|
| 17 |
|
| 18 |
import pytest
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
from picarones.app.services._legacy_runner_adapter import (
|
|
|
|
| 21 |
corpus_to_corpus_spec,
|
| 22 |
document_to_document_ref,
|
|
|
|
| 23 |
)
|
| 24 |
from picarones.domain.artifacts import ArtifactType
|
| 25 |
from picarones.domain.errors import PicaronesError
|
|
|
|
| 26 |
from picarones.evaluation.corpus import (
|
| 27 |
AltoGT,
|
| 28 |
Corpus,
|
|
@@ -34,6 +42,43 @@ from picarones.evaluation.corpus import (
|
|
| 34 |
)
|
| 35 |
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
# ──────────────────────────────────────────────────────────────────────
|
| 38 |
# document_to_document_ref
|
| 39 |
# ──────────────────────────────────────────────────────────────────────
|
|
@@ -312,3 +357,172 @@ class TestCorpusToCorpusSpec:
|
|
| 312 |
corpus = Corpus(name="dup", documents=docs)
|
| 313 |
with pytest.raises(CorpusSpecError, match="dupliqu"):
|
| 314 |
corpus_to_corpus_spec(corpus, workspace_dir=tmp_path)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
import pytest
|
| 19 |
|
| 20 |
+
from picarones.adapters.legacy_engines._step_executor import (
|
| 21 |
+
LegacyOCREngineExecutor,
|
| 22 |
+
)
|
| 23 |
+
from picarones.adapters.legacy_engines.base import BaseOCREngine
|
| 24 |
+
from picarones.adapters.llm.base import BaseLLMAdapter
|
| 25 |
from picarones.app.services._legacy_runner_adapter import (
|
| 26 |
+
build_adapter_resolver,
|
| 27 |
corpus_to_corpus_spec,
|
| 28 |
document_to_document_ref,
|
| 29 |
+
engine_to_pipeline_spec,
|
| 30 |
)
|
| 31 |
from picarones.domain.artifacts import ArtifactType
|
| 32 |
from picarones.domain.errors import PicaronesError
|
| 33 |
+
from picarones.domain.pipeline_spec import INITIAL_STEP_ID
|
| 34 |
from picarones.evaluation.corpus import (
|
| 35 |
AltoGT,
|
| 36 |
Corpus,
|
|
|
|
| 42 |
)
|
| 43 |
|
| 44 |
|
| 45 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 46 |
+
# Mocks réutilisés pour D.1.b
|
| 47 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
class _MockOCR(BaseOCREngine):
|
| 51 |
+
def __init__(self, name: str = "mock_ocr") -> None:
|
| 52 |
+
super().__init__(config={})
|
| 53 |
+
self._name = name
|
| 54 |
+
|
| 55 |
+
@property
|
| 56 |
+
def name(self) -> str: # type: ignore[override]
|
| 57 |
+
return self._name
|
| 58 |
+
|
| 59 |
+
def version(self) -> str:
|
| 60 |
+
return "1.0"
|
| 61 |
+
|
| 62 |
+
def _run_ocr(self, image_path):
|
| 63 |
+
return "ocr text"
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
class _MockLLM(BaseLLMAdapter):
|
| 67 |
+
def __init__(self, model: str = "mock-1") -> None:
|
| 68 |
+
super().__init__(model=model, config={})
|
| 69 |
+
|
| 70 |
+
@property
|
| 71 |
+
def name(self) -> str:
|
| 72 |
+
return "mock_llm"
|
| 73 |
+
|
| 74 |
+
@property
|
| 75 |
+
def default_model(self) -> str:
|
| 76 |
+
return "mock-1"
|
| 77 |
+
|
| 78 |
+
def _call(self, prompt, image_b64=None):
|
| 79 |
+
return "corrected"
|
| 80 |
+
|
| 81 |
+
|
| 82 |
# ──────────────────────────────────────────────────────────────────────
|
| 83 |
# document_to_document_ref
|
| 84 |
# ──────────────────────────────────────────────────────────────────────
|
|
|
|
| 357 |
corpus = Corpus(name="dup", documents=docs)
|
| 358 |
with pytest.raises(CorpusSpecError, match="dupliqu"):
|
| 359 |
corpus_to_corpus_spec(corpus, workspace_dir=tmp_path)
|
| 360 |
+
|
| 361 |
+
|
| 362 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 363 |
+
# engine_to_pipeline_spec
|
| 364 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 365 |
+
|
| 366 |
+
|
| 367 |
+
class TestEngineToPipelineSpec:
|
| 368 |
+
def test_ocr_only_produces_single_step_spec(self) -> None:
|
| 369 |
+
ocr = _MockOCR(name="my_ocr")
|
| 370 |
+
spec = engine_to_pipeline_spec(ocr)
|
| 371 |
+
assert len(spec.steps) == 1
|
| 372 |
+
step = spec.steps[0]
|
| 373 |
+
assert step.id == "ocr"
|
| 374 |
+
assert step.kind == "ocr"
|
| 375 |
+
assert step.adapter_name == "my_ocr"
|
| 376 |
+
assert ArtifactType.IMAGE in step.input_types
|
| 377 |
+
assert ArtifactType.RAW_TEXT in step.output_types
|
| 378 |
+
assert step.inputs_from[ArtifactType.IMAGE] == INITIAL_STEP_ID
|
| 379 |
+
|
| 380 |
+
def test_ocr_only_initial_inputs_is_image(self) -> None:
|
| 381 |
+
ocr = _MockOCR()
|
| 382 |
+
spec = engine_to_pipeline_spec(ocr)
|
| 383 |
+
assert spec.initial_inputs == (ArtifactType.IMAGE,)
|
| 384 |
+
|
| 385 |
+
def test_ocr_only_name_is_safe(self) -> None:
|
| 386 |
+
"""Un engine.name avec caractères spéciaux donne quand même un
|
| 387 |
+
spec.name conforme."""
|
| 388 |
+
ocr = _MockOCR(name="weird name (v2)")
|
| 389 |
+
spec = engine_to_pipeline_spec(ocr)
|
| 390 |
+
# Le nom de la spec ne doit contenir que des chars autorisés.
|
| 391 |
+
for ch in spec.name:
|
| 392 |
+
assert ch.isalnum() or ch in "_-"
|
| 393 |
+
|
| 394 |
+
def test_ocr_llm_pipeline_text_only(self) -> None:
|
| 395 |
+
from picarones.pipelines.base import OCRLLMPipeline, PipelineMode
|
| 396 |
+
|
| 397 |
+
ocr = _MockOCR(name="upstream_ocr")
|
| 398 |
+
llm = _MockLLM(model="mock-1")
|
| 399 |
+
pipeline = OCRLLMPipeline(
|
| 400 |
+
ocr_engine=ocr,
|
| 401 |
+
llm_adapter=llm,
|
| 402 |
+
mode=PipelineMode.TEXT_ONLY,
|
| 403 |
+
)
|
| 404 |
+
spec = engine_to_pipeline_spec(pipeline)
|
| 405 |
+
# Spec composée : 2 steps (OCR + LLM).
|
| 406 |
+
assert len(spec.steps) == 2
|
| 407 |
+
assert spec.steps[0].adapter_name == "upstream_ocr"
|
| 408 |
+
assert spec.steps[1].adapter_name == "mock_llm:mock-1"
|
| 409 |
+
# Le step LLM hérite du prompt template via params.
|
| 410 |
+
assert "prompt_template" in spec.steps[1].params
|
| 411 |
+
|
| 412 |
+
def test_ocr_llm_pipeline_zero_shot_no_ocr_step(self) -> None:
|
| 413 |
+
from picarones.pipelines.base import OCRLLMPipeline, PipelineMode
|
| 414 |
+
|
| 415 |
+
llm = _MockLLM(model="vlm-1")
|
| 416 |
+
pipeline = OCRLLMPipeline(
|
| 417 |
+
llm_adapter=llm,
|
| 418 |
+
mode=PipelineMode.ZERO_SHOT,
|
| 419 |
+
)
|
| 420 |
+
spec = engine_to_pipeline_spec(pipeline)
|
| 421 |
+
# Un seul step (VLM).
|
| 422 |
+
assert len(spec.steps) == 1
|
| 423 |
+
assert spec.steps[0].adapter_name == "mock_llm:vlm-1"
|
| 424 |
+
assert ArtifactType.RAW_TEXT in spec.steps[0].output_types
|
| 425 |
+
|
| 426 |
+
|
| 427 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 428 |
+
# build_adapter_resolver
|
| 429 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 430 |
+
|
| 431 |
+
|
| 432 |
+
class TestBuildAdapterResolver:
|
| 433 |
+
def test_single_ocr_engine_registered(self) -> None:
|
| 434 |
+
ocr = _MockOCR(name="my_ocr")
|
| 435 |
+
resolver = build_adapter_resolver([ocr])
|
| 436 |
+
step = resolver("my_ocr")
|
| 437 |
+
assert isinstance(step, LegacyOCREngineExecutor)
|
| 438 |
+
|
| 439 |
+
def test_unknown_name_raises_keyerror(self) -> None:
|
| 440 |
+
ocr = _MockOCR()
|
| 441 |
+
resolver = build_adapter_resolver([ocr])
|
| 442 |
+
with pytest.raises(KeyError, match="adapter inconnu"):
|
| 443 |
+
resolver("unknown_engine")
|
| 444 |
+
|
| 445 |
+
def test_multiple_engines_registered(self) -> None:
|
| 446 |
+
ocr_a = _MockOCR(name="engine_a")
|
| 447 |
+
ocr_b = _MockOCR(name="engine_b")
|
| 448 |
+
resolver = build_adapter_resolver([ocr_a, ocr_b])
|
| 449 |
+
step_a = resolver("engine_a")
|
| 450 |
+
step_b = resolver("engine_b")
|
| 451 |
+
assert isinstance(step_a, LegacyOCREngineExecutor)
|
| 452 |
+
assert isinstance(step_b, LegacyOCREngineExecutor)
|
| 453 |
+
|
| 454 |
+
def test_collision_on_same_name_raises(self) -> None:
|
| 455 |
+
"""Deux engines avec le même name → PicaronesError (le resolver
|
| 456 |
+
ne peut pas distinguer les deux instances)."""
|
| 457 |
+
ocr_a = _MockOCR(name="dup")
|
| 458 |
+
ocr_b = _MockOCR(name="dup") # même name, instance différente
|
| 459 |
+
with pytest.raises(PicaronesError, match="enregistré"):
|
| 460 |
+
build_adapter_resolver([ocr_a, ocr_b])
|
| 461 |
+
|
| 462 |
+
def test_pipeline_registers_subcomponents(self) -> None:
|
| 463 |
+
"""Pour un OCRLLMPipeline, le resolver enregistre l'OCR
|
| 464 |
+
sous-jacent (wrappé) et le LLM (qui est déjà StepExecutor),
|
| 465 |
+
pas le pipeline lui-même."""
|
| 466 |
+
from picarones.pipelines.base import OCRLLMPipeline, PipelineMode
|
| 467 |
+
|
| 468 |
+
ocr = _MockOCR(name="inner_ocr")
|
| 469 |
+
llm = _MockLLM(model="mock-1")
|
| 470 |
+
pipeline = OCRLLMPipeline(
|
| 471 |
+
ocr_engine=ocr,
|
| 472 |
+
llm_adapter=llm,
|
| 473 |
+
mode=PipelineMode.TEXT_ONLY,
|
| 474 |
+
)
|
| 475 |
+
resolver = build_adapter_resolver([pipeline])
|
| 476 |
+
# Les sous-composants sont disponibles…
|
| 477 |
+
assert isinstance(resolver("inner_ocr"), LegacyOCREngineExecutor)
|
| 478 |
+
assert resolver("mock_llm:mock-1") is llm
|
| 479 |
+
# …mais pas le pipeline lui-même par son nom (le resolver
|
| 480 |
+
# référence par adapter_name dans la spec, pas par engine).
|
| 481 |
+
with pytest.raises(KeyError):
|
| 482 |
+
resolver(pipeline.name)
|
| 483 |
+
|
| 484 |
+
def test_zero_shot_pipeline_only_registers_llm(self) -> None:
|
| 485 |
+
"""En zero_shot, ocr_engine=None → seul le LLM est enregistré."""
|
| 486 |
+
from picarones.pipelines.base import OCRLLMPipeline, PipelineMode
|
| 487 |
+
|
| 488 |
+
llm = _MockLLM(model="vlm-1")
|
| 489 |
+
pipeline = OCRLLMPipeline(
|
| 490 |
+
llm_adapter=llm,
|
| 491 |
+
mode=PipelineMode.ZERO_SHOT,
|
| 492 |
+
)
|
| 493 |
+
resolver = build_adapter_resolver([pipeline])
|
| 494 |
+
assert resolver("mock_llm:vlm-1") is llm
|
| 495 |
+
|
| 496 |
+
|
| 497 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 498 |
+
# Intégration : engine_to_pipeline_spec + build_adapter_resolver
|
| 499 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 500 |
+
|
| 501 |
+
|
| 502 |
+
class TestEngineSpecResolverIntegration:
|
| 503 |
+
def test_spec_adapter_names_resolve(self) -> None:
|
| 504 |
+
"""Tous les ``adapter_name`` de la spec produite par
|
| 505 |
+
``engine_to_pipeline_spec`` doivent être résolvables par
|
| 506 |
+
``build_adapter_resolver([engine])``."""
|
| 507 |
+
ocr = _MockOCR(name="resolved_ocr")
|
| 508 |
+
spec = engine_to_pipeline_spec(ocr)
|
| 509 |
+
resolver = build_adapter_resolver([ocr])
|
| 510 |
+
for step in spec.steps:
|
| 511 |
+
executor = resolver(step.adapter_name)
|
| 512 |
+
assert executor is not None
|
| 513 |
+
|
| 514 |
+
def test_pipeline_spec_resolvers_all_steps(self) -> None:
|
| 515 |
+
from picarones.pipelines.base import OCRLLMPipeline, PipelineMode
|
| 516 |
+
|
| 517 |
+
ocr = _MockOCR(name="upstream")
|
| 518 |
+
llm = _MockLLM(model="mock-1")
|
| 519 |
+
pipeline = OCRLLMPipeline(
|
| 520 |
+
ocr_engine=ocr,
|
| 521 |
+
llm_adapter=llm,
|
| 522 |
+
mode=PipelineMode.TEXT_AND_IMAGE,
|
| 523 |
+
)
|
| 524 |
+
spec = engine_to_pipeline_spec(pipeline)
|
| 525 |
+
resolver = build_adapter_resolver([pipeline])
|
| 526 |
+
# Les 2 steps (OCR + LLM) doivent pouvoir être résolus.
|
| 527 |
+
for step in spec.steps:
|
| 528 |
+
assert resolver(step.adapter_name) is not None
|
|
@@ -38,6 +38,10 @@ FILE_BUDGETS: dict[str, int] = {
|
|
| 38 |
# Sera supprimé en Sprint C-D quand les callers consommeront des
|
| 39 |
# PipelineSpec directement.
|
| 40 |
"picarones/pipelines/_executor_runner.py": 470, # actuel 410
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
# --- God-modules : budget actuel + 15 % de marge.
|
| 42 |
# Le rétrécissement sera l'objet d'un sprint de refactor dédié.
|
| 43 |
# statistics.py (1128 lignes) a été éclaté en sous-package
|
|
|
|
| 38 |
# Sera supprimé en Sprint C-D quand les callers consommeront des
|
| 39 |
# PipelineSpec directement.
|
| 40 |
"picarones/pipelines/_executor_runner.py": 470, # actuel 410
|
| 41 |
+
# Sprint D.1 (plan v2.0) — adapter de compat run_benchmark legacy
|
| 42 |
+
# → BenchmarkService rewrite. Module transitoire qui sera
|
| 43 |
+
# supprimé en D.6 avec measurements/runner/.
|
| 44 |
+
"picarones/app/services/_legacy_runner_adapter.py": 575, # actuel 498
|
| 45 |
# --- God-modules : budget actuel + 15 % de marge.
|
| 46 |
# Le rétrécissement sera l'objet d'un sprint de refactor dédié.
|
| 47 |
# statistics.py (1128 lignes) a été éclaté en sous-package
|