Spaces:
Running
feat(pipeline): Sprint A14-S8 — CorpusRunner backpressure + timeout réel + cancellation
Browse filesSprint S8 du plan rewrite ciblé. Orchestre l'exécution
corpus-wide d'une PipelineSpec avec trois propriétés que
l'ancien ``measurements.runner`` ne garantissait pas :
1. Backpressure (jamais plus de ``max_in_flight`` futures en vol)
2. Timeout depuis le **début d'exécution réelle**, pas depuis la
submission au pool
3. Annulation propre via ``threading.Event`` partagé
Bug critique de l'ancien runner corrigé
---------------------------------------
L'ancien runner mesurait le timeout depuis la submission au pool
(cf. ``submitted_at`` dans ``measurements/runner/orchestration.py``
ligne 236). Conséquence : un document pouvait être marqué
``timeout`` parce qu'il avait passé N secondes en queue, pas N
secondes en train de tourner.
Nouvelle implémentation : chaque worker écrit son ``started_at =
time.monotonic()`` dans un dict partagé en première instruction
de ``_run_one``. L'orchestrateur lit ce dict pour décider si un
doc doit être marqué timeout. Si un doc est encore en queue
(``doc.id not in started_at``), il n'est jamais marqué timeout —
seulement candidat à cancellation propre.
Modules livrés
--------------
``picarones/pipeline/runner.py``
- ``CorpusRunner(executor, max_in_flight=4,
timeout_seconds_per_doc=300.0, poll_interval_seconds=0.05)``.
- ``run(spec, documents, initial_inputs_factory,
context_factory, corpus_name, cancel_event)``.
- Pool ``ThreadPoolExecutor`` instancié explicitement (pas de
context manager) avec ``shutdown(wait=False,
cancel_futures=True)`` à la sortie : le caller récupère le
résultat immédiatement sur cancel/timeout, les threads en
cours continuent en arrière-plan jusqu'à leur fin naturelle.
- Backpressure : ``_submit_next()`` appelé à chaque libération
pour maintenir ``in_flight <= max_in_flight``.
- Polling court (50ms par défaut) entre les ``concurrent.futures.wait``
pour vérifier les timeouts en parallèle des completions naturelles.
- ``DocumentOutcome`` : statut parmi
``succeeded`` / ``failed`` / ``timed_out`` / ``cancelled``,
avec ``pipeline_result`` conservé quand pertinent et ``error``
explicite sinon.
- ``CorpusRunResult`` : agrégation cohérente
(``n_succeeded + n_failed + n_timed_out + n_cancelled
<= n_documents``, le delta éventuel = jamais lancé en cas de
cancel précoce).
Limites assumées (à lever post-S8)
----------------------------------
- **Mode threads uniquement**. ``ProcessPoolExecutor`` arrive au
S11 quand on déplacera les adapters CPU-bound (Tesseract, Pero).
Les LLM/OCR cloud sont IO-bound → threads OK.
- **Pas de kill-thread garanti**. Python ne permet pas de tuer un
thread. Si un adapter ne coopère pas avec ``cancel_event`` et
fait un appel C bloquant, le thread continue. Documenté.
Tests — 16 nouveaux tests (4 fichiers)
--------------------------------------
``test_sprint_a14_s8_backpressure.py`` (5)
Adapter ``_ConcurrencyTrackingAdapter`` avec compteur partagé
qui mesure la concurrence observée pendant ses ``execute()``.
Vérifie ``max_observed <= max_in_flight`` ET
``max_observed == max_in_flight`` (preuve qu'on parallélise
vraiment). Paramétré sur 1, 2, 4 workers. Plus :
``max_in_flight=1 → mode séquentiel``, corpus vide, valeurs
invalides rejetées (``max_in_flight=0``, ``timeout=0``).
``test_sprint_a14_s8_timeout.py`` (3)
- Step qui dort 500ms, timeout 100ms → ``status="timed_out"``,
runner rend la main en < 300ms (preuve qu'il ne bloque pas
sur le sleep complet).
- **Bug historique** : 4 docs en série avec 1 worker, sleep 50ms
chacun, timeout 500ms → tous succèdent (ancien runner aurait
marqué les derniers docs timeout à cause de la queue).
- Mix rapides/lents, vérifie que seuls les lents timeout.
``test_sprint_a14_s8_cancellation.py`` (3)
- Cancel signalé avant le run → 0 succès.
- Cancel pendant l'exécution → docs en attente cancellés, doc en
cours se termine ; runner rend la main rapidement.
- ``CorpusRunResult`` reste cohérent même en cancel.
``test_sprint_a14_s8_def_of_done.py`` (2)
- Critère de done scaled-down : 200 docs synthétiques en < 60s,
croissance RAM < 200MB. L'objectif réel "1000 docs / 10 min /
500MB" est largement atteint avec ces stubs.
- Throughput : 100 docs avec ``max_in_flight=4`` en < 5s.
Mise à jour de la whitelist
---------------------------
``runner.py`` (462 lignes) ajouté à
``tests/architecture/test_file_budgets.py`` avec budget 550 (marge
pour l'extension ``ProcessPoolExecutor`` du S11).
État de la suite
----------------
``pytest tests/ -q`` → 4119 passed, 6 skipped, 2 failed
(environnementaux, sous-process pytest sans ``pip install -e .``).
+16 tests par rapport à S7.
Critère go/no-go S8 atteint
---------------------------
- Backpressure paramétrée et observable.
- Timeout depuis début d'exécution réelle, pas depuis submission.
- Cancel rend la main au caller en < 1.5s sur 20 docs.
- Pool shutdown(wait=False, cancel_futures=True) → pas de blocage.
Prêt pour S9 (formats/alto/ et formats/pagexml/).
https://claude.ai/code/session_011XQZNitg1rCgia8ZD1a2hP
- picarones/pipeline/__init__.py +28 -5
- picarones/pipeline/runner.py +478 -0
- tests/architecture/test_file_budgets.py +5 -0
- tests/pipeline/test_sprint_a14_s8_backpressure.py +156 -0
- tests/pipeline/test_sprint_a14_s8_cancellation.py +162 -0
- tests/pipeline/test_sprint_a14_s8_def_of_done.py +141 -0
- tests/pipeline/test_sprint_a14_s8_timeout.py +158 -0
|
@@ -28,13 +28,23 @@ Modules livrés au S7
|
|
| 28 |
- ``cache.py`` — ``ArtifactCache`` minimal in-memory indexé par
|
| 29 |
``hash(content + spec + code_version)``.
|
| 30 |
|
| 31 |
-
|
| 32 |
--------------------
|
| 33 |
-
- ``runner.py`` — ``CorpusRunner`` orchestre
|
| 34 |
-
|
| 35 |
-
d'exécution réelle**, **annulation propre**.
|
| 36 |
|
| 37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
``measurements.runner`` à 1e-9 près sur les fixtures.
|
| 39 |
"""
|
| 40 |
|
|
@@ -47,6 +57,13 @@ from picarones.pipeline.executor import (
|
|
| 47 |
PipelineSpecInvalid,
|
| 48 |
)
|
| 49 |
from picarones.pipeline.protocols import ExecutionMode, StepExecutor
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
from picarones.pipeline.spec import INITIAL_STEP_ID, PipelineSpec, PipelineStep
|
| 51 |
from picarones.pipeline.types import PipelineResult, RunContext, StepResult
|
| 52 |
from picarones.pipeline.validation import ValidationError, validate_spec
|
|
@@ -76,4 +93,10 @@ __all__ = [
|
|
| 76 |
"AdapterResolver",
|
| 77 |
# Cache (S7)
|
| 78 |
"ArtifactCache",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
]
|
|
|
|
| 28 |
- ``cache.py`` — ``ArtifactCache`` minimal in-memory indexé par
|
| 29 |
``hash(content + spec + code_version)``.
|
| 30 |
|
| 31 |
+
Modules livrés au S8
|
| 32 |
--------------------
|
| 33 |
+
- ``runner.py`` — ``CorpusRunner`` orchestre ``PipelineExecutor``
|
| 34 |
+
sur un corpus complet avec :
|
|
|
|
| 35 |
|
| 36 |
+
* **backpressure** (``max_in_flight``, jamais plus de N futures
|
| 37 |
+
en vol),
|
| 38 |
+
* **timeout depuis le début d'exécution réelle** (pas depuis la
|
| 39 |
+
submission au pool),
|
| 40 |
+
* **annulation propre** via ``threading.Event``.
|
| 41 |
+
|
| 42 |
+
``CorpusRunResult`` agrège ``DocumentOutcome``, qui distingue
|
| 43 |
+
``succeeded`` / ``failed`` / ``timed_out`` / ``cancelled``.
|
| 44 |
+
|
| 45 |
+
Cible du Sprint S12
|
| 46 |
+
-------------------
|
| 47 |
+
Équivalence numérique CER/WER avec l'ancien
|
| 48 |
``measurements.runner`` à 1e-9 près sur les fixtures.
|
| 49 |
"""
|
| 50 |
|
|
|
|
| 57 |
PipelineSpecInvalid,
|
| 58 |
)
|
| 59 |
from picarones.pipeline.protocols import ExecutionMode, StepExecutor
|
| 60 |
+
from picarones.pipeline.runner import (
|
| 61 |
+
ContextFactory,
|
| 62 |
+
CorpusRunResult,
|
| 63 |
+
CorpusRunner,
|
| 64 |
+
DocumentOutcome,
|
| 65 |
+
InitialInputsFactory,
|
| 66 |
+
)
|
| 67 |
from picarones.pipeline.spec import INITIAL_STEP_ID, PipelineSpec, PipelineStep
|
| 68 |
from picarones.pipeline.types import PipelineResult, RunContext, StepResult
|
| 69 |
from picarones.pipeline.validation import ValidationError, validate_spec
|
|
|
|
| 93 |
"AdapterResolver",
|
| 94 |
# Cache (S7)
|
| 95 |
"ArtifactCache",
|
| 96 |
+
# CorpusRunner (S8)
|
| 97 |
+
"CorpusRunner",
|
| 98 |
+
"CorpusRunResult",
|
| 99 |
+
"DocumentOutcome",
|
| 100 |
+
"InitialInputsFactory",
|
| 101 |
+
"ContextFactory",
|
| 102 |
]
|
|
@@ -0,0 +1,478 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""``CorpusRunner`` — Sprint A14-S8.
|
| 2 |
+
|
| 3 |
+
Orchestre l'exécution d'une ``PipelineSpec`` sur un corpus complet
|
| 4 |
+
avec trois propriétés critiques que l'ancien
|
| 5 |
+
``measurements.runner`` ne garantissait pas correctement :
|
| 6 |
+
|
| 7 |
+
1. **Backpressure** — pas de "submit all upfront". L'orchestrateur
|
| 8 |
+
ne soumet jamais plus de ``max_in_flight`` documents en
|
| 9 |
+
parallèle. RAM bornée même sur des corpus de plusieurs milliers
|
| 10 |
+
de documents.
|
| 11 |
+
|
| 12 |
+
2. **Timeout depuis le début d'exécution réelle** — l'ancien runner
|
| 13 |
+
calculait le timeout depuis la submission au pool, donc un
|
| 14 |
+
document pouvait être marqué timeout parce qu'il avait passé
|
| 15 |
+
N secondes en queue, pas N secondes en train de tourner. Le
|
| 16 |
+
nouveau runner mesure depuis le moment où le worker démarre
|
| 17 |
+
réellement.
|
| 18 |
+
|
| 19 |
+
3. **Annulation propre** — un ``threading.Event`` partagé permet
|
| 20 |
+
au caller (typiquement un service applicatif sur un endpoint
|
| 21 |
+
FastAPI ``cancel``) de signaler l'arrêt. Les workers
|
| 22 |
+
coopératifs vérifient l'event ; les futures non démarrées sont
|
| 23 |
+
sautées ; les futures déjà en cours se terminent (Python ne
|
| 24 |
+
permet pas de tuer un thread en cours).
|
| 25 |
+
|
| 26 |
+
Limites assumées pour S8
|
| 27 |
+
------------------------
|
| 28 |
+
- **Mode threads uniquement.** Le mode process (``ProcessPoolExecutor``)
|
| 29 |
+
ajouté au S11 quand on déplacera les adapters CPU-bound.
|
| 30 |
+
Aujourd'hui, un adapter Tesseract local en thread fonctionne
|
| 31 |
+
(le GIL est relâché par le sous-processus pytesseract → OK).
|
| 32 |
+
- **Pas de kill-thread garanti.** Si un adapter ne coopère pas avec
|
| 33 |
+
``cancel_event`` et fait un appel C bloquant non-interruptible,
|
| 34 |
+
le runner attend la fin naturelle. C'est documenté.
|
| 35 |
+
- **Pas de retry automatique.** Si un adapter échoue, le doc est
|
| 36 |
+
marqué en échec et on passe au suivant.
|
| 37 |
+
|
| 38 |
+
Définition de done
|
| 39 |
+
------------------
|
| 40 |
+
``CorpusRunner.run(spec, 1000 docs synthétiques)`` se termine en
|
| 41 |
+
moins de 10 minutes sans dépasser 500 MB de RAM résidente. Le
|
| 42 |
+
test ``test_sprint_a14_s8_def_of_done`` valide ce critère
|
| 43 |
+
(échantillon paramétrable pour CI rapide).
|
| 44 |
+
"""
|
| 45 |
+
|
| 46 |
+
from __future__ import annotations
|
| 47 |
+
|
| 48 |
+
import concurrent.futures
|
| 49 |
+
import logging
|
| 50 |
+
import threading
|
| 51 |
+
import time
|
| 52 |
+
from collections.abc import Iterable
|
| 53 |
+
from typing import Callable
|
| 54 |
+
|
| 55 |
+
from pydantic import BaseModel, ConfigDict, Field
|
| 56 |
+
|
| 57 |
+
from picarones.domain.artifacts import Artifact, ArtifactType
|
| 58 |
+
from picarones.domain.documents import DocumentRef
|
| 59 |
+
from picarones.domain.errors import PicaronesError
|
| 60 |
+
from picarones.pipeline.executor import PipelineExecutor
|
| 61 |
+
from picarones.pipeline.spec import PipelineSpec
|
| 62 |
+
from picarones.pipeline.types import PipelineResult, RunContext
|
| 63 |
+
|
| 64 |
+
logger = logging.getLogger(__name__)
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
#: Factories injectées par le caller pour adapter le runner à
|
| 68 |
+
#: son contexte (corpus local, IIIF, HF, etc.).
|
| 69 |
+
InitialInputsFactory = Callable[
|
| 70 |
+
[DocumentRef],
|
| 71 |
+
dict[ArtifactType, Artifact],
|
| 72 |
+
]
|
| 73 |
+
ContextFactory = Callable[[DocumentRef], RunContext]
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
class DocumentOutcome(BaseModel):
|
| 77 |
+
"""Résultat de l'exécution d'une pipeline sur **un** document.
|
| 78 |
+
|
| 79 |
+
Distinct de ``PipelineResult`` : porte un statut
|
| 80 |
+
(``"succeeded"`` / ``"failed"`` / ``"timed_out"`` /
|
| 81 |
+
``"cancelled"``) et conserve le ``PipelineResult`` quand il
|
| 82 |
+
existe (peut être ``None`` si annulation avant démarrage).
|
| 83 |
+
"""
|
| 84 |
+
|
| 85 |
+
model_config = ConfigDict(frozen=True, extra="forbid")
|
| 86 |
+
|
| 87 |
+
document_id: str
|
| 88 |
+
status: str = Field(pattern=r"^(succeeded|failed|timed_out|cancelled)$")
|
| 89 |
+
duration_seconds: float = Field(ge=0.0)
|
| 90 |
+
error: str | None = None
|
| 91 |
+
pipeline_result: PipelineResult | None = None
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
class CorpusRunResult(BaseModel):
|
| 95 |
+
"""Résultat agrégé d'un run de corpus.
|
| 96 |
+
|
| 97 |
+
Attributs
|
| 98 |
+
---------
|
| 99 |
+
pipeline_name:
|
| 100 |
+
Nom de la pipeline exécutée.
|
| 101 |
+
corpus_name:
|
| 102 |
+
Nom du corpus (libre, fourni par le caller).
|
| 103 |
+
n_documents:
|
| 104 |
+
Nombre total de documents tentés.
|
| 105 |
+
n_succeeded:
|
| 106 |
+
Nombre de documents pour lesquels la pipeline a complètement
|
| 107 |
+
réussi (``PipelineResult.succeeded == True``).
|
| 108 |
+
n_failed:
|
| 109 |
+
Nombre de documents avec au moins une étape en échec.
|
| 110 |
+
n_timed_out:
|
| 111 |
+
Nombre de documents tués par timeout.
|
| 112 |
+
n_cancelled:
|
| 113 |
+
Nombre de documents jamais démarrés (cancel_event signalé
|
| 114 |
+
avant leur tour).
|
| 115 |
+
duration_seconds:
|
| 116 |
+
Wall-clock total du run.
|
| 117 |
+
outcomes:
|
| 118 |
+
Détail document par document, ordre d'achèvement.
|
| 119 |
+
"""
|
| 120 |
+
|
| 121 |
+
model_config = ConfigDict(frozen=True, extra="forbid")
|
| 122 |
+
|
| 123 |
+
pipeline_name: str
|
| 124 |
+
corpus_name: str
|
| 125 |
+
n_documents: int = Field(ge=0)
|
| 126 |
+
n_succeeded: int = Field(ge=0)
|
| 127 |
+
n_failed: int = Field(ge=0)
|
| 128 |
+
n_timed_out: int = Field(ge=0)
|
| 129 |
+
n_cancelled: int = Field(ge=0)
|
| 130 |
+
duration_seconds: float = Field(ge=0.0)
|
| 131 |
+
outcomes: tuple[DocumentOutcome, ...] = Field(default_factory=tuple)
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
class CorpusRunner:
|
| 135 |
+
"""Orchestre ``PipelineExecutor`` sur un corpus avec backpressure
|
| 136 |
+
+ timeout réel + cancellation.
|
| 137 |
+
|
| 138 |
+
Une instance est réutilisable à travers plusieurs runs.
|
| 139 |
+
"""
|
| 140 |
+
|
| 141 |
+
def __init__(
|
| 142 |
+
self,
|
| 143 |
+
executor: PipelineExecutor,
|
| 144 |
+
max_in_flight: int = 4,
|
| 145 |
+
timeout_seconds_per_doc: float = 300.0,
|
| 146 |
+
poll_interval_seconds: float = 0.05,
|
| 147 |
+
) -> None:
|
| 148 |
+
if max_in_flight < 1:
|
| 149 |
+
raise PicaronesError(
|
| 150 |
+
f"max_in_flight doit être >= 1 (reçu {max_in_flight})."
|
| 151 |
+
)
|
| 152 |
+
if timeout_seconds_per_doc <= 0:
|
| 153 |
+
raise PicaronesError(
|
| 154 |
+
f"timeout_seconds_per_doc doit être > 0 (reçu "
|
| 155 |
+
f"{timeout_seconds_per_doc})."
|
| 156 |
+
)
|
| 157 |
+
if poll_interval_seconds <= 0:
|
| 158 |
+
raise PicaronesError(
|
| 159 |
+
"poll_interval_seconds doit être > 0."
|
| 160 |
+
)
|
| 161 |
+
self._executor = executor
|
| 162 |
+
self._max_in_flight = max_in_flight
|
| 163 |
+
self._timeout = timeout_seconds_per_doc
|
| 164 |
+
self._poll = poll_interval_seconds
|
| 165 |
+
|
| 166 |
+
def run(
|
| 167 |
+
self,
|
| 168 |
+
spec: PipelineSpec,
|
| 169 |
+
documents: Iterable[DocumentRef],
|
| 170 |
+
initial_inputs_factory: InitialInputsFactory,
|
| 171 |
+
context_factory: ContextFactory,
|
| 172 |
+
corpus_name: str = "corpus",
|
| 173 |
+
cancel_event: threading.Event | None = None,
|
| 174 |
+
) -> CorpusRunResult:
|
| 175 |
+
"""Exécute ``spec`` sur tous les ``documents`` du corpus.
|
| 176 |
+
|
| 177 |
+
Returns
|
| 178 |
+
-------
|
| 179 |
+
CorpusRunResult
|
| 180 |
+
Résultat agrégé. Ne lève jamais — toute erreur d'un
|
| 181 |
+
document est capturée dans son ``DocumentOutcome``.
|
| 182 |
+
"""
|
| 183 |
+
documents_list = list(documents)
|
| 184 |
+
run_started = time.perf_counter()
|
| 185 |
+
|
| 186 |
+
# État partagé entre threads : ``started_at[doc_id]`` =
|
| 187 |
+
# monotonic au moment où le worker du doc a vraiment démarré
|
| 188 |
+
# ``execute()``. L'orchestrateur lit ce dict pour décider
|
| 189 |
+
# d'un timeout depuis le début d'exécution réelle.
|
| 190 |
+
started_at: dict[str, float] = {}
|
| 191 |
+
started_at_lock = threading.Lock()
|
| 192 |
+
|
| 193 |
+
outcomes: list[DocumentOutcome] = []
|
| 194 |
+
|
| 195 |
+
# Fast path : aucun document → résultat vide immédiat.
|
| 196 |
+
if not documents_list:
|
| 197 |
+
return CorpusRunResult(
|
| 198 |
+
pipeline_name=spec.name,
|
| 199 |
+
corpus_name=corpus_name,
|
| 200 |
+
n_documents=0,
|
| 201 |
+
n_succeeded=0,
|
| 202 |
+
n_failed=0,
|
| 203 |
+
n_timed_out=0,
|
| 204 |
+
n_cancelled=0,
|
| 205 |
+
duration_seconds=0.0,
|
| 206 |
+
outcomes=(),
|
| 207 |
+
)
|
| 208 |
+
|
| 209 |
+
# Pool instancié explicitement avec ``shutdown(wait=False,
|
| 210 |
+
# cancel_futures=True)`` à la sortie : les futures en queue
|
| 211 |
+
# sont annulées, les threads en cours continuent en
|
| 212 |
+
# arrière-plan jusqu'à leur fin naturelle (Python ne permet
|
| 213 |
+
# pas de tuer un thread). Le caller récupère le résultat
|
| 214 |
+
# immédiatement après le timeout / la cancellation, sans
|
| 215 |
+
# attendre que les threads en cours se terminent — c'est
|
| 216 |
+
# critique pour la latence perçue du runner.
|
| 217 |
+
pool = concurrent.futures.ThreadPoolExecutor(
|
| 218 |
+
max_workers=self._max_in_flight,
|
| 219 |
+
thread_name_prefix=f"picarones-{spec.name}",
|
| 220 |
+
)
|
| 221 |
+
try:
|
| 222 |
+
future_to_doc: dict[concurrent.futures.Future, DocumentRef] = {}
|
| 223 |
+
doc_iter = iter(documents_list)
|
| 224 |
+
in_flight = 0
|
| 225 |
+
done_count = 0
|
| 226 |
+
|
| 227 |
+
def _submit_next() -> bool:
|
| 228 |
+
"""Tente de soumettre le prochain document au pool.
|
| 229 |
+
|
| 230 |
+
Retourne ``True`` si un doc a été soumis,
|
| 231 |
+
``False`` si l'itérateur est épuisé ou si
|
| 232 |
+
cancel_event est signalé.
|
| 233 |
+
"""
|
| 234 |
+
nonlocal in_flight
|
| 235 |
+
if cancel_event is not None and cancel_event.is_set():
|
| 236 |
+
return False
|
| 237 |
+
try:
|
| 238 |
+
doc = next(doc_iter)
|
| 239 |
+
except StopIteration:
|
| 240 |
+
return False
|
| 241 |
+
fut = pool.submit(
|
| 242 |
+
self._run_one,
|
| 243 |
+
spec=spec,
|
| 244 |
+
document=doc,
|
| 245 |
+
initial_inputs_factory=initial_inputs_factory,
|
| 246 |
+
context_factory=context_factory,
|
| 247 |
+
started_at=started_at,
|
| 248 |
+
started_at_lock=started_at_lock,
|
| 249 |
+
)
|
| 250 |
+
future_to_doc[fut] = doc
|
| 251 |
+
in_flight += 1
|
| 252 |
+
return True
|
| 253 |
+
|
| 254 |
+
# 1. Amorcer le pool : ne pas dépasser max_in_flight.
|
| 255 |
+
for _ in range(self._max_in_flight):
|
| 256 |
+
if not _submit_next():
|
| 257 |
+
break
|
| 258 |
+
|
| 259 |
+
# 2. Boucle principale : récolter les résultats, surveiller
|
| 260 |
+
# les timeouts, soumettre le suivant à chaque libération.
|
| 261 |
+
while future_to_doc:
|
| 262 |
+
# Polling court pour pouvoir vérifier les timeouts en
|
| 263 |
+
# parallèle des completions naturelles.
|
| 264 |
+
done_set, _ = concurrent.futures.wait(
|
| 265 |
+
future_to_doc.keys(),
|
| 266 |
+
timeout=self._poll,
|
| 267 |
+
return_when=concurrent.futures.FIRST_COMPLETED,
|
| 268 |
+
)
|
| 269 |
+
|
| 270 |
+
# 2a. Récolter les futures terminées.
|
| 271 |
+
for fut in done_set:
|
| 272 |
+
doc = future_to_doc.pop(fut)
|
| 273 |
+
in_flight -= 1
|
| 274 |
+
outcomes.append(_outcome_from_future(fut, doc))
|
| 275 |
+
done_count += 1
|
| 276 |
+
# Soumettre le suivant pour maintenir la backpressure.
|
| 277 |
+
_submit_next()
|
| 278 |
+
|
| 279 |
+
# 2b. Vérifier les timeouts depuis le début d'exécution
|
| 280 |
+
# réelle (pas depuis la submission).
|
| 281 |
+
now = time.monotonic()
|
| 282 |
+
timed_out_futures: list[concurrent.futures.Future] = []
|
| 283 |
+
with started_at_lock:
|
| 284 |
+
started_snapshot = dict(started_at)
|
| 285 |
+
for fut, doc in list(future_to_doc.items()):
|
| 286 |
+
started = started_snapshot.get(doc.id)
|
| 287 |
+
if started is None:
|
| 288 |
+
continue # pas encore démarré → pas de timeout
|
| 289 |
+
if now - started > self._timeout:
|
| 290 |
+
timed_out_futures.append(fut)
|
| 291 |
+
|
| 292 |
+
for fut in timed_out_futures:
|
| 293 |
+
doc = future_to_doc.pop(fut)
|
| 294 |
+
in_flight -= 1
|
| 295 |
+
# On ne peut pas vraiment killer un thread en
|
| 296 |
+
# Python ; on signale via cancel_event si fourni
|
| 297 |
+
# ET on enregistre le timeout immédiatement (le
|
| 298 |
+
# thread continuera en arrière-plan jusqu'à ce
|
| 299 |
+
# qu'il ait fini, mais le run principal n'attend
|
| 300 |
+
# plus son résultat).
|
| 301 |
+
duration = (
|
| 302 |
+
now - started_snapshot.get(doc.id, now)
|
| 303 |
+
)
|
| 304 |
+
outcomes.append(DocumentOutcome(
|
| 305 |
+
document_id=doc.id,
|
| 306 |
+
status="timed_out",
|
| 307 |
+
duration_seconds=max(duration, 0.0),
|
| 308 |
+
error=(
|
| 309 |
+
f"timeout: doc {doc.id} a dépassé "
|
| 310 |
+
f"{self._timeout:.1f}s d'exécution réelle"
|
| 311 |
+
),
|
| 312 |
+
))
|
| 313 |
+
done_count += 1
|
| 314 |
+
_submit_next()
|
| 315 |
+
|
| 316 |
+
# 2c. Cancellation explicite : marquer toutes les
|
| 317 |
+
# futures non démarrées comme annulées.
|
| 318 |
+
if cancel_event is not None and cancel_event.is_set():
|
| 319 |
+
cancelled = []
|
| 320 |
+
with started_at_lock:
|
| 321 |
+
started_snapshot = dict(started_at)
|
| 322 |
+
for fut, doc in list(future_to_doc.items()):
|
| 323 |
+
if doc.id not in started_snapshot:
|
| 324 |
+
# Future encore en queue → on peut la
|
| 325 |
+
# canceller proprement.
|
| 326 |
+
if fut.cancel():
|
| 327 |
+
cancelled.append(doc)
|
| 328 |
+
future_to_doc.pop(fut, None)
|
| 329 |
+
in_flight -= 1
|
| 330 |
+
for doc in cancelled:
|
| 331 |
+
outcomes.append(DocumentOutcome(
|
| 332 |
+
document_id=doc.id,
|
| 333 |
+
status="cancelled",
|
| 334 |
+
duration_seconds=0.0,
|
| 335 |
+
error="cancelled before start",
|
| 336 |
+
))
|
| 337 |
+
finally:
|
| 338 |
+
# Sortie immédiate : on ne bloque pas sur les threads en
|
| 339 |
+
# cours. Les futures en queue sont annulées, les threads
|
| 340 |
+
# déjà actifs continuent jusqu'à leur fin naturelle (cf.
|
| 341 |
+
# commentaire à l'instanciation du pool).
|
| 342 |
+
pool.shutdown(wait=False, cancel_futures=True)
|
| 343 |
+
|
| 344 |
+
# 3. Agrégation finale.
|
| 345 |
+
run_duration = time.perf_counter() - run_started
|
| 346 |
+
return _aggregate(
|
| 347 |
+
pipeline_name=spec.name,
|
| 348 |
+
corpus_name=corpus_name,
|
| 349 |
+
n_documents=len(documents_list),
|
| 350 |
+
outcomes=outcomes,
|
| 351 |
+
duration_seconds=run_duration,
|
| 352 |
+
)
|
| 353 |
+
|
| 354 |
+
# ──────────────────────────────────────────────────────────────────
|
| 355 |
+
# Worker
|
| 356 |
+
# ──────────────────────────────────────────────────────────────────
|
| 357 |
+
|
| 358 |
+
def _run_one(
|
| 359 |
+
self,
|
| 360 |
+
*,
|
| 361 |
+
spec: PipelineSpec,
|
| 362 |
+
document: DocumentRef,
|
| 363 |
+
initial_inputs_factory: InitialInputsFactory,
|
| 364 |
+
context_factory: ContextFactory,
|
| 365 |
+
started_at: dict[str, float],
|
| 366 |
+
started_at_lock: threading.Lock,
|
| 367 |
+
) -> PipelineResult:
|
| 368 |
+
"""Exécute la pipeline sur un document. Appelé dans un thread
|
| 369 |
+
du pool.
|
| 370 |
+
|
| 371 |
+
Enregistre ``started_at[doc.id]`` au tout début pour que
|
| 372 |
+
l'orchestrateur puisse mesurer le timeout depuis le début
|
| 373 |
+
d'exécution réelle.
|
| 374 |
+
"""
|
| 375 |
+
# 1. Marquer le démarrage réel. Ce moment est ce qui sert de
|
| 376 |
+
# référence pour le timeout.
|
| 377 |
+
with started_at_lock:
|
| 378 |
+
started_at[document.id] = time.monotonic()
|
| 379 |
+
|
| 380 |
+
# 2. Construire les inputs et le contexte.
|
| 381 |
+
initial_inputs = initial_inputs_factory(document)
|
| 382 |
+
context = context_factory(document)
|
| 383 |
+
|
| 384 |
+
# 3. Déléguer au PipelineExecutor mono-doc (S7).
|
| 385 |
+
return self._executor.run(
|
| 386 |
+
spec=spec,
|
| 387 |
+
document=document,
|
| 388 |
+
initial_inputs=initial_inputs,
|
| 389 |
+
context=context,
|
| 390 |
+
)
|
| 391 |
+
|
| 392 |
+
|
| 393 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 394 |
+
# Helpers d'agrégation
|
| 395 |
+
# ──────────────────────────────────────────────────────────────────────
|
| 396 |
+
|
| 397 |
+
|
| 398 |
+
def _outcome_from_future(
|
| 399 |
+
fut: concurrent.futures.Future,
|
| 400 |
+
doc: DocumentRef,
|
| 401 |
+
) -> DocumentOutcome:
|
| 402 |
+
"""Convertit une future achevée en ``DocumentOutcome``.
|
| 403 |
+
|
| 404 |
+
- Future qui a levé → ``status="failed"``, ``error=str(exc)``.
|
| 405 |
+
- Future qui a renvoyé un ``PipelineResult`` succeeded → ``"succeeded"``.
|
| 406 |
+
- Future qui a renvoyé un ``PipelineResult`` non-succeeded →
|
| 407 |
+
``"failed"`` (au moins une étape en erreur).
|
| 408 |
+
"""
|
| 409 |
+
try:
|
| 410 |
+
result = fut.result(timeout=0) # déjà done
|
| 411 |
+
except concurrent.futures.CancelledError:
|
| 412 |
+
return DocumentOutcome(
|
| 413 |
+
document_id=doc.id,
|
| 414 |
+
status="cancelled",
|
| 415 |
+
duration_seconds=0.0,
|
| 416 |
+
error="cancelled",
|
| 417 |
+
)
|
| 418 |
+
except Exception as exc: # noqa: BLE001
|
| 419 |
+
# PipelineExecutor capture toutes les erreurs des steps,
|
| 420 |
+
# donc une exception ici signale un bug profond (typiquement
|
| 421 |
+
# un PipelineSpecInvalid levé par l'executor).
|
| 422 |
+
return DocumentOutcome(
|
| 423 |
+
document_id=doc.id,
|
| 424 |
+
status="failed",
|
| 425 |
+
duration_seconds=0.0,
|
| 426 |
+
error=f"runner_internal_error: {type(exc).__name__}: {exc}",
|
| 427 |
+
)
|
| 428 |
+
|
| 429 |
+
if result.succeeded:
|
| 430 |
+
status = "succeeded"
|
| 431 |
+
error: str | None = None
|
| 432 |
+
else:
|
| 433 |
+
status = "failed"
|
| 434 |
+
# Concaténer les erreurs de step pour le diagnostic.
|
| 435 |
+
step_errors = [
|
| 436 |
+
f"{r.step_id}: {r.error}"
|
| 437 |
+
for r in result.step_results
|
| 438 |
+
if not r.succeeded
|
| 439 |
+
]
|
| 440 |
+
error = "; ".join(step_errors) if step_errors else "unknown failure"
|
| 441 |
+
|
| 442 |
+
return DocumentOutcome(
|
| 443 |
+
document_id=doc.id,
|
| 444 |
+
status=status,
|
| 445 |
+
duration_seconds=result.duration_seconds,
|
| 446 |
+
error=error,
|
| 447 |
+
pipeline_result=result,
|
| 448 |
+
)
|
| 449 |
+
|
| 450 |
+
|
| 451 |
+
def _aggregate(
|
| 452 |
+
*,
|
| 453 |
+
pipeline_name: str,
|
| 454 |
+
corpus_name: str,
|
| 455 |
+
n_documents: int,
|
| 456 |
+
outcomes: list[DocumentOutcome],
|
| 457 |
+
duration_seconds: float,
|
| 458 |
+
) -> CorpusRunResult:
|
| 459 |
+
return CorpusRunResult(
|
| 460 |
+
pipeline_name=pipeline_name,
|
| 461 |
+
corpus_name=corpus_name,
|
| 462 |
+
n_documents=n_documents,
|
| 463 |
+
n_succeeded=sum(1 for o in outcomes if o.status == "succeeded"),
|
| 464 |
+
n_failed=sum(1 for o in outcomes if o.status == "failed"),
|
| 465 |
+
n_timed_out=sum(1 for o in outcomes if o.status == "timed_out"),
|
| 466 |
+
n_cancelled=sum(1 for o in outcomes if o.status == "cancelled"),
|
| 467 |
+
duration_seconds=duration_seconds,
|
| 468 |
+
outcomes=tuple(outcomes),
|
| 469 |
+
)
|
| 470 |
+
|
| 471 |
+
|
| 472 |
+
__all__ = [
|
| 473 |
+
"CorpusRunner",
|
| 474 |
+
"CorpusRunResult",
|
| 475 |
+
"DocumentOutcome",
|
| 476 |
+
"InitialInputsFactory",
|
| 477 |
+
"ContextFactory",
|
| 478 |
+
]
|
|
@@ -68,6 +68,11 @@ FILE_BUDGETS: dict[str, int] = {
|
|
| 68 |
# Ces helpers seront extraits dans ``picarones/web/path_security.py``
|
| 69 |
# lors du Sprint S20 du rewrite ciblé (création couche app/services/).
|
| 70 |
"picarones/web/security.py": 800, # actuel 751
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
"picarones/core/corpus.py": 600, # actuel 511
|
| 72 |
"picarones/fixtures.py": 600, # actuel 510
|
| 73 |
"picarones/measurements/inter_engine.py": 575, # actuel 484
|
|
|
|
| 68 |
# Ces helpers seront extraits dans ``picarones/web/path_security.py``
|
| 69 |
# lors du Sprint S20 du rewrite ciblé (création couche app/services/).
|
| 70 |
"picarones/web/security.py": 800, # actuel 751
|
| 71 |
+
# Sprint A14-S8 — CorpusRunner introduit pour orchestrer les
|
| 72 |
+
# pipelines composées sur un corpus avec backpressure / timeout
|
| 73 |
+
# réel / annulation propre. Budget stable, l'extension
|
| 74 |
+
# ProcessPoolExecutor (S11) restera dans cette enveloppe.
|
| 75 |
+
"picarones/pipeline/runner.py": 550, # actuel 462
|
| 76 |
"picarones/core/corpus.py": 600, # actuel 511
|
| 77 |
"picarones/fixtures.py": 600, # actuel 510
|
| 78 |
"picarones/measurements/inter_engine.py": 575, # actuel 484
|
|
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint A14-S8 — backpressure du ``CorpusRunner``.
|
| 2 |
+
|
| 3 |
+
Vérifie que ``max_in_flight`` est respecté à tout instant : il n'y
|
| 4 |
+
a jamais plus de N adapters qui tournent en parallèle, même sur
|
| 5 |
+
des corpus de plusieurs centaines de documents.
|
| 6 |
+
|
| 7 |
+
Stratégie : un stub d'adapter incrémente un compteur partagé au
|
| 8 |
+
début de ``execute()``, le décrémente à la fin, et capture le
|
| 9 |
+
maximum atteint. À la fin du run, on vérifie ``max_observed
|
| 10 |
+
<= max_in_flight``.
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
from __future__ import annotations
|
| 14 |
+
|
| 15 |
+
import threading
|
| 16 |
+
import time
|
| 17 |
+
|
| 18 |
+
import pytest
|
| 19 |
+
|
| 20 |
+
from picarones.domain import Artifact, ArtifactType, DocumentRef
|
| 21 |
+
from picarones.pipeline import (
|
| 22 |
+
CorpusRunner,
|
| 23 |
+
PipelineExecutor,
|
| 24 |
+
PipelineSpec,
|
| 25 |
+
PipelineStep,
|
| 26 |
+
RunContext,
|
| 27 |
+
)
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
class _ConcurrencyTrackingAdapter:
|
| 31 |
+
"""Adapter qui mesure la concurrence observée pendant son exécution."""
|
| 32 |
+
|
| 33 |
+
name = "tracking"
|
| 34 |
+
input_types = frozenset({ArtifactType.IMAGE})
|
| 35 |
+
output_types = frozenset({ArtifactType.RAW_TEXT})
|
| 36 |
+
execution_mode = "io"
|
| 37 |
+
|
| 38 |
+
def __init__(self, sleep_seconds: float = 0.01) -> None:
|
| 39 |
+
self._sleep = sleep_seconds
|
| 40 |
+
self._lock = threading.Lock()
|
| 41 |
+
self._current = 0
|
| 42 |
+
self.max_observed = 0
|
| 43 |
+
|
| 44 |
+
def execute(self, inputs, params, context):
|
| 45 |
+
with self._lock:
|
| 46 |
+
self._current += 1
|
| 47 |
+
if self._current > self.max_observed:
|
| 48 |
+
self.max_observed = self._current
|
| 49 |
+
try:
|
| 50 |
+
time.sleep(self._sleep)
|
| 51 |
+
return {
|
| 52 |
+
ArtifactType.RAW_TEXT: Artifact(
|
| 53 |
+
id=f"{context.document_id}:raw_text",
|
| 54 |
+
document_id=context.document_id,
|
| 55 |
+
type=ArtifactType.RAW_TEXT,
|
| 56 |
+
),
|
| 57 |
+
}
|
| 58 |
+
finally:
|
| 59 |
+
with self._lock:
|
| 60 |
+
self._current -= 1
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
def _build(adapter, max_in_flight: int):
|
| 64 |
+
registry = {"tracking": adapter}
|
| 65 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: registry[n])
|
| 66 |
+
runner = CorpusRunner(
|
| 67 |
+
exe,
|
| 68 |
+
max_in_flight=max_in_flight,
|
| 69 |
+
timeout_seconds_per_doc=10.0,
|
| 70 |
+
poll_interval_seconds=0.005,
|
| 71 |
+
)
|
| 72 |
+
spec = PipelineSpec(
|
| 73 |
+
name="bp", initial_inputs=(ArtifactType.IMAGE,),
|
| 74 |
+
steps=(PipelineStep(
|
| 75 |
+
id="s", kind="ocr", adapter_name="tracking",
|
| 76 |
+
input_types=(ArtifactType.IMAGE,),
|
| 77 |
+
output_types=(ArtifactType.RAW_TEXT,),
|
| 78 |
+
),),
|
| 79 |
+
)
|
| 80 |
+
return runner, spec
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
def _factories():
|
| 84 |
+
def inputs(doc):
|
| 85 |
+
return {ArtifactType.IMAGE: Artifact(
|
| 86 |
+
id=f"{doc.id}:image",
|
| 87 |
+
document_id=doc.id,
|
| 88 |
+
type=ArtifactType.IMAGE,
|
| 89 |
+
uri=doc.image_uri,
|
| 90 |
+
)}
|
| 91 |
+
|
| 92 |
+
def ctx(doc):
|
| 93 |
+
return RunContext(
|
| 94 |
+
document_id=doc.id,
|
| 95 |
+
code_version="1.0.0",
|
| 96 |
+
pipeline_name="bp",
|
| 97 |
+
)
|
| 98 |
+
return inputs, ctx
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
@pytest.mark.parametrize("max_in_flight", [1, 2, 4])
|
| 102 |
+
def test_max_in_flight_respected(max_in_flight: int) -> None:
|
| 103 |
+
adapter = _ConcurrencyTrackingAdapter(sleep_seconds=0.02)
|
| 104 |
+
runner, spec = _build(adapter, max_in_flight=max_in_flight)
|
| 105 |
+
inputs, ctx = _factories()
|
| 106 |
+
docs = [DocumentRef(id=f"d{i}", image_uri=f"/tmp/{i}.png") for i in range(40)]
|
| 107 |
+
|
| 108 |
+
result = runner.run(spec, docs, inputs, ctx, corpus_name="bp")
|
| 109 |
+
|
| 110 |
+
assert result.n_documents == 40
|
| 111 |
+
assert result.n_succeeded == 40
|
| 112 |
+
# Garantie de backpressure : la concurrence n'a jamais excédé max.
|
| 113 |
+
assert adapter.max_observed <= max_in_flight, (
|
| 114 |
+
f"max observed = {adapter.max_observed}, attendu <= {max_in_flight}"
|
| 115 |
+
)
|
| 116 |
+
# Et la backpressure a effectivement saturé : on a bien atteint le
|
| 117 |
+
# plafond (preuve qu'on parallélise vraiment).
|
| 118 |
+
assert adapter.max_observed == max_in_flight, (
|
| 119 |
+
f"on aurait dû saturer à {max_in_flight}, observed "
|
| 120 |
+
f"{adapter.max_observed}"
|
| 121 |
+
)
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def test_max_in_flight_one_means_sequential() -> None:
|
| 125 |
+
adapter = _ConcurrencyTrackingAdapter(sleep_seconds=0.005)
|
| 126 |
+
runner, spec = _build(adapter, max_in_flight=1)
|
| 127 |
+
inputs, ctx = _factories()
|
| 128 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(20)]
|
| 129 |
+
|
| 130 |
+
runner.run(spec, docs, inputs, ctx)
|
| 131 |
+
assert adapter.max_observed == 1
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
def test_empty_corpus_returns_zero_outcomes() -> None:
|
| 135 |
+
adapter = _ConcurrencyTrackingAdapter()
|
| 136 |
+
runner, spec = _build(adapter, max_in_flight=4)
|
| 137 |
+
inputs, ctx = _factories()
|
| 138 |
+
|
| 139 |
+
result = runner.run(spec, [], inputs, ctx)
|
| 140 |
+
assert result.n_documents == 0
|
| 141 |
+
assert result.outcomes == ()
|
| 142 |
+
assert adapter.max_observed == 0
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
def test_max_in_flight_zero_rejected() -> None:
|
| 146 |
+
from picarones.domain import PicaronesError
|
| 147 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: None)
|
| 148 |
+
with pytest.raises(PicaronesError, match="max_in_flight"):
|
| 149 |
+
CorpusRunner(exe, max_in_flight=0)
|
| 150 |
+
|
| 151 |
+
|
| 152 |
+
def test_negative_timeout_rejected() -> None:
|
| 153 |
+
from picarones.domain import PicaronesError
|
| 154 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: None)
|
| 155 |
+
with pytest.raises(PicaronesError, match="timeout"):
|
| 156 |
+
CorpusRunner(exe, timeout_seconds_per_doc=0)
|
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint A14-S8 — annulation propre du ``CorpusRunner``.
|
| 2 |
+
|
| 3 |
+
Vérifie qu'un ``threading.Event`` partagé permet au caller
|
| 4 |
+
(typiquement un endpoint FastAPI ``cancel``) de signaler l'arrêt.
|
| 5 |
+
Les futures non démarrées sont annulées proprement, les futures
|
| 6 |
+
en cours se terminent (Python ne permet pas de tuer un thread).
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import threading
|
| 12 |
+
import time
|
| 13 |
+
|
| 14 |
+
from picarones.domain import Artifact, ArtifactType, DocumentRef
|
| 15 |
+
from picarones.pipeline import (
|
| 16 |
+
CorpusRunner,
|
| 17 |
+
PipelineExecutor,
|
| 18 |
+
PipelineSpec,
|
| 19 |
+
PipelineStep,
|
| 20 |
+
RunContext,
|
| 21 |
+
)
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
class _EventAwareAdapter:
|
| 25 |
+
"""Adapter qui dort par petites tranches et signale qu'il a démarré."""
|
| 26 |
+
|
| 27 |
+
name = "event"
|
| 28 |
+
input_types = frozenset({ArtifactType.IMAGE})
|
| 29 |
+
output_types = frozenset({ArtifactType.RAW_TEXT})
|
| 30 |
+
execution_mode = "io"
|
| 31 |
+
|
| 32 |
+
def __init__(
|
| 33 |
+
self,
|
| 34 |
+
sleep_seconds: float,
|
| 35 |
+
started_event: threading.Event | None = None,
|
| 36 |
+
) -> None:
|
| 37 |
+
self._sleep = sleep_seconds
|
| 38 |
+
self._started = started_event
|
| 39 |
+
|
| 40 |
+
def execute(self, inputs, params, context):
|
| 41 |
+
if self._started is not None:
|
| 42 |
+
self._started.set()
|
| 43 |
+
time.sleep(self._sleep)
|
| 44 |
+
return {
|
| 45 |
+
ArtifactType.RAW_TEXT: Artifact(
|
| 46 |
+
id=f"{context.document_id}:raw_text",
|
| 47 |
+
document_id=context.document_id,
|
| 48 |
+
type=ArtifactType.RAW_TEXT,
|
| 49 |
+
),
|
| 50 |
+
}
|
| 51 |
+
|
| 52 |
+
|
| 53 |
+
def _build(adapter, max_in_flight: int = 1):
|
| 54 |
+
registry = {"event": adapter}
|
| 55 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: registry[n])
|
| 56 |
+
runner = CorpusRunner(
|
| 57 |
+
exe,
|
| 58 |
+
max_in_flight=max_in_flight,
|
| 59 |
+
timeout_seconds_per_doc=10.0,
|
| 60 |
+
poll_interval_seconds=0.01,
|
| 61 |
+
)
|
| 62 |
+
spec = PipelineSpec(
|
| 63 |
+
name="c", initial_inputs=(ArtifactType.IMAGE,),
|
| 64 |
+
steps=(PipelineStep(
|
| 65 |
+
id="s", kind="ocr", adapter_name="event",
|
| 66 |
+
input_types=(ArtifactType.IMAGE,),
|
| 67 |
+
output_types=(ArtifactType.RAW_TEXT,),
|
| 68 |
+
),),
|
| 69 |
+
)
|
| 70 |
+
return runner, spec
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def _factories():
|
| 74 |
+
def inputs(doc):
|
| 75 |
+
return {ArtifactType.IMAGE: Artifact(
|
| 76 |
+
id=f"{doc.id}:image",
|
| 77 |
+
document_id=doc.id,
|
| 78 |
+
type=ArtifactType.IMAGE,
|
| 79 |
+
)}
|
| 80 |
+
|
| 81 |
+
def ctx(doc):
|
| 82 |
+
return RunContext(
|
| 83 |
+
document_id=doc.id, code_version="1.0.0", pipeline_name="c",
|
| 84 |
+
)
|
| 85 |
+
return inputs, ctx
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
def test_cancel_before_run_yields_zero_progress() -> None:
|
| 89 |
+
"""Cancel signalé avant le run → aucun doc ne démarre."""
|
| 90 |
+
adapter = _EventAwareAdapter(sleep_seconds=1.0)
|
| 91 |
+
runner, spec = _build(adapter, max_in_flight=1)
|
| 92 |
+
inputs, ctx = _factories()
|
| 93 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(10)]
|
| 94 |
+
|
| 95 |
+
cancel_event = threading.Event()
|
| 96 |
+
cancel_event.set() # déjà signalé
|
| 97 |
+
|
| 98 |
+
result = runner.run(
|
| 99 |
+
spec, docs, inputs, ctx, cancel_event=cancel_event,
|
| 100 |
+
)
|
| 101 |
+
# Tous les docs sont cancelled (ou en partie cancelled si
|
| 102 |
+
# quelques-uns ont eu le temps d'être amorcés avant la
|
| 103 |
+
# première itération de la boucle).
|
| 104 |
+
assert result.n_succeeded == 0
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
def test_cancel_during_run_stops_pending_docs() -> None:
|
| 108 |
+
"""Cancel signalé pendant l'exécution → les docs en attente sont
|
| 109 |
+
annulés, ceux en cours se terminent."""
|
| 110 |
+
started = threading.Event()
|
| 111 |
+
adapter = _EventAwareAdapter(sleep_seconds=0.1, started_event=started)
|
| 112 |
+
runner, spec = _build(adapter, max_in_flight=1)
|
| 113 |
+
inputs, ctx = _factories()
|
| 114 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(20)]
|
| 115 |
+
|
| 116 |
+
cancel_event = threading.Event()
|
| 117 |
+
|
| 118 |
+
def _trigger_cancel():
|
| 119 |
+
# Attendre que le premier doc démarre, puis annuler.
|
| 120 |
+
started.wait(timeout=2.0)
|
| 121 |
+
cancel_event.set()
|
| 122 |
+
|
| 123 |
+
canceller = threading.Thread(target=_trigger_cancel, daemon=True)
|
| 124 |
+
canceller.start()
|
| 125 |
+
|
| 126 |
+
t0 = time.perf_counter()
|
| 127 |
+
result = runner.run(
|
| 128 |
+
spec, docs, inputs, ctx, cancel_event=cancel_event,
|
| 129 |
+
)
|
| 130 |
+
elapsed = time.perf_counter() - t0
|
| 131 |
+
|
| 132 |
+
canceller.join(timeout=1.0)
|
| 133 |
+
|
| 134 |
+
# On a au plus quelques docs réussis (ceux qui ont démarré avant
|
| 135 |
+
# la cancellation), et le reste cancellé. Pas tous succeeded.
|
| 136 |
+
assert result.n_succeeded < len(docs)
|
| 137 |
+
# Le run ne dure pas 20 * 0.1 = 2s ; il s'arrête bien plus tôt
|
| 138 |
+
# grâce à la cancellation.
|
| 139 |
+
assert elapsed < 1.5, f"cancellation trop lente : {elapsed:.2f}s"
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
def test_cancel_returns_well_formed_result() -> None:
|
| 143 |
+
"""Même en cas de cancel, le ``CorpusRunResult`` reste cohérent
|
| 144 |
+
(n_succeeded + n_failed + n_timed_out + n_cancelled <=
|
| 145 |
+
n_documents, outcomes correspondants)."""
|
| 146 |
+
adapter = _EventAwareAdapter(sleep_seconds=0.5)
|
| 147 |
+
runner, spec = _build(adapter, max_in_flight=2)
|
| 148 |
+
inputs, ctx = _factories()
|
| 149 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(10)]
|
| 150 |
+
|
| 151 |
+
cancel_event = threading.Event()
|
| 152 |
+
cancel_event.set()
|
| 153 |
+
|
| 154 |
+
result = runner.run(
|
| 155 |
+
spec, docs, inputs, ctx, cancel_event=cancel_event,
|
| 156 |
+
)
|
| 157 |
+
total = (
|
| 158 |
+
result.n_succeeded + result.n_failed
|
| 159 |
+
+ result.n_timed_out + result.n_cancelled
|
| 160 |
+
)
|
| 161 |
+
assert total <= result.n_documents
|
| 162 |
+
assert len(result.outcomes) == total
|
|
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint A14-S8 — définition de done : 1000 docs synthétiques en
|
| 2 |
+
moins de 10 minutes sans dépasser 500 MB de RAM.
|
| 3 |
+
|
| 4 |
+
Test scaled-down pour CI rapide (200 docs, mais avec mesure de RAM
|
| 5 |
+
qui doit rester très basse vu la nature synthétique du benchmark).
|
| 6 |
+
Le critère réel "1000 docs / 10 min / 500MB" est atteint trivialement
|
| 7 |
+
avec ces stubs ; le test garde ces ordres de grandeur en
|
| 8 |
+
inégalité large pour éviter d'être flaky en CI.
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
|
| 13 |
+
import os
|
| 14 |
+
import resource
|
| 15 |
+
import time
|
| 16 |
+
|
| 17 |
+
import pytest
|
| 18 |
+
|
| 19 |
+
from picarones.domain import Artifact, ArtifactType, DocumentRef
|
| 20 |
+
from picarones.pipeline import (
|
| 21 |
+
CorpusRunner,
|
| 22 |
+
PipelineExecutor,
|
| 23 |
+
PipelineSpec,
|
| 24 |
+
PipelineStep,
|
| 25 |
+
RunContext,
|
| 26 |
+
)
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
class _FastStub:
|
| 30 |
+
"""Adapter ultra-rapide pour mesurer les overheads d'orchestration."""
|
| 31 |
+
|
| 32 |
+
name = "fast"
|
| 33 |
+
input_types = frozenset({ArtifactType.IMAGE})
|
| 34 |
+
output_types = frozenset({ArtifactType.RAW_TEXT})
|
| 35 |
+
execution_mode = "io"
|
| 36 |
+
|
| 37 |
+
def execute(self, inputs, params, context):
|
| 38 |
+
return {
|
| 39 |
+
ArtifactType.RAW_TEXT: Artifact(
|
| 40 |
+
id=f"{context.document_id}:raw_text",
|
| 41 |
+
document_id=context.document_id,
|
| 42 |
+
type=ArtifactType.RAW_TEXT,
|
| 43 |
+
content_hash="0" * 64,
|
| 44 |
+
),
|
| 45 |
+
}
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
def _build(max_in_flight: int = 8):
|
| 49 |
+
registry = {"fast": _FastStub()}
|
| 50 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: registry[n])
|
| 51 |
+
runner = CorpusRunner(
|
| 52 |
+
exe,
|
| 53 |
+
max_in_flight=max_in_flight,
|
| 54 |
+
timeout_seconds_per_doc=60.0,
|
| 55 |
+
poll_interval_seconds=0.01,
|
| 56 |
+
)
|
| 57 |
+
spec = PipelineSpec(
|
| 58 |
+
name="dod", initial_inputs=(ArtifactType.IMAGE,),
|
| 59 |
+
steps=(PipelineStep(
|
| 60 |
+
id="s", kind="ocr", adapter_name="fast",
|
| 61 |
+
input_types=(ArtifactType.IMAGE,),
|
| 62 |
+
output_types=(ArtifactType.RAW_TEXT,),
|
| 63 |
+
),),
|
| 64 |
+
)
|
| 65 |
+
return runner, spec
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
def _factories():
|
| 69 |
+
def inputs(doc):
|
| 70 |
+
return {ArtifactType.IMAGE: Artifact(
|
| 71 |
+
id=f"{doc.id}:image",
|
| 72 |
+
document_id=doc.id,
|
| 73 |
+
type=ArtifactType.IMAGE,
|
| 74 |
+
)}
|
| 75 |
+
|
| 76 |
+
def ctx(doc):
|
| 77 |
+
return RunContext(
|
| 78 |
+
document_id=doc.id, code_version="1.0.0", pipeline_name="dod",
|
| 79 |
+
)
|
| 80 |
+
return inputs, ctx
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
def _rss_mb() -> float:
|
| 84 |
+
"""RSS en mégaoctets (Linux/macOS). Sur certaines plateformes,
|
| 85 |
+
ru_maxrss est en kilo-octets (Linux), d'autres en octets (BSD) ;
|
| 86 |
+
on assume Linux qui est la plateforme cible CI."""
|
| 87 |
+
rusage = resource.getrusage(resource.RUSAGE_SELF)
|
| 88 |
+
return rusage.ru_maxrss / 1024 # KB → MB
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
@pytest.mark.parametrize("n_docs", [200])
|
| 92 |
+
def test_def_of_done_scaled(n_docs: int) -> None:
|
| 93 |
+
"""Critère : N docs en moins de 10 min, RAM bornée.
|
| 94 |
+
|
| 95 |
+
Avec 200 docs synthétiques, on attend < 10s et < 500 MB RAM.
|
| 96 |
+
"""
|
| 97 |
+
runner, spec = _build(max_in_flight=8)
|
| 98 |
+
inputs, ctx = _factories()
|
| 99 |
+
docs = [
|
| 100 |
+
DocumentRef(id=f"d{i:04d}", image_uri=f"/tmp/{i}.png")
|
| 101 |
+
for i in range(n_docs)
|
| 102 |
+
]
|
| 103 |
+
|
| 104 |
+
rss_before = _rss_mb()
|
| 105 |
+
t0 = time.perf_counter()
|
| 106 |
+
result = runner.run(spec, docs, inputs, ctx, corpus_name="dod")
|
| 107 |
+
elapsed = time.perf_counter() - t0
|
| 108 |
+
rss_after = _rss_mb()
|
| 109 |
+
|
| 110 |
+
rss_growth = rss_after - rss_before
|
| 111 |
+
|
| 112 |
+
assert result.n_documents == n_docs
|
| 113 |
+
assert result.n_succeeded == n_docs
|
| 114 |
+
|
| 115 |
+
# Critère temps (large marge pour CI lente).
|
| 116 |
+
assert elapsed < 60.0, (
|
| 117 |
+
f"trop lent : {n_docs} docs en {elapsed:.1f}s"
|
| 118 |
+
)
|
| 119 |
+
|
| 120 |
+
# Critère RAM (la croissance pendant le run doit rester
|
| 121 |
+
# raisonnable — pas un test strict, juste un garde-fou contre
|
| 122 |
+
# une régression "submit all upfront" qui ferait exploser).
|
| 123 |
+
assert rss_growth < 200.0, (
|
| 124 |
+
f"croissance RAM excessive : +{rss_growth:.1f}MB"
|
| 125 |
+
)
|
| 126 |
+
|
| 127 |
+
|
| 128 |
+
def test_throughput_with_backpressure_reasonable() -> None:
|
| 129 |
+
"""Avec max_in_flight=4 et un adapter ultra-rapide, on doit
|
| 130 |
+
traiter 100 docs en bien moins d'une seconde."""
|
| 131 |
+
runner, spec = _build(max_in_flight=4)
|
| 132 |
+
inputs, ctx = _factories()
|
| 133 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(100)]
|
| 134 |
+
|
| 135 |
+
t0 = time.perf_counter()
|
| 136 |
+
result = runner.run(spec, docs, inputs, ctx)
|
| 137 |
+
elapsed = time.perf_counter() - t0
|
| 138 |
+
|
| 139 |
+
assert result.n_succeeded == 100
|
| 140 |
+
# Threshold large : 100 docs synthétiques en moins de 5s.
|
| 141 |
+
assert elapsed < 5.0, f"throughput trop bas : {elapsed:.2f}s"
|
|
@@ -0,0 +1,158 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sprint A14-S8 — timeout depuis le début d'exécution **réelle**.
|
| 2 |
+
|
| 3 |
+
Le bug critique de l'ancien runner : un document pouvait être marqué
|
| 4 |
+
``timeout`` parce qu'il avait passé N secondes en queue, pas N
|
| 5 |
+
secondes en train de tourner. Le nouveau ``CorpusRunner`` mesure
|
| 6 |
+
le timeout depuis ``time.monotonic()`` au moment où le worker
|
| 7 |
+
démarre réellement (cf. ``CorpusRunner._run_one`` qui écrit
|
| 8 |
+
``started_at[doc.id]`` en première instruction).
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
|
| 13 |
+
import threading
|
| 14 |
+
import time
|
| 15 |
+
|
| 16 |
+
import pytest
|
| 17 |
+
|
| 18 |
+
from picarones.domain import Artifact, ArtifactType, DocumentRef
|
| 19 |
+
from picarones.pipeline import (
|
| 20 |
+
CorpusRunner,
|
| 21 |
+
PipelineExecutor,
|
| 22 |
+
PipelineSpec,
|
| 23 |
+
PipelineStep,
|
| 24 |
+
RunContext,
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
class _SlowAdapter:
|
| 29 |
+
"""Adapter qui dort un certain temps avant de retourner."""
|
| 30 |
+
|
| 31 |
+
name = "slow"
|
| 32 |
+
input_types = frozenset({ArtifactType.IMAGE})
|
| 33 |
+
output_types = frozenset({ArtifactType.RAW_TEXT})
|
| 34 |
+
execution_mode = "io"
|
| 35 |
+
|
| 36 |
+
def __init__(self, sleep_seconds: float) -> None:
|
| 37 |
+
self._sleep = sleep_seconds
|
| 38 |
+
|
| 39 |
+
def execute(self, inputs, params, context):
|
| 40 |
+
time.sleep(self._sleep)
|
| 41 |
+
return {
|
| 42 |
+
ArtifactType.RAW_TEXT: Artifact(
|
| 43 |
+
id=f"{context.document_id}:raw_text",
|
| 44 |
+
document_id=context.document_id,
|
| 45 |
+
type=ArtifactType.RAW_TEXT,
|
| 46 |
+
),
|
| 47 |
+
}
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
def _build(adapter, *, timeout: float, max_in_flight: int = 2):
|
| 51 |
+
registry = {"slow": adapter}
|
| 52 |
+
exe = PipelineExecutor(adapter_resolver=lambda n: registry[n])
|
| 53 |
+
runner = CorpusRunner(
|
| 54 |
+
exe,
|
| 55 |
+
max_in_flight=max_in_flight,
|
| 56 |
+
timeout_seconds_per_doc=timeout,
|
| 57 |
+
poll_interval_seconds=0.01,
|
| 58 |
+
)
|
| 59 |
+
spec = PipelineSpec(
|
| 60 |
+
name="t", initial_inputs=(ArtifactType.IMAGE,),
|
| 61 |
+
steps=(PipelineStep(
|
| 62 |
+
id="s", kind="ocr", adapter_name="slow",
|
| 63 |
+
input_types=(ArtifactType.IMAGE,),
|
| 64 |
+
output_types=(ArtifactType.RAW_TEXT,),
|
| 65 |
+
),),
|
| 66 |
+
)
|
| 67 |
+
return runner, spec
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
def _factories():
|
| 71 |
+
def inputs(doc):
|
| 72 |
+
return {ArtifactType.IMAGE: Artifact(
|
| 73 |
+
id=f"{doc.id}:image",
|
| 74 |
+
document_id=doc.id,
|
| 75 |
+
type=ArtifactType.IMAGE,
|
| 76 |
+
)}
|
| 77 |
+
|
| 78 |
+
def ctx(doc):
|
| 79 |
+
return RunContext(
|
| 80 |
+
document_id=doc.id, code_version="1.0.0", pipeline_name="t",
|
| 81 |
+
)
|
| 82 |
+
return inputs, ctx
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
def test_doc_timed_out_when_exceeds_timeout() -> None:
|
| 86 |
+
"""Step qui dort 0.5s, timeout 0.1s → status timed_out."""
|
| 87 |
+
adapter = _SlowAdapter(sleep_seconds=0.5)
|
| 88 |
+
runner, spec = _build(adapter, timeout=0.1, max_in_flight=1)
|
| 89 |
+
inputs, ctx = _factories()
|
| 90 |
+
docs = [DocumentRef(id="slow_one", image_uri="/tmp/x.png")]
|
| 91 |
+
|
| 92 |
+
t0 = time.perf_counter()
|
| 93 |
+
result = runner.run(spec, docs, inputs, ctx)
|
| 94 |
+
elapsed = time.perf_counter() - t0
|
| 95 |
+
|
| 96 |
+
assert result.n_timed_out == 1
|
| 97 |
+
assert result.outcomes[0].status == "timed_out"
|
| 98 |
+
assert "timeout" in (result.outcomes[0].error or "")
|
| 99 |
+
# Le run principal a rendu la main rapidement (ne s'est pas bloqué
|
| 100 |
+
# sur le sleep complet — le thread continue mais on n'attend plus).
|
| 101 |
+
assert elapsed < 0.3, f"runner s'est bloqué : {elapsed:.2f}s"
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
def test_timeout_measured_from_real_start_not_submission() -> None:
|
| 105 |
+
"""Bug historique : avec un seul worker (max_in_flight=1) et 4
|
| 106 |
+
documents, les 3 derniers attendent en queue. L'ancien runner
|
| 107 |
+
aurait marqué ces 3 docs timeout dès que la queue dépassait le
|
| 108 |
+
timeout. Le nouveau runner ne marque timeout que les docs qui
|
| 109 |
+
ont **réellement** dépassé le délai en exécution."""
|
| 110 |
+
# Adapter qui dort 50ms — bien sous le timeout de 500ms.
|
| 111 |
+
adapter = _SlowAdapter(sleep_seconds=0.05)
|
| 112 |
+
runner, spec = _build(adapter, timeout=0.5, max_in_flight=1)
|
| 113 |
+
inputs, ctx = _factories()
|
| 114 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(4)]
|
| 115 |
+
|
| 116 |
+
result = runner.run(spec, docs, inputs, ctx)
|
| 117 |
+
|
| 118 |
+
# Les 4 docs auraient pris ~0.2s en série, ce qui dépasse le
|
| 119 |
+
# timeout de 0.5s **si** le runner mesurait depuis la submission
|
| 120 |
+
# du dernier doc. Mais comme on mesure depuis le début réel
|
| 121 |
+
# de chaque doc, aucun ne devrait timeout.
|
| 122 |
+
assert result.n_succeeded == 4
|
| 123 |
+
assert result.n_timed_out == 0
|
| 124 |
+
|
| 125 |
+
|
| 126 |
+
def test_some_docs_succeed_others_timeout() -> None:
|
| 127 |
+
"""Mix : la moitié des docs sont rapides, l'autre lente. Avec
|
| 128 |
+
un timeout intermédiaire, les rapides réussissent et les lents
|
| 129 |
+
timeout."""
|
| 130 |
+
|
| 131 |
+
class _ConditionalSlow:
|
| 132 |
+
name = "cond"
|
| 133 |
+
input_types = frozenset({ArtifactType.IMAGE})
|
| 134 |
+
output_types = frozenset({ArtifactType.RAW_TEXT})
|
| 135 |
+
execution_mode = "io"
|
| 136 |
+
|
| 137 |
+
def execute(self, inputs, params, context):
|
| 138 |
+
# Les docs avec id pair sont rapides.
|
| 139 |
+
if int(context.document_id.removeprefix("d")) % 2 == 0:
|
| 140 |
+
time.sleep(0.01)
|
| 141 |
+
else:
|
| 142 |
+
time.sleep(0.5)
|
| 143 |
+
return {
|
| 144 |
+
ArtifactType.RAW_TEXT: Artifact(
|
| 145 |
+
id=f"{context.document_id}:raw_text",
|
| 146 |
+
document_id=context.document_id,
|
| 147 |
+
type=ArtifactType.RAW_TEXT,
|
| 148 |
+
),
|
| 149 |
+
}
|
| 150 |
+
|
| 151 |
+
adapter = _ConditionalSlow()
|
| 152 |
+
runner, spec = _build(adapter, timeout=0.1, max_in_flight=2)
|
| 153 |
+
inputs, ctx = _factories()
|
| 154 |
+
docs = [DocumentRef(id=f"d{i}") for i in range(6)]
|
| 155 |
+
|
| 156 |
+
result = runner.run(spec, docs, inputs, ctx)
|
| 157 |
+
assert result.n_succeeded == 3 # pairs : d0, d2, d4
|
| 158 |
+
assert result.n_timed_out == 3 # impairs : d1, d3, d5
|