cy0307 commited on 3 days ago

Commit

146ae33

verified ·

1 Parent(s): 965d0da

Add files using upload-large-folder tool

Browse files

Files changed (43) hide show

ARTIFACT_GUIDE.md +3 -3
EVALUATION_PROTOCOL.md +22 -22
FIGURE_INDEX.md +2 -2
PROJECT_README.md +17 -25
PROJECT_STATUS.md +2 -2
README.md +17 -25
RESEARCH_TAKEAWAYS.md +1 -1
TASK_METHOD_20_GAP_AUDIT.md +1 -1
TASK_SUITE_20.md +22 -22
data/artifact_index.json +64 -64
data/evaluation_protocol.json +23 -23
data/live_publication_status.json +0 -0
data/mirror_parity.json +0 -0
data/omni_model_comparison.json +2 -2
data/project_manifest.json +3 -4
data/project_packet.json +3 -4
data/project_status.json +5 -6
data/publication_audit.json +1 -1
data/quality_gates.json +1 -1
data/reproducibility_matrix.json +4 -4
data/research_takeaways.json +2 -2
data/scope_claims_audit.json +1 -1
data/single_episode_task_model_radar.json +21 -21
data/source_alignment_audit.json +1 -1
data/task_method_20_gap_audit.json +1 -1
data/task_method_20_result_matrix.json +1 -1
data/task_suite_20.json +46 -46
data/task_surface_integrity.json +1 -1
data/tier2_task_suite.json +24 -25
data/unified_task_model_radar.json +21 -21
data/website_integrity.json +24 -31
index.html +12 -70
metrics/episode128_task_model_radar.json +21 -21
metrics/figure_index.json +7 -7
metrics/live_publication_status.json +0 -0
metrics/omni_model_comparison.json +2 -2
metrics/project_brief.json +1 -1
metrics/project_packet.json +3 -4
metrics/public_surface_qa.json +7 -7
metrics/reproducibility_matrix.json +4 -4
metrics/research_takeaways.json +2 -2
metrics/task_method_20_gap_audit.json +1 -1
metrics/task_surface_integrity.json +1 -1

ARTIFACT_GUIDE.md CHANGED Viewed

@@ -20,7 +20,7 @@ Xperience-native pretraining goal.
 | [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
 | [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
 | [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
-| [`results/audio_ablation/AUDIO_ABLATION_SUMMARY.md`](results/audio_ablation/AUDIO_ABLATION_SUMMARY.md) | Shows measured current-audio and raw log-mel replacement deltas across the original task contracts. |
 | [`docs/single_episode_explorer.html`](docs/single_episode_explorer.html) | Gives a static window-level explorer for the public sample episode. |
 | [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Optional detail for readers who need official dataset and access-term context. |
@@ -74,13 +74,13 @@ Xperience-native pretraining goal.
 | --- | --- |
 | [`TASK_SUITE_20.md`](TASK_SUITE_20.md) | Reader-facing table for the unified 20-task suite. |
 | [`docs/data/task_suite_20.json`](docs/data/task_suite_20.json) | Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
-| [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json) | The original task contracts, chronological split, and minimal/neural metrics. |
 | [`results/episode_task_suite/neural_mlp/`](results/episode_task_suite/neural_mlp/) | Matching PyTorch MLP heads for the same task contracts and feature windows. |
 | [`results/episode_task_suite/research_directions/`](results/episode_task_suite/research_directions/) | Mapping from the unified 20-task suite to the four Ropedia research directions. |
 | [`results/episode_task_suite/research_direction_extensions/`](results/episode_task_suite/research_direction_extensions/) | Four additional coded probes, one per research direction. |
 | [`results/episode_task_suite/tier2_task_suite/`](results/episode_task_suite/tier2_task_suite/) | Historical provenance path inside the unified 20-task suite. |
 | [`results/episode_task_suite/task_walkthroughs/`](results/episode_task_suite/task_walkthroughs/) | Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
-| [`results/audio_ablation/audio_ablation_metrics.csv`](results/audio_ablation/audio_ablation_metrics.csv) | All measured audio rows for the original task contracts across six variants, including no-audio, audio-only, alternate-audio-only, representation replacement, and all-input variants. |
 | [`results/audio_ablation/audio_delta_summary.csv`](results/audio_ablation/audio_delta_summary.csv) | Compact per-task audio delta table for quick manual inspection. |
 | [`scripts/audio_ablation_and_raw_upgrade.py`](scripts/audio_ablation_and_raw_upgrade.py) | Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
 | [`scripts/validate_task_surface.py`](scripts/validate_task_surface.py) | Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |

 | [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
 | [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
 | [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
+| [`results/audio_ablation/AUDIO_ABLATION_SUMMARY.md`](results/audio_ablation/AUDIO_ABLATION_SUMMARY.md) | Shows measured current-audio and raw log-mel replacement deltas across the walkthrough-backed task contracts. |
 | [`docs/single_episode_explorer.html`](docs/single_episode_explorer.html) | Gives a static window-level explorer for the public sample episode. |
 | [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Optional detail for readers who need official dataset and access-term context. |
 | --- | --- |
 | [`TASK_SUITE_20.md`](TASK_SUITE_20.md) | Reader-facing table for the unified 20-task suite. |
 | [`docs/data/task_suite_20.json`](docs/data/task_suite_20.json) | Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
+| [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json) | The walkthrough-backed task contracts, chronological split, and minimal/neural metrics. |
 | [`results/episode_task_suite/neural_mlp/`](results/episode_task_suite/neural_mlp/) | Matching PyTorch MLP heads for the same task contracts and feature windows. |
 | [`results/episode_task_suite/research_directions/`](results/episode_task_suite/research_directions/) | Mapping from the unified 20-task suite to the four Ropedia research directions. |
 | [`results/episode_task_suite/research_direction_extensions/`](results/episode_task_suite/research_direction_extensions/) | Four additional coded probes, one per research direction. |
 | [`results/episode_task_suite/tier2_task_suite/`](results/episode_task_suite/tier2_task_suite/) | Historical provenance path inside the unified 20-task suite. |
 | [`results/episode_task_suite/task_walkthroughs/`](results/episode_task_suite/task_walkthroughs/) | Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
+| [`results/audio_ablation/audio_ablation_metrics.csv`](results/audio_ablation/audio_ablation_metrics.csv) | All measured audio rows for the walkthrough-backed task contracts across six variants, including no-audio, audio-only, alternate-audio-only, representation replacement, and all-input variants. |
 | [`results/audio_ablation/audio_delta_summary.csv`](results/audio_ablation/audio_delta_summary.csv) | Compact per-task audio delta table for quick manual inspection. |
 | [`scripts/audio_ablation_and_raw_upgrade.py`](scripts/audio_ablation_and_raw_upgrade.py) | Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
 | [`scripts/validate_task_surface.py`](scripts/validate_task_surface.py) | Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |

EVALUATION_PROTOCOL.md CHANGED Viewed

@@ -50,28 +50,28 @@ All 20 public-sample task contracts are presented together under the same
 minimal/neural baseline setup. Historical `tier2_task_suite` paths are
 retained only as stable provenance artifact locations inside the unified suite.
-| # | Task | Artifact id | Origin | Family | Unit | Input -> target | Primary metric | Minimal | Neural |
-| ---: | --- | --- | --- | --- | --- | --- | --- | ---: | ---: |
-| 1 | Action Recognition | `timeline_action` | original | supervised classification | single window | current 20-frame all-feature window -> current action label | macro_f1 (higher better) | 0.0500 | 0.0148 |
-| 2 | Procedure Step Recognition | `timeline_subtask` | original | supervised classification | single window | current 20-frame all-feature window -> current subtask label | macro_f1 (higher better) | 0.0506 | 0.0281 |
-| 3 | Action Boundary Detection | `transition_detection` | original | temporal diagnostic | single window | current 20-frame all-feature window -> action boundary versus steady | macro_f1 (higher better) | 0.6118 | 0.5862 |
-| 4 | Next-Action Prediction | `next_action` | original | short-horizon prediction | single window | current 20-frame all-feature window at time t -> action label at t + 20 frames | macro_f1 (higher better) | 0.0593 | 0.0419 |
-| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | original | trajectory regression | single window | current all-feature window -> future left/right hand 3D joints for 10 frames | mpjpe (lower better) | 0.8647 | 0.1079 |
-| 6 | Contact State Prediction | `contact_prediction` | original | binary classification | single window | non-contact and non-caption feature blocks -> any body contact | macro_f1 (higher better) | 1.0000 | 1.0000 |
-| 7 | Object Relevance Prediction | `object_relevance` | original | multi-label classification | single window | non-caption feature blocks -> current relevant object set | micro_f1 (higher better) | 0.1803 | 0.1679 |
-| 8 | Language Grounding | `caption_grounding` | original | retrieval | caption query | caption object/interaction query plus candidate sensor windows -> matching time window | mrr (higher better) | 0.0160 | 0.0168 |
-| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | original | retrieval | sensor query | motion, IMU, and camera query features -> matching depth/video window | top5_accuracy (higher better) | 0.3678 | 0.1983 |
-| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | original | cross-modal regression | single window | motion, IMU, and camera features -> depth/video feature vector | r2 (higher better) | -0.0153 | -0.0102 |
-| 11 | Temporal Order Verification | `temporal_order` | original | pairwise diagnostic | adjacent window pair | two adjacent windows -> correct versus reversed order | f1 (higher better) | 0.5400 | 0.8520 |
-| 12 | Multimodal Synchronization Detection | `misalignment_detection` | original | pairwise diagnostic | paired modality window | motion side plus visual/depth side -> aligned versus shifted by 8 windows | f1 (higher better) | 0.5052 | 0.7153 |
-| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | additional | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro_f1 (higher better) | 0.0750 | 0.0655 |
-| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | additional | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro_f1 (higher better) | 0.0455 | 0.0507 |
-| 15 | Interaction Text Prediction | `interaction_text_prediction` | additional | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro_f1 (higher better) | 0.0444 | 0.0381 |
-| 16 | Action-Object Relation Prediction | `action_object_relation` | additional | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro_f1 (higher better) | 0.0000 | 0.0000 |
-| 17 | Future Object-Set Forecasting | `object_set_forecast` | additional | multi_label | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro_f1 (higher better) | 0.1694 | 0.1972 |
-| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | additional | regression | single aligned window | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | mae (lower better) | 0.0420 | 0.0426 |
-| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | additional | retrieval | held-out query window | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | mrr (higher better) | 0.4943 | 0.2409 |
-| 20 | Time-to-Next-Transition Regression | `time_to_transition` | additional | regression | single aligned window | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | mae (lower better) | 10.5374 | 10.5545 |
 ## Leakage Controls

 minimal/neural baseline setup. Historical `tier2_task_suite` paths are
 retained only as stable provenance artifact locations inside the unified suite.
+| # | Task | Artifact id | Family | Unit | Input -> target | Primary metric | Minimal | Neural |
+| ---: | --- | --- | --- | --- | --- | --- | ---: | ---: |
+| 1 | Action Recognition | `timeline_action` | supervised classification | single window | current 20-frame all-feature window -> current action label | macro_f1 (higher better) | 0.0500 | 0.0148 |
+| 2 | Procedure Step Recognition | `timeline_subtask` | supervised classification | single window | current 20-frame all-feature window -> current subtask label | macro_f1 (higher better) | 0.0506 | 0.0281 |
+| 3 | Action Boundary Detection | `transition_detection` | temporal diagnostic | single window | current 20-frame all-feature window -> action boundary versus steady | macro_f1 (higher better) | 0.6118 | 0.5862 |
+| 4 | Next-Action Prediction | `next_action` | short-horizon prediction | single window | current 20-frame all-feature window at time t -> action label at t + 20 frames | macro_f1 (higher better) | 0.0593 | 0.0419 |
+| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | trajectory regression | single window | current all-feature window -> future left/right hand 3D joints for 10 frames | mpjpe (lower better) | 0.8647 | 0.1079 |
+| 6 | Contact State Prediction | `contact_prediction` | binary classification | single window | non-contact and non-caption feature blocks -> any body contact | macro_f1 (higher better) | 1.0000 | 1.0000 |
+| 7 | Object Relevance Prediction | `object_relevance` | multi-label classification | single window | non-caption feature blocks -> current relevant object set | micro_f1 (higher better) | 0.1803 | 0.1679 |
+| 8 | Language Grounding | `caption_grounding` | retrieval | caption query | caption object/interaction query plus candidate sensor windows -> matching time window | mrr (higher better) | 0.0160 | 0.0168 |
+| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | retrieval | sensor query | motion, IMU, and camera query features -> matching depth/video window | top5_accuracy (higher better) | 0.3678 | 0.1983 |
+| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | cross-modal regression | single window | motion, IMU, and camera features -> depth/video feature vector | r2 (higher better) | -0.0153 | -0.0102 |
+| 11 | Temporal Order Verification | `temporal_order` | pairwise diagnostic | adjacent window pair | two adjacent windows -> correct versus reversed order | f1 (higher better) | 0.5400 | 0.8520 |
+| 12 | Multimodal Synchronization Detection | `misalignment_detection` | pairwise diagnostic | paired modality window | motion side plus visual/depth side -> aligned versus shifted by 8 windows | f1 (higher better) | 0.5052 | 0.7153 |
+| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro_f1 (higher better) | 0.0750 | 0.0655 |
+| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro_f1 (higher better) | 0.0455 | 0.0507 |
+| 15 | Interaction Text Prediction | `interaction_text_prediction` | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro_f1 (higher better) | 0.0444 | 0.0381 |
+| 16 | Action-Object Relation Prediction | `action_object_relation` | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro_f1 (higher better) | 0.0000 | 0.0000 |
+| 17 | Future Object-Set Forecasting | `object_set_forecast` | multi_label | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro_f1 (higher better) | 0.1694 | 0.1972 |
+| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | regression | single aligned window | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | mae (lower better) | 0.0420 | 0.0426 |
+| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | retrieval | held-out query window | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | mrr (higher better) | 0.4943 | 0.2409 |
+| 20 | Time-to-Next-Transition Regression | `time_to_transition` | regression | single aligned window | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | mae (lower better) | 10.5374 | 10.5545 |
 ## Leakage Controls

FIGURE_INDEX.md CHANGED Viewed

@@ -14,13 +14,13 @@ Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience
 | Project logo mark | `docs/assets/brand/xperience10m-logo-mark-512.png` | 512 x 512 | `scripts/build_brand_assets.py` | Primary X-shaped multimodal camera mark used for the website header, README, HF cards, and brand identity. |
 | Project logo social card | `docs/assets/brand/xperience10m-logo-social-card.png` | 1200 x 630 | `scripts/build_brand_assets.py` | Large preview image for README, Hugging Face cards, and Open Graph/Twitter social sharing. |
 | Project favicon | `docs/assets/brand/xperience10m-logo-favicon-64.png` | 64 x 64 | `scripts/build_brand_assets.py` | Small dark-tile logo for browser tabs and compact navigation. |
-| Original task-suite infographic | `docs/assets/task_suite_infographic.png` | 1800 x 7600 | `scripts/render_task_suite_infographic.py` | Primary visual map of the original task families, verified metrics, and sample modalities; the unified public suite is now documented as 20 tasks. |
 | Episode-to-task pipeline diagram | `docs/assets/pipeline_diagram.png` | 1800 x 1120 | `scripts/generate_visualizations.py` | End-to-end data processing and evaluation pipeline overview. |
 | Qwen3-Omni LoRA training pipeline | `docs/assets/qwen3_omni_lora_pipeline.png` | 1536 x 1024 | `docs/assets/qwen3_omni_lora_pipeline.prompt.md` | Detailed raw-data-to-adapter flow for staged Xperience-10M Qwen3-Omni LoRA training. |
 | Spatial intelligence slide diagram | `docs/assets/foundation-pipelines/spatial-intelligence-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the spatial intelligence pipeline track. |
 | Human-video world model slide diagram | `docs/assets/foundation-pipelines/human-video-world-model-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the human-video world-model pipeline track. |
 | Vision-language-action slide diagram | `docs/assets/foundation-pipelines/vision-language-action-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the VLA/action-policy pipeline track. |
-| Minimal and neural task architecture map | `docs/assets/task_architectures.png` | 1800 x 2450 | `scripts/render_overview_figures.py` | Minimal and neural heads for the original task contracts and shared feature contracts. |
 | Video modality thumbnail | `docs/assets/modalities/video.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived thumbnail for synchronized camera streams. |
 | Audio modality thumbnail | `docs/assets/modalities/audio.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived waveform thumbnail for the MP4 AAC stream. |
 | Depth modality thumbnail | `docs/assets/modalities/depth.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived depth and confidence thumbnail. |

 | Project logo mark | `docs/assets/brand/xperience10m-logo-mark-512.png` | 512 x 512 | `scripts/build_brand_assets.py` | Primary X-shaped multimodal camera mark used for the website header, README, HF cards, and brand identity. |
 | Project logo social card | `docs/assets/brand/xperience10m-logo-social-card.png` | 1200 x 630 | `scripts/build_brand_assets.py` | Large preview image for README, Hugging Face cards, and Open Graph/Twitter social sharing. |
 | Project favicon | `docs/assets/brand/xperience10m-logo-favicon-64.png` | 64 x 64 | `scripts/build_brand_assets.py` | Small dark-tile logo for browser tabs and compact navigation. |
+| Original task-suite infographic | `docs/assets/task_suite_infographic.png` | 1800 x 7600 | `scripts/render_task_suite_infographic.py` | Primary visual map of the walkthrough-backed task families, verified metrics, and sample modalities; the unified public suite is documented as 20 tasks. |
 | Episode-to-task pipeline diagram | `docs/assets/pipeline_diagram.png` | 1800 x 1120 | `scripts/generate_visualizations.py` | End-to-end data processing and evaluation pipeline overview. |
 | Qwen3-Omni LoRA training pipeline | `docs/assets/qwen3_omni_lora_pipeline.png` | 1536 x 1024 | `docs/assets/qwen3_omni_lora_pipeline.prompt.md` | Detailed raw-data-to-adapter flow for staged Xperience-10M Qwen3-Omni LoRA training. |
 | Spatial intelligence slide diagram | `docs/assets/foundation-pipelines/spatial-intelligence-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the spatial intelligence pipeline track. |
 | Human-video world model slide diagram | `docs/assets/foundation-pipelines/human-video-world-model-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the human-video world-model pipeline track. |
 | Vision-language-action slide diagram | `docs/assets/foundation-pipelines/vision-language-action-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the VLA/action-policy pipeline track. |
+| Minimal and neural task architecture map | `docs/assets/task_architectures.png` | 1800 x 2450 | `scripts/render_overview_figures.py` | Minimal and neural heads for the walkthrough-backed task contracts and shared feature contracts. |
 | Video modality thumbnail | `docs/assets/modalities/video.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived thumbnail for synchronized camera streams. |
 | Audio modality thumbnail | `docs/assets/modalities/audio.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived waveform thumbnail for the MP4 AAC stream. |
 | Depth modality thumbnail | `docs/assets/modalities/depth.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived depth and confidence thumbnail. |

PROJECT_README.md CHANGED Viewed

@@ -850,9 +850,9 @@ and verified Qwen3-Omni/Cosmos3 diagnostic artifacts.
 scripts/
   train_min_action_model.py         # motion/IMU baseline
   train_all_modalities_model.py     # current all-feature lightweight baseline
-  episode_task_suite.py             # original end-to-end task definitions
   neural_task_models.py             # optional PyTorch MLP heads for task contracts
-  research_direction_taxonomy.py    # maps original tasks to the four research tracks
   research_direction_extension_tasks.py # one extra data-backed probe per track
   tier2_task_suite.py              # historical-name provenance builder for unified task rows
   build_unified_task_suite.py       # builds TASK_SUITE_20.md and task_suite_20.json
@@ -890,7 +890,7 @@ results/
     research_directions/            # four-track taxonomy, CSV, and summary
     research_direction_extensions/  # four extra direction probes + predictions
     tier2_task_suite/               # provenance baseline tasks + predictions; historical path
-    task_walkthroughs/              # case-study walkthroughs for original tasks
   omni_exploration/                 # ModelScope readiness-check artifacts
   omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
@@ -1028,7 +1028,7 @@ cd ropedia-xperience-10m-task-suite
 python scripts/episode_task_suite.py --workspace /path/to/workspace
 ```
-Run the original task definitions with lightweight neural heads:
 ```bash
 pip install torch
@@ -1449,7 +1449,7 @@ and [`docs/data/additional_development_directions.json`](docs/data/additional_de
 ## Four Research Directions
-The original task contracts are organized against the four Ropedia research directions in
 a generated artifact, not only in prose:
 - [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
@@ -1475,13 +1475,13 @@ Current direction-level coverage:
 The important interpretation is that all four directions can be **started** from
 the Xperience-10M sample modalities, but only direction C is strongly represented
-by the original task suite. Directions A, B, and D need additional targets and
 multi-episode training before they become full research deliverables.
-## Four Direction-Extension Probes
-Beyond the original task contracts, the repo now includes one extra data-backed
-probe for each research direction. These probes are computed from the same
 `shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
 the reported numbers are computed from sample-derived features and saved metric artifacts.
@@ -1543,18 +1543,10 @@ unified 20-task suite, not as a separate benchmark tier.
 ![128-episode 20-task model radar](docs/assets/charts/episode128_task_model_radar.svg)
-![Unified 20-task provenance chart](docs/assets/charts/tier2_task_suite.svg)
-| # | Task | Input | Output | Minimal | Neural MLP | Meaning |
-| ---: | --- | --- | --- | ---: | ---: | --- |
-| 13 | Long-Horizon Next-Action Forecasting | current non-caption multimodal window | action label five seconds later | `0.0750` macro-F1 | `0.0655` macro-F1 | Tests procedure context beyond the one-second next-action task. |
-| 14 | Long-Horizon Next-Subtask Forecasting | current non-caption multimodal window | subtask five seconds later | `0.0455` macro-F1 | `0.0507` macro-F1 | Moves anticipation from low-level action to high-level procedure state. |
-| 15 | Interaction Text Prediction | current sensor window without caption text | raw interaction phrase | `0.0444` macro-F1 | `0.0381` macro-F1 | Uses the original annotation interaction text instead of only hashed features. |
-| 16 | Action-Object Relation Prediction | current sensor window without caption text | joint action plus object-set label | `0.0000` macro-F1 | `0.0000` macro-F1 | Exposes a hard binding target for action-object reasoning. |
-| 17 | Future Object-Set Forecasting | current sensor window without caption text | object set five seconds later | `0.1694` micro-F1 | `0.1972` micro-F1 | Predicts which objects become relevant soon. |
-| 18 | IMU-to-Hand Pose Reconstruction | IMU feature block only | current left/right hand joints | `0.0420` MAE | `0.0426` MAE | Tests inertial-to-hand sensor bridging. |
-| 19 | Camera-View Synchronization Retrieval | fisheye camera-1 query | synchronized fisheye camera-3 window | `0.4943` MRR | `0.2409` MRR | Stress-tests multi-camera temporal alignment. |
-| 20 | Time-to-Next-Transition Regression | current non-caption multimodal window | capped frames until next action boundary | `10.5374` MAE frames | `10.5545` MAE frames | Converts boundary detection into continuous timing. |
 Run:
@@ -1632,7 +1624,7 @@ PyTorch MLP classifiers or regressors. Its outputs live under
 and the rollup is stored in the `neural_tasks` section of
 [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
-The original task-specific heads are:
 | Task | Input | Minimal head | Output |
 | --- | --- | --- | --- |
@@ -1663,8 +1655,8 @@ The original task-specific heads are:
 | Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
 | Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
 | Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
-| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 of the original task contracts |
-| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 of the original task contracts |
 ## Audio Contribution Study
@@ -1743,7 +1735,7 @@ episodes; they are not reported as multi-episode benchmark results.
 I re-ran the full pipeline from the local raw public sample into a temporary
 local workspace and compared regenerated metrics with the committed
-artifacts. The baseline metrics, original task metrics, feature manifest, and
 available modality manifest matched exactly after float normalization.
 See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the

 scripts/
   train_min_action_model.py         # motion/IMU baseline
   train_all_modalities_model.py     # current all-feature lightweight baseline
+  episode_task_suite.py             # public-sample task definitions
   neural_task_models.py             # optional PyTorch MLP heads for task contracts
+  research_direction_taxonomy.py    # maps walkthrough-backed tasks to the four research tracks
   research_direction_extension_tasks.py # one extra data-backed probe per track
   tier2_task_suite.py              # historical-name provenance builder for unified task rows
   build_unified_task_suite.py       # builds TASK_SUITE_20.md and task_suite_20.json
     research_directions/            # four-track taxonomy, CSV, and summary
     research_direction_extensions/  # four extra direction probes + predictions
     tier2_task_suite/               # provenance baseline tasks + predictions; historical path
+    task_walkthroughs/              # case-study walkthroughs for walkthrough-backed tasks
   omni_exploration/                 # ModelScope readiness-check artifacts
   omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
 python scripts/episode_task_suite.py --workspace /path/to/workspace
 ```
+Run the public-sample task definitions with lightweight neural heads:
 ```bash
 pip install torch
 ## Four Research Directions
+The walkthrough-backed task contracts are organized against the four Ropedia research directions in
 a generated artifact, not only in prose:
 - [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
 The important interpretation is that all four directions can be **started** from
 the Xperience-10M sample modalities, but only direction C is strongly represented
+by the current task evidence. Directions A, B, and D need additional targets and
 multi-episode training before they become full research deliverables.
+## Four Direction Probes
+Alongside the unified 20-task suite, the repo includes one data-backed probe for
+each research direction. These probes are computed from the same
 `shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
 the reported numbers are computed from sample-derived features and saved metric artifacts.
 ![128-episode 20-task model radar](docs/assets/charts/episode128_task_model_radar.svg)
+The all-task table, including every input/output contract and minimal/neural
+metric, is in [`TASK_SUITE_20.md`](TASK_SUITE_20.md). Historical provenance
+links remain listed above for exact source tracing, but the public task surface
+should be read as one integrated 20-task suite.
 Run:
 and the rollup is stored in the `neural_tasks` section of
 [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
+The walkthrough-backed task heads are:
 | Task | Input | Minimal head | Output |
 | --- | --- | --- | --- |
 | Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
 | Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
 | Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
+| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 walkthrough-backed task contracts |
+| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 walkthrough-backed task contracts |
 ## Audio Contribution Study
 I re-ran the full pipeline from the local raw public sample into a temporary
 local workspace and compared regenerated metrics with the committed
+artifacts. The baseline metrics, task metrics, feature manifest, and
 available modality manifest matched exactly after float normalization.
 See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the

PROJECT_STATUS.md CHANGED Viewed

@@ -33,7 +33,7 @@ prior multiscale release, and v6 is the current public 20-task Qwen3-Omni row.
 | Unified 20-task suite | Verified | `TASK_SUITE_20.md`, `docs/data/task_suite_20.json`, `results/episode_task_suite/`, `results/episode_task_suite/tier2_task_suite/` | All 20 task contracts have committed minimal metrics and share the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The `tier2_task_suite` path is historical provenance inside the unified suite, not a separate public tier. |
 | 180-result method matrix | Verified complete | `docs/data/task_method_20_result_matrix.json`, `TASK_METHOD_20_RESULT_MATRIX.md`, `docs/data/task_method_20_gap_audit.json`, `docs/assets/charts/unified_task_model_radar.svg` | The public comparison matrix now has 9 methods x 20 tasks = 180/180 scored method-task records. Six rows are explicitly marked as compact-proxy scores where the public 128-episode export lacks the direct raw target. |
 | Neural heads | Verified | `scripts/neural_task_models.py`, `results/episode_task_suite/neural_mlp/` | Each task also has a compact PyTorch MLP run over the same feature tensor and chronological split. |
-| Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across the original task contracts; audio improves the primary metric on 6 of those contracts, and a 588-d audio-window representation improves over the baseline audio variant on 6 of those contracts. |
 | Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
 | Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy tracks, and the future Xperience-native pretraining goal. |
 | 128-episode task-suite enhancement pack | Current no-new-episode plan | `TASK_SUITE_ENHANCEMENT_128.md`, `docs/data/task_suite_enhancement_128.json`, `results/omni_finetune/task_suite_enhancement_128_v1_20260608/enhancement_plan.json`, `scripts/omni/build_task_suite_enhancement_128.py` | The current 3,808-window selected split can be stressed without more episodes by exporting denser and multiscale windows. The recommended next export is `multiscale_20s10_40s20_80s40`, estimated at 106,095 windows from the observed frame spans; the pack also defines hierarchical action/subtask targets, raw-feature shard priorities for unsupported tasks, and Qwen3-Omni/Cosmos3 follow-up run cards. |
@@ -112,7 +112,7 @@ prior multiscale release, and v6 is the current public 20-task Qwen3-Omni row.
 - The current reconstruction task reconstructs feature vectors, not pixel
   depth, meshes, NeRF outputs, or Gaussian splats.
 - Audio is part of the current 8,546-dimensional baseline feature vector.
-- Audio contribution is evaluated across the original task contracts in
   `results/audio_ablation/`.
 - Foundation-model selection is now explicit: Qwen3-Omni is the immediate
   trainable pilot, Cosmos 3 is the first world-model track, and Cosmos3-Super

 | Unified 20-task suite | Verified | `TASK_SUITE_20.md`, `docs/data/task_suite_20.json`, `results/episode_task_suite/`, `results/episode_task_suite/tier2_task_suite/` | All 20 task contracts have committed minimal metrics and share the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The `tier2_task_suite` path is historical provenance inside the unified suite, not a separate public tier. |
 | 180-result method matrix | Verified complete | `docs/data/task_method_20_result_matrix.json`, `TASK_METHOD_20_RESULT_MATRIX.md`, `docs/data/task_method_20_gap_audit.json`, `docs/assets/charts/unified_task_model_radar.svg` | The public comparison matrix now has 9 methods x 20 tasks = 180/180 scored method-task records. Six rows are explicitly marked as compact-proxy scores where the public 128-episode export lacks the direct raw target. |
 | Neural heads | Verified | `scripts/neural_task_models.py`, `results/episode_task_suite/neural_mlp/` | Each task also has a compact PyTorch MLP run over the same feature tensor and chronological split. |
+| Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across the walkthrough-backed task contracts; audio improves the primary metric on 6 of those contracts, and a 588-d audio-window representation improves over the baseline audio variant on 6 of those contracts. |
 | Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
 | Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy tracks, and the future Xperience-native pretraining goal. |
 | 128-episode task-suite enhancement pack | Current no-new-episode plan | `TASK_SUITE_ENHANCEMENT_128.md`, `docs/data/task_suite_enhancement_128.json`, `results/omni_finetune/task_suite_enhancement_128_v1_20260608/enhancement_plan.json`, `scripts/omni/build_task_suite_enhancement_128.py` | The current 3,808-window selected split can be stressed without more episodes by exporting denser and multiscale windows. The recommended next export is `multiscale_20s10_40s20_80s40`, estimated at 106,095 windows from the observed frame spans; the pack also defines hierarchical action/subtask targets, raw-feature shard priorities for unsupported tasks, and Qwen3-Omni/Cosmos3 follow-up run cards. |
 - The current reconstruction task reconstructs feature vectors, not pixel
   depth, meshes, NeRF outputs, or Gaussian splats.
 - Audio is part of the current 8,546-dimensional baseline feature vector.
+- Audio contribution is evaluated across the walkthrough-backed task contracts in
   `results/audio_ablation/`.
 - Foundation-model selection is now explicit: Qwen3-Omni is the immediate
   trainable pilot, Cosmos 3 is the first world-model track, and Cosmos3-Super

README.md CHANGED Viewed

@@ -872,9 +872,9 @@ and verified Qwen3-Omni/Cosmos3 diagnostic artifacts.
 scripts/
   train_min_action_model.py         # motion/IMU baseline
   train_all_modalities_model.py     # current all-feature lightweight baseline
-  episode_task_suite.py             # original end-to-end task definitions
   neural_task_models.py             # optional PyTorch MLP heads for task contracts
-  research_direction_taxonomy.py    # maps original tasks to the four research tracks
   research_direction_extension_tasks.py # one extra data-backed probe per track
   tier2_task_suite.py              # historical-name provenance builder for unified task rows
   build_unified_task_suite.py       # builds TASK_SUITE_20.md and task_suite_20.json
@@ -912,7 +912,7 @@ results/
     research_directions/            # four-track taxonomy, CSV, and summary
     research_direction_extensions/  # four extra direction probes + predictions
     tier2_task_suite/               # provenance baseline tasks + predictions; historical path
-    task_walkthroughs/              # case-study walkthroughs for original tasks
   omni_exploration/                 # ModelScope readiness-check artifacts
   omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
@@ -1050,7 +1050,7 @@ cd ropedia-xperience-10m-task-suite
 python scripts/episode_task_suite.py --workspace /path/to/workspace
 ```
-Run the original task definitions with lightweight neural heads:
 ```bash
 pip install torch
@@ -1471,7 +1471,7 @@ and [`docs/data/additional_development_directions.json`](docs/data/additional_de
 ## Four Research Directions
-The original task contracts are organized against the four Ropedia research directions in
 a generated artifact, not only in prose:
 - [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
@@ -1497,13 +1497,13 @@ Current direction-level coverage:
 The important interpretation is that all four directions can be **started** from
 the Xperience-10M sample modalities, but only direction C is strongly represented
-by the original task suite. Directions A, B, and D need additional targets and
 multi-episode training before they become full research deliverables.
-## Four Direction-Extension Probes
-Beyond the original task contracts, the repo now includes one extra data-backed
-probe for each research direction. These probes are computed from the same
 `shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
 the reported numbers are computed from sample-derived features and saved metric artifacts.
@@ -1565,18 +1565,10 @@ unified 20-task suite, not as a separate benchmark tier.
 ![128-episode 20-task model radar](docs/assets/charts/episode128_task_model_radar.svg)
-![Unified 20-task provenance chart](docs/assets/charts/tier2_task_suite.svg)
-| # | Task | Input | Output | Minimal | Neural MLP | Meaning |
-| ---: | --- | --- | --- | ---: | ---: | --- |
-| 13 | Long-Horizon Next-Action Forecasting | current non-caption multimodal window | action label five seconds later | `0.0750` macro-F1 | `0.0655` macro-F1 | Tests procedure context beyond the one-second next-action task. |
-| 14 | Long-Horizon Next-Subtask Forecasting | current non-caption multimodal window | subtask five seconds later | `0.0455` macro-F1 | `0.0507` macro-F1 | Moves anticipation from low-level action to high-level procedure state. |
-| 15 | Interaction Text Prediction | current sensor window without caption text | raw interaction phrase | `0.0444` macro-F1 | `0.0381` macro-F1 | Uses the original annotation interaction text instead of only hashed features. |
-| 16 | Action-Object Relation Prediction | current sensor window without caption text | joint action plus object-set label | `0.0000` macro-F1 | `0.0000` macro-F1 | Exposes a hard binding target for action-object reasoning. |
-| 17 | Future Object-Set Forecasting | current sensor window without caption text | object set five seconds later | `0.1694` micro-F1 | `0.1972` micro-F1 | Predicts which objects become relevant soon. |
-| 18 | IMU-to-Hand Pose Reconstruction | IMU feature block only | current left/right hand joints | `0.0420` MAE | `0.0426` MAE | Tests inertial-to-hand sensor bridging. |
-| 19 | Camera-View Synchronization Retrieval | fisheye camera-1 query | synchronized fisheye camera-3 window | `0.4943` MRR | `0.2409` MRR | Stress-tests multi-camera temporal alignment. |
-| 20 | Time-to-Next-Transition Regression | current non-caption multimodal window | capped frames until next action boundary | `10.5374` MAE frames | `10.5545` MAE frames | Converts boundary detection into continuous timing. |
 Run:
@@ -1654,7 +1646,7 @@ PyTorch MLP classifiers or regressors. Its outputs live under
 and the rollup is stored in the `neural_tasks` section of
 [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
-The original task-specific heads are:
 | Task | Input | Minimal head | Output |
 | --- | --- | --- | --- |
@@ -1685,8 +1677,8 @@ The original task-specific heads are:
 | Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
 | Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
 | Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
-| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 of the original task contracts |
-| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 of the original task contracts |
 ## Audio Contribution Study
@@ -1765,7 +1757,7 @@ episodes; they are not reported as multi-episode benchmark results.
 I re-ran the full pipeline from the local raw public sample into a temporary
 local workspace and compared regenerated metrics with the committed
-artifacts. The baseline metrics, original task metrics, feature manifest, and
 available modality manifest matched exactly after float normalization.
 See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the

 scripts/
   train_min_action_model.py         # motion/IMU baseline
   train_all_modalities_model.py     # current all-feature lightweight baseline
+  episode_task_suite.py             # public-sample task definitions
   neural_task_models.py             # optional PyTorch MLP heads for task contracts
+  research_direction_taxonomy.py    # maps walkthrough-backed tasks to the four research tracks
   research_direction_extension_tasks.py # one extra data-backed probe per track
   tier2_task_suite.py              # historical-name provenance builder for unified task rows
   build_unified_task_suite.py       # builds TASK_SUITE_20.md and task_suite_20.json
     research_directions/            # four-track taxonomy, CSV, and summary
     research_direction_extensions/  # four extra direction probes + predictions
     tier2_task_suite/               # provenance baseline tasks + predictions; historical path
+    task_walkthroughs/              # case-study walkthroughs for walkthrough-backed tasks
   omni_exploration/                 # ModelScope readiness-check artifacts
   omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
 python scripts/episode_task_suite.py --workspace /path/to/workspace
 ```
+Run the public-sample task definitions with lightweight neural heads:
 ```bash
 pip install torch
 ## Four Research Directions
+The walkthrough-backed task contracts are organized against the four Ropedia research directions in
 a generated artifact, not only in prose:
 - [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
 The important interpretation is that all four directions can be **started** from
 the Xperience-10M sample modalities, but only direction C is strongly represented
+by the current task evidence. Directions A, B, and D need additional targets and
 multi-episode training before they become full research deliverables.
+## Four Direction Probes
+Alongside the unified 20-task suite, the repo includes one data-backed probe for
+each research direction. These probes are computed from the same
 `shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
 the reported numbers are computed from sample-derived features and saved metric artifacts.
 ![128-episode 20-task model radar](docs/assets/charts/episode128_task_model_radar.svg)
+The all-task table, including every input/output contract and minimal/neural
+metric, is in [`TASK_SUITE_20.md`](TASK_SUITE_20.md). Historical provenance
+links remain listed above for exact source tracing, but the public task surface
+should be read as one integrated 20-task suite.
 Run:
 and the rollup is stored in the `neural_tasks` section of
 [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
+The walkthrough-backed task heads are:
 | Task | Input | Minimal head | Output |
 | --- | --- | --- | --- |
 | Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
 | Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
 | Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
+| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 walkthrough-backed task contracts |
+| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 walkthrough-backed task contracts |
 ## Audio Contribution Study
 I re-ran the full pipeline from the local raw public sample into a temporary
 local workspace and compared regenerated metrics with the committed
+artifacts. The baseline metrics, task metrics, feature manifest, and
 available modality manifest matched exactly after float normalization.
 See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the

RESEARCH_TAKEAWAYS.md CHANGED Viewed

@@ -80,7 +80,7 @@ Current scope: The current reconstruction task predicts feature vectors; depth,
 ### Audio helps some tasks and hurts others on the public sample
-Audio improves the primary metric on 6 of the original task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.
 | Metric | Value |
 | --- | ---: |

 ### Audio helps some tasks and hurts others on the public sample
+Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.
 | Metric | Value |
 | --- | ---: |

TASK_METHOD_20_GAP_AUDIT.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Task Method 20-Result Completion Audit
-Generated: `2026-06-21T08:38:20+00:00`
 This audit is the explicit completion ledger for the 9-method x 20-task result
 matrix. The current public matrix is complete at 180/180 scored records while

 # Task Method 20-Result Completion Audit
+Generated: `2026-06-21T15:21:42+00:00`
 This audit is the explicit completion ledger for the 9-method x 20-task result
 matrix. The current public matrix is complete at 180/180 scored records while

TASK_SUITE_20.md CHANGED Viewed

@@ -20,28 +20,28 @@ as a separate benchmark tier.
 ## Task Table
-| # | Task | Artifact id | Origin | Input -> output | Primary metric | Minimal | Neural |
-| ---: | --- | --- | --- | --- | --- | ---: | ---: |
-| 1 | Action Recognition | `timeline_action` | original task | 20-frame multimodal window -> current action class | macro-F1 (higher better) | 0.0500 | 0.0148 |
-| 2 | Procedure Step Recognition | `timeline_subtask` | original task | 20-frame multimodal window -> current procedure step | macro-F1 (higher better) | 0.0506 | 0.0281 |
-| 3 | Action Boundary Detection | `transition_detection` | original task | current window with boundary target -> boundary or steady | macro-F1 (higher better) | 0.6118 | 0.5862 |
-| 4 | Next-Action Prediction | `next_action` | original task | current window at time t -> action at t+20 frames | macro-F1 (higher better) | 0.0593 | 0.0419 |
-| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | original task | current multimodal window -> future hand-joint trajectory | MPJPE (lower better) | 0.8647 | 0.1079 |
-| 6 | Contact State Prediction | `contact_prediction` | original task | non-contact, non-caption features -> contact or no contact | macro-F1 (higher better) | 1.0000 | 1.0000 |
-| 7 | Object Relevance Prediction | `object_relevance` | original task | non-caption multimodal features -> relevant object set | micro-F1 (higher better) | 0.1803 | 0.1679 |
-| 8 | Language Grounding | `caption_grounding` | original task | text-like query and candidate windows -> ranked matching moments | MRR (higher better) | 0.0160 | 0.0168 |
-| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | original task | motion/IMU/pose query; depth/video candidates -> ranked visual windows | MRR (higher better) | 0.2693 | 0.1300 |
-| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | original task | motion, IMU, and camera/pose features -> reconstructed depth/video vector | R2 (higher better) | -0.0153 | -0.0102 |
-| 11 | Temporal Order Verification | `temporal_order` | original task | two adjacent windows plus difference vector -> correct or reversed | F1 (higher better) | 0.5400 | 0.8520 |
-| 12 | Multimodal Synchronization Detection | `misalignment_detection` | original task | motion-side and visual/depth-side feature groups -> aligned or shifted | F1 (higher better) | 0.5052 | 0.7153 |
-| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | additional task | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro-F1 (higher better) | 0.0750 | 0.0655 |
-| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | additional task | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro-F1 (higher better) | 0.0455 | 0.0507 |
-| 15 | Interaction Text Prediction | `interaction_text_prediction` | additional task | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro-F1 (higher better) | 0.0444 | 0.0381 |
-| 16 | Action-Object Relation Prediction | `action_object_relation` | additional task | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro-F1 (higher better) | 0.0000 | 0.0000 |
-| 17 | Future Object-Set Forecasting | `object_set_forecast` | additional task | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro-F1 (higher better) | 0.1694 | 0.1972 |
-| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | additional task | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | MAE (lower better) | 0.0420 | 0.0426 |
-| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | additional task | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | MRR (higher better) | 0.4943 | 0.2409 |
-| 20 | Time-to-Next-Transition Regression | `time_to_transition` | additional task | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | MAE frames (lower better) | 10.5374 | 10.5545 |
 ## Machine-Readable Copy

 ## Task Table
+| # | Task | Artifact id | Input -> output | Primary metric | Minimal | Neural |
+| ---: | --- | --- | --- | --- | ---: | ---: |
+| 1 | Action Recognition | `timeline_action` | 20-frame multimodal window -> current action class | macro-F1 (higher better) | 0.0500 | 0.0148 |
+| 2 | Procedure Step Recognition | `timeline_subtask` | 20-frame multimodal window -> current procedure step | macro-F1 (higher better) | 0.0506 | 0.0281 |
+| 3 | Action Boundary Detection | `transition_detection` | current window with boundary target -> boundary or steady | macro-F1 (higher better) | 0.6118 | 0.5862 |
+| 4 | Next-Action Prediction | `next_action` | current window at time t -> action at t+20 frames | macro-F1 (higher better) | 0.0593 | 0.0419 |
+| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | current multimodal window -> future hand-joint trajectory | MPJPE (lower better) | 0.8647 | 0.1079 |
+| 6 | Contact State Prediction | `contact_prediction` | non-contact, non-caption features -> contact or no contact | macro-F1 (higher better) | 1.0000 | 1.0000 |
+| 7 | Object Relevance Prediction | `object_relevance` | non-caption multimodal features -> relevant object set | micro-F1 (higher better) | 0.1803 | 0.1679 |
+| 8 | Language Grounding | `caption_grounding` | text-like query and candidate windows -> ranked matching moments | MRR (higher better) | 0.0160 | 0.0168 |
+| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | motion/IMU/pose query; depth/video candidates -> ranked visual windows | MRR (higher better) | 0.2693 | 0.1300 |
+| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | motion, IMU, and camera/pose features -> reconstructed depth/video vector | R2 (higher better) | -0.0153 | -0.0102 |
+| 11 | Temporal Order Verification | `temporal_order` | two adjacent windows plus difference vector -> correct or reversed | F1 (higher better) | 0.5400 | 0.8520 |
+| 12 | Multimodal Synchronization Detection | `misalignment_detection` | motion-side and visual/depth-side feature groups -> aligned or shifted | F1 (higher better) | 0.5052 | 0.7153 |
+| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro-F1 (higher better) | 0.0750 | 0.0655 |
+| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro-F1 (higher better) | 0.0455 | 0.0507 |
+| 15 | Interaction Text Prediction | `interaction_text_prediction` | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro-F1 (higher better) | 0.0444 | 0.0381 |
+| 16 | Action-Object Relation Prediction | `action_object_relation` | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro-F1 (higher better) | 0.0000 | 0.0000 |
+| 17 | Future Object-Set Forecasting | `object_set_forecast` | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro-F1 (higher better) | 0.1694 | 0.1972 |
+| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | MAE (lower better) | 0.0420 | 0.0426 |
+| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | MRR (higher better) | 0.4943 | 0.2409 |
+| 20 | Time-to-Next-Transition Regression | `time_to_transition` | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | MAE frames (lower better) | 10.5374 | 10.5545 |
 ## Machine-Readable Copy

data/artifact_index.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "title": "Ropedia Xperience-10M Task Suite Artifact Index",
-  "generated_at_utc": "2026-06-21T14:40:34+00:00",
   "status": "pass",
   "artifact_count": 228,
   "missing": [],
@@ -59,8 +59,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable first-reader project brief for the website and Hugging Face mirrors.",
       "exists": true,
-      "bytes": 4019,
-      "sha256": "9521556a750941a0f9ee8e9541903acbb0fbec2501fd05ed4e7a017fc18cf794"
     },
     {
       "id": "project_status",
@@ -70,8 +70,8 @@
       "surface": "repo_hf",
       "shows": "Gives a compact current-state table for first-pass readers.",
       "exists": true,
-      "bytes": 15993,
-      "sha256": "96bf5d894ace804aea2f3889a4d99a802a5e015405e7eed573eb3a98882ce968"
     },
     {
       "id": "project_status_json",
@@ -81,8 +81,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable copy of the current project status for website and HF mirrors.",
       "exists": true,
-      "bytes": 23255,
-      "sha256": "874f1133ee75f060735f0c9e763cf81463f304432f1dbca3ebc9837225c0259d"
     },
     {
       "id": "glossary",
@@ -576,8 +576,8 @@
       "surface": "website_hf",
       "shows": "Gives a short project path with scope status and public surfaces.",
       "exists": true,
-      "bytes": 10009,
-      "sha256": "e0f8bd65cd15b0fe68c8079045b4c72552daaf644c35b8a7a68426250a4aa441"
     },
     {
       "id": "artifact_guide",
@@ -587,8 +587,8 @@
       "surface": "repo_hf",
       "shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
       "exists": true,
-      "bytes": 20571,
-      "sha256": "217e3eb2cf82999f75ce6e132f567fa1ed08d319bf7a44f77b7150a45fae5274"
     },
     {
       "id": "official_dataset_card_alignment",
@@ -632,7 +632,7 @@
       "shows": "Machine-readable source-alignment pass/fail check for repo, website, and HF surfaces.",
       "exists": true,
       "bytes": 4432,
-      "sha256": "3def3dc923162ad0d2802acdca8a689a4e9ad1408f36edae8f77f49c4507cef1"
     },
     {
       "id": "source_alignment_validator",
@@ -686,8 +686,8 @@
       "surface": "repo_hf",
       "shows": "Defines the window unit, chronological split, task metrics, leakage controls, and current limitations.",
       "exists": true,
-      "bytes": 9156,
-      "sha256": "cfc23b3115ebce2b41a349b8a2cd6989aaf2294790c79e2b17545ebede2b2df0"
     },
     {
       "id": "evaluation_protocol_json",
@@ -697,8 +697,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable protocol generated from committed task metrics for website and HF mirrors.",
       "exists": true,
-      "bytes": 24007,
-      "sha256": "dde490d175f0d6828f5973f1b24e696a3d1b3b09d65a59cb9c1dde5c38845b66"
     },
     {
       "id": "evaluation_protocol_builder",
@@ -708,8 +708,8 @@
       "surface": "repo_hf",
       "shows": "Regenerates the protocol from committed summary metrics and task artifacts.",
       "exists": true,
-      "bytes": 19931,
-      "sha256": "080f894b50c609e3a467c8c513dfe441f90ba0dad3586dd1cb88de6e58eedb3b"
     },
     {
       "id": "task_suite_20",
@@ -719,8 +719,8 @@
       "surface": "repo_hf",
       "shows": "Reader-facing table for the single unified public-sample task suite under the same window, split, feature, and baseline contract.",
       "exists": true,
-      "bytes": 5196,
-      "sha256": "473891503dcd1251a2cc9a16e6642ce16fbca9d264a734d2397c2afc60977195"
     },
     {
       "id": "task_suite_20_json",
@@ -730,8 +730,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable unified 20-task index for the website, Hugging Face mirrors, and live verification.",
       "exists": true,
-      "bytes": 34597,
-      "sha256": "2029f7f9744001861ac00acabdb578fe97d3b39a1c16a7c2d19c56347ded22d7"
     },
     {
       "id": "task_suite_20_builder",
@@ -741,8 +741,8 @@
       "surface": "repo_hf",
       "shows": "Regenerates the unified 20-task JSON and Markdown from the public-sample metrics plus the historical provenance result bundle.",
       "exists": true,
-      "bytes": 12213,
-      "sha256": "1421593f05e345799007bbcdf138f81dfdb7c511ec1e31b56d00e2cdaed3d7de"
     },
     {
       "id": "unified_task_model_radar_json",
@@ -1005,8 +1005,8 @@
       "surface": "repo_hf",
       "shows": "Summarizes the main research lessons from committed metrics and identifies which experiments need held-out episodes.",
       "exists": true,
-      "bytes": 5172,
-      "sha256": "39978c1e30b6aa76c5fd2684e9a1111ec2e813423feaff6053084b0335968db8"
     },
     {
       "id": "research_takeaways_json",
@@ -1016,8 +1016,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable result interpretation for the website, HF cards, and mirror checks.",
       "exists": true,
-      "bytes": 7162,
-      "sha256": "9899c5cb6b92bcfe5e64f98503af5b7d0759ad1a9c5098dbfe4146f54ee26656"
     },
     {
       "id": "research_takeaways_builder",
@@ -1027,8 +1027,8 @@
       "surface": "repo_hf",
       "shows": "Regenerates the research takeaways from committed summary metrics and task result artifacts.",
       "exists": true,
-      "bytes": 13496,
-      "sha256": "c35995607dc16fa2a318c626b84323eb47b61a373a492c22cf9fdac851b4d9b5"
     },
     {
       "id": "audio_ablation_script",
@@ -1036,7 +1036,7 @@
       "path": "scripts/audio_ablation_and_raw_upgrade.py",
       "kind": "result_interpretation",
       "surface": "repo_hf",
-      "shows": "Measures audio contribution variants across the original task contracts.",
       "exists": true,
       "bytes": 43159,
       "sha256": "2444f2e52efb975be931b33d66b7180d53031e1d5e821719122160f92f4540aa"
@@ -1080,7 +1080,7 @@
       "path": "docs/assets/charts/audio_ablation_delta.svg",
       "kind": "visual_evidence",
       "surface": "website_hf",
-      "shows": "Bar chart of measured current-audio primary-metric deltas across the original tasks.",
       "exists": true,
       "bytes": 4146,
       "sha256": "187dbabe01f9ff18841ff61a1e7fbf85bebdd188cc0f248bb5090d64528e7568"
@@ -1093,8 +1093,8 @@
       "surface": "repo_hf",
       "shows": "Catalogs public figures, charts, modality thumbnails, dimensions, hashes, roles, and source scripts.",
       "exists": true,
-      "bytes": 7011,
-      "sha256": "f6554cd980efa6c0b3b8feac5ff3e19c3e2e74ccf2d446ac4afb5ee5d65413f3"
     },
     {
       "id": "figure_index_json",
@@ -1104,8 +1104,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable visual asset index for website and Hugging Face mirrors.",
       "exists": true,
-      "bytes": 19469,
-      "sha256": "11a06ee64d28f81f3280eb99327d99b47dc58fb1521332434b9df11c97b9b4e8"
     },
     {
       "id": "figure_index_builder",
@@ -1115,8 +1115,8 @@
       "surface": "repo_hf",
       "shows": "Regenerates visual-asset hashes, dimensions, and source-script provenance.",
       "exists": true,
-      "bytes": 16829,
-      "sha256": "14f1ed7f94630c8f70fbc14547071db251647f3d527cf760341b7a233883d069"
     },
     {
       "id": "brand_assets_json",
@@ -1182,7 +1182,7 @@
       "shows": "Machine-readable release-check summary for validators, mirrors, and public project surfaces.",
       "exists": true,
       "bytes": 8640,
-      "sha256": "c8ce99ac63ab70e3696386671bf201f5605b6a88c8be8f288d44a122bad9025e"
     },
     {
       "id": "public_surface_qa",
@@ -1226,7 +1226,7 @@
       "volatile": true,
       "shows": "Machine-readable report for SEO/social metadata, accessible tab semantics, public links, project links, and clear project presentation.",
       "exists": true,
-      "bytes": 7690,
       "hash_policy": "existence_and_size_only"
     },
     {
@@ -1307,7 +1307,7 @@
       "volatile": true,
       "shows": "Records the last live GitHub/HF URL verification after upload.",
       "exists": true,
-      "bytes": 189922,
       "hash_policy": "existence_and_size_only"
     },
     {
@@ -1340,8 +1340,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable reproduction steps with expected artifacts and public boundaries.",
       "exists": true,
-      "bytes": 6815,
-      "sha256": "ff44893cac56c229d6eb5d20d8cb261ea38e0358e6444615406affd692d8d98e"
     },
     {
       "id": "artifact_index_builder",
@@ -1351,8 +1351,8 @@
       "surface": "repo_hf",
       "shows": "Generates the selective artifact catalog from local files.",
       "exists": true,
-      "bytes": 68232,
-      "sha256": "ee1b210688c1b722d6ca94d1c1706c1a510218c964298b91dd3e596fa19ed2a1"
     },
     {
       "id": "publication_audit",
@@ -1410,8 +1410,8 @@
       "surface": "website_hf",
       "shows": "Lists public URLs, upstream sources, and machine-readable project metadata.",
       "exists": true,
-      "bytes": 5774,
-      "sha256": "8da6063de9e0b888089aa62daac6d323057dd80247b8f38be5fbce0b370ef6ac"
     },
     {
       "id": "task_summary",
@@ -1474,7 +1474,7 @@
       "path": "results/episode_task_suite/neural_mlp",
       "kind": "result_directory",
       "surface": "repo_hf_model",
-      "shows": "Stores matching PyTorch MLP results for the original task contracts.",
       "exists": true,
       "file_count": 60,
       "bytes": 90609517
@@ -1485,7 +1485,7 @@
       "path": "results/episode_task_suite/research_directions/research_direction_taxonomy.json",
       "kind": "taxonomy",
       "surface": "repo_hf",
-      "shows": "Maps the original tasks to the four Ropedia research directions as direct/proxy/diagnostic.",
       "exists": true,
       "bytes": 25046,
       "sha256": "0e3c442e5eb9057b04b1e8c8fa723dfde6f72e7fae1378d5ea022d93f7d25ca3"
@@ -1509,8 +1509,8 @@
       "surface": "repo_hf",
       "shows": "Stores the historical result bundle for provenance rows with minimal and neural baselines aligned to the same 20-task window/split setup.",
       "exists": true,
-      "bytes": 33402,
-      "sha256": "5a1051d25ceafe53c60dbd5b81d4b686a421c493ad09a462ad96bac100c5f3f3"
     },
     {
       "id": "tier2_task_suite_json",
@@ -1520,8 +1520,8 @@
       "surface": "website_hf",
       "shows": "Machine-readable provenance definitions, setup alignment, metrics, and public source paths; the file name is historical.",
       "exists": true,
-      "bytes": 33402,
-      "sha256": "5a1051d25ceafe53c60dbd5b81d4b686a421c493ad09a462ad96bac100c5f3f3"
     },
     {
       "id": "tier2_task_suite_chart",
@@ -1531,8 +1531,8 @@
       "surface": "website_hf",
       "shows": "Visual summary of the historical provenance baseline metrics inside the unified 20-task suite.",
       "exists": true,
-      "bytes": 5437,
-      "sha256": "3e35e476f559cd6188e5417e4d28c25efc130abafc9cab2d941bc77d559177a1"
     },
     {
       "id": "tier2_task_suite_builder",
@@ -1542,8 +1542,8 @@
       "surface": "repo_hf",
       "shows": "Regenerates the historical provenance rows from shared windows plus the local public-sample annotation HDF5; the script name is historical.",
       "exists": true,
-      "bytes": 47102,
-      "sha256": "3cddefaaeedd8efb65e6db956cbd13605e4a5b3772d98fa831d34fd6f92850de"
     },
     {
       "id": "task_walkthroughs",
@@ -1564,8 +1564,8 @@
       "surface": "website_hf",
       "shows": "Presents the task suite and sample modality thumbnails with metrics generated from committed files.",
       "exists": true,
-      "bytes": 1903454,
-      "sha256": "6667eb856cf61ada9f868807b5d5c6ccde06e4f791b2f9dd567d98b71b307415"
     },
     {
       "id": "modality_atlas",
@@ -1672,7 +1672,7 @@
       "path": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
       "kind": "scaleup_status",
       "surface": "repo_hf",
-      "shows": "Summarizes same-split simple and neural metadata baselines for the 12 original task ids, with unsupported markers for tasks that need missing raw 128 feature blocks.",
       "exists": true,
       "bytes": 2238,
       "sha256": "c70440aa502ec569a840159ab7e05b8e7d4ed70e0091ad9a4b2fb3fb0d3803c1"
@@ -1696,8 +1696,8 @@
       "surface": "repo_hf",
       "shows": "Reader-facing comparison of the single-episode task suite, 128-episode aligned baselines, Qwen3-Omni packages, and Cosmos3 future-window branch.",
       "exists": true,
-      "bytes": 15983,
-      "sha256": "4db248566972e811aac6ca06582f233414821624f00f9d4fc4a1b66b2e00401f"
     },
     {
       "id": "omni_model_comparison_json",
@@ -1707,8 +1707,8 @@
       "surface": "repo_hf",
       "shows": "Machine-readable comparison of the current result versions, per-task aligned baselines, verified Qwen3 packages, and Cosmos3 package.",
       "exists": true,
-      "bytes": 82088,
-      "sha256": "82ccc2932cad63a9ebad85da53e694b18ef626aa3720bda3ed5da30f3dc5e121"
     },
     {
       "id": "cosmos3_nano_verified_summary",

 {
   "title": "Ropedia Xperience-10M Task Suite Artifact Index",
+  "generated_at_utc": "2026-06-21T15:19:00+00:00",
   "status": "pass",
   "artifact_count": 228,
   "missing": [],
       "surface": "website_hf",
       "shows": "Machine-readable first-reader project brief for the website and Hugging Face mirrors.",
       "exists": true,
+      "bytes": 4032,
+      "sha256": "328d601390fdd61c836434e00cfe27670ef5fb96252270975c4ca339f2a51bfa"
     },
     {
       "id": "project_status",
       "surface": "repo_hf",
       "shows": "Gives a compact current-state table for first-pass readers.",
       "exists": true,
+      "bytes": 16013,
+      "sha256": "5ad142b601ad982ce59620bd7fa50446c8837050b0331b2be4a357280b295c21"
     },
     {
       "id": "project_status_json",
       "surface": "website_hf",
       "shows": "Machine-readable copy of the current project status for website and HF mirrors.",
       "exists": true,
+      "bytes": 23232,
+      "sha256": "406c48ec858b5f288c7ebef6eefc0ed94dc8bad11bf9221f435b9c8aca547ea3"
     },
     {
       "id": "glossary",
       "surface": "website_hf",
       "shows": "Gives a short project path with scope status and public surfaces.",
       "exists": true,
+      "bytes": 10018,
+      "sha256": "6b7ae7fe0df1a9e4a12d241a3162540b0cf1ade86803dec8aac68e3dc99bfc66"
     },
     {
       "id": "artifact_guide",
       "surface": "repo_hf",
       "shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
       "exists": true,
+      "bytes": 20601,
+      "sha256": "e0e4ad50271ab1d58d2fe97de5b3451a52f034996b54d0ee9499b562b9decbbf"
     },
     {
       "id": "official_dataset_card_alignment",
       "shows": "Machine-readable source-alignment pass/fail check for repo, website, and HF surfaces.",
       "exists": true,
       "bytes": 4432,
+      "sha256": "5ab2ea4bfefe9f5bc7854f02b2e1e2b5206766a54447647191828da1a1a2077c"
     },
     {
       "id": "source_alignment_validator",
       "surface": "repo_hf",
       "shows": "Defines the window unit, chronological split, task metrics, leakage controls, and current limitations.",
       "exists": true,
+      "bytes": 8905,
+      "sha256": "f82e9b9c4a07e95776005968788e7acdaae9e322991113d79432d59057181add"
     },
     {
       "id": "evaluation_protocol_json",
       "surface": "website_hf",
       "shows": "Machine-readable protocol generated from committed task metrics for website and HF mirrors.",
       "exists": true,
+      "bytes": 24047,
+      "sha256": "d8f61b646a2f3f1e0af901dbdaff310ebfeea90622c93a34b9e35f34be98b896"
     },
     {
       "id": "evaluation_protocol_builder",
       "surface": "repo_hf",
       "shows": "Regenerates the protocol from committed summary metrics and task artifacts.",
       "exists": true,
+      "bytes": 19825,
+      "sha256": "aa9de1582f8fa79c1850e10e69fb125c0e3c1add433c7ebedc104c2efb42272e"
     },
     {
       "id": "task_suite_20",
       "surface": "repo_hf",
       "shows": "Reader-facing table for the single unified public-sample task suite under the same window, split, feature, and baseline contract.",
       "exists": true,
+      "bytes": 4845,
+      "sha256": "076a68734f20e2660d1eddba460672c1246951b893494396f1281d6423f3627a"
     },
     {
       "id": "task_suite_20_json",
       "surface": "website_hf",
       "shows": "Machine-readable unified 20-task index for the website, Hugging Face mirrors, and live verification.",
       "exists": true,
+      "bytes": 34585,
+      "sha256": "75145285cf71bc3bb9a10377a1921b60e85c4546dc8b858102b3c26e94c11a01"
     },
     {
       "id": "task_suite_20_builder",
       "surface": "repo_hf",
       "shows": "Regenerates the unified 20-task JSON and Markdown from the public-sample metrics plus the historical provenance result bundle.",
       "exists": true,
+      "bytes": 12157,
+      "sha256": "157265b5c025f279ce1eb56c52dd720ce0969b8426d5887030bfa179a3b565e0"
     },
     {
       "id": "unified_task_model_radar_json",
       "surface": "repo_hf",
       "shows": "Summarizes the main research lessons from committed metrics and identifies which experiments need held-out episodes.",
       "exists": true,
+      "bytes": 5175,
+      "sha256": "385d1b77b41c632925bbd27878c334839303462d03a3b9d358326951b1088da8"
     },
     {
       "id": "research_takeaways_json",
       "surface": "website_hf",
       "shows": "Machine-readable result interpretation for the website, HF cards, and mirror checks.",
       "exists": true,
+      "bytes": 7165,
+      "sha256": "f1ddead60f986e3036206bc3c70d4bdda422a8be4761b285eb89c9c49d9832b6"
     },
     {
       "id": "research_takeaways_builder",
       "surface": "repo_hf",
       "shows": "Regenerates the research takeaways from committed summary metrics and task result artifacts.",
       "exists": true,
+      "bytes": 13499,
+      "sha256": "fc749125f9be87ee0db5b66918342da5c0378d6c97fb1acabe9688f920554c39"
     },
     {
       "id": "audio_ablation_script",
       "path": "scripts/audio_ablation_and_raw_upgrade.py",
       "kind": "result_interpretation",
       "surface": "repo_hf",
+      "shows": "Measures audio contribution variants across the walkthrough-backed task contracts.",
       "exists": true,
       "bytes": 43159,
       "sha256": "2444f2e52efb975be931b33d66b7180d53031e1d5e821719122160f92f4540aa"
       "path": "docs/assets/charts/audio_ablation_delta.svg",
       "kind": "visual_evidence",
       "surface": "website_hf",
+      "shows": "Bar chart of measured current-audio primary-metric deltas across the walkthrough-backed tasks.",
       "exists": true,
       "bytes": 4146,
       "sha256": "187dbabe01f9ff18841ff61a1e7fbf85bebdd188cc0f248bb5090d64528e7568"
       "surface": "repo_hf",
       "shows": "Catalogs public figures, charts, modality thumbnails, dimensions, hashes, roles, and source scripts.",
       "exists": true,
+      "bytes": 7027,
+      "sha256": "b7b507c35cd3cba2765586e9703a447c8025c89658c3daa390df67db4211d0fc"
     },
     {
       "id": "figure_index_json",
       "surface": "website_hf",
       "shows": "Machine-readable visual asset index for website and Hugging Face mirrors.",
       "exists": true,
+      "bytes": 19485,
+      "sha256": "4f225bf08f00fbe843999d6bd2b3d5f5d6c17f2ff67e1f6a85eee9094c6bb6a3"
     },
     {
       "id": "figure_index_builder",
       "surface": "repo_hf",
       "shows": "Regenerates visual-asset hashes, dimensions, and source-script provenance.",
       "exists": true,
+      "bytes": 16845,
+      "sha256": "3f91f7f13a3fb08ab57c2f0a6b320102e9d5ae19b102b71499edb5b8fd5a2cec"
     },
     {
       "id": "brand_assets_json",
       "shows": "Machine-readable release-check summary for validators, mirrors, and public project surfaces.",
       "exists": true,
       "bytes": 8640,
+      "sha256": "6e54f6828b8fef97e963a9a56bccc91162b8a632f6897743095e32407fa0db98"
     },
     {
       "id": "public_surface_qa",
       "volatile": true,
       "shows": "Machine-readable report for SEO/social metadata, accessible tab semantics, public links, project links, and clear project presentation.",
       "exists": true,
+      "bytes": 7691,
       "hash_policy": "existence_and_size_only"
     },
     {
       "volatile": true,
       "shows": "Records the last live GitHub/HF URL verification after upload.",
       "exists": true,
+      "bytes": 189990,
       "hash_policy": "existence_and_size_only"
     },
     {
       "surface": "website_hf",
       "shows": "Machine-readable reproduction steps with expected artifacts and public boundaries.",
       "exists": true,
+      "bytes": 6836,
+      "sha256": "3f1e1615c6c0853d21bc14a8eab20af3757ecc443e72dab7744b3c0ec149fa87"
     },
     {
       "id": "artifact_index_builder",
       "surface": "repo_hf",
       "shows": "Generates the selective artifact catalog from local files.",
       "exists": true,
+      "bytes": 68279,
+      "sha256": "69b43ad5d3dc5a6893c4592fa47fff6a7a87691728ec2c61b121ec262d00bf2a"
     },
     {
       "id": "publication_audit",
       "surface": "website_hf",
       "shows": "Lists public URLs, upstream sources, and machine-readable project metadata.",
       "exists": true,
+      "bytes": 5739,
+      "sha256": "d972f30552dd346ec296f88d004c70bf2fb99e92e44ddc8d3a6dad5634f0336d"
     },
     {
       "id": "task_summary",
       "path": "results/episode_task_suite/neural_mlp",
       "kind": "result_directory",
       "surface": "repo_hf_model",
+      "shows": "Stores matching PyTorch MLP results for the walkthrough-backed task contracts.",
       "exists": true,
       "file_count": 60,
       "bytes": 90609517
       "path": "results/episode_task_suite/research_directions/research_direction_taxonomy.json",
       "kind": "taxonomy",
       "surface": "repo_hf",
+      "shows": "Maps the walkthrough-backed tasks to the four Ropedia research directions as direct/proxy/diagnostic.",
       "exists": true,
       "bytes": 25046,
       "sha256": "0e3c442e5eb9057b04b1e8c8fa723dfde6f72e7fae1378d5ea022d93f7d25ca3"
       "surface": "repo_hf",
       "shows": "Stores the historical result bundle for provenance rows with minimal and neural baselines aligned to the same 20-task window/split setup.",
       "exists": true,
+      "bytes": 33575,
+      "sha256": "d6d2f851325a691e77aed6d948f7355b16cf8d81ca35bf115e7309a7b7308efd"
     },
     {
       "id": "tier2_task_suite_json",
       "surface": "website_hf",
       "shows": "Machine-readable provenance definitions, setup alignment, metrics, and public source paths; the file name is historical.",
       "exists": true,
+      "bytes": 33575,
+      "sha256": "d6d2f851325a691e77aed6d948f7355b16cf8d81ca35bf115e7309a7b7308efd"
     },
     {
       "id": "tier2_task_suite_chart",
       "surface": "website_hf",
       "shows": "Visual summary of the historical provenance baseline metrics inside the unified 20-task suite.",
       "exists": true,
+      "bytes": 5453,
+      "sha256": "e9da29c57f42b29a7a05622fee1335089ac2b6fc9692a3b49fa5b753904db9dc"
     },
     {
       "id": "tier2_task_suite_builder",
       "surface": "repo_hf",
       "shows": "Regenerates the historical provenance rows from shared windows plus the local public-sample annotation HDF5; the script name is historical.",
       "exists": true,
+      "bytes": 47155,
+      "sha256": "569f05c1299f5186778ec75280188969fe1a5a76ae8553738fd44fc2faaab195"
     },
     {
       "id": "task_walkthroughs",
       "surface": "website_hf",
       "shows": "Presents the task suite and sample modality thumbnails with metrics generated from committed files.",
       "exists": true,
+      "bytes": 1897278,
+      "sha256": "71b1ab150e952cf902488226c65b3822d8016974f63d111204c1eb1a7745faad"
     },
     {
       "id": "modality_atlas",
       "path": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
       "kind": "scaleup_status",
       "surface": "repo_hf",
+      "shows": "Summarizes same-split simple and neural metadata baselines for the walkthrough-backed task ids, with unsupported markers for tasks that need missing raw 128 feature blocks.",
       "exists": true,
       "bytes": 2238,
       "sha256": "c70440aa502ec569a840159ab7e05b8e7d4ed70e0091ad9a4b2fb3fb0d3803c1"
       "surface": "repo_hf",
       "shows": "Reader-facing comparison of the single-episode task suite, 128-episode aligned baselines, Qwen3-Omni packages, and Cosmos3 future-window branch.",
       "exists": true,
+      "bytes": 15997,
+      "sha256": "c8296c51eb1d67d155b84e3a39f703642d30e855fee7ee7d6ca437966b5c760b"
     },
     {
       "id": "omni_model_comparison_json",
       "surface": "repo_hf",
       "shows": "Machine-readable comparison of the current result versions, per-task aligned baselines, verified Qwen3 packages, and Cosmos3 package.",
       "exists": true,
+      "bytes": 82102,
+      "sha256": "6b246dbdb2685efdc9d0a92bb8c446a89523a1787ebc8a883805b4179e266dd1"
     },
     {
       "id": "cosmos3_nano_verified_summary",

data/evaluation_protocol.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "title": "Ropedia Xperience-10M Task Suite Evaluation Protocol",
   "status": "pass",
   "version": "2026-06-01",
-  "generated_at_utc": "2026-06-21T14:40:33+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
@@ -26,8 +26,8 @@
   "task_suite": {
     "status": "unified_public_sample_suite",
     "task_count": 20,
-    "original_public_sample_tasks": 12,
-    "additional_public_sample_tasks": 8,
     "unified_results": "docs/data/task_suite_20.json",
     "legacy_additional_task_result_path": "docs/data/tier2_task_suite.json",
     "legacy_path_note": "The tier2_task_suite path is retained for stable links only; it is provenance inside the same 20-task suite."
@@ -82,7 +82,7 @@
     {
       "task": "timeline_action",
       "task_display_name": "Action Recognition",
-      "origin": "original_public_sample_tasks",
       "family": "supervised classification",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
@@ -105,7 +105,7 @@
     {
       "task": "timeline_subtask",
       "task_display_name": "Procedure Step Recognition",
-      "origin": "original_public_sample_tasks",
       "family": "supervised classification",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
@@ -128,7 +128,7 @@
     {
       "task": "transition_detection",
       "task_display_name": "Action Boundary Detection",
-      "origin": "original_public_sample_tasks",
       "family": "temporal diagnostic",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
@@ -151,7 +151,7 @@
     {
       "task": "next_action",
       "task_display_name": "Next-Action Prediction",
-      "origin": "original_public_sample_tasks",
       "family": "short-horizon prediction",
       "unit": "single window",
       "input": "current 20-frame all-feature window at time t",
@@ -174,7 +174,7 @@
     {
       "task": "hand_trajectory_forecast",
       "task_display_name": "Hand Trajectory Forecasting",
-      "origin": "original_public_sample_tasks",
       "family": "trajectory regression",
       "unit": "single window",
       "input": "current all-feature window",
@@ -197,7 +197,7 @@
     {
       "task": "contact_prediction",
       "task_display_name": "Contact State Prediction",
-      "origin": "original_public_sample_tasks",
       "family": "binary classification",
       "unit": "single window",
       "input": "non-contact and non-caption feature blocks",
@@ -220,7 +220,7 @@
     {
       "task": "object_relevance",
       "task_display_name": "Object Relevance Prediction",
-      "origin": "original_public_sample_tasks",
       "family": "multi-label classification",
       "unit": "single window",
       "input": "non-caption feature blocks",
@@ -243,7 +243,7 @@
     {
       "task": "caption_grounding",
       "task_display_name": "Language Grounding",
-      "origin": "original_public_sample_tasks",
       "family": "retrieval",
       "unit": "caption query",
       "input": "caption object/interaction query plus candidate sensor windows",
@@ -266,7 +266,7 @@
     {
       "task": "cross_modal_retrieval",
       "task_display_name": "Cross-Modal Retrieval",
-      "origin": "original_public_sample_tasks",
       "family": "retrieval",
       "unit": "sensor query",
       "input": "motion, IMU, and camera query features",
@@ -289,7 +289,7 @@
     {
       "task": "modality_reconstruction",
       "task_display_name": "Cross-Modal Reconstruction",
-      "origin": "original_public_sample_tasks",
       "family": "cross-modal regression",
       "unit": "single window",
       "input": "motion, IMU, and camera features",
@@ -311,7 +311,7 @@
     {
       "task": "temporal_order",
       "task_display_name": "Temporal Order Verification",
-      "origin": "original_public_sample_tasks",
       "family": "pairwise diagnostic",
       "unit": "adjacent window pair",
       "input": "two adjacent windows",
@@ -334,7 +334,7 @@
     {
       "task": "misalignment_detection",
       "task_display_name": "Multimodal Synchronization Detection",
-      "origin": "original_public_sample_tasks",
       "family": "pairwise diagnostic",
       "unit": "paired modality window",
       "input": "motion side plus visual/depth side",
@@ -357,7 +357,7 @@
     {
       "task": "long_horizon_next_action",
       "task_display_name": "Long-Horizon Next-Action Forecasting",
-      "origin": "additional_public_sample_tasks",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",
@@ -375,7 +375,7 @@
     {
       "task": "next_subtask_forecast",
       "task_display_name": "Long-Horizon Next-Subtask Forecasting",
-      "origin": "additional_public_sample_tasks",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",
@@ -393,7 +393,7 @@
     {
       "task": "interaction_text_prediction",
       "task_display_name": "Interaction Text Prediction",
-      "origin": "additional_public_sample_tasks",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
@@ -411,7 +411,7 @@
     {
       "task": "action_object_relation",
       "task_display_name": "Action-Object Relation Prediction",
-      "origin": "additional_public_sample_tasks",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
@@ -429,7 +429,7 @@
     {
       "task": "object_set_forecast",
       "task_display_name": "Future Object-Set Forecasting",
-      "origin": "additional_public_sample_tasks",
       "family": "multi_label",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
@@ -447,7 +447,7 @@
     {
       "task": "imu_to_hand_pose",
       "task_display_name": "IMU-to-Hand Pose Reconstruction",
-      "origin": "additional_public_sample_tasks",
       "family": "regression",
       "unit": "single aligned window",
       "input": "Current IMU acceleration/gyroscope feature block only.",
@@ -465,7 +465,7 @@
     {
       "task": "camera_view_sync_retrieval",
       "task_display_name": "Camera-View Synchronization Retrieval",
-      "origin": "additional_public_sample_tasks",
       "family": "retrieval",
       "unit": "held-out query window",
       "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
@@ -483,7 +483,7 @@
     {
       "task": "time_to_transition",
       "task_display_name": "Time-to-Next-Transition Regression",
-      "origin": "additional_public_sample_tasks",
       "family": "regression",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",

   "title": "Ropedia Xperience-10M Task Suite Evaluation Protocol",
   "status": "pass",
   "version": "2026-06-01",
+  "generated_at_utc": "2026-06-21T15:20:33+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
   "task_suite": {
     "status": "unified_public_sample_suite",
     "task_count": 20,
+    "public_framing": "all 20 public-sample task contracts are presented as one suite",
+    "legacy_provenance_rows": 8,
     "unified_results": "docs/data/task_suite_20.json",
     "legacy_additional_task_result_path": "docs/data/tier2_task_suite.json",
     "legacy_path_note": "The tier2_task_suite path is retained for stable links only; it is provenance inside the same 20-task suite."
     {
       "task": "timeline_action",
       "task_display_name": "Action Recognition",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "supervised classification",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
     {
       "task": "timeline_subtask",
       "task_display_name": "Procedure Step Recognition",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "supervised classification",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
     {
       "task": "transition_detection",
       "task_display_name": "Action Boundary Detection",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "temporal diagnostic",
       "unit": "single window",
       "input": "current 20-frame all-feature window",
     {
       "task": "next_action",
       "task_display_name": "Next-Action Prediction",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "short-horizon prediction",
       "unit": "single window",
       "input": "current 20-frame all-feature window at time t",
     {
       "task": "hand_trajectory_forecast",
       "task_display_name": "Hand Trajectory Forecasting",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "trajectory regression",
       "unit": "single window",
       "input": "current all-feature window",
     {
       "task": "contact_prediction",
       "task_display_name": "Contact State Prediction",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "binary classification",
       "unit": "single window",
       "input": "non-contact and non-caption feature blocks",
     {
       "task": "object_relevance",
       "task_display_name": "Object Relevance Prediction",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "multi-label classification",
       "unit": "single window",
       "input": "non-caption feature blocks",
     {
       "task": "caption_grounding",
       "task_display_name": "Language Grounding",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "retrieval",
       "unit": "caption query",
       "input": "caption object/interaction query plus candidate sensor windows",
     {
       "task": "cross_modal_retrieval",
       "task_display_name": "Cross-Modal Retrieval",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "retrieval",
       "unit": "sensor query",
       "input": "motion, IMU, and camera query features",
     {
       "task": "modality_reconstruction",
       "task_display_name": "Cross-Modal Reconstruction",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "cross-modal regression",
       "unit": "single window",
       "input": "motion, IMU, and camera features",
     {
       "task": "temporal_order",
       "task_display_name": "Temporal Order Verification",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "pairwise diagnostic",
       "unit": "adjacent window pair",
       "input": "two adjacent windows",
     {
       "task": "misalignment_detection",
       "task_display_name": "Multimodal Synchronization Detection",
+      "provenance_source": "walkthrough_backed_task_contract",
       "family": "pairwise diagnostic",
       "unit": "paired modality window",
       "input": "motion side plus visual/depth side",
     {
       "task": "long_horizon_next_action",
       "task_display_name": "Long-Horizon Next-Action Forecasting",
+      "provenance_source": "historical_result_bundle",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",
     {
       "task": "next_subtask_forecast",
       "task_display_name": "Long-Horizon Next-Subtask Forecasting",
+      "provenance_source": "historical_result_bundle",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",
     {
       "task": "interaction_text_prediction",
       "task_display_name": "Interaction Text Prediction",
+      "provenance_source": "historical_result_bundle",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
     {
       "task": "action_object_relation",
       "task_display_name": "Action-Object Relation Prediction",
+      "provenance_source": "historical_result_bundle",
       "family": "classification",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
     {
       "task": "object_set_forecast",
       "task_display_name": "Future Object-Set Forecasting",
+      "provenance_source": "historical_result_bundle",
       "family": "multi_label",
       "unit": "single aligned window",
       "input": "Current 20-frame sensor window with caption-text features removed.",
     {
       "task": "imu_to_hand_pose",
       "task_display_name": "IMU-to-Hand Pose Reconstruction",
+      "provenance_source": "historical_result_bundle",
       "family": "regression",
       "unit": "single aligned window",
       "input": "Current IMU acceleration/gyroscope feature block only.",
     {
       "task": "camera_view_sync_retrieval",
       "task_display_name": "Camera-View Synchronization Retrieval",
+      "provenance_source": "historical_result_bundle",
       "family": "retrieval",
       "unit": "held-out query window",
       "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
     {
       "task": "time_to_transition",
       "task_display_name": "Time-to-Next-Transition Regression",
+      "provenance_source": "historical_result_bundle",
       "family": "regression",
       "unit": "single aligned window",
       "input": "Current 20-frame non-caption multimodal window.",

data/live_publication_status.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

data/mirror_parity.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

data/omni_model_comparison.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
   "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
-  "generated_at_utc": "2026-06-21T10:47:04+00:00",
   "status": "pass",
   "version_count": 3,
   "model_group_count": 5,
   "comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
   "version_reading_notes": [
-    "Version 1 is the public-sample 20-task surface: original core heads, tasks 13-20, and the 180-row method-task matrix.",
     "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
     "The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
   ],

 {
   "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
+  "generated_at_utc": "2026-06-21T15:17:00+00:00",
   "status": "pass",
   "version_count": 3,
   "model_group_count": 5,
   "comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
   "version_reading_notes": [
+    "Version 1 is the public-sample 20-task surface: unified task heads, historical provenance rows, and the 180-row method-task matrix.",
     "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
     "The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
   ],

data/project_manifest.json CHANGED Viewed

@@ -23,9 +23,8 @@
     "qwen3_omni_json_quality_target_met": true,
     "qwen3_omni_lora_adapter_repo": "https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep",
     "task_count": 20,
-    "original_public_sample_task_count": 12,
-    "additional_public_sample_task_count": 8,
-    "legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
   },
   "public_surfaces": {
     "github_repo": "https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite",
@@ -96,7 +95,7 @@
     "task_walkthroughs": "docs/data/task_walkthroughs.json",
     "task_suite_20": "TASK_SUITE_20.md",
     "task_suite_20_json": "docs/data/task_suite_20.json",
-    "tasks_13_to_20_result_bundle": "docs/data/tier2_task_suite.json"
   },
   "citation_files": {
     "citation_cff": "CITATION.cff",

     "qwen3_omni_json_quality_target_met": true,
     "qwen3_omni_lora_adapter_repo": "https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep",
     "task_count": 20,
+    "task_surface_framing": "unified_20_task_suite",
+    "legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
   },
   "public_surfaces": {
     "github_repo": "https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite",
     "task_walkthroughs": "docs/data/task_walkthroughs.json",
     "task_suite_20": "TASK_SUITE_20.md",
     "task_suite_20_json": "docs/data/task_suite_20.json",
+    "historical_provenance_result_bundle": "docs/data/tier2_task_suite.json"
   },
   "citation_files": {
     "citation_cff": "CITATION.cff",

data/project_packet.json CHANGED Viewed

@@ -15,9 +15,8 @@
     "cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
     "task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
     "task_count": 20,
-    "original_public_sample_task_count": 12,
-    "additional_public_sample_task_count": 8,
-    "legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
   },
   "reading_path": [
     {
@@ -110,7 +109,7 @@
         "results/episode_task_suite/neural_mlp/",
         "docs/data/summary_metrics.json"
       ],
-      "readout": "The unified suite has 20 task contracts; tasks 1-12 have walkthroughs and neural MLP heads, and tasks 13-20 have aligned minimal/neural result bundles under the historical tier2_task_suite path."
     },
     {
       "step": 8,

     "cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
     "task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
     "task_count": 20,
+    "task_surface_framing": "unified_20_task_suite",
+    "legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
   },
   "reading_path": [
     {
         "results/episode_task_suite/neural_mlp/",
         "docs/data/summary_metrics.json"
       ],
+      "readout": "The unified suite has 20 task contracts in one task surface. Walkthrough-backed tasks, aligned minimal/neural result bundles, and historical tier2_task_suite provenance paths are all linked from TASK_SUITE_20.md and docs/data/task_suite_20.json."
     },
     {
       "step": 8,

data/project_status.json CHANGED Viewed

@@ -62,9 +62,8 @@
     "task_suite_enhancement_128_recommended_export": "multiscale_20s10_40s20_80s40",
     "task_suite_enhancement_128_estimated_windows": 106095,
     "task_count": 20,
-    "original_public_sample_task_count": 12,
-    "additional_public_sample_task_count": 8,
-    "legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
   },
   "rows": [
     {
@@ -86,7 +85,7 @@
         "results/episode_task_suite/",
         "results/episode_task_suite/tier2_task_suite/"
       ],
-      "readout": "All 20 task contracts have committed minimal metrics; tasks 13-20 reuse the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The tier2_task_suite path is historical and now stores tasks 13-20, not a separate public tier."
     },
     {
       "area": "180-result method matrix",
@@ -116,7 +115,7 @@
         "results/audio_ablation/",
         "docs/data/audio_ablation_summary.json"
       ],
-      "readout": "Audio variants improve the primary metric on 6 of the original task contracts in this single-episode setting."
     },
     {
       "area": "Evaluation protocol",
@@ -355,7 +354,7 @@
     "The Cosmos3-Nano future-window package is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super Forward-Dynamics LoRA is verified as the first fine-tuned Super adapter artifact. Cosmos3-Super adapter weights belong in cy0307/ropedia-cosmos3-super-forward-dynamics-lora-128ep; verified_public packages exclude safetensors.",
     "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
     "Audio is one of the synchronized source modalities in the current task representation.",
-    "The audio ablation report compares audio/no-audio variants across the original task contracts in results/audio_ablation/.",
     "Foundation-model selection is explicit: Qwen3-Omni is the structured JSON baseline, Cosmos 3 is the world-model track with Nano compatibility and Super forward-dynamics LoRA results, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
     "Future model tracks should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
     "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."

     "task_suite_enhancement_128_recommended_export": "multiscale_20s10_40s20_80s40",
     "task_suite_enhancement_128_estimated_windows": 106095,
     "task_count": 20,
+    "task_surface_framing": "unified_20_task_suite",
+    "legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
   },
   "rows": [
     {
         "results/episode_task_suite/",
         "results/episode_task_suite/tier2_task_suite/"
       ],
+      "readout": "All 20 task contracts are presented together with committed minimal metrics, the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The tier2_task_suite path is historical provenance inside the suite, not a separate public tier."
     },
     {
       "area": "180-result method matrix",
         "results/audio_ablation/",
         "docs/data/audio_ablation_summary.json"
       ],
+      "readout": "Audio variants improve the primary metric on 6 walkthrough-backed task contracts in this single-episode setting."
     },
     {
       "area": "Evaluation protocol",
     "The Cosmos3-Nano future-window package is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super Forward-Dynamics LoRA is verified as the first fine-tuned Super adapter artifact. Cosmos3-Super adapter weights belong in cy0307/ropedia-cosmos3-super-forward-dynamics-lora-128ep; verified_public packages exclude safetensors.",
     "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
     "Audio is one of the synchronized source modalities in the current task representation.",
+    "The audio ablation report compares audio/no-audio variants across the walkthrough-backed task contracts in results/audio_ablation/.",
     "Foundation-model selection is explicit: Qwen3-Omni is the structured JSON baseline, Cosmos 3 is the world-model track with Nano compatibility and Super forward-dynamics LoRA results, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
     "Future model tracks should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
     "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."

data/publication_audit.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:46:11+00:00",
   "checks": [
     {
       "name": "required_publication_assets_present",

 {
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:22:42+00:00",
   "checks": [
     {
       "name": "required_publication_assets_present",

data/quality_gates.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Release Checks",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:46:48+00:00",
   "rule": "A release is current when the automated reports pass and the live GitHub/Hugging Face mirrors are verified after publishing.",
   "automated_gates": [
     {

 {
   "title": "Ropedia Xperience-10M Release Checks",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:42+00:00",
   "rule": "A release is current when the automated reports pass and the live GitHub/Hugging Face mirrors are verified after publishing.",
   "automated_gates": [
     {

data/reproducibility_matrix.json CHANGED Viewed

@@ -39,7 +39,7 @@
       "id": "original_task_suite",
       "status": "reproducible",
       "command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
-      "expected": "original task metrics, predictions, manifests, and neural_mlp task-head artifacts",
       "boundary": "8,546-dimensional multimodal window contract"
     },
     {
@@ -50,11 +50,11 @@
       "boundary": "single-episode probes, not full research-direction solutions"
     },
     {
-      "id": "tasks_13_to_20_and_unified_index",
       "status": "reproducible",
       "command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
-      "expected": "tasks 13-20 metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
-      "boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for tasks 13-20; raw HDF5 and MP4 files are not redistributed"
     },
     {
       "id": "source_alignment_audit",

       "id": "original_task_suite",
       "status": "reproducible",
       "command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
+      "expected": "walkthrough-backed task metrics, predictions, manifests, and neural_mlp task-head artifacts",
       "boundary": "8,546-dimensional multimodal window contract"
     },
     {
       "boundary": "single-episode probes, not full research-direction solutions"
     },
     {
+      "id": "unified_20_task_index",
       "status": "reproducible",
       "command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
+      "expected": "unified 20-task metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
+      "boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for full public-task regeneration; raw HDF5 and MP4 files are not redistributed"
     },
     {
       "id": "source_alignment_audit",

data/research_takeaways.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Research Takeaways",
   "status": "pass",
-  "generated_at_utc": "2026-06-20T21:27:21+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
@@ -133,7 +133,7 @@
     {
       "id": "audio_contribution_is_task_specific",
       "title": "Audio helps some tasks and hurts others on the public sample",
-      "readout": "Audio improves the primary metric on 6 of the original task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
       "evidence": [
         {
           "label": "tasks_where_current_audio_improves",

 {
   "title": "Ropedia Xperience-10M Research Takeaways",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:18:59+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
     {
       "id": "audio_contribution_is_task_specific",
       "title": "Audio helps some tasks and hurts others on the public sample",
+      "readout": "Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
       "evidence": [
         {
           "label": "tasks_where_current_audio_improves",

data/scope_claims_audit.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:47:03+00:00",
   "summary": {
     "qwen3_omni_verified_diagnostic_pilot": true,
     "dataset_manifest_num_episodes": 119,

 {
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:23:13+00:00",
   "summary": {
     "qwen3_omni_verified_diagnostic_pilot": true,
     "dataset_manifest_num_episodes": 119,

data/single_episode_task_model_radar.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Single-Episode 20-Task Radar",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T10:47:17+00:00",
   "description": "Minimal and Neural MLP baselines on the one public sample episode, both scored on all 20 task contracts.",
   "task_count": 20,
   "method_count": 2,
@@ -73,7 +73,7 @@
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -107,7 +107,7 @@
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -141,7 +141,7 @@
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -175,7 +175,7 @@
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -209,7 +209,7 @@
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
@@ -243,7 +243,7 @@
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -277,7 +277,7 @@
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
-      "origin": "original_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -311,7 +311,7 @@
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -345,7 +345,7 @@
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -379,7 +379,7 @@
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
-      "origin": "original_public_sample_tasks",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
@@ -413,7 +413,7 @@
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -447,7 +447,7 @@
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -481,7 +481,7 @@
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -515,7 +515,7 @@
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -549,7 +549,7 @@
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -583,7 +583,7 @@
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -617,7 +617,7 @@
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -651,7 +651,7 @@
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
@@ -685,7 +685,7 @@
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -719,7 +719,7 @@
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

 {
   "title": "Single-Episode 20-Task Radar",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:20:34+00:00",
   "description": "Minimal and Neural MLP baselines on the one public sample episode, both scored on all 20 task contracts.",
   "task_count": 20,
   "method_count": 2,
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

data/source_alignment_audit.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Source Alignment Note",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:46:49+00:00",
   "alignment_json": "docs/data/xperience10m_dataset_card_alignment.json",
   "alignment_summary": {
     "full_dataset_repo": "ropedia-ai/xperience-10m",

 {
   "title": "Ropedia Xperience-10M Source Alignment Note",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:55+00:00",
   "alignment_json": "docs/data/xperience10m_dataset_card_alignment.json",
   "alignment_summary": {
     "full_dataset_repo": "ropedia-ai/xperience-10m",

data/task_method_20_gap_audit.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "generated_at_utc": "2026-06-21T08:38:20+00:00",
   "immediate_actions": [
     {
       "artifact": "docs/data/task_method_20_gap_audit.json",

 {
+  "generated_at_utc": "2026-06-21T15:21:42+00:00",
   "immediate_actions": [
     {
       "artifact": "docs/data/task_method_20_gap_audit.json",

data/task_method_20_result_matrix.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Task Method 20-Result Matrix",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T10:47:17+00:00",
   "task_count": 20,
   "method_count": 9,
   "method_task_record_count": 180,

 {
   "title": "Task Method 20-Result Matrix",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:20:34+00:00",
   "task_count": 20,
   "method_count": 9,
   "method_task_record_count": 180,

data/task_suite_20.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
   "title": "Ropedia Xperience-10M Unified 20-Task Suite",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:40:33+00:00",
   "task_count": 20,
-  "task_count_breakdown": {
-    "original_public_sample_tasks": 12,
-    "additional_public_sample_tasks": 8,
-    "total_unified_tasks": 20
   },
   "unification_policy": {
     "public_framing": "The suite is presented as one 20-task benchmark surface. All task contracts share the same window, split, feature, baseline, and leakage-control language.",
@@ -21,7 +21,7 @@
     "window_frames": 20,
     "stride_frames": 5,
     "split_policy": "single_episode_chronological_70_30",
-    "raw_hdf5_required_for_tasks_13_20_regeneration": true,
     "raw_data_redistributed": false
   },
   "setup_alignment": {
@@ -47,8 +47,8 @@
       "task_id": "timeline_action",
       "task_display_name": "Action Recognition",
       "research_name": "Egocentric Action Recognition",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "supervised",
       "architecture_family": "multiclass classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -82,8 +82,8 @@
       "task_id": "timeline_subtask",
       "task_display_name": "Procedure Step Recognition",
       "research_name": "Temporal Subtask Recognition",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "supervised",
       "architecture_family": "multiclass classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -117,8 +117,8 @@
       "task_id": "transition_detection",
       "task_display_name": "Action Boundary Detection",
       "research_name": "Temporal Action Segmentation",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "diagnostic",
       "architecture_family": "binary classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -152,8 +152,8 @@
       "task_id": "next_action",
       "task_display_name": "Next-Action Prediction",
       "research_name": "Short-Horizon Intention Prediction",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "supervised",
       "architecture_family": "future-label classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -187,8 +187,8 @@
       "task_id": "hand_trajectory_forecast",
       "task_display_name": "Hand Trajectory Forecasting",
       "research_name": "3D Hand Motion Forecasting",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "forecast",
       "architecture_family": "continuous regressor",
       "primary_direction": "A. Human Modeling & Motion Understanding",
@@ -220,8 +220,8 @@
       "task_id": "contact_prediction",
       "task_display_name": "Contact State Prediction",
       "research_name": "Human-Object Contact Prediction",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "supervised",
       "architecture_family": "binary classifier",
       "primary_direction": "A. Human Modeling & Motion Understanding",
@@ -255,8 +255,8 @@
       "task_id": "object_relevance",
       "task_display_name": "Object Relevance Prediction",
       "research_name": "Object-Centric Interaction Recognition",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "supervised",
       "architecture_family": "multi-label classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -288,8 +288,8 @@
       "task_id": "caption_grounding",
       "task_display_name": "Language Grounding",
       "research_name": "Language-to-Moment Grounding",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "retrieval",
       "architecture_family": "retrieval ranker",
       "primary_direction": "C. Egocentric Vision & Interaction",
@@ -321,8 +321,8 @@
       "task_id": "cross_modal_retrieval",
       "task_display_name": "Cross-Modal Retrieval",
       "research_name": "Multimodal Representation Retrieval",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "retrieval",
       "architecture_family": "two-tower retrieval head",
       "primary_direction": "D. Scene Reconstruction & World Modeling",
@@ -354,8 +354,8 @@
       "task_id": "modality_reconstruction",
       "task_display_name": "Cross-Modal Reconstruction",
       "research_name": "Modality Feature Reconstruction",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "forecast",
       "architecture_family": "feature regressor",
       "primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
@@ -386,8 +386,8 @@
       "task_id": "temporal_order",
       "task_display_name": "Temporal Order Verification",
       "research_name": "Temporal Order Verification",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "diagnostic",
       "architecture_family": "pairwise classifier",
       "primary_direction": "D. Scene Reconstruction & World Modeling",
@@ -419,8 +419,8 @@
       "task_id": "misalignment_detection",
       "task_display_name": "Multimodal Synchronization Detection",
       "research_name": "Cross-Modal Misalignment Detection",
-      "origin": "original_public_sample_tasks",
-      "origin_count_label": "original task",
       "family": "diagnostic",
       "architecture_family": "pairwise classifier",
       "primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
@@ -452,8 +452,8 @@
       "task_id": "long_horizon_next_action",
       "task_display_name": "Long-Horizon Next-Action Forecasting",
       "research_name": "Long-Horizon Next-Action Forecasting",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
@@ -487,8 +487,8 @@
       "task_id": "next_subtask_forecast",
       "task_display_name": "Long-Horizon Next-Subtask Forecasting",
       "research_name": "Long-Horizon Next-Subtask Forecasting",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
@@ -522,8 +522,8 @@
       "task_id": "interaction_text_prediction",
       "task_display_name": "Interaction Text Prediction",
       "research_name": "Interaction Text Prediction",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
@@ -557,8 +557,8 @@
       "task_id": "action_object_relation",
       "task_display_name": "Action-Object Relation Prediction",
       "research_name": "Action-Object Relation Prediction",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
@@ -592,8 +592,8 @@
       "task_id": "object_set_forecast",
       "task_display_name": "Future Object-Set Forecasting",
       "research_name": "Future Object-Set Forecasting",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "multi_label",
       "architecture_family": "minimal_ridge_multilabel",
       "primary_direction": "sample-supported extension",
@@ -625,8 +625,8 @@
       "task_id": "imu_to_hand_pose",
       "task_display_name": "IMU-to-Hand Pose Reconstruction",
       "research_name": "IMU-to-Hand Pose Reconstruction",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "regression",
       "architecture_family": "minimal_ridge_regression",
       "primary_direction": "sample-supported extension",
@@ -658,8 +658,8 @@
       "task_id": "camera_view_sync_retrieval",
       "task_display_name": "Camera-View Synchronization Retrieval",
       "research_name": "Camera-View Synchronization Retrieval",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "retrieval",
       "architecture_family": "minimal_ridge_projection_cosine_retrieval",
       "primary_direction": "sample-supported extension",
@@ -690,8 +690,8 @@
       "task_id": "time_to_transition",
       "task_display_name": "Time-to-Next-Transition Regression",
       "research_name": "Time-to-Next-Transition Regression",
-      "origin": "additional_public_sample_tasks",
-      "origin_count_label": "additional task",
       "family": "regression",
       "architecture_family": "minimal_ridge_regression",
       "primary_direction": "sample-supported extension",

 {
   "title": "Ropedia Xperience-10M Unified 20-Task Suite",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:12+00:00",
   "task_count": 20,
+  "task_count_summary": {
+    "total_unified_tasks": 20,
+    "public_framing": "all 20 task contracts are presented as one suite",
+    "legacy_provenance_rows": 8
   },
   "unification_policy": {
     "public_framing": "The suite is presented as one 20-task benchmark surface. All task contracts share the same window, split, feature, baseline, and leakage-control language.",
     "window_frames": 20,
     "stride_frames": 5,
     "split_policy": "single_episode_chronological_70_30",
+    "raw_hdf5_required_for_full_public_regeneration": true,
     "raw_data_redistributed": false
   },
   "setup_alignment": {
       "task_id": "timeline_action",
       "task_display_name": "Action Recognition",
       "research_name": "Egocentric Action Recognition",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "supervised",
       "architecture_family": "multiclass classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "timeline_subtask",
       "task_display_name": "Procedure Step Recognition",
       "research_name": "Temporal Subtask Recognition",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "supervised",
       "architecture_family": "multiclass classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "transition_detection",
       "task_display_name": "Action Boundary Detection",
       "research_name": "Temporal Action Segmentation",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "diagnostic",
       "architecture_family": "binary classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "next_action",
       "task_display_name": "Next-Action Prediction",
       "research_name": "Short-Horizon Intention Prediction",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "supervised",
       "architecture_family": "future-label classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "hand_trajectory_forecast",
       "task_display_name": "Hand Trajectory Forecasting",
       "research_name": "3D Hand Motion Forecasting",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "forecast",
       "architecture_family": "continuous regressor",
       "primary_direction": "A. Human Modeling & Motion Understanding",
       "task_id": "contact_prediction",
       "task_display_name": "Contact State Prediction",
       "research_name": "Human-Object Contact Prediction",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "supervised",
       "architecture_family": "binary classifier",
       "primary_direction": "A. Human Modeling & Motion Understanding",
       "task_id": "object_relevance",
       "task_display_name": "Object Relevance Prediction",
       "research_name": "Object-Centric Interaction Recognition",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "supervised",
       "architecture_family": "multi-label classifier",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "caption_grounding",
       "task_display_name": "Language Grounding",
       "research_name": "Language-to-Moment Grounding",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "retrieval",
       "architecture_family": "retrieval ranker",
       "primary_direction": "C. Egocentric Vision & Interaction",
       "task_id": "cross_modal_retrieval",
       "task_display_name": "Cross-Modal Retrieval",
       "research_name": "Multimodal Representation Retrieval",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "retrieval",
       "architecture_family": "two-tower retrieval head",
       "primary_direction": "D. Scene Reconstruction & World Modeling",
       "task_id": "modality_reconstruction",
       "task_display_name": "Cross-Modal Reconstruction",
       "research_name": "Modality Feature Reconstruction",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "forecast",
       "architecture_family": "feature regressor",
       "primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
       "task_id": "temporal_order",
       "task_display_name": "Temporal Order Verification",
       "research_name": "Temporal Order Verification",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "diagnostic",
       "architecture_family": "pairwise classifier",
       "primary_direction": "D. Scene Reconstruction & World Modeling",
       "task_id": "misalignment_detection",
       "task_display_name": "Multimodal Synchronization Detection",
       "research_name": "Cross-Modal Misalignment Detection",
+      "provenance_source": "walkthrough_backed_task_contract",
+      "origin_count_label": "unified task",
       "family": "diagnostic",
       "architecture_family": "pairwise classifier",
       "primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
       "task_id": "long_horizon_next_action",
       "task_display_name": "Long-Horizon Next-Action Forecasting",
       "research_name": "Long-Horizon Next-Action Forecasting",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
       "task_id": "next_subtask_forecast",
       "task_display_name": "Long-Horizon Next-Subtask Forecasting",
       "research_name": "Long-Horizon Next-Subtask Forecasting",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
       "task_id": "interaction_text_prediction",
       "task_display_name": "Interaction Text Prediction",
       "research_name": "Interaction Text Prediction",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
       "task_id": "action_object_relation",
       "task_display_name": "Action-Object Relation Prediction",
       "research_name": "Action-Object Relation Prediction",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "classification",
       "architecture_family": "minimal_softmax",
       "primary_direction": "sample-supported extension",
       "task_id": "object_set_forecast",
       "task_display_name": "Future Object-Set Forecasting",
       "research_name": "Future Object-Set Forecasting",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "multi_label",
       "architecture_family": "minimal_ridge_multilabel",
       "primary_direction": "sample-supported extension",
       "task_id": "imu_to_hand_pose",
       "task_display_name": "IMU-to-Hand Pose Reconstruction",
       "research_name": "IMU-to-Hand Pose Reconstruction",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "regression",
       "architecture_family": "minimal_ridge_regression",
       "primary_direction": "sample-supported extension",
       "task_id": "camera_view_sync_retrieval",
       "task_display_name": "Camera-View Synchronization Retrieval",
       "research_name": "Camera-View Synchronization Retrieval",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "retrieval",
       "architecture_family": "minimal_ridge_projection_cosine_retrieval",
       "primary_direction": "sample-supported extension",
       "task_id": "time_to_transition",
       "task_display_name": "Time-to-Next-Transition Regression",
       "research_name": "Time-to-Next-Transition Regression",
+      "provenance_source": "historical_result_bundle",
+      "origin_count_label": "unified task",
       "family": "regression",
       "architecture_family": "minimal_ridge_regression",
       "primary_direction": "sample-supported extension",

data/task_surface_integrity.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:45:00+00:00",
   "summary": {
     "original_walkthrough_task_count": 12,
     "expected_original_walkthrough_task_count": 12,

 {
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:55+00:00",
   "summary": {
     "original_walkthrough_task_count": 12,
     "expected_original_walkthrough_task_count": 12,

data/tier2_task_suite.json CHANGED Viewed

@@ -2,13 +2,12 @@
   "title": "Ropedia Xperience-10M Unified 20-Task Provenance Bundle",
   "status": "pass",
   "generated_at_utc": "2026-06-16T06:25:58+00:00",
-  "suite_position": "tasks_13_to_20",
   "legacy_path_note": "The tier2_task_suite file and directory names are retained for stable public links; this bundle is provenance inside the unified 20-task suite, not a separate public tier.",
-  "integrated_with_tasks_1_to_12": {
-    "tasks_1_to_12_count": 12,
-    "additional_task_count": 8,
-    "combined_task_count": 20,
-    "tasks_1_to_12_metrics": "docs/data/summary_metrics.json",
     "unified_protocol": "docs/data/evaluation_protocol.json"
   },
   "dataset_scope": {
@@ -28,9 +27,9 @@
     "raw_data_redistributed": false
   },
   "setup_alignment": {
-    "same_window_unit_as_tasks_1_to_12": true,
-    "same_feature_manifest_as_tasks_1_to_12": "results/episode_task_suite/feature_manifest.json",
-    "same_shared_tensor_as_tasks_1_to_12": "results/episode_task_suite/shared_windows.npz",
     "minimal_baselines": "softmax, ridge regression/projection, and ridge multilabel heads",
     "neural_baselines": "compact one-hidden-layer/two-layer PyTorch MLP heads with the same chronological split",
     "leakage_policy": "Caption-derived text features are removed whenever the target is a label, object, relation, interaction phrase, or future semantic state."
@@ -135,7 +134,7 @@
         "status": "pass",
         "task": "long_horizon_next_action",
         "task_display_name": "Long-Horizon Next-Action Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
@@ -221,7 +220,7 @@
         "status": "pass",
         "task": "long_horizon_next_action",
         "task_display_name": "Long-Horizon Next-Action Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
@@ -276,7 +275,7 @@
         "status": "pass",
         "task": "next_subtask_forecast",
         "task_display_name": "Long-Horizon Next-Subtask Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
@@ -361,7 +360,7 @@
         "status": "pass",
         "task": "next_subtask_forecast",
         "task_display_name": "Long-Horizon Next-Subtask Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
@@ -416,7 +415,7 @@
         "status": "pass",
         "task": "interaction_text_prediction",
         "task_display_name": "Interaction Text Prediction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -512,7 +511,7 @@
         "status": "pass",
         "task": "interaction_text_prediction",
         "task_display_name": "Interaction Text Prediction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -567,7 +566,7 @@
         "status": "pass",
         "task": "action_object_relation",
         "task_display_name": "Action-Object Relation Prediction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -659,7 +658,7 @@
         "status": "pass",
         "task": "action_object_relation",
         "task_display_name": "Action-Object Relation Prediction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -713,7 +712,7 @@
         "status": "pass",
         "task": "object_set_forecast",
         "task_display_name": "Future Object-Set Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_ridge_multilabel",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -747,7 +746,7 @@
         "status": "pass",
         "task": "object_set_forecast",
         "task_display_name": "Future Object-Set Forecasting",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp_multilabel",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
@@ -795,7 +794,7 @@
         "status": "pass",
         "task": "imu_to_hand_pose",
         "task_display_name": "IMU-to-Hand Pose Reconstruction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_ridge_regression",
         "input": "Current IMU acceleration/gyroscope feature block only.",
         "split": "single_episode_chronological",
@@ -814,7 +813,7 @@
         "status": "pass",
         "task": "imu_to_hand_pose",
         "task_display_name": "IMU-to-Hand Pose Reconstruction",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp_regression",
         "input": "Current IMU acceleration/gyroscope feature block only.",
         "split": "single_episode_chronological",
@@ -864,7 +863,7 @@
         "status": "pass",
         "task": "camera_view_sync_retrieval",
         "task_display_name": "Camera-View Synchronization Retrieval",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_ridge_projection_cosine_retrieval",
         "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
         "split": "single_episode_chronological",
@@ -885,7 +884,7 @@
         "status": "pass",
         "task": "camera_view_sync_retrieval",
         "task_display_name": "Camera-View Synchronization Retrieval",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp_projection_cosine_retrieval",
         "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
         "split": "single_episode_chronological",
@@ -934,7 +933,7 @@
         "status": "pass",
         "task": "time_to_transition",
         "task_display_name": "Time-to-Next-Transition Regression",
-        "suite_position": "tasks_13_to_20",
         "model_family": "minimal_ridge_regression",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
@@ -954,7 +953,7 @@
         "status": "pass",
         "task": "time_to_transition",
         "task_display_name": "Time-to-Next-Transition Regression",
-        "suite_position": "tasks_13_to_20",
         "model_family": "neural_mlp_regression",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",

   "title": "Ropedia Xperience-10M Unified 20-Task Provenance Bundle",
   "status": "pass",
   "generated_at_utc": "2026-06-16T06:25:58+00:00",
+  "suite_position": "unified_20_task_provenance",
   "legacy_path_note": "The tier2_task_suite file and directory names are retained for stable public links; this bundle is provenance inside the unified 20-task suite, not a separate public tier.",
+  "unified_task_integration": {
+    "total_task_count": 20,
+    "legacy_provenance_row_count": 8,
+    "shared_metrics": "docs/data/summary_metrics.json",
     "unified_protocol": "docs/data/evaluation_protocol.json"
   },
   "dataset_scope": {
     "raw_data_redistributed": false
   },
   "setup_alignment": {
+    "same_window_unit_as_unified_suite": true,
+    "same_feature_manifest_as_unified_suite": "results/episode_task_suite/feature_manifest.json",
+    "same_shared_tensor_as_unified_suite": "results/episode_task_suite/shared_windows.npz",
     "minimal_baselines": "softmax, ridge regression/projection, and ridge multilabel heads",
     "neural_baselines": "compact one-hidden-layer/two-layer PyTorch MLP heads with the same chronological split",
     "leakage_policy": "Caption-derived text features are removed whenever the target is a label, object, relation, interaction phrase, or future semantic state."
         "status": "pass",
         "task": "long_horizon_next_action",
         "task_display_name": "Long-Horizon Next-Action Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "long_horizon_next_action",
         "task_display_name": "Long-Horizon Next-Action Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "next_subtask_forecast",
         "task_display_name": "Long-Horizon Next-Subtask Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "next_subtask_forecast",
         "task_display_name": "Long-Horizon Next-Subtask Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "interaction_text_prediction",
         "task_display_name": "Interaction Text Prediction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "interaction_text_prediction",
         "task_display_name": "Interaction Text Prediction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "action_object_relation",
         "task_display_name": "Action-Object Relation Prediction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_softmax",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "action_object_relation",
         "task_display_name": "Action-Object Relation Prediction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "object_set_forecast",
         "task_display_name": "Future Object-Set Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_ridge_multilabel",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "object_set_forecast",
         "task_display_name": "Future Object-Set Forecasting",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp_multilabel",
         "input": "Current 20-frame sensor window with caption-text features removed.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "imu_to_hand_pose",
         "task_display_name": "IMU-to-Hand Pose Reconstruction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_ridge_regression",
         "input": "Current IMU acceleration/gyroscope feature block only.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "imu_to_hand_pose",
         "task_display_name": "IMU-to-Hand Pose Reconstruction",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp_regression",
         "input": "Current IMU acceleration/gyroscope feature block only.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "camera_view_sync_retrieval",
         "task_display_name": "Camera-View Synchronization Retrieval",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_ridge_projection_cosine_retrieval",
         "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "camera_view_sync_retrieval",
         "task_display_name": "Camera-View Synchronization Retrieval",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp_projection_cosine_retrieval",
         "input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "time_to_transition",
         "task_display_name": "Time-to-Next-Transition Regression",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "minimal_ridge_regression",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",
         "status": "pass",
         "task": "time_to_transition",
         "task_display_name": "Time-to-Next-Transition Regression",
+        "suite_position": "unified_20_task_provenance",
         "model_family": "neural_mlp_regression",
         "input": "Current 20-frame non-caption multimodal window.",
         "split": "single_episode_chronological",

data/unified_task_model_radar.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Unified 20-Task Model Radar",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T10:47:17+00:00",
   "task_count": 20,
   "method_count": 9,
   "method_task_record_count": 180,
@@ -235,7 +235,7 @@
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -346,7 +346,7 @@
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -457,7 +457,7 @@
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -568,7 +568,7 @@
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -679,7 +679,7 @@
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
@@ -790,7 +790,7 @@
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -901,7 +901,7 @@
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
-      "origin": "original_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -1012,7 +1012,7 @@
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -1123,7 +1123,7 @@
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -1234,7 +1234,7 @@
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
-      "origin": "original_public_sample_tasks",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
@@ -1345,7 +1345,7 @@
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -1456,7 +1456,7 @@
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -1567,7 +1567,7 @@
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1678,7 +1678,7 @@
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1789,7 +1789,7 @@
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1900,7 +1900,7 @@
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -2011,7 +2011,7 @@
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -2122,7 +2122,7 @@
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
@@ -2233,7 +2233,7 @@
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -2344,7 +2344,7 @@
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

 {
   "title": "Unified 20-Task Model Radar",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:20:34+00:00",
   "task_count": 20,
   "method_count": 9,
   "method_task_record_count": 180,
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

data/website_integrity.json CHANGED Viewed

@@ -1,14 +1,14 @@
 {
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:45:02+00:00",
   "docs_root": "docs",
   "site_base": "/ropedia-xperience-10m-task-suite/",
   "summary": {
     "html_pages": 4,
-    "local_references": 254,
     "external_reference_count": 157,
     "json_files": 55,
-    "image_assets_referenced": 29,
     "failure_count": 0
   },
   "failures": {
@@ -81,7 +81,7 @@
       "status": "pass",
       "reason": "The project overview should appear before the deeper progress ledger.",
       "overview_index": 121816,
-      "evidence_index": 167645
     },
     {
       "name": "project_status_links_json",
@@ -161,7 +161,7 @@
       "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
       "overview_index": 121816,
       "protocol_index": 163835,
-      "evidence_index": 167645
     },
     {
       "name": "evaluation_protocol_links_json",
@@ -277,8 +277,8 @@
     {
       "path": "index.html",
       "id_count": 96,
-      "reference_count": 226,
-      "image_count": 35
     },
     {
       "path": "research_roadmap.html",
@@ -301,7 +301,7 @@
     },
     {
       "path": "data/artifact_index.json",
-      "bytes": 124294,
       "top_level_type": "dict"
     },
     {
@@ -316,12 +316,12 @@
     },
     {
       "path": "data/episode128_task_model_radar.json",
-      "bytes": 184992,
       "top_level_type": "dict"
     },
     {
       "path": "data/evaluation_protocol.json",
-      "bytes": 24007,
       "top_level_type": "dict"
     },
     {
@@ -331,7 +331,7 @@
     },
     {
       "path": "data/figure_index.json",
-      "bytes": 19469,
       "top_level_type": "dict"
     },
     {
@@ -351,7 +351,7 @@
     },
     {
       "path": "data/live_publication_status.json",
-      "bytes": 189922,
       "top_level_type": "dict"
     },
     {
@@ -371,27 +371,27 @@
     },
     {
       "path": "data/omni_model_comparison.json",
-      "bytes": 82088,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_brief.json",
-      "bytes": 4019,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_manifest.json",
-      "bytes": 5774,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_packet.json",
-      "bytes": 10009,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_status.json",
-      "bytes": 23255,
       "top_level_type": "dict"
     },
     {
@@ -401,7 +401,7 @@
     },
     {
       "path": "data/public_surface_qa.json",
-      "bytes": 7690,
       "top_level_type": "dict"
     },
     {
@@ -441,7 +441,7 @@
     },
     {
       "path": "data/reproducibility_matrix.json",
-      "bytes": 6815,
       "top_level_type": "dict"
     },
     {
@@ -466,7 +466,7 @@
     },
     {
       "path": "data/research_takeaways.json",
-      "bytes": 7162,
       "top_level_type": "dict"
     },
     {
@@ -481,7 +481,7 @@
     },
     {
       "path": "data/single_episode_task_model_radar.json",
-      "bytes": 51107,
       "top_level_type": "dict"
     },
     {
@@ -511,7 +511,7 @@
     },
     {
       "path": "data/task_suite_20.json",
-      "bytes": 34597,
       "top_level_type": "dict"
     },
     {
@@ -536,7 +536,7 @@
     },
     {
       "path": "data/tier2_task_suite.json",
-      "bytes": 33411,
       "top_level_type": "dict"
     },
     {
@@ -551,7 +551,7 @@
     },
     {
       "path": "data/unified_task_model_radar.json",
-      "bytes": 228815,
       "top_level_type": "dict"
     },
     {
@@ -656,13 +656,6 @@
       "format": "SVG",
       "has_viewbox": true
     },
-    {
-      "path": "assets/charts/tier2_task_suite.svg",
-      "exists": true,
-      "bytes": 5453,
-      "format": "SVG",
-      "has_viewbox": true
-    },
     {
       "path": "assets/charts/two_evidence_line_map.svg",
       "exists": true,

 {
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:58+00:00",
   "docs_root": "docs",
   "site_base": "/ropedia-xperience-10m-task-suite/",
   "summary": {
     "html_pages": 4,
+    "local_references": 256,
     "external_reference_count": 157,
     "json_files": 55,
+    "image_assets_referenced": 28,
     "failure_count": 0
   },
   "failures": {
       "status": "pass",
       "reason": "The project overview should appear before the deeper progress ledger.",
       "overview_index": 121816,
+      "evidence_index": 167655
     },
     {
       "name": "project_status_links_json",
       "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
       "overview_index": 121816,
       "protocol_index": 163835,
+      "evidence_index": 167655
     },
     {
       "name": "evaluation_protocol_links_json",
     {
       "path": "index.html",
       "id_count": 96,
+      "reference_count": 228,
+      "image_count": 34
     },
     {
       "path": "research_roadmap.html",
     },
     {
       "path": "data/artifact_index.json",
+      "bytes": 124341,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/episode128_task_model_radar.json",
+      "bytes": 185212,
       "top_level_type": "dict"
     },
     {
       "path": "data/evaluation_protocol.json",
+      "bytes": 24267,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/figure_index.json",
+      "bytes": 19485,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/live_publication_status.json",
+      "bytes": 189990,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/omni_model_comparison.json",
+      "bytes": 82102,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_brief.json",
+      "bytes": 4032,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_manifest.json",
+      "bytes": 5739,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_packet.json",
+      "bytes": 10018,
       "top_level_type": "dict"
     },
     {
       "path": "data/project_status.json",
+      "bytes": 23232,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/public_surface_qa.json",
+      "bytes": 7691,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/reproducibility_matrix.json",
+      "bytes": 6836,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/research_takeaways.json",
+      "bytes": 7165,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/single_episode_task_model_radar.json",
+      "bytes": 51327,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/task_suite_20.json",
+      "bytes": 34805,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/tier2_task_suite.json",
+      "bytes": 33575,
       "top_level_type": "dict"
     },
     {
     },
     {
       "path": "data/unified_task_model_radar.json",
+      "bytes": 229035,
       "top_level_type": "dict"
     },
     {
       "format": "SVG",
       "has_viewbox": true
     },
     {
       "path": "assets/charts/two_evidence_line_map.svg",
       "exists": true,

index.html CHANGED Viewed

@@ -4787,7 +4787,7 @@
           <article class="artifact"><h3>Split policy</h3><p>Single-episode chronological 70/30 train/test split. This avoids random future-window mixing; cross-episode generalization is measured in the later multi-episode pilot.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/EVALUATION_PROTOCOL.md">protocol document</a></article>
           <article class="artifact"><h3>Metric contract</h3><p>All 20 tasks list input, target, primary metric, baseline score, and source artifact path in the unified suite file.</p><a href="data/task_suite_20.json">task_suite_20.json</a></article>
           <article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
-          <article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across the original task contracts under the same chronological split.</p><a href="data/audio_ablation_summary.json">audio summary</a></article>
           <article class="artifact"><h3>Foundation track selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 is the world-model track with a camera-pose proxy forward-dynamics contract ready for trainer work, policy models wait for robot-compatible action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
           <article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight JSON-task evaluation plus a fine-tuned forward-dynamics LoRA branch. The next stage is action/subtask error analysis, stronger model-quality runs, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
           <article class="artifact"><h3>128-Episode Task Suite Enhancement Pack</h3><p>Before adding episodes, the suite should try `multiscale_20s10_40s20_80s40`, hierarchical action/subtask targets, label-normalized scoring, and compact raw-feature shards for unsupported tasks.</p><a href="data/task_suite_enhancement_128.json">task_suite_enhancement_128.json</a></article>
@@ -4824,7 +4824,7 @@
           <article class="evidence-card">
             <span class="status-pill">verified</span>
             <h3>Audio contribution is measured task by task</h3>
-            <p>Audio variants improve the primary metric on 6 of the original task contracts in this single-episode setting.</p>
             <div class="evidence-links">
               <a href="data/audio_ablation_summary.json">audio summary</a>
               <a href="assets/charts/audio_ablation_delta.svg">delta chart</a>
@@ -5463,7 +5463,7 @@
     <section id="directions" data-project-tab="directions" role="tabpanel" aria-labelledby="tab-directions" tabindex="-1">
       <div class="wrap">
         <div class="section-head">
-          <h2>The original tasks organized into four research directions.</h2>
           <p>Each task is mapped as direct, proxy, or diagnostic evidence for the Ropedia research tracks. The mapping uses two current baselines: minimal interpretable heads and neural MLP heads over the same feature contract.</p>
         </div>
         <div class="direction-grid">
@@ -5510,76 +5510,18 @@
       <div class="wrap">
         <div class="section-head">
           <h2>Unified 20-task evidence and provenance.</h2>
-          <p>All 20 tasks now live in the same task table, task-card grid, radar, and 180-record result matrix. The chart below is retained as provenance for the historically named result bundle, not as a separate task tier.</p>
-        </div>
-        <img class="chart" src="assets/charts/tier2_task_suite.svg?v=xperience10m-tier2" alt="Historical additional-task provenance chart for the unified Xperience-10M 20-task suite">
-        <div class="extension-grid">
-          <article class="extension-card">
-            <span class="status-pill">Task 13 / forecast</span>
-            <h3>Long-Horizon Next-Action Forecasting</h3>
-            <p><strong>Input:</strong> current non-caption multimodal window.</p>
-            <p><strong>Output:</strong> action label five seconds later.</p>
-            <div class="extension-metrics"><span><strong>0.0750</strong>minimal macro-F1</span><span><strong>0.0655</strong>neural macro-F1</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 14 / procedure</span>
-            <h3>Long-Horizon Next-Subtask Forecasting</h3>
-            <p><strong>Input:</strong> current non-caption multimodal window.</p>
-            <p><strong>Output:</strong> procedure subtask five seconds later.</p>
-            <div class="extension-metrics"><span><strong>0.0455</strong>minimal macro-F1</span><span><strong>0.0507</strong>neural macro-F1</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 15 / language</span>
-            <h3>Interaction Text Prediction</h3>
-            <p><strong>Input:</strong> current sensor window with caption features removed.</p>
-            <p><strong>Output:</strong> raw annotation interaction phrase.</p>
-            <div class="extension-metrics"><span><strong>0.0444</strong>minimal macro-F1</span><span><strong>0.0381</strong>neural macro-F1</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 16 / relation</span>
-            <h3>Action-Object Relation Prediction</h3>
-            <p><strong>Input:</strong> current sensor window with caption features removed.</p>
-            <p><strong>Output:</strong> joint action plus active object-set label.</p>
-            <div class="extension-metrics"><span><strong>0.0000</strong>minimal macro-F1</span><span><strong>0.0000</strong>neural macro-F1</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 17 / objects</span>
-            <h3>Future Object-Set Forecasting</h3>
-            <p><strong>Input:</strong> current sensor window with caption features removed.</p>
-            <p><strong>Output:</strong> object set active five seconds later.</p>
-            <div class="extension-metrics"><span><strong>0.1694</strong>minimal micro-F1</span><span><strong>0.1972</strong>neural micro-F1</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 18 / sensor bridge</span>
-            <h3>IMU-to-Hand Pose Reconstruction</h3>
-            <p><strong>Input:</strong> IMU acceleration and gyroscope features only.</p>
-            <p><strong>Output:</strong> current left/right hand joint feature blocks.</p>
-            <div class="extension-metrics"><span><strong>0.0420</strong>minimal MAE</span><span><strong>0.0426</strong>neural MAE</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 19 / camera sync</span>
-            <h3>Camera-View Synchronization Retrieval</h3>
-            <p><strong>Input:</strong> fisheye camera-1 feature query.</p>
-            <p><strong>Output:</strong> synchronized fisheye camera-3 window rank.</p>
-            <div class="extension-metrics"><span><strong>0.4943</strong>minimal MRR</span><span><strong>0.2409</strong>neural MRR</span></div>
-          </article>
-          <article class="extension-card">
-            <span class="status-pill">Task 20 / timing</span>
-            <h3>Time-to-Next-Transition Regression</h3>
-            <p><strong>Input:</strong> current non-caption multimodal window.</p>
-            <p><strong>Output:</strong> capped frames until the next action boundary.</p>
-            <div class="extension-metrics"><span><strong>10.5374</strong>minimal MAE frames</span><span><strong>10.5545</strong>neural MAE frames</span></div>
-          </article>
         </div>
         <div class="callout-row">
           <div class="callout">
             <h3>Unified task artifact package</h3>
-            <p>The public task package has the 20-task JSON, per-task metrics, prediction/rank files, Markdown summaries, and charts generated from the local public-sample annotation and committed shared-window tensor.</p>
-            <p><a href="data/task_suite_20.json">Open unified 20-task JSON</a> · <a href="data/tier2_task_suite.json">Open historical provenance JSON</a></p>
           </div>
           <div class="callout">
             <h3>One setup, one task surface</h3>
             <p>Every task uses the same 20-frame window unit, 5-frame stride, 8,546-dimensional feature manifest, chronological split discipline, and minimal/neural comparison pattern unless a task-specific leakage rule removes target-side features.</p>
           </div>
         </div>
         <img class="chart" src="assets/charts/research_direction_extension_tasks.svg?v=xperience10m-ext" alt="Four Xperience-10M research-direction extension probes with minimal and neural metrics">
@@ -5633,7 +5575,7 @@
     <section id="architectures" data-project-tab="method" role="tabpanel" aria-labelledby="tab-method" tabindex="-1">
       <div class="wrap">
         <div class="section-head">
-          <h2>The original task heads share four head families.</h2>
           <p>The diagram separates the shared episode-window representation from the task-specific heads, so the task contracts stay readable before scaling to larger models.</p>
         </div>
         <img class="architecture-image" src="assets/task_architectures.png?v=xperience10m-nn" alt="Verified minimal and neural architecture diagram for Ropedia Xperience-10M task heads">
@@ -5732,7 +5674,7 @@
           <img class="chart" src="assets/charts/cross_modal_retrieval.svg" alt="Cross modal retrieval chart">
           <img class="chart" src="assets/charts/episode_task_scores_neural_mlp.svg" alt="Neural MLP task score chart">
           <img class="chart" src="assets/charts/episode_task_scores_minimal_vs_neural.svg" alt="Minimal versus neural score chart">
-          <img class="chart" src="assets/charts/audio_ablation_delta.svg" alt="Measured audio delta chart across original task contracts">
         </div>
         <p class="section-note"><a href="single_episode_explorer.html">Open the single-episode explorer</a> to inspect window-level labels, predictions, modality statistics, object labels, and diagnostic scores. The <a href="data/audio_ablation_summary.json">audio ablation summary</a> records the task-by-task audio contribution.</p>
       </div>
@@ -5861,9 +5803,9 @@
               <article class="artifact"><h3>Windows table</h3><p>Window start/end frames and aligned action/subtask labels for the public sample episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/windows.csv">window table</a></article>
               <article class="artifact"><h3>Feature inputs</h3><p>Source map for the current modality inputs used by the task suite.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/feature_manifest.json">feature inputs</a></article>
               <article class="artifact"><h3>Neural MLP task results</h3><p>Per-task PyTorch MLP metrics, predictions, histories, and checkpoints for the unified task contracts, with historical result-bundle paths retained for provenance.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/neural_mlp">neural MLP outputs</a></article>
-              <article class="artifact"><h3>Four-direction taxonomy</h3><p>Maps the original tasks to the four research tracks: human modeling, 3D/4D reconstruction, egocentric interaction, and world modeling.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_directions">research direction outputs</a></article>
               <article class="artifact"><h3>Direction extension probes</h3><p>Four coded probes, one per research direction, with minimal and neural metrics plus prediction/rank CSVs.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_direction_extensions">extension probe outputs</a></article>
-              <article class="artifact"><h3>Task walkthroughs</h3><p>Case studies for the original tasks, including input, middle process modules, output, metric, limitation, and task-player data.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/task_walkthroughs">walkthrough outputs</a></article>
               <article class="artifact"><h3>Audio ablation and raw upgrade</h3><p>All 72 task/variant rows comparing current audio, no audio, raw audio, replacement, and combined-input settings.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/audio_ablation">audio ablation outputs</a></article>
               <article class="artifact"><h3>Single-episode explorer</h3><p>Interactive window-level view of labels, predictions, modality statistics, object labels, and diagnostics.</p><a href="single_episode_explorer.html">open explorer</a></article>
               <article class="artifact"><h3>Cross-modal retrieval</h3><p>The strongest self-supervised signal from the single episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/cross_modal_retrieval/metrics.json">retrieval metrics</a></article>
@@ -5917,7 +5859,7 @@
             <div class="artifact-grid">
               <article class="artifact"><h3>Project brief</h3><p>The fastest written overview of the dataset sample, tasks, baselines, and scale-up plan.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/PROJECT_BRIEF.md">brief</a></article>
               <article class="artifact"><h3>Glossary</h3><p>Plain-language definitions for the terms most likely to confuse first-time readers and reviewers.</p><a href="data/glossary.json">glossary</a></article>
-              <article class="artifact"><h3>Task walkthroughs</h3><p>Human-readable case studies for the original tasks, including input, process modules, output, metric, and limitation.</p><a href="data/task_walkthroughs.json">walkthroughs</a></article>
               <article class="artifact"><h3>Task results</h3><p>Minimal and neural-head metrics for the same sample windows and chronological split.</p><a href="data/summary_metrics.json">metrics</a></article>
               <article class="artifact"><h3>Visual figures</h3><p>Task-suite map, modality atlas, pipeline diagram, model architecture figure, and Qwen3-Omni LoRA training-flow figure.</p><a href="assets/task_suite_infographic.png">task-suite figure</a></article>
               <article class="artifact"><h3>Dataset notes</h3><p>Official dataset links, public sample source, modalities, access boundary, and current project subset.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE10M_DATASET_CARD_ALIGNMENT.md">dataset notes</a></article>

           <article class="artifact"><h3>Split policy</h3><p>Single-episode chronological 70/30 train/test split. This avoids random future-window mixing; cross-episode generalization is measured in the later multi-episode pilot.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/EVALUATION_PROTOCOL.md">protocol document</a></article>
           <article class="artifact"><h3>Metric contract</h3><p>All 20 tasks list input, target, primary metric, baseline score, and source artifact path in the unified suite file.</p><a href="data/task_suite_20.json">task_suite_20.json</a></article>
           <article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
+          <article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across the walkthrough-backed task contracts under the same chronological split.</p><a href="data/audio_ablation_summary.json">audio summary</a></article>
           <article class="artifact"><h3>Foundation track selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 is the world-model track with a camera-pose proxy forward-dynamics contract ready for trainer work, policy models wait for robot-compatible action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
           <article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight JSON-task evaluation plus a fine-tuned forward-dynamics LoRA branch. The next stage is action/subtask error analysis, stronger model-quality runs, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
           <article class="artifact"><h3>128-Episode Task Suite Enhancement Pack</h3><p>Before adding episodes, the suite should try `multiscale_20s10_40s20_80s40`, hierarchical action/subtask targets, label-normalized scoring, and compact raw-feature shards for unsupported tasks.</p><a href="data/task_suite_enhancement_128.json">task_suite_enhancement_128.json</a></article>
           <article class="evidence-card">
             <span class="status-pill">verified</span>
             <h3>Audio contribution is measured task by task</h3>
+            <p>Audio variants improve the primary metric on 6 walkthrough-backed task contracts in this single-episode setting.</p>
             <div class="evidence-links">
               <a href="data/audio_ablation_summary.json">audio summary</a>
               <a href="assets/charts/audio_ablation_delta.svg">delta chart</a>
     <section id="directions" data-project-tab="directions" role="tabpanel" aria-labelledby="tab-directions" tabindex="-1">
       <div class="wrap">
         <div class="section-head">
+          <h2>The walkthrough-backed tasks organized into four research directions.</h2>
           <p>Each task is mapped as direct, proxy, or diagnostic evidence for the Ropedia research tracks. The mapping uses two current baselines: minimal interpretable heads and neural MLP heads over the same feature contract.</p>
         </div>
         <div class="direction-grid">
       <div class="wrap">
         <div class="section-head">
           <h2>Unified 20-task evidence and provenance.</h2>
+          <p>All 20 tasks live in the same task table, task-card grid, radar, and 180-record result matrix. Historical result paths are retained only for exact provenance links.</p>
         </div>
         <div class="callout-row">
           <div class="callout">
             <h3>Unified task artifact package</h3>
+            <p>The public task package has one 20-task JSON, per-task metrics, prediction/rank files, Markdown summaries, radar charts, and the 180-record method-task matrix.</p>
+            <p><a href="data/task_suite_20.json">Open unified 20-task JSON</a> · <a href="data/task_method_20_result_matrix.json">Open 180-record matrix</a> · <a href="assets/charts/unified_task_model_radar.svg">Open unified radar</a></p>
           </div>
           <div class="callout">
             <h3>One setup, one task surface</h3>
             <p>Every task uses the same 20-frame window unit, 5-frame stride, 8,546-dimensional feature manifest, chronological split discipline, and minimal/neural comparison pattern unless a task-specific leakage rule removes target-side features.</p>
+            <p><a href="data/tier2_task_suite.json">Historical provenance JSON</a> and <a href="assets/charts/tier2_task_suite.svg">historical provenance chart</a> remain available for exact source tracing.</p>
           </div>
         </div>
         <img class="chart" src="assets/charts/research_direction_extension_tasks.svg?v=xperience10m-ext" alt="Four Xperience-10M research-direction extension probes with minimal and neural metrics">
     <section id="architectures" data-project-tab="method" role="tabpanel" aria-labelledby="tab-method" tabindex="-1">
       <div class="wrap">
         <div class="section-head">
+          <h2>The baseline task heads share four head families.</h2>
           <p>The diagram separates the shared episode-window representation from the task-specific heads, so the task contracts stay readable before scaling to larger models.</p>
         </div>
         <img class="architecture-image" src="assets/task_architectures.png?v=xperience10m-nn" alt="Verified minimal and neural architecture diagram for Ropedia Xperience-10M task heads">
           <img class="chart" src="assets/charts/cross_modal_retrieval.svg" alt="Cross modal retrieval chart">
           <img class="chart" src="assets/charts/episode_task_scores_neural_mlp.svg" alt="Neural MLP task score chart">
           <img class="chart" src="assets/charts/episode_task_scores_minimal_vs_neural.svg" alt="Minimal versus neural score chart">
+          <img class="chart" src="assets/charts/audio_ablation_delta.svg" alt="Measured audio delta chart across walkthrough-backed task contracts">
         </div>
         <p class="section-note"><a href="single_episode_explorer.html">Open the single-episode explorer</a> to inspect window-level labels, predictions, modality statistics, object labels, and diagnostic scores. The <a href="data/audio_ablation_summary.json">audio ablation summary</a> records the task-by-task audio contribution.</p>
       </div>
               <article class="artifact"><h3>Windows table</h3><p>Window start/end frames and aligned action/subtask labels for the public sample episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/windows.csv">window table</a></article>
               <article class="artifact"><h3>Feature inputs</h3><p>Source map for the current modality inputs used by the task suite.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/feature_manifest.json">feature inputs</a></article>
               <article class="artifact"><h3>Neural MLP task results</h3><p>Per-task PyTorch MLP metrics, predictions, histories, and checkpoints for the unified task contracts, with historical result-bundle paths retained for provenance.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/neural_mlp">neural MLP outputs</a></article>
+              <article class="artifact"><h3>Four-direction taxonomy</h3><p>Maps the walkthrough-backed task contracts to the four research tracks: human modeling, 3D/4D reconstruction, egocentric interaction, and world modeling.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_directions">research direction outputs</a></article>
               <article class="artifact"><h3>Direction extension probes</h3><p>Four coded probes, one per research direction, with minimal and neural metrics plus prediction/rank CSVs.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_direction_extensions">extension probe outputs</a></article>
+              <article class="artifact"><h3>Task walkthroughs</h3><p>Case studies for the walkthrough-backed task contracts, including input, middle process modules, output, metric, limitation, and task-player data.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/task_walkthroughs">walkthrough outputs</a></article>
               <article class="artifact"><h3>Audio ablation and raw upgrade</h3><p>All 72 task/variant rows comparing current audio, no audio, raw audio, replacement, and combined-input settings.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/audio_ablation">audio ablation outputs</a></article>
               <article class="artifact"><h3>Single-episode explorer</h3><p>Interactive window-level view of labels, predictions, modality statistics, object labels, and diagnostics.</p><a href="single_episode_explorer.html">open explorer</a></article>
               <article class="artifact"><h3>Cross-modal retrieval</h3><p>The strongest self-supervised signal from the single episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/cross_modal_retrieval/metrics.json">retrieval metrics</a></article>
             <div class="artifact-grid">
               <article class="artifact"><h3>Project brief</h3><p>The fastest written overview of the dataset sample, tasks, baselines, and scale-up plan.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/PROJECT_BRIEF.md">brief</a></article>
               <article class="artifact"><h3>Glossary</h3><p>Plain-language definitions for the terms most likely to confuse first-time readers and reviewers.</p><a href="data/glossary.json">glossary</a></article>
+              <article class="artifact"><h3>Task walkthroughs</h3><p>Human-readable case studies for the walkthrough-backed task contracts, including input, process modules, output, metric, and limitation.</p><a href="data/task_walkthroughs.json">walkthroughs</a></article>
               <article class="artifact"><h3>Task results</h3><p>Minimal and neural-head metrics for the same sample windows and chronological split.</p><a href="data/summary_metrics.json">metrics</a></article>
               <article class="artifact"><h3>Visual figures</h3><p>Task-suite map, modality atlas, pipeline diagram, model architecture figure, and Qwen3-Omni LoRA training-flow figure.</p><a href="assets/task_suite_infographic.png">task-suite figure</a></article>
               <article class="artifact"><h3>Dataset notes</h3><p>Official dataset links, public sample source, modalities, access boundary, and current project subset.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE10M_DATASET_CARD_ALIGNMENT.md">dataset notes</a></article>

metrics/episode128_task_model_radar.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "128-Episode 20-Task Radar",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T10:47:17+00:00",
   "description": "Selected 128-episode metadata/raw baselines plus verified Qwen3-Omni v6, Cosmos3-Super, and Cosmos3-Nano diagnostics. Every method has 20 records; numeric scores appear only where the public artifact produced that task target.",
   "task_count": 20,
   "method_count": 7,
@@ -192,7 +192,7 @@
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -283,7 +283,7 @@
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -374,7 +374,7 @@
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -465,7 +465,7 @@
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -556,7 +556,7 @@
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
@@ -647,7 +647,7 @@
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
-      "origin": "original_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -738,7 +738,7 @@
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
-      "origin": "original_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -829,7 +829,7 @@
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -920,7 +920,7 @@
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
-      "origin": "original_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -1011,7 +1011,7 @@
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
-      "origin": "original_public_sample_tasks",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
@@ -1102,7 +1102,7 @@
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -1193,7 +1193,7 @@
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
-      "origin": "original_public_sample_tasks",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
@@ -1284,7 +1284,7 @@
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1375,7 +1375,7 @@
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1466,7 +1466,7 @@
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1557,7 +1557,7 @@
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
@@ -1648,7 +1648,7 @@
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
@@ -1739,7 +1739,7 @@
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
@@ -1830,7 +1830,7 @@
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
@@ -1921,7 +1921,7 @@
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
-      "origin": "additional_public_sample_tasks",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

 {
   "title": "128-Episode 20-Task Radar",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:20:34+00:00",
   "description": "Selected 128-episode metadata/raw baselines plus verified Qwen3-Omni v6, Cosmos3-Super, and Cosmos3-Nano diagnostics. Every method has 20 records; numeric scores appear only where the public artifact produced that task target.",
   "task_count": 20,
   "method_count": 7,
       "label": "Action Recognition",
       "axis_label": "01 Action Recognition",
       "short_label": "Action",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Procedure Step Recognition",
       "axis_label": "02 Procedure Step Recognition",
       "short_label": "Step",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action Boundary Detection",
       "axis_label": "03 Action Boundary Detection",
       "short_label": "Boundary",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Next-Action Prediction",
       "axis_label": "04 Next-Action Prediction",
       "short_label": "Next act",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Hand Trajectory Forecasting",
       "axis_label": "05 Hand Trajectory Forecasting",
       "short_label": "Hand traj",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mpjpe",
       "metric_name": "MPJPE",
       "metric_direction": "lower",
       "label": "Contact State Prediction",
       "axis_label": "06 Contact State Prediction",
       "short_label": "Contact",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Object Relevance Prediction",
       "axis_label": "07 Object Relevance Prediction",
       "short_label": "Objects",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "Language Grounding",
       "axis_label": "08 Language Grounding",
       "short_label": "Language",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Retrieval",
       "axis_label": "09 Cross-Modal Retrieval",
       "short_label": "X-modal",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Cross-Modal Reconstruction",
       "axis_label": "10 Cross-Modal Reconstruction",
       "short_label": "Recon",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "r2",
       "metric_name": "R2",
       "metric_direction": "higher",
       "label": "Temporal Order Verification",
       "axis_label": "11 Temporal Order Verification",
       "short_label": "Order",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Multimodal Synchronization Detection",
       "axis_label": "12 Multimodal Synchronization Detection",
       "short_label": "Sync",
+      "provenance_source": "walkthrough_backed_task_contract",
       "metric_key": "f1",
       "metric_name": "F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Action Forecasting",
       "axis_label": "13 Long-Horizon Next-Action Forecasting",
       "short_label": "Long act",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Long-Horizon Next-Subtask Forecasting",
       "axis_label": "14 Long-Horizon Next-Subtask Forecasting",
       "short_label": "Long step",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Interaction Text Prediction",
       "axis_label": "15 Interaction Text Prediction",
       "short_label": "Interact txt",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Action-Object Relation Prediction",
       "axis_label": "16 Action-Object Relation Prediction",
       "short_label": "Act+obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "macro_f1",
       "metric_name": "macro-F1",
       "metric_direction": "higher",
       "label": "Future Object-Set Forecasting",
       "axis_label": "17 Future Object-Set Forecasting",
       "short_label": "Future obj",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "micro_f1",
       "metric_name": "micro-F1",
       "metric_direction": "higher",
       "label": "IMU-to-Hand Pose Reconstruction",
       "axis_label": "18 IMU-to-Hand Pose Reconstruction",
       "short_label": "IMU->hand",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE",
       "metric_direction": "lower",
       "label": "Camera-View Synchronization Retrieval",
       "axis_label": "19 Camera-View Synchronization Retrieval",
       "short_label": "Cam sync",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mrr",
       "metric_name": "MRR",
       "metric_direction": "higher",
       "label": "Time-to-Next-Transition Regression",
       "axis_label": "20 Time-to-Next-Transition Regression",
       "short_label": "Time2bdry",
+      "provenance_source": "historical_result_bundle",
       "metric_key": "mae",
       "metric_name": "MAE frames",
       "metric_direction": "lower",

metrics/figure_index.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Figure Index",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:40:33+00:00",
   "scope": "Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience-10M videos, annotations, RRD files, and Qwen weights are excluded.",
   "figure_count": 29,
   "figures": [
@@ -60,12 +60,12 @@
       "id": "task_suite_infographic",
       "title": "Original task-suite infographic",
       "path": "docs/assets/task_suite_infographic.png",
-      "role": "Primary visual map of the original task families, verified metrics, and sample modalities; the unified public suite is now documented as 20 tasks.",
       "source_script": "scripts/render_task_suite_infographic.py",
       "surface": "README, website, HF Space, artifact dataset, model card",
       "exists": true,
-      "bytes": 1903454,
-      "sha256": "6667eb856cf61ada9f868807b5d5c6ccde06e4f791b2f9dd567d98b71b307415",
       "dimensions": {
         "format": "PNG",
         "width": 1800,
@@ -162,7 +162,7 @@
       "id": "task_architectures",
       "title": "Minimal and neural task architecture map",
       "path": "docs/assets/task_architectures.png",
-      "role": "Minimal and neural heads for the original task contracts and shared feature contracts.",
       "source_script": "scripts/render_overview_figures.py",
       "surface": "README, website, HF artifact dataset, model card",
       "exists": true,
@@ -392,8 +392,8 @@
       "source_script": "scripts/tier2_task_suite.py",
       "surface": "website unified task section, README, HF mirrors",
       "exists": true,
-      "bytes": 5437,
-      "sha256": "3e35e476f559cd6188e5417e4d28c25efc130abafc9cab2d941bc77d559177a1",
       "dimensions": {
         "format": "SVG",
         "width": 1440,

 {
   "title": "Ropedia Xperience-10M Figure Index",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:19:00+00:00",
   "scope": "Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience-10M videos, annotations, RRD files, and Qwen weights are excluded.",
   "figure_count": 29,
   "figures": [
       "id": "task_suite_infographic",
       "title": "Original task-suite infographic",
       "path": "docs/assets/task_suite_infographic.png",
+      "role": "Primary visual map of the walkthrough-backed task families, verified metrics, and sample modalities; the unified public suite is documented as 20 tasks.",
       "source_script": "scripts/render_task_suite_infographic.py",
       "surface": "README, website, HF Space, artifact dataset, model card",
       "exists": true,
+      "bytes": 1897278,
+      "sha256": "71b1ab150e952cf902488226c65b3822d8016974f63d111204c1eb1a7745faad",
       "dimensions": {
         "format": "PNG",
         "width": 1800,
       "id": "task_architectures",
       "title": "Minimal and neural task architecture map",
       "path": "docs/assets/task_architectures.png",
+      "role": "Minimal and neural heads for the walkthrough-backed task contracts and shared feature contracts.",
       "source_script": "scripts/render_overview_figures.py",
       "surface": "README, website, HF artifact dataset, model card",
       "exists": true,
       "source_script": "scripts/tier2_task_suite.py",
       "surface": "website unified task section, README, HF mirrors",
       "exists": true,
+      "bytes": 5453,
+      "sha256": "e9da29c57f42b29a7a05622fee1335089ac2b6fc9692a3b49fa5b753904db9dc",
       "dimensions": {
         "format": "SVG",
         "width": 1440,

metrics/live_publication_status.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

metrics/omni_model_comparison.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
   "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
-  "generated_at_utc": "2026-06-21T10:47:04+00:00",
   "status": "pass",
   "version_count": 3,
   "model_group_count": 5,
   "comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
   "version_reading_notes": [
-    "Version 1 is the public-sample 20-task surface: original core heads, tasks 13-20, and the 180-row method-task matrix.",
     "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
     "The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
   ],

 {
   "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
+  "generated_at_utc": "2026-06-21T15:17:00+00:00",
   "status": "pass",
   "version_count": 3,
   "model_group_count": 5,
   "comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
   "version_reading_notes": [
+    "Version 1 is the public-sample 20-task surface: unified task heads, historical provenance rows, and the 180-row method-task matrix.",
     "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
     "The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
   ],

metrics/project_brief.json CHANGED Viewed

@@ -52,7 +52,7 @@
     "Open EVALUATION_PROTOCOL.md before comparing task scores.",
     "Use RESEARCH_TAKEAWAYS.md for the current metric interpretation.",
     "Inspect results/episode_task_suite/feature_manifest.json to understand one model input.",
-    "Use TASK_SUITE_20.md and docs/data/task_suite_20.json to read the unified 20-task suite; the historical docs/data/tier2_task_suite.json path stores the tasks 13-20 result bundle.",
     "Use docs/data/omni_finetune_verified_result.json for the current multi-episode Qwen3-Omni pilot result."
   ],
   "scope_boundary": "The public sample is enough to build and verify task definitions, feature contracts, metrics, visualization, and baseline code. The final multi-episode Qwen3-Omni diagnostic result verifies the training loop and strict-JSON output reliability, but does not yet show strong action/subtask model quality.",

     "Open EVALUATION_PROTOCOL.md before comparing task scores.",
     "Use RESEARCH_TAKEAWAYS.md for the current metric interpretation.",
     "Inspect results/episode_task_suite/feature_manifest.json to understand one model input.",
+    "Use TASK_SUITE_20.md and docs/data/task_suite_20.json to read the unified 20-task suite; the historical docs/data/tier2_task_suite.json path is retained only for provenance inside that suite.",
     "Use docs/data/omni_finetune_verified_result.json for the current multi-episode Qwen3-Omni pilot result."
   ],
   "scope_boundary": "The public sample is enough to build and verify task definitions, feature contracts, metrics, visualization, and baseline code. The final multi-episode Qwen3-Omni diagnostic result verifies the training loop and strict-JSON output reliability, but does not yet show strong action/subtask model quality.",

metrics/project_packet.json CHANGED Viewed

@@ -15,9 +15,8 @@
     "cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
     "task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
     "task_count": 20,
-    "original_public_sample_task_count": 12,
-    "additional_public_sample_task_count": 8,
-    "legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
   },
   "reading_path": [
     {
@@ -110,7 +109,7 @@
         "results/episode_task_suite/neural_mlp/",
         "docs/data/summary_metrics.json"
       ],
-      "readout": "The unified suite has 20 task contracts; tasks 1-12 have walkthroughs and neural MLP heads, and tasks 13-20 have aligned minimal/neural result bundles under the historical tier2_task_suite path."
     },
     {
       "step": 8,

     "cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
     "task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
     "task_count": 20,
+    "task_surface_framing": "unified_20_task_suite",
+    "legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
   },
   "reading_path": [
     {
         "results/episode_task_suite/neural_mlp/",
         "docs/data/summary_metrics.json"
       ],
+      "readout": "The unified suite has 20 task contracts in one task surface. Walkthrough-backed tasks, aligned minimal/neural result bundles, and historical tier2_task_suite provenance paths are all linked from TASK_SUITE_20.md and docs/data/task_suite_20.json."
     },
     {
       "step": 8,

metrics/public_surface_qa.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Public Project Surface",
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:46:49+00:00",
   "scope": "Repo README, GitHub Pages HTML, Hugging Face Space card, artifact dataset card, and model card.",
   "checks": [
     {
@@ -33,12 +33,12 @@
         "source_alignment": {
           "exists": true,
           "status": "pass",
-          "generated_at_utc": "2026-06-21T13:32:47+00:00"
         },
         "scale_up_status": {
           "exists": true,
           "status": "pass",
-          "generated_at_utc": "2026-06-21T13:32:50+00:00"
         },
         "publication_package": {
           "exists": true,
@@ -48,7 +48,7 @@
         "mirror_parity": {
           "exists": true,
           "status": "pass",
-          "generated_at_utc": "2026-06-21T14:13:08+00:00"
         }
       },
       "failures": {}
@@ -96,7 +96,7 @@
       "reason": "Public copy should consistently present the project as Ropedia Xperience-10M, with the Qwen3-Omni scale-up status.",
       "marker_counts": {
         "Ropedia Xperience-10M Task Suite": 20,
-        "Xperience-10M": 167,
         "20-task": 100,
         "Qwen3-Omni": 245,
         "128-episode pilot": 1
@@ -137,11 +137,11 @@
         "data/unified_task_model_radar.json": 21,
         "data/single_episode_task_model_radar.json": 17,
         "data/episode128_task_model_radar.json": 16,
-        "data/task_method_20_result_matrix.json": 24,
         "data/task_method_20_gap_audit.json": 23,
         "data/language_versions.json": 3,
         "assets/charts/two_evidence_line_map.svg": 5,
-        "assets/charts/unified_task_model_radar.svg": 17,
         "assets/charts/single_episode_task_model_radar.svg": 19,
         "assets/charts/episode128_task_model_radar.svg": 19,
         "data/tier2_task_suite.json": 11

 {
   "title": "Ropedia Xperience-10M Public Project Surface",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:42+00:00",
   "scope": "Repo README, GitHub Pages HTML, Hugging Face Space card, artifact dataset card, and model card.",
   "checks": [
     {
         "source_alignment": {
           "exists": true,
           "status": "pass",
+          "generated_at_utc": "2026-06-21T14:46:49+00:00"
         },
         "scale_up_status": {
           "exists": true,
           "status": "pass",
+          "generated_at_utc": "2026-06-21T14:47:03+00:00"
         },
         "publication_package": {
           "exists": true,
         "mirror_parity": {
           "exists": true,
           "status": "pass",
+          "generated_at_utc": "2026-06-21T14:53:27+00:00"
         }
       },
       "failures": {}
       "reason": "Public copy should consistently present the project as Ropedia Xperience-10M, with the Qwen3-Omni scale-up status.",
       "marker_counts": {
         "Ropedia Xperience-10M Task Suite": 20,
+        "Xperience-10M": 166,
         "20-task": 100,
         "Qwen3-Omni": 245,
         "128-episode pilot": 1
         "data/unified_task_model_radar.json": 21,
         "data/single_episode_task_model_radar.json": 17,
         "data/episode128_task_model_radar.json": 16,
+        "data/task_method_20_result_matrix.json": 25,
         "data/task_method_20_gap_audit.json": 23,
         "data/language_versions.json": 3,
         "assets/charts/two_evidence_line_map.svg": 5,
+        "assets/charts/unified_task_model_radar.svg": 18,
         "assets/charts/single_episode_task_model_radar.svg": 19,
         "assets/charts/episode128_task_model_radar.svg": 19,
         "data/tier2_task_suite.json": 11

metrics/reproducibility_matrix.json CHANGED Viewed

@@ -39,7 +39,7 @@
       "id": "original_task_suite",
       "status": "reproducible",
       "command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
-      "expected": "original task metrics, predictions, manifests, and neural_mlp task-head artifacts",
       "boundary": "8,546-dimensional multimodal window contract"
     },
     {
@@ -50,11 +50,11 @@
       "boundary": "single-episode probes, not full research-direction solutions"
     },
     {
-      "id": "tasks_13_to_20_and_unified_index",
       "status": "reproducible",
       "command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
-      "expected": "tasks 13-20 metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
-      "boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for tasks 13-20; raw HDF5 and MP4 files are not redistributed"
     },
     {
       "id": "source_alignment_audit",

       "id": "original_task_suite",
       "status": "reproducible",
       "command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
+      "expected": "walkthrough-backed task metrics, predictions, manifests, and neural_mlp task-head artifacts",
       "boundary": "8,546-dimensional multimodal window contract"
     },
     {
       "boundary": "single-episode probes, not full research-direction solutions"
     },
     {
+      "id": "unified_20_task_index",
       "status": "reproducible",
       "command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
+      "expected": "unified 20-task metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
+      "boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for full public-task regeneration; raw HDF5 and MP4 files are not redistributed"
     },
     {
       "id": "source_alignment_audit",

metrics/research_takeaways.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "title": "Ropedia Xperience-10M Research Takeaways",
   "status": "pass",
-  "generated_at_utc": "2026-06-20T21:27:21+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
@@ -133,7 +133,7 @@
     {
       "id": "audio_contribution_is_task_specific",
       "title": "Audio helps some tasks and hurts others on the public sample",
-      "readout": "Audio improves the primary metric on 6 of the original task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
       "evidence": [
         {
           "label": "tasks_where_current_audio_improves",

 {
   "title": "Ropedia Xperience-10M Research Takeaways",
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:18:59+00:00",
   "source_files": [
     "docs/data/summary_metrics.json",
     "results/episode_task_suite/summary_report.json",
     {
       "id": "audio_contribution_is_task_specific",
       "title": "Audio helps some tasks and hurts others on the public sample",
+      "readout": "Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
       "evidence": [
         {
           "label": "tasks_where_current_audio_improves",

metrics/task_method_20_gap_audit.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "generated_at_utc": "2026-06-21T08:38:20+00:00",
   "immediate_actions": [
     {
       "artifact": "docs/data/task_method_20_gap_audit.json",

 {
+  "generated_at_utc": "2026-06-21T15:21:42+00:00",
   "immediate_actions": [
     {
       "artifact": "docs/data/task_method_20_gap_audit.json",

metrics/task_surface_integrity.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "status": "pass",
-  "generated_at_utc": "2026-06-21T14:45:00+00:00",
   "summary": {
     "original_walkthrough_task_count": 12,
     "expected_original_walkthrough_task_count": 12,

 {
   "status": "pass",
+  "generated_at_utc": "2026-06-21T15:21:55+00:00",
   "summary": {
     "original_walkthrough_task_count": 12,
     "expected_original_walkthrough_task_count": 12,