Add files using upload-large-folder tool
Browse files- ARTIFACT_GUIDE.md +3 -3
- EVALUATION_PROTOCOL.md +22 -22
- FIGURE_INDEX.md +2 -2
- PROJECT_README.md +17 -25
- PROJECT_STATUS.md +2 -2
- README.md +17 -25
- RESEARCH_TAKEAWAYS.md +1 -1
- TASK_METHOD_20_GAP_AUDIT.md +1 -1
- TASK_SUITE_20.md +22 -22
- data/artifact_index.json +64 -64
- data/evaluation_protocol.json +23 -23
- data/live_publication_status.json +0 -0
- data/mirror_parity.json +0 -0
- data/omni_model_comparison.json +2 -2
- data/project_manifest.json +3 -4
- data/project_packet.json +3 -4
- data/project_status.json +5 -6
- data/publication_audit.json +1 -1
- data/quality_gates.json +1 -1
- data/reproducibility_matrix.json +4 -4
- data/research_takeaways.json +2 -2
- data/scope_claims_audit.json +1 -1
- data/single_episode_task_model_radar.json +21 -21
- data/source_alignment_audit.json +1 -1
- data/task_method_20_gap_audit.json +1 -1
- data/task_method_20_result_matrix.json +1 -1
- data/task_suite_20.json +46 -46
- data/task_surface_integrity.json +1 -1
- data/tier2_task_suite.json +24 -25
- data/unified_task_model_radar.json +21 -21
- data/website_integrity.json +24 -31
- index.html +12 -70
- metrics/episode128_task_model_radar.json +21 -21
- metrics/figure_index.json +7 -7
- metrics/live_publication_status.json +0 -0
- metrics/omni_model_comparison.json +2 -2
- metrics/project_brief.json +1 -1
- metrics/project_packet.json +3 -4
- metrics/public_surface_qa.json +7 -7
- metrics/reproducibility_matrix.json +4 -4
- metrics/research_takeaways.json +2 -2
- metrics/task_method_20_gap_audit.json +1 -1
- metrics/task_surface_integrity.json +1 -1
ARTIFACT_GUIDE.md
CHANGED
|
@@ -20,7 +20,7 @@ Xperience-native pretraining goal.
|
|
| 20 |
| [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
|
| 21 |
| [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
|
| 22 |
| [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
|
| 23 |
-
| [`results/audio_ablation/AUDIO_ABLATION_SUMMARY.md`](results/audio_ablation/AUDIO_ABLATION_SUMMARY.md) | Shows measured current-audio and raw log-mel replacement deltas across the
|
| 24 |
| [`docs/single_episode_explorer.html`](docs/single_episode_explorer.html) | Gives a static window-level explorer for the public sample episode. |
|
| 25 |
| [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Optional detail for readers who need official dataset and access-term context. |
|
| 26 |
|
|
@@ -74,13 +74,13 @@ Xperience-native pretraining goal.
|
|
| 74 |
| --- | --- |
|
| 75 |
| [`TASK_SUITE_20.md`](TASK_SUITE_20.md) | Reader-facing table for the unified 20-task suite. |
|
| 76 |
| [`docs/data/task_suite_20.json`](docs/data/task_suite_20.json) | Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
|
| 77 |
-
| [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json) | The
|
| 78 |
| [`results/episode_task_suite/neural_mlp/`](results/episode_task_suite/neural_mlp/) | Matching PyTorch MLP heads for the same task contracts and feature windows. |
|
| 79 |
| [`results/episode_task_suite/research_directions/`](results/episode_task_suite/research_directions/) | Mapping from the unified 20-task suite to the four Ropedia research directions. |
|
| 80 |
| [`results/episode_task_suite/research_direction_extensions/`](results/episode_task_suite/research_direction_extensions/) | Four additional coded probes, one per research direction. |
|
| 81 |
| [`results/episode_task_suite/tier2_task_suite/`](results/episode_task_suite/tier2_task_suite/) | Historical provenance path inside the unified 20-task suite. |
|
| 82 |
| [`results/episode_task_suite/task_walkthroughs/`](results/episode_task_suite/task_walkthroughs/) | Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
|
| 83 |
-
| [`results/audio_ablation/audio_ablation_metrics.csv`](results/audio_ablation/audio_ablation_metrics.csv) | All measured audio rows for the
|
| 84 |
| [`results/audio_ablation/audio_delta_summary.csv`](results/audio_ablation/audio_delta_summary.csv) | Compact per-task audio delta table for quick manual inspection. |
|
| 85 |
| [`scripts/audio_ablation_and_raw_upgrade.py`](scripts/audio_ablation_and_raw_upgrade.py) | Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
|
| 86 |
| [`scripts/validate_task_surface.py`](scripts/validate_task_surface.py) | Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |
|
|
|
|
| 20 |
| [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
|
| 21 |
| [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
|
| 22 |
| [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
|
| 23 |
+
| [`results/audio_ablation/AUDIO_ABLATION_SUMMARY.md`](results/audio_ablation/AUDIO_ABLATION_SUMMARY.md) | Shows measured current-audio and raw log-mel replacement deltas across the walkthrough-backed task contracts. |
|
| 24 |
| [`docs/single_episode_explorer.html`](docs/single_episode_explorer.html) | Gives a static window-level explorer for the public sample episode. |
|
| 25 |
| [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Optional detail for readers who need official dataset and access-term context. |
|
| 26 |
|
|
|
|
| 74 |
| --- | --- |
|
| 75 |
| [`TASK_SUITE_20.md`](TASK_SUITE_20.md) | Reader-facing table for the unified 20-task suite. |
|
| 76 |
| [`docs/data/task_suite_20.json`](docs/data/task_suite_20.json) | Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
|
| 77 |
+
| [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json) | The walkthrough-backed task contracts, chronological split, and minimal/neural metrics. |
|
| 78 |
| [`results/episode_task_suite/neural_mlp/`](results/episode_task_suite/neural_mlp/) | Matching PyTorch MLP heads for the same task contracts and feature windows. |
|
| 79 |
| [`results/episode_task_suite/research_directions/`](results/episode_task_suite/research_directions/) | Mapping from the unified 20-task suite to the four Ropedia research directions. |
|
| 80 |
| [`results/episode_task_suite/research_direction_extensions/`](results/episode_task_suite/research_direction_extensions/) | Four additional coded probes, one per research direction. |
|
| 81 |
| [`results/episode_task_suite/tier2_task_suite/`](results/episode_task_suite/tier2_task_suite/) | Historical provenance path inside the unified 20-task suite. |
|
| 82 |
| [`results/episode_task_suite/task_walkthroughs/`](results/episode_task_suite/task_walkthroughs/) | Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
|
| 83 |
+
| [`results/audio_ablation/audio_ablation_metrics.csv`](results/audio_ablation/audio_ablation_metrics.csv) | All measured audio rows for the walkthrough-backed task contracts across six variants, including no-audio, audio-only, alternate-audio-only, representation replacement, and all-input variants. |
|
| 84 |
| [`results/audio_ablation/audio_delta_summary.csv`](results/audio_ablation/audio_delta_summary.csv) | Compact per-task audio delta table for quick manual inspection. |
|
| 85 |
| [`scripts/audio_ablation_and_raw_upgrade.py`](scripts/audio_ablation_and_raw_upgrade.py) | Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
|
| 86 |
| [`scripts/validate_task_surface.py`](scripts/validate_task_surface.py) | Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |
|
EVALUATION_PROTOCOL.md
CHANGED
|
@@ -50,28 +50,28 @@ All 20 public-sample task contracts are presented together under the same
|
|
| 50 |
minimal/neural baseline setup. Historical `tier2_task_suite` paths are
|
| 51 |
retained only as stable provenance artifact locations inside the unified suite.
|
| 52 |
|
| 53 |
-
| # | Task | Artifact id |
|
| 54 |
-
| ---: | --- | --- | --- | --- | --- | --- | ---
|
| 55 |
-
| 1 | Action Recognition | `timeline_action` |
|
| 56 |
-
| 2 | Procedure Step Recognition | `timeline_subtask` |
|
| 57 |
-
| 3 | Action Boundary Detection | `transition_detection` |
|
| 58 |
-
| 4 | Next-Action Prediction | `next_action` |
|
| 59 |
-
| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` |
|
| 60 |
-
| 6 | Contact State Prediction | `contact_prediction` |
|
| 61 |
-
| 7 | Object Relevance Prediction | `object_relevance` |
|
| 62 |
-
| 8 | Language Grounding | `caption_grounding` |
|
| 63 |
-
| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` |
|
| 64 |
-
| 10 | Cross-Modal Reconstruction | `modality_reconstruction` |
|
| 65 |
-
| 11 | Temporal Order Verification | `temporal_order` |
|
| 66 |
-
| 12 | Multimodal Synchronization Detection | `misalignment_detection` |
|
| 67 |
-
| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` |
|
| 68 |
-
| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` |
|
| 69 |
-
| 15 | Interaction Text Prediction | `interaction_text_prediction` |
|
| 70 |
-
| 16 | Action-Object Relation Prediction | `action_object_relation` |
|
| 71 |
-
| 17 | Future Object-Set Forecasting | `object_set_forecast` |
|
| 72 |
-
| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` |
|
| 73 |
-
| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` |
|
| 74 |
-
| 20 | Time-to-Next-Transition Regression | `time_to_transition` |
|
| 75 |
|
| 76 |
## Leakage Controls
|
| 77 |
|
|
|
|
| 50 |
minimal/neural baseline setup. Historical `tier2_task_suite` paths are
|
| 51 |
retained only as stable provenance artifact locations inside the unified suite.
|
| 52 |
|
| 53 |
+
| # | Task | Artifact id | Family | Unit | Input -> target | Primary metric | Minimal | Neural |
|
| 54 |
+
| ---: | --- | --- | --- | --- | --- | --- | ---: | ---: |
|
| 55 |
+
| 1 | Action Recognition | `timeline_action` | supervised classification | single window | current 20-frame all-feature window -> current action label | macro_f1 (higher better) | 0.0500 | 0.0148 |
|
| 56 |
+
| 2 | Procedure Step Recognition | `timeline_subtask` | supervised classification | single window | current 20-frame all-feature window -> current subtask label | macro_f1 (higher better) | 0.0506 | 0.0281 |
|
| 57 |
+
| 3 | Action Boundary Detection | `transition_detection` | temporal diagnostic | single window | current 20-frame all-feature window -> action boundary versus steady | macro_f1 (higher better) | 0.6118 | 0.5862 |
|
| 58 |
+
| 4 | Next-Action Prediction | `next_action` | short-horizon prediction | single window | current 20-frame all-feature window at time t -> action label at t + 20 frames | macro_f1 (higher better) | 0.0593 | 0.0419 |
|
| 59 |
+
| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | trajectory regression | single window | current all-feature window -> future left/right hand 3D joints for 10 frames | mpjpe (lower better) | 0.8647 | 0.1079 |
|
| 60 |
+
| 6 | Contact State Prediction | `contact_prediction` | binary classification | single window | non-contact and non-caption feature blocks -> any body contact | macro_f1 (higher better) | 1.0000 | 1.0000 |
|
| 61 |
+
| 7 | Object Relevance Prediction | `object_relevance` | multi-label classification | single window | non-caption feature blocks -> current relevant object set | micro_f1 (higher better) | 0.1803 | 0.1679 |
|
| 62 |
+
| 8 | Language Grounding | `caption_grounding` | retrieval | caption query | caption object/interaction query plus candidate sensor windows -> matching time window | mrr (higher better) | 0.0160 | 0.0168 |
|
| 63 |
+
| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | retrieval | sensor query | motion, IMU, and camera query features -> matching depth/video window | top5_accuracy (higher better) | 0.3678 | 0.1983 |
|
| 64 |
+
| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | cross-modal regression | single window | motion, IMU, and camera features -> depth/video feature vector | r2 (higher better) | -0.0153 | -0.0102 |
|
| 65 |
+
| 11 | Temporal Order Verification | `temporal_order` | pairwise diagnostic | adjacent window pair | two adjacent windows -> correct versus reversed order | f1 (higher better) | 0.5400 | 0.8520 |
|
| 66 |
+
| 12 | Multimodal Synchronization Detection | `misalignment_detection` | pairwise diagnostic | paired modality window | motion side plus visual/depth side -> aligned versus shifted by 8 windows | f1 (higher better) | 0.5052 | 0.7153 |
|
| 67 |
+
| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro_f1 (higher better) | 0.0750 | 0.0655 |
|
| 68 |
+
| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | classification | single aligned window | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro_f1 (higher better) | 0.0455 | 0.0507 |
|
| 69 |
+
| 15 | Interaction Text Prediction | `interaction_text_prediction` | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro_f1 (higher better) | 0.0444 | 0.0381 |
|
| 70 |
+
| 16 | Action-Object Relation Prediction | `action_object_relation` | classification | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro_f1 (higher better) | 0.0000 | 0.0000 |
|
| 71 |
+
| 17 | Future Object-Set Forecasting | `object_set_forecast` | multi_label | single aligned window | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro_f1 (higher better) | 0.1694 | 0.1972 |
|
| 72 |
+
| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | regression | single aligned window | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | mae (lower better) | 0.0420 | 0.0426 |
|
| 73 |
+
| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | retrieval | held-out query window | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | mrr (higher better) | 0.4943 | 0.2409 |
|
| 74 |
+
| 20 | Time-to-Next-Transition Regression | `time_to_transition` | regression | single aligned window | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | mae (lower better) | 10.5374 | 10.5545 |
|
| 75 |
|
| 76 |
## Leakage Controls
|
| 77 |
|
FIGURE_INDEX.md
CHANGED
|
@@ -14,13 +14,13 @@ Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience
|
|
| 14 |
| Project logo mark | `docs/assets/brand/xperience10m-logo-mark-512.png` | 512 x 512 | `scripts/build_brand_assets.py` | Primary X-shaped multimodal camera mark used for the website header, README, HF cards, and brand identity. |
|
| 15 |
| Project logo social card | `docs/assets/brand/xperience10m-logo-social-card.png` | 1200 x 630 | `scripts/build_brand_assets.py` | Large preview image for README, Hugging Face cards, and Open Graph/Twitter social sharing. |
|
| 16 |
| Project favicon | `docs/assets/brand/xperience10m-logo-favicon-64.png` | 64 x 64 | `scripts/build_brand_assets.py` | Small dark-tile logo for browser tabs and compact navigation. |
|
| 17 |
-
| Original task-suite infographic | `docs/assets/task_suite_infographic.png` | 1800 x 7600 | `scripts/render_task_suite_infographic.py` | Primary visual map of the
|
| 18 |
| Episode-to-task pipeline diagram | `docs/assets/pipeline_diagram.png` | 1800 x 1120 | `scripts/generate_visualizations.py` | End-to-end data processing and evaluation pipeline overview. |
|
| 19 |
| Qwen3-Omni LoRA training pipeline | `docs/assets/qwen3_omni_lora_pipeline.png` | 1536 x 1024 | `docs/assets/qwen3_omni_lora_pipeline.prompt.md` | Detailed raw-data-to-adapter flow for staged Xperience-10M Qwen3-Omni LoRA training. |
|
| 20 |
| Spatial intelligence slide diagram | `docs/assets/foundation-pipelines/spatial-intelligence-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the spatial intelligence pipeline track. |
|
| 21 |
| Human-video world model slide diagram | `docs/assets/foundation-pipelines/human-video-world-model-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the human-video world-model pipeline track. |
|
| 22 |
| Vision-language-action slide diagram | `docs/assets/foundation-pipelines/vision-language-action-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the VLA/action-policy pipeline track. |
|
| 23 |
-
| Minimal and neural task architecture map | `docs/assets/task_architectures.png` | 1800 x 2450 | `scripts/render_overview_figures.py` | Minimal and neural heads for the
|
| 24 |
| Video modality thumbnail | `docs/assets/modalities/video.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived thumbnail for synchronized camera streams. |
|
| 25 |
| Audio modality thumbnail | `docs/assets/modalities/audio.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived waveform thumbnail for the MP4 AAC stream. |
|
| 26 |
| Depth modality thumbnail | `docs/assets/modalities/depth.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived depth and confidence thumbnail. |
|
|
|
|
| 14 |
| Project logo mark | `docs/assets/brand/xperience10m-logo-mark-512.png` | 512 x 512 | `scripts/build_brand_assets.py` | Primary X-shaped multimodal camera mark used for the website header, README, HF cards, and brand identity. |
|
| 15 |
| Project logo social card | `docs/assets/brand/xperience10m-logo-social-card.png` | 1200 x 630 | `scripts/build_brand_assets.py` | Large preview image for README, Hugging Face cards, and Open Graph/Twitter social sharing. |
|
| 16 |
| Project favicon | `docs/assets/brand/xperience10m-logo-favicon-64.png` | 64 x 64 | `scripts/build_brand_assets.py` | Small dark-tile logo for browser tabs and compact navigation. |
|
| 17 |
+
| Original task-suite infographic | `docs/assets/task_suite_infographic.png` | 1800 x 7600 | `scripts/render_task_suite_infographic.py` | Primary visual map of the walkthrough-backed task families, verified metrics, and sample modalities; the unified public suite is documented as 20 tasks. |
|
| 18 |
| Episode-to-task pipeline diagram | `docs/assets/pipeline_diagram.png` | 1800 x 1120 | `scripts/generate_visualizations.py` | End-to-end data processing and evaluation pipeline overview. |
|
| 19 |
| Qwen3-Omni LoRA training pipeline | `docs/assets/qwen3_omni_lora_pipeline.png` | 1536 x 1024 | `docs/assets/qwen3_omni_lora_pipeline.prompt.md` | Detailed raw-data-to-adapter flow for staged Xperience-10M Qwen3-Omni LoRA training. |
|
| 20 |
| Spatial intelligence slide diagram | `docs/assets/foundation-pipelines/spatial-intelligence-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the spatial intelligence pipeline track. |
|
| 21 |
| Human-video world model slide diagram | `docs/assets/foundation-pipelines/human-video-world-model-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the human-video world-model pipeline track. |
|
| 22 |
| Vision-language-action slide diagram | `docs/assets/foundation-pipelines/vision-language-action-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the VLA/action-policy pipeline track. |
|
| 23 |
+
| Minimal and neural task architecture map | `docs/assets/task_architectures.png` | 1800 x 2450 | `scripts/render_overview_figures.py` | Minimal and neural heads for the walkthrough-backed task contracts and shared feature contracts. |
|
| 24 |
| Video modality thumbnail | `docs/assets/modalities/video.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived thumbnail for synchronized camera streams. |
|
| 25 |
| Audio modality thumbnail | `docs/assets/modalities/audio.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived waveform thumbnail for the MP4 AAC stream. |
|
| 26 |
| Depth modality thumbnail | `docs/assets/modalities/depth.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived depth and confidence thumbnail. |
|
PROJECT_README.md
CHANGED
|
@@ -850,9 +850,9 @@ and verified Qwen3-Omni/Cosmos3 diagnostic artifacts.
|
|
| 850 |
scripts/
|
| 851 |
train_min_action_model.py # motion/IMU baseline
|
| 852 |
train_all_modalities_model.py # current all-feature lightweight baseline
|
| 853 |
-
episode_task_suite.py #
|
| 854 |
neural_task_models.py # optional PyTorch MLP heads for task contracts
|
| 855 |
-
research_direction_taxonomy.py # maps
|
| 856 |
research_direction_extension_tasks.py # one extra data-backed probe per track
|
| 857 |
tier2_task_suite.py # historical-name provenance builder for unified task rows
|
| 858 |
build_unified_task_suite.py # builds TASK_SUITE_20.md and task_suite_20.json
|
|
@@ -890,7 +890,7 @@ results/
|
|
| 890 |
research_directions/ # four-track taxonomy, CSV, and summary
|
| 891 |
research_direction_extensions/ # four extra direction probes + predictions
|
| 892 |
tier2_task_suite/ # provenance baseline tasks + predictions; historical path
|
| 893 |
-
task_walkthroughs/ # case-study walkthroughs for
|
| 894 |
omni_exploration/ # ModelScope readiness-check artifacts
|
| 895 |
omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
|
| 896 |
|
|
@@ -1028,7 +1028,7 @@ cd ropedia-xperience-10m-task-suite
|
|
| 1028 |
python scripts/episode_task_suite.py --workspace /path/to/workspace
|
| 1029 |
```
|
| 1030 |
|
| 1031 |
-
Run the
|
| 1032 |
|
| 1033 |
```bash
|
| 1034 |
pip install torch
|
|
@@ -1449,7 +1449,7 @@ and [`docs/data/additional_development_directions.json`](docs/data/additional_de
|
|
| 1449 |
|
| 1450 |
## Four Research Directions
|
| 1451 |
|
| 1452 |
-
The
|
| 1453 |
a generated artifact, not only in prose:
|
| 1454 |
|
| 1455 |
- [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
|
|
@@ -1475,13 +1475,13 @@ Current direction-level coverage:
|
|
| 1475 |
|
| 1476 |
The important interpretation is that all four directions can be **started** from
|
| 1477 |
the Xperience-10M sample modalities, but only direction C is strongly represented
|
| 1478 |
-
by the
|
| 1479 |
multi-episode training before they become full research deliverables.
|
| 1480 |
|
| 1481 |
-
## Four Direction
|
| 1482 |
|
| 1483 |
-
|
| 1484 |
-
|
| 1485 |
`shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
|
| 1486 |
the reported numbers are computed from sample-derived features and saved metric artifacts.
|
| 1487 |
|
|
@@ -1543,18 +1543,10 @@ unified 20-task suite, not as a separate benchmark tier.
|
|
| 1543 |
|
| 1544 |

|
| 1545 |
|
| 1546 |
-
|
| 1547 |
-
|
| 1548 |
-
|
| 1549 |
-
|
| 1550 |
-
| 13 | Long-Horizon Next-Action Forecasting | current non-caption multimodal window | action label five seconds later | `0.0750` macro-F1 | `0.0655` macro-F1 | Tests procedure context beyond the one-second next-action task. |
|
| 1551 |
-
| 14 | Long-Horizon Next-Subtask Forecasting | current non-caption multimodal window | subtask five seconds later | `0.0455` macro-F1 | `0.0507` macro-F1 | Moves anticipation from low-level action to high-level procedure state. |
|
| 1552 |
-
| 15 | Interaction Text Prediction | current sensor window without caption text | raw interaction phrase | `0.0444` macro-F1 | `0.0381` macro-F1 | Uses the original annotation interaction text instead of only hashed features. |
|
| 1553 |
-
| 16 | Action-Object Relation Prediction | current sensor window without caption text | joint action plus object-set label | `0.0000` macro-F1 | `0.0000` macro-F1 | Exposes a hard binding target for action-object reasoning. |
|
| 1554 |
-
| 17 | Future Object-Set Forecasting | current sensor window without caption text | object set five seconds later | `0.1694` micro-F1 | `0.1972` micro-F1 | Predicts which objects become relevant soon. |
|
| 1555 |
-
| 18 | IMU-to-Hand Pose Reconstruction | IMU feature block only | current left/right hand joints | `0.0420` MAE | `0.0426` MAE | Tests inertial-to-hand sensor bridging. |
|
| 1556 |
-
| 19 | Camera-View Synchronization Retrieval | fisheye camera-1 query | synchronized fisheye camera-3 window | `0.4943` MRR | `0.2409` MRR | Stress-tests multi-camera temporal alignment. |
|
| 1557 |
-
| 20 | Time-to-Next-Transition Regression | current non-caption multimodal window | capped frames until next action boundary | `10.5374` MAE frames | `10.5545` MAE frames | Converts boundary detection into continuous timing. |
|
| 1558 |
|
| 1559 |
Run:
|
| 1560 |
|
|
@@ -1632,7 +1624,7 @@ PyTorch MLP classifiers or regressors. Its outputs live under
|
|
| 1632 |
and the rollup is stored in the `neural_tasks` section of
|
| 1633 |
[`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
|
| 1634 |
|
| 1635 |
-
The
|
| 1636 |
|
| 1637 |
| Task | Input | Minimal head | Output |
|
| 1638 |
| --- | --- | --- | --- |
|
|
@@ -1663,8 +1655,8 @@ The original task-specific heads are:
|
|
| 1663 |
| Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
|
| 1664 |
| Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
|
| 1665 |
| Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
|
| 1666 |
-
| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6
|
| 1667 |
-
| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6
|
| 1668 |
|
| 1669 |
## Audio Contribution Study
|
| 1670 |
|
|
@@ -1743,7 +1735,7 @@ episodes; they are not reported as multi-episode benchmark results.
|
|
| 1743 |
|
| 1744 |
I re-ran the full pipeline from the local raw public sample into a temporary
|
| 1745 |
local workspace and compared regenerated metrics with the committed
|
| 1746 |
-
artifacts. The baseline metrics,
|
| 1747 |
available modality manifest matched exactly after float normalization.
|
| 1748 |
|
| 1749 |
See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the
|
|
|
|
| 850 |
scripts/
|
| 851 |
train_min_action_model.py # motion/IMU baseline
|
| 852 |
train_all_modalities_model.py # current all-feature lightweight baseline
|
| 853 |
+
episode_task_suite.py # public-sample task definitions
|
| 854 |
neural_task_models.py # optional PyTorch MLP heads for task contracts
|
| 855 |
+
research_direction_taxonomy.py # maps walkthrough-backed tasks to the four research tracks
|
| 856 |
research_direction_extension_tasks.py # one extra data-backed probe per track
|
| 857 |
tier2_task_suite.py # historical-name provenance builder for unified task rows
|
| 858 |
build_unified_task_suite.py # builds TASK_SUITE_20.md and task_suite_20.json
|
|
|
|
| 890 |
research_directions/ # four-track taxonomy, CSV, and summary
|
| 891 |
research_direction_extensions/ # four extra direction probes + predictions
|
| 892 |
tier2_task_suite/ # provenance baseline tasks + predictions; historical path
|
| 893 |
+
task_walkthroughs/ # case-study walkthroughs for walkthrough-backed tasks
|
| 894 |
omni_exploration/ # ModelScope readiness-check artifacts
|
| 895 |
omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
|
| 896 |
|
|
|
|
| 1028 |
python scripts/episode_task_suite.py --workspace /path/to/workspace
|
| 1029 |
```
|
| 1030 |
|
| 1031 |
+
Run the public-sample task definitions with lightweight neural heads:
|
| 1032 |
|
| 1033 |
```bash
|
| 1034 |
pip install torch
|
|
|
|
| 1449 |
|
| 1450 |
## Four Research Directions
|
| 1451 |
|
| 1452 |
+
The walkthrough-backed task contracts are organized against the four Ropedia research directions in
|
| 1453 |
a generated artifact, not only in prose:
|
| 1454 |
|
| 1455 |
- [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
|
|
|
|
| 1475 |
|
| 1476 |
The important interpretation is that all four directions can be **started** from
|
| 1477 |
the Xperience-10M sample modalities, but only direction C is strongly represented
|
| 1478 |
+
by the current task evidence. Directions A, B, and D need additional targets and
|
| 1479 |
multi-episode training before they become full research deliverables.
|
| 1480 |
|
| 1481 |
+
## Four Direction Probes
|
| 1482 |
|
| 1483 |
+
Alongside the unified 20-task suite, the repo includes one data-backed probe for
|
| 1484 |
+
each research direction. These probes are computed from the same
|
| 1485 |
`shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
|
| 1486 |
the reported numbers are computed from sample-derived features and saved metric artifacts.
|
| 1487 |
|
|
|
|
| 1543 |
|
| 1544 |

|
| 1545 |
|
| 1546 |
+
The all-task table, including every input/output contract and minimal/neural
|
| 1547 |
+
metric, is in [`TASK_SUITE_20.md`](TASK_SUITE_20.md). Historical provenance
|
| 1548 |
+
links remain listed above for exact source tracing, but the public task surface
|
| 1549 |
+
should be read as one integrated 20-task suite.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1550 |
|
| 1551 |
Run:
|
| 1552 |
|
|
|
|
| 1624 |
and the rollup is stored in the `neural_tasks` section of
|
| 1625 |
[`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
|
| 1626 |
|
| 1627 |
+
The walkthrough-backed task heads are:
|
| 1628 |
|
| 1629 |
| Task | Input | Minimal head | Output |
|
| 1630 |
| --- | --- | --- | --- |
|
|
|
|
| 1655 |
| Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
|
| 1656 |
| Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
|
| 1657 |
| Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
|
| 1658 |
+
| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 walkthrough-backed task contracts |
|
| 1659 |
+
| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 walkthrough-backed task contracts |
|
| 1660 |
|
| 1661 |
## Audio Contribution Study
|
| 1662 |
|
|
|
|
| 1735 |
|
| 1736 |
I re-ran the full pipeline from the local raw public sample into a temporary
|
| 1737 |
local workspace and compared regenerated metrics with the committed
|
| 1738 |
+
artifacts. The baseline metrics, task metrics, feature manifest, and
|
| 1739 |
available modality manifest matched exactly after float normalization.
|
| 1740 |
|
| 1741 |
See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the
|
PROJECT_STATUS.md
CHANGED
|
@@ -33,7 +33,7 @@ prior multiscale release, and v6 is the current public 20-task Qwen3-Omni row.
|
|
| 33 |
| Unified 20-task suite | Verified | `TASK_SUITE_20.md`, `docs/data/task_suite_20.json`, `results/episode_task_suite/`, `results/episode_task_suite/tier2_task_suite/` | All 20 task contracts have committed minimal metrics and share the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The `tier2_task_suite` path is historical provenance inside the unified suite, not a separate public tier. |
|
| 34 |
| 180-result method matrix | Verified complete | `docs/data/task_method_20_result_matrix.json`, `TASK_METHOD_20_RESULT_MATRIX.md`, `docs/data/task_method_20_gap_audit.json`, `docs/assets/charts/unified_task_model_radar.svg` | The public comparison matrix now has 9 methods x 20 tasks = 180/180 scored method-task records. Six rows are explicitly marked as compact-proxy scores where the public 128-episode export lacks the direct raw target. |
|
| 35 |
| Neural heads | Verified | `scripts/neural_task_models.py`, `results/episode_task_suite/neural_mlp/` | Each task also has a compact PyTorch MLP run over the same feature tensor and chronological split. |
|
| 36 |
-
| Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across the
|
| 37 |
| Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
|
| 38 |
| Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy tracks, and the future Xperience-native pretraining goal. |
|
| 39 |
| 128-episode task-suite enhancement pack | Current no-new-episode plan | `TASK_SUITE_ENHANCEMENT_128.md`, `docs/data/task_suite_enhancement_128.json`, `results/omni_finetune/task_suite_enhancement_128_v1_20260608/enhancement_plan.json`, `scripts/omni/build_task_suite_enhancement_128.py` | The current 3,808-window selected split can be stressed without more episodes by exporting denser and multiscale windows. The recommended next export is `multiscale_20s10_40s20_80s40`, estimated at 106,095 windows from the observed frame spans; the pack also defines hierarchical action/subtask targets, raw-feature shard priorities for unsupported tasks, and Qwen3-Omni/Cosmos3 follow-up run cards. |
|
|
@@ -112,7 +112,7 @@ prior multiscale release, and v6 is the current public 20-task Qwen3-Omni row.
|
|
| 112 |
- The current reconstruction task reconstructs feature vectors, not pixel
|
| 113 |
depth, meshes, NeRF outputs, or Gaussian splats.
|
| 114 |
- Audio is part of the current 8,546-dimensional baseline feature vector.
|
| 115 |
-
- Audio contribution is evaluated across the
|
| 116 |
`results/audio_ablation/`.
|
| 117 |
- Foundation-model selection is now explicit: Qwen3-Omni is the immediate
|
| 118 |
trainable pilot, Cosmos 3 is the first world-model track, and Cosmos3-Super
|
|
|
|
| 33 |
| Unified 20-task suite | Verified | `TASK_SUITE_20.md`, `docs/data/task_suite_20.json`, `results/episode_task_suite/`, `results/episode_task_suite/tier2_task_suite/` | All 20 task contracts have committed minimal metrics and share the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The `tier2_task_suite` path is historical provenance inside the unified suite, not a separate public tier. |
|
| 34 |
| 180-result method matrix | Verified complete | `docs/data/task_method_20_result_matrix.json`, `TASK_METHOD_20_RESULT_MATRIX.md`, `docs/data/task_method_20_gap_audit.json`, `docs/assets/charts/unified_task_model_radar.svg` | The public comparison matrix now has 9 methods x 20 tasks = 180/180 scored method-task records. Six rows are explicitly marked as compact-proxy scores where the public 128-episode export lacks the direct raw target. |
|
| 35 |
| Neural heads | Verified | `scripts/neural_task_models.py`, `results/episode_task_suite/neural_mlp/` | Each task also has a compact PyTorch MLP run over the same feature tensor and chronological split. |
|
| 36 |
+
| Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across the walkthrough-backed task contracts; audio improves the primary metric on 6 of those contracts, and a 588-d audio-window representation improves over the baseline audio variant on 6 of those contracts. |
|
| 37 |
| Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
|
| 38 |
| Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy tracks, and the future Xperience-native pretraining goal. |
|
| 39 |
| 128-episode task-suite enhancement pack | Current no-new-episode plan | `TASK_SUITE_ENHANCEMENT_128.md`, `docs/data/task_suite_enhancement_128.json`, `results/omni_finetune/task_suite_enhancement_128_v1_20260608/enhancement_plan.json`, `scripts/omni/build_task_suite_enhancement_128.py` | The current 3,808-window selected split can be stressed without more episodes by exporting denser and multiscale windows. The recommended next export is `multiscale_20s10_40s20_80s40`, estimated at 106,095 windows from the observed frame spans; the pack also defines hierarchical action/subtask targets, raw-feature shard priorities for unsupported tasks, and Qwen3-Omni/Cosmos3 follow-up run cards. |
|
|
|
|
| 112 |
- The current reconstruction task reconstructs feature vectors, not pixel
|
| 113 |
depth, meshes, NeRF outputs, or Gaussian splats.
|
| 114 |
- Audio is part of the current 8,546-dimensional baseline feature vector.
|
| 115 |
+
- Audio contribution is evaluated across the walkthrough-backed task contracts in
|
| 116 |
`results/audio_ablation/`.
|
| 117 |
- Foundation-model selection is now explicit: Qwen3-Omni is the immediate
|
| 118 |
trainable pilot, Cosmos 3 is the first world-model track, and Cosmos3-Super
|
README.md
CHANGED
|
@@ -872,9 +872,9 @@ and verified Qwen3-Omni/Cosmos3 diagnostic artifacts.
|
|
| 872 |
scripts/
|
| 873 |
train_min_action_model.py # motion/IMU baseline
|
| 874 |
train_all_modalities_model.py # current all-feature lightweight baseline
|
| 875 |
-
episode_task_suite.py #
|
| 876 |
neural_task_models.py # optional PyTorch MLP heads for task contracts
|
| 877 |
-
research_direction_taxonomy.py # maps
|
| 878 |
research_direction_extension_tasks.py # one extra data-backed probe per track
|
| 879 |
tier2_task_suite.py # historical-name provenance builder for unified task rows
|
| 880 |
build_unified_task_suite.py # builds TASK_SUITE_20.md and task_suite_20.json
|
|
@@ -912,7 +912,7 @@ results/
|
|
| 912 |
research_directions/ # four-track taxonomy, CSV, and summary
|
| 913 |
research_direction_extensions/ # four extra direction probes + predictions
|
| 914 |
tier2_task_suite/ # provenance baseline tasks + predictions; historical path
|
| 915 |
-
task_walkthroughs/ # case-study walkthroughs for
|
| 916 |
omni_exploration/ # ModelScope readiness-check artifacts
|
| 917 |
omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
|
| 918 |
|
|
@@ -1050,7 +1050,7 @@ cd ropedia-xperience-10m-task-suite
|
|
| 1050 |
python scripts/episode_task_suite.py --workspace /path/to/workspace
|
| 1051 |
```
|
| 1052 |
|
| 1053 |
-
Run the
|
| 1054 |
|
| 1055 |
```bash
|
| 1056 |
pip install torch
|
|
@@ -1471,7 +1471,7 @@ and [`docs/data/additional_development_directions.json`](docs/data/additional_de
|
|
| 1471 |
|
| 1472 |
## Four Research Directions
|
| 1473 |
|
| 1474 |
-
The
|
| 1475 |
a generated artifact, not only in prose:
|
| 1476 |
|
| 1477 |
- [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
|
|
@@ -1497,13 +1497,13 @@ Current direction-level coverage:
|
|
| 1497 |
|
| 1498 |
The important interpretation is that all four directions can be **started** from
|
| 1499 |
the Xperience-10M sample modalities, but only direction C is strongly represented
|
| 1500 |
-
by the
|
| 1501 |
multi-episode training before they become full research deliverables.
|
| 1502 |
|
| 1503 |
-
## Four Direction
|
| 1504 |
|
| 1505 |
-
|
| 1506 |
-
|
| 1507 |
`shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
|
| 1508 |
the reported numbers are computed from sample-derived features and saved metric artifacts.
|
| 1509 |
|
|
@@ -1565,18 +1565,10 @@ unified 20-task suite, not as a separate benchmark tier.
|
|
| 1565 |
|
| 1566 |

|
| 1567 |
|
| 1568 |
-
|
| 1569 |
-
|
| 1570 |
-
|
| 1571 |
-
|
| 1572 |
-
| 13 | Long-Horizon Next-Action Forecasting | current non-caption multimodal window | action label five seconds later | `0.0750` macro-F1 | `0.0655` macro-F1 | Tests procedure context beyond the one-second next-action task. |
|
| 1573 |
-
| 14 | Long-Horizon Next-Subtask Forecasting | current non-caption multimodal window | subtask five seconds later | `0.0455` macro-F1 | `0.0507` macro-F1 | Moves anticipation from low-level action to high-level procedure state. |
|
| 1574 |
-
| 15 | Interaction Text Prediction | current sensor window without caption text | raw interaction phrase | `0.0444` macro-F1 | `0.0381` macro-F1 | Uses the original annotation interaction text instead of only hashed features. |
|
| 1575 |
-
| 16 | Action-Object Relation Prediction | current sensor window without caption text | joint action plus object-set label | `0.0000` macro-F1 | `0.0000` macro-F1 | Exposes a hard binding target for action-object reasoning. |
|
| 1576 |
-
| 17 | Future Object-Set Forecasting | current sensor window without caption text | object set five seconds later | `0.1694` micro-F1 | `0.1972` micro-F1 | Predicts which objects become relevant soon. |
|
| 1577 |
-
| 18 | IMU-to-Hand Pose Reconstruction | IMU feature block only | current left/right hand joints | `0.0420` MAE | `0.0426` MAE | Tests inertial-to-hand sensor bridging. |
|
| 1578 |
-
| 19 | Camera-View Synchronization Retrieval | fisheye camera-1 query | synchronized fisheye camera-3 window | `0.4943` MRR | `0.2409` MRR | Stress-tests multi-camera temporal alignment. |
|
| 1579 |
-
| 20 | Time-to-Next-Transition Regression | current non-caption multimodal window | capped frames until next action boundary | `10.5374` MAE frames | `10.5545` MAE frames | Converts boundary detection into continuous timing. |
|
| 1580 |
|
| 1581 |
Run:
|
| 1582 |
|
|
@@ -1654,7 +1646,7 @@ PyTorch MLP classifiers or regressors. Its outputs live under
|
|
| 1654 |
and the rollup is stored in the `neural_tasks` section of
|
| 1655 |
[`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
|
| 1656 |
|
| 1657 |
-
The
|
| 1658 |
|
| 1659 |
| Task | Input | Minimal head | Output |
|
| 1660 |
| --- | --- | --- | --- |
|
|
@@ -1685,8 +1677,8 @@ The original task-specific heads are:
|
|
| 1685 |
| Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
|
| 1686 |
| Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
|
| 1687 |
| Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
|
| 1688 |
-
| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6
|
| 1689 |
-
| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6
|
| 1690 |
|
| 1691 |
## Audio Contribution Study
|
| 1692 |
|
|
@@ -1765,7 +1757,7 @@ episodes; they are not reported as multi-episode benchmark results.
|
|
| 1765 |
|
| 1766 |
I re-ran the full pipeline from the local raw public sample into a temporary
|
| 1767 |
local workspace and compared regenerated metrics with the committed
|
| 1768 |
-
artifacts. The baseline metrics,
|
| 1769 |
available modality manifest matched exactly after float normalization.
|
| 1770 |
|
| 1771 |
See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the
|
|
|
|
| 872 |
scripts/
|
| 873 |
train_min_action_model.py # motion/IMU baseline
|
| 874 |
train_all_modalities_model.py # current all-feature lightweight baseline
|
| 875 |
+
episode_task_suite.py # public-sample task definitions
|
| 876 |
neural_task_models.py # optional PyTorch MLP heads for task contracts
|
| 877 |
+
research_direction_taxonomy.py # maps walkthrough-backed tasks to the four research tracks
|
| 878 |
research_direction_extension_tasks.py # one extra data-backed probe per track
|
| 879 |
tier2_task_suite.py # historical-name provenance builder for unified task rows
|
| 880 |
build_unified_task_suite.py # builds TASK_SUITE_20.md and task_suite_20.json
|
|
|
|
| 912 |
research_directions/ # four-track taxonomy, CSV, and summary
|
| 913 |
research_direction_extensions/ # four extra direction probes + predictions
|
| 914 |
tier2_task_suite/ # provenance baseline tasks + predictions; historical path
|
| 915 |
+
task_walkthroughs/ # case-study walkthroughs for walkthrough-backed tasks
|
| 916 |
omni_exploration/ # ModelScope readiness-check artifacts
|
| 917 |
omni_finetune/model_output_task_probes_20260616/ # task-13/task-16 probes derived from verified model JSON
|
| 918 |
|
|
|
|
| 1050 |
python scripts/episode_task_suite.py --workspace /path/to/workspace
|
| 1051 |
```
|
| 1052 |
|
| 1053 |
+
Run the public-sample task definitions with lightweight neural heads:
|
| 1054 |
|
| 1055 |
```bash
|
| 1056 |
pip install torch
|
|
|
|
| 1471 |
|
| 1472 |
## Four Research Directions
|
| 1473 |
|
| 1474 |
+
The walkthrough-backed task contracts are organized against the four Ropedia research directions in
|
| 1475 |
a generated artifact, not only in prose:
|
| 1476 |
|
| 1477 |
- [`research_direction_taxonomy.json`](results/episode_task_suite/research_directions/research_direction_taxonomy.json)
|
|
|
|
| 1497 |
|
| 1498 |
The important interpretation is that all four directions can be **started** from
|
| 1499 |
the Xperience-10M sample modalities, but only direction C is strongly represented
|
| 1500 |
+
by the current task evidence. Directions A, B, and D need additional targets and
|
| 1501 |
multi-episode training before they become full research deliverables.
|
| 1502 |
|
| 1503 |
+
## Four Direction Probes
|
| 1504 |
|
| 1505 |
+
Alongside the unified 20-task suite, the repo includes one data-backed probe for
|
| 1506 |
+
each research direction. These probes are computed from the same
|
| 1507 |
`shared_windows.npz`, `windows.csv`, and `feature_manifest.json` artifacts, so
|
| 1508 |
the reported numbers are computed from sample-derived features and saved metric artifacts.
|
| 1509 |
|
|
|
|
| 1565 |
|
| 1566 |

|
| 1567 |
|
| 1568 |
+
The all-task table, including every input/output contract and minimal/neural
|
| 1569 |
+
metric, is in [`TASK_SUITE_20.md`](TASK_SUITE_20.md). Historical provenance
|
| 1570 |
+
links remain listed above for exact source tracing, but the public task surface
|
| 1571 |
+
should be read as one integrated 20-task suite.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1572 |
|
| 1573 |
Run:
|
| 1574 |
|
|
|
|
| 1646 |
and the rollup is stored in the `neural_tasks` section of
|
| 1647 |
[`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json).
|
| 1648 |
|
| 1649 |
+
The walkthrough-backed task heads are:
|
| 1650 |
|
| 1651 |
| Task | Input | Minimal head | Output |
|
| 1652 |
| --- | --- | --- | --- |
|
|
|
|
| 1677 |
| Neural MLP hand forecast | 0.1079 MPJPE | n/a | Same features/split, nonlinear regression head |
|
| 1678 |
| Neural MLP temporal order | 0.8520 F1 | 0.8578 | Strong improvement on adjacent-window ordering |
|
| 1679 |
| Neural MLP misalignment | 0.7153 F1 | 0.7009 | Detects shifted motion/visual/audio pairs better than the linear head |
|
| 1680 |
+
| Audio ablation | +0.0418 mean delta | n/a | Current audio variant improves the primary metric on 6 walkthrough-backed task contracts |
|
| 1681 |
+
| Alternate audio representation | +0.0936 mean delta | n/a | Alternate audio-window representation improves over the baseline audio variant on 6 walkthrough-backed task contracts |
|
| 1682 |
|
| 1683 |
## Audio Contribution Study
|
| 1684 |
|
|
|
|
| 1757 |
|
| 1758 |
I re-ran the full pipeline from the local raw public sample into a temporary
|
| 1759 |
local workspace and compared regenerated metrics with the committed
|
| 1760 |
+
artifacts. The baseline metrics, task metrics, feature manifest, and
|
| 1761 |
available modality manifest matched exactly after float normalization.
|
| 1762 |
|
| 1763 |
See [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) for the
|
RESEARCH_TAKEAWAYS.md
CHANGED
|
@@ -80,7 +80,7 @@ Current scope: The current reconstruction task predicts feature vectors; depth,
|
|
| 80 |
|
| 81 |
### Audio helps some tasks and hurts others on the public sample
|
| 82 |
|
| 83 |
-
Audio improves the primary metric on 6
|
| 84 |
|
| 85 |
| Metric | Value |
|
| 86 |
| --- | ---: |
|
|
|
|
| 80 |
|
| 81 |
### Audio helps some tasks and hurts others on the public sample
|
| 82 |
|
| 83 |
+
Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.
|
| 84 |
|
| 85 |
| Metric | Value |
|
| 86 |
| --- | ---: |
|
TASK_METHOD_20_GAP_AUDIT.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# Task Method 20-Result Completion Audit
|
| 2 |
|
| 3 |
-
Generated: `2026-06-
|
| 4 |
|
| 5 |
This audit is the explicit completion ledger for the 9-method x 20-task result
|
| 6 |
matrix. The current public matrix is complete at 180/180 scored records while
|
|
|
|
| 1 |
# Task Method 20-Result Completion Audit
|
| 2 |
|
| 3 |
+
Generated: `2026-06-21T15:21:42+00:00`
|
| 4 |
|
| 5 |
This audit is the explicit completion ledger for the 9-method x 20-task result
|
| 6 |
matrix. The current public matrix is complete at 180/180 scored records while
|
TASK_SUITE_20.md
CHANGED
|
@@ -20,28 +20,28 @@ as a separate benchmark tier.
|
|
| 20 |
|
| 21 |
## Task Table
|
| 22 |
|
| 23 |
-
| # | Task | Artifact id |
|
| 24 |
-
| ---: | --- | --- | --- | --- | ---
|
| 25 |
-
| 1 | Action Recognition | `timeline_action` |
|
| 26 |
-
| 2 | Procedure Step Recognition | `timeline_subtask` |
|
| 27 |
-
| 3 | Action Boundary Detection | `transition_detection` |
|
| 28 |
-
| 4 | Next-Action Prediction | `next_action` |
|
| 29 |
-
| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` |
|
| 30 |
-
| 6 | Contact State Prediction | `contact_prediction` |
|
| 31 |
-
| 7 | Object Relevance Prediction | `object_relevance` |
|
| 32 |
-
| 8 | Language Grounding | `caption_grounding` |
|
| 33 |
-
| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` |
|
| 34 |
-
| 10 | Cross-Modal Reconstruction | `modality_reconstruction` |
|
| 35 |
-
| 11 | Temporal Order Verification | `temporal_order` |
|
| 36 |
-
| 12 | Multimodal Synchronization Detection | `misalignment_detection` |
|
| 37 |
-
| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` |
|
| 38 |
-
| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` |
|
| 39 |
-
| 15 | Interaction Text Prediction | `interaction_text_prediction` |
|
| 40 |
-
| 16 | Action-Object Relation Prediction | `action_object_relation` |
|
| 41 |
-
| 17 | Future Object-Set Forecasting | `object_set_forecast` |
|
| 42 |
-
| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` |
|
| 43 |
-
| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` |
|
| 44 |
-
| 20 | Time-to-Next-Transition Regression | `time_to_transition` |
|
| 45 |
|
| 46 |
## Machine-Readable Copy
|
| 47 |
|
|
|
|
| 20 |
|
| 21 |
## Task Table
|
| 22 |
|
| 23 |
+
| # | Task | Artifact id | Input -> output | Primary metric | Minimal | Neural |
|
| 24 |
+
| ---: | --- | --- | --- | --- | ---: | ---: |
|
| 25 |
+
| 1 | Action Recognition | `timeline_action` | 20-frame multimodal window -> current action class | macro-F1 (higher better) | 0.0500 | 0.0148 |
|
| 26 |
+
| 2 | Procedure Step Recognition | `timeline_subtask` | 20-frame multimodal window -> current procedure step | macro-F1 (higher better) | 0.0506 | 0.0281 |
|
| 27 |
+
| 3 | Action Boundary Detection | `transition_detection` | current window with boundary target -> boundary or steady | macro-F1 (higher better) | 0.6118 | 0.5862 |
|
| 28 |
+
| 4 | Next-Action Prediction | `next_action` | current window at time t -> action at t+20 frames | macro-F1 (higher better) | 0.0593 | 0.0419 |
|
| 29 |
+
| 5 | Hand Trajectory Forecasting | `hand_trajectory_forecast` | current multimodal window -> future hand-joint trajectory | MPJPE (lower better) | 0.8647 | 0.1079 |
|
| 30 |
+
| 6 | Contact State Prediction | `contact_prediction` | non-contact, non-caption features -> contact or no contact | macro-F1 (higher better) | 1.0000 | 1.0000 |
|
| 31 |
+
| 7 | Object Relevance Prediction | `object_relevance` | non-caption multimodal features -> relevant object set | micro-F1 (higher better) | 0.1803 | 0.1679 |
|
| 32 |
+
| 8 | Language Grounding | `caption_grounding` | text-like query and candidate windows -> ranked matching moments | MRR (higher better) | 0.0160 | 0.0168 |
|
| 33 |
+
| 9 | Cross-Modal Retrieval | `cross_modal_retrieval` | motion/IMU/pose query; depth/video candidates -> ranked visual windows | MRR (higher better) | 0.2693 | 0.1300 |
|
| 34 |
+
| 10 | Cross-Modal Reconstruction | `modality_reconstruction` | motion, IMU, and camera/pose features -> reconstructed depth/video vector | R2 (higher better) | -0.0153 | -0.0102 |
|
| 35 |
+
| 11 | Temporal Order Verification | `temporal_order` | two adjacent windows plus difference vector -> correct or reversed | F1 (higher better) | 0.5400 | 0.8520 |
|
| 36 |
+
| 12 | Multimodal Synchronization Detection | `misalignment_detection` | motion-side and visual/depth-side feature groups -> aligned or shifted | F1 (higher better) | 0.5052 | 0.7153 |
|
| 37 |
+
| 13 | Long-Horizon Next-Action Forecasting | `long_horizon_next_action` | Current 20-frame non-caption multimodal window. -> Action label five seconds later. | macro-F1 (higher better) | 0.0750 | 0.0655 |
|
| 38 |
+
| 14 | Long-Horizon Next-Subtask Forecasting | `next_subtask_forecast` | Current 20-frame non-caption multimodal window. -> Procedure subtask label five seconds later. | macro-F1 (higher better) | 0.0455 | 0.0507 |
|
| 39 |
+
| 15 | Interaction Text Prediction | `interaction_text_prediction` | Current 20-frame sensor window with caption-text features removed. -> Raw annotation interaction phrase for the same window. | macro-F1 (higher better) | 0.0444 | 0.0381 |
|
| 40 |
+
| 16 | Action-Object Relation Prediction | `action_object_relation` | Current 20-frame sensor window with caption-text features removed. -> Joint action plus active object-set relation. | macro-F1 (higher better) | 0.0000 | 0.0000 |
|
| 41 |
+
| 17 | Future Object-Set Forecasting | `object_set_forecast` | Current 20-frame sensor window with caption-text features removed. -> Object set active five seconds later. | micro-F1 (higher better) | 0.1694 | 0.1972 |
|
| 42 |
+
| 18 | IMU-to-Hand Pose Reconstruction | `imu_to_hand_pose` | Current IMU acceleration/gyroscope feature block only. -> Current left/right hand joint feature blocks. | MAE (lower better) | 0.0420 | 0.0426 |
|
| 43 |
+
| 19 | Camera-View Synchronization Retrieval | `camera_view_sync_retrieval` | Fisheye camera-1 feature query projected into fisheye camera-3 feature space. -> The synchronized held-out camera-3 window. | MRR (higher better) | 0.4943 | 0.2409 |
|
| 44 |
+
| 20 | Time-to-Next-Transition Regression | `time_to_transition` | Current 20-frame non-caption multimodal window. -> Frames until the next action-label boundary, capped at 200 frames. | MAE frames (lower better) | 10.5374 | 10.5545 |
|
| 45 |
|
| 46 |
## Machine-Readable Copy
|
| 47 |
|
data/artifact_index.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"status": "pass",
|
| 5 |
"artifact_count": 228,
|
| 6 |
"missing": [],
|
|
@@ -59,8 +59,8 @@
|
|
| 59 |
"surface": "website_hf",
|
| 60 |
"shows": "Machine-readable first-reader project brief for the website and Hugging Face mirrors.",
|
| 61 |
"exists": true,
|
| 62 |
-
"bytes":
|
| 63 |
-
"sha256": "
|
| 64 |
},
|
| 65 |
{
|
| 66 |
"id": "project_status",
|
|
@@ -70,8 +70,8 @@
|
|
| 70 |
"surface": "repo_hf",
|
| 71 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 72 |
"exists": true,
|
| 73 |
-
"bytes":
|
| 74 |
-
"sha256": "
|
| 75 |
},
|
| 76 |
{
|
| 77 |
"id": "project_status_json",
|
|
@@ -81,8 +81,8 @@
|
|
| 81 |
"surface": "website_hf",
|
| 82 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 83 |
"exists": true,
|
| 84 |
-
"bytes":
|
| 85 |
-
"sha256": "
|
| 86 |
},
|
| 87 |
{
|
| 88 |
"id": "glossary",
|
|
@@ -576,8 +576,8 @@
|
|
| 576 |
"surface": "website_hf",
|
| 577 |
"shows": "Gives a short project path with scope status and public surfaces.",
|
| 578 |
"exists": true,
|
| 579 |
-
"bytes":
|
| 580 |
-
"sha256": "
|
| 581 |
},
|
| 582 |
{
|
| 583 |
"id": "artifact_guide",
|
|
@@ -587,8 +587,8 @@
|
|
| 587 |
"surface": "repo_hf",
|
| 588 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 589 |
"exists": true,
|
| 590 |
-
"bytes":
|
| 591 |
-
"sha256": "
|
| 592 |
},
|
| 593 |
{
|
| 594 |
"id": "official_dataset_card_alignment",
|
|
@@ -632,7 +632,7 @@
|
|
| 632 |
"shows": "Machine-readable source-alignment pass/fail check for repo, website, and HF surfaces.",
|
| 633 |
"exists": true,
|
| 634 |
"bytes": 4432,
|
| 635 |
-
"sha256": "
|
| 636 |
},
|
| 637 |
{
|
| 638 |
"id": "source_alignment_validator",
|
|
@@ -686,8 +686,8 @@
|
|
| 686 |
"surface": "repo_hf",
|
| 687 |
"shows": "Defines the window unit, chronological split, task metrics, leakage controls, and current limitations.",
|
| 688 |
"exists": true,
|
| 689 |
-
"bytes":
|
| 690 |
-
"sha256": "
|
| 691 |
},
|
| 692 |
{
|
| 693 |
"id": "evaluation_protocol_json",
|
|
@@ -697,8 +697,8 @@
|
|
| 697 |
"surface": "website_hf",
|
| 698 |
"shows": "Machine-readable protocol generated from committed task metrics for website and HF mirrors.",
|
| 699 |
"exists": true,
|
| 700 |
-
"bytes":
|
| 701 |
-
"sha256": "
|
| 702 |
},
|
| 703 |
{
|
| 704 |
"id": "evaluation_protocol_builder",
|
|
@@ -708,8 +708,8 @@
|
|
| 708 |
"surface": "repo_hf",
|
| 709 |
"shows": "Regenerates the protocol from committed summary metrics and task artifacts.",
|
| 710 |
"exists": true,
|
| 711 |
-
"bytes":
|
| 712 |
-
"sha256": "
|
| 713 |
},
|
| 714 |
{
|
| 715 |
"id": "task_suite_20",
|
|
@@ -719,8 +719,8 @@
|
|
| 719 |
"surface": "repo_hf",
|
| 720 |
"shows": "Reader-facing table for the single unified public-sample task suite under the same window, split, feature, and baseline contract.",
|
| 721 |
"exists": true,
|
| 722 |
-
"bytes":
|
| 723 |
-
"sha256": "
|
| 724 |
},
|
| 725 |
{
|
| 726 |
"id": "task_suite_20_json",
|
|
@@ -730,8 +730,8 @@
|
|
| 730 |
"surface": "website_hf",
|
| 731 |
"shows": "Machine-readable unified 20-task index for the website, Hugging Face mirrors, and live verification.",
|
| 732 |
"exists": true,
|
| 733 |
-
"bytes":
|
| 734 |
-
"sha256": "
|
| 735 |
},
|
| 736 |
{
|
| 737 |
"id": "task_suite_20_builder",
|
|
@@ -741,8 +741,8 @@
|
|
| 741 |
"surface": "repo_hf",
|
| 742 |
"shows": "Regenerates the unified 20-task JSON and Markdown from the public-sample metrics plus the historical provenance result bundle.",
|
| 743 |
"exists": true,
|
| 744 |
-
"bytes":
|
| 745 |
-
"sha256": "
|
| 746 |
},
|
| 747 |
{
|
| 748 |
"id": "unified_task_model_radar_json",
|
|
@@ -1005,8 +1005,8 @@
|
|
| 1005 |
"surface": "repo_hf",
|
| 1006 |
"shows": "Summarizes the main research lessons from committed metrics and identifies which experiments need held-out episodes.",
|
| 1007 |
"exists": true,
|
| 1008 |
-
"bytes":
|
| 1009 |
-
"sha256": "
|
| 1010 |
},
|
| 1011 |
{
|
| 1012 |
"id": "research_takeaways_json",
|
|
@@ -1016,8 +1016,8 @@
|
|
| 1016 |
"surface": "website_hf",
|
| 1017 |
"shows": "Machine-readable result interpretation for the website, HF cards, and mirror checks.",
|
| 1018 |
"exists": true,
|
| 1019 |
-
"bytes":
|
| 1020 |
-
"sha256": "
|
| 1021 |
},
|
| 1022 |
{
|
| 1023 |
"id": "research_takeaways_builder",
|
|
@@ -1027,8 +1027,8 @@
|
|
| 1027 |
"surface": "repo_hf",
|
| 1028 |
"shows": "Regenerates the research takeaways from committed summary metrics and task result artifacts.",
|
| 1029 |
"exists": true,
|
| 1030 |
-
"bytes":
|
| 1031 |
-
"sha256": "
|
| 1032 |
},
|
| 1033 |
{
|
| 1034 |
"id": "audio_ablation_script",
|
|
@@ -1036,7 +1036,7 @@
|
|
| 1036 |
"path": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1037 |
"kind": "result_interpretation",
|
| 1038 |
"surface": "repo_hf",
|
| 1039 |
-
"shows": "Measures audio contribution variants across the
|
| 1040 |
"exists": true,
|
| 1041 |
"bytes": 43159,
|
| 1042 |
"sha256": "2444f2e52efb975be931b33d66b7180d53031e1d5e821719122160f92f4540aa"
|
|
@@ -1080,7 +1080,7 @@
|
|
| 1080 |
"path": "docs/assets/charts/audio_ablation_delta.svg",
|
| 1081 |
"kind": "visual_evidence",
|
| 1082 |
"surface": "website_hf",
|
| 1083 |
-
"shows": "Bar chart of measured current-audio primary-metric deltas across the
|
| 1084 |
"exists": true,
|
| 1085 |
"bytes": 4146,
|
| 1086 |
"sha256": "187dbabe01f9ff18841ff61a1e7fbf85bebdd188cc0f248bb5090d64528e7568"
|
|
@@ -1093,8 +1093,8 @@
|
|
| 1093 |
"surface": "repo_hf",
|
| 1094 |
"shows": "Catalogs public figures, charts, modality thumbnails, dimensions, hashes, roles, and source scripts.",
|
| 1095 |
"exists": true,
|
| 1096 |
-
"bytes":
|
| 1097 |
-
"sha256": "
|
| 1098 |
},
|
| 1099 |
{
|
| 1100 |
"id": "figure_index_json",
|
|
@@ -1104,8 +1104,8 @@
|
|
| 1104 |
"surface": "website_hf",
|
| 1105 |
"shows": "Machine-readable visual asset index for website and Hugging Face mirrors.",
|
| 1106 |
"exists": true,
|
| 1107 |
-
"bytes":
|
| 1108 |
-
"sha256": "
|
| 1109 |
},
|
| 1110 |
{
|
| 1111 |
"id": "figure_index_builder",
|
|
@@ -1115,8 +1115,8 @@
|
|
| 1115 |
"surface": "repo_hf",
|
| 1116 |
"shows": "Regenerates visual-asset hashes, dimensions, and source-script provenance.",
|
| 1117 |
"exists": true,
|
| 1118 |
-
"bytes":
|
| 1119 |
-
"sha256": "
|
| 1120 |
},
|
| 1121 |
{
|
| 1122 |
"id": "brand_assets_json",
|
|
@@ -1182,7 +1182,7 @@
|
|
| 1182 |
"shows": "Machine-readable release-check summary for validators, mirrors, and public project surfaces.",
|
| 1183 |
"exists": true,
|
| 1184 |
"bytes": 8640,
|
| 1185 |
-
"sha256": "
|
| 1186 |
},
|
| 1187 |
{
|
| 1188 |
"id": "public_surface_qa",
|
|
@@ -1226,7 +1226,7 @@
|
|
| 1226 |
"volatile": true,
|
| 1227 |
"shows": "Machine-readable report for SEO/social metadata, accessible tab semantics, public links, project links, and clear project presentation.",
|
| 1228 |
"exists": true,
|
| 1229 |
-
"bytes":
|
| 1230 |
"hash_policy": "existence_and_size_only"
|
| 1231 |
},
|
| 1232 |
{
|
|
@@ -1307,7 +1307,7 @@
|
|
| 1307 |
"volatile": true,
|
| 1308 |
"shows": "Records the last live GitHub/HF URL verification after upload.",
|
| 1309 |
"exists": true,
|
| 1310 |
-
"bytes":
|
| 1311 |
"hash_policy": "existence_and_size_only"
|
| 1312 |
},
|
| 1313 |
{
|
|
@@ -1340,8 +1340,8 @@
|
|
| 1340 |
"surface": "website_hf",
|
| 1341 |
"shows": "Machine-readable reproduction steps with expected artifacts and public boundaries.",
|
| 1342 |
"exists": true,
|
| 1343 |
-
"bytes":
|
| 1344 |
-
"sha256": "
|
| 1345 |
},
|
| 1346 |
{
|
| 1347 |
"id": "artifact_index_builder",
|
|
@@ -1351,8 +1351,8 @@
|
|
| 1351 |
"surface": "repo_hf",
|
| 1352 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 1353 |
"exists": true,
|
| 1354 |
-
"bytes":
|
| 1355 |
-
"sha256": "
|
| 1356 |
},
|
| 1357 |
{
|
| 1358 |
"id": "publication_audit",
|
|
@@ -1410,8 +1410,8 @@
|
|
| 1410 |
"surface": "website_hf",
|
| 1411 |
"shows": "Lists public URLs, upstream sources, and machine-readable project metadata.",
|
| 1412 |
"exists": true,
|
| 1413 |
-
"bytes":
|
| 1414 |
-
"sha256": "
|
| 1415 |
},
|
| 1416 |
{
|
| 1417 |
"id": "task_summary",
|
|
@@ -1474,7 +1474,7 @@
|
|
| 1474 |
"path": "results/episode_task_suite/neural_mlp",
|
| 1475 |
"kind": "result_directory",
|
| 1476 |
"surface": "repo_hf_model",
|
| 1477 |
-
"shows": "Stores matching PyTorch MLP results for the
|
| 1478 |
"exists": true,
|
| 1479 |
"file_count": 60,
|
| 1480 |
"bytes": 90609517
|
|
@@ -1485,7 +1485,7 @@
|
|
| 1485 |
"path": "results/episode_task_suite/research_directions/research_direction_taxonomy.json",
|
| 1486 |
"kind": "taxonomy",
|
| 1487 |
"surface": "repo_hf",
|
| 1488 |
-
"shows": "Maps the
|
| 1489 |
"exists": true,
|
| 1490 |
"bytes": 25046,
|
| 1491 |
"sha256": "0e3c442e5eb9057b04b1e8c8fa723dfde6f72e7fae1378d5ea022d93f7d25ca3"
|
|
@@ -1509,8 +1509,8 @@
|
|
| 1509 |
"surface": "repo_hf",
|
| 1510 |
"shows": "Stores the historical result bundle for provenance rows with minimal and neural baselines aligned to the same 20-task window/split setup.",
|
| 1511 |
"exists": true,
|
| 1512 |
-
"bytes":
|
| 1513 |
-
"sha256": "
|
| 1514 |
},
|
| 1515 |
{
|
| 1516 |
"id": "tier2_task_suite_json",
|
|
@@ -1520,8 +1520,8 @@
|
|
| 1520 |
"surface": "website_hf",
|
| 1521 |
"shows": "Machine-readable provenance definitions, setup alignment, metrics, and public source paths; the file name is historical.",
|
| 1522 |
"exists": true,
|
| 1523 |
-
"bytes":
|
| 1524 |
-
"sha256": "
|
| 1525 |
},
|
| 1526 |
{
|
| 1527 |
"id": "tier2_task_suite_chart",
|
|
@@ -1531,8 +1531,8 @@
|
|
| 1531 |
"surface": "website_hf",
|
| 1532 |
"shows": "Visual summary of the historical provenance baseline metrics inside the unified 20-task suite.",
|
| 1533 |
"exists": true,
|
| 1534 |
-
"bytes":
|
| 1535 |
-
"sha256": "
|
| 1536 |
},
|
| 1537 |
{
|
| 1538 |
"id": "tier2_task_suite_builder",
|
|
@@ -1542,8 +1542,8 @@
|
|
| 1542 |
"surface": "repo_hf",
|
| 1543 |
"shows": "Regenerates the historical provenance rows from shared windows plus the local public-sample annotation HDF5; the script name is historical.",
|
| 1544 |
"exists": true,
|
| 1545 |
-
"bytes":
|
| 1546 |
-
"sha256": "
|
| 1547 |
},
|
| 1548 |
{
|
| 1549 |
"id": "task_walkthroughs",
|
|
@@ -1564,8 +1564,8 @@
|
|
| 1564 |
"surface": "website_hf",
|
| 1565 |
"shows": "Presents the task suite and sample modality thumbnails with metrics generated from committed files.",
|
| 1566 |
"exists": true,
|
| 1567 |
-
"bytes":
|
| 1568 |
-
"sha256": "
|
| 1569 |
},
|
| 1570 |
{
|
| 1571 |
"id": "modality_atlas",
|
|
@@ -1672,7 +1672,7 @@
|
|
| 1672 |
"path": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
|
| 1673 |
"kind": "scaleup_status",
|
| 1674 |
"surface": "repo_hf",
|
| 1675 |
-
"shows": "Summarizes same-split simple and neural metadata baselines for the
|
| 1676 |
"exists": true,
|
| 1677 |
"bytes": 2238,
|
| 1678 |
"sha256": "c70440aa502ec569a840159ab7e05b8e7d4ed70e0091ad9a4b2fb3fb0d3803c1"
|
|
@@ -1696,8 +1696,8 @@
|
|
| 1696 |
"surface": "repo_hf",
|
| 1697 |
"shows": "Reader-facing comparison of the single-episode task suite, 128-episode aligned baselines, Qwen3-Omni packages, and Cosmos3 future-window branch.",
|
| 1698 |
"exists": true,
|
| 1699 |
-
"bytes":
|
| 1700 |
-
"sha256": "
|
| 1701 |
},
|
| 1702 |
{
|
| 1703 |
"id": "omni_model_comparison_json",
|
|
@@ -1707,8 +1707,8 @@
|
|
| 1707 |
"surface": "repo_hf",
|
| 1708 |
"shows": "Machine-readable comparison of the current result versions, per-task aligned baselines, verified Qwen3 packages, and Cosmos3 package.",
|
| 1709 |
"exists": true,
|
| 1710 |
-
"bytes":
|
| 1711 |
-
"sha256": "
|
| 1712 |
},
|
| 1713 |
{
|
| 1714 |
"id": "cosmos3_nano_verified_summary",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:19:00+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
"artifact_count": 228,
|
| 6 |
"missing": [],
|
|
|
|
| 59 |
"surface": "website_hf",
|
| 60 |
"shows": "Machine-readable first-reader project brief for the website and Hugging Face mirrors.",
|
| 61 |
"exists": true,
|
| 62 |
+
"bytes": 4032,
|
| 63 |
+
"sha256": "328d601390fdd61c836434e00cfe27670ef5fb96252270975c4ca339f2a51bfa"
|
| 64 |
},
|
| 65 |
{
|
| 66 |
"id": "project_status",
|
|
|
|
| 70 |
"surface": "repo_hf",
|
| 71 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 72 |
"exists": true,
|
| 73 |
+
"bytes": 16013,
|
| 74 |
+
"sha256": "5ad142b601ad982ce59620bd7fa50446c8837050b0331b2be4a357280b295c21"
|
| 75 |
},
|
| 76 |
{
|
| 77 |
"id": "project_status_json",
|
|
|
|
| 81 |
"surface": "website_hf",
|
| 82 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 83 |
"exists": true,
|
| 84 |
+
"bytes": 23232,
|
| 85 |
+
"sha256": "406c48ec858b5f288c7ebef6eefc0ed94dc8bad11bf9221f435b9c8aca547ea3"
|
| 86 |
},
|
| 87 |
{
|
| 88 |
"id": "glossary",
|
|
|
|
| 576 |
"surface": "website_hf",
|
| 577 |
"shows": "Gives a short project path with scope status and public surfaces.",
|
| 578 |
"exists": true,
|
| 579 |
+
"bytes": 10018,
|
| 580 |
+
"sha256": "6b7ae7fe0df1a9e4a12d241a3162540b0cf1ade86803dec8aac68e3dc99bfc66"
|
| 581 |
},
|
| 582 |
{
|
| 583 |
"id": "artifact_guide",
|
|
|
|
| 587 |
"surface": "repo_hf",
|
| 588 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 589 |
"exists": true,
|
| 590 |
+
"bytes": 20601,
|
| 591 |
+
"sha256": "e0e4ad50271ab1d58d2fe97de5b3451a52f034996b54d0ee9499b562b9decbbf"
|
| 592 |
},
|
| 593 |
{
|
| 594 |
"id": "official_dataset_card_alignment",
|
|
|
|
| 632 |
"shows": "Machine-readable source-alignment pass/fail check for repo, website, and HF surfaces.",
|
| 633 |
"exists": true,
|
| 634 |
"bytes": 4432,
|
| 635 |
+
"sha256": "5ab2ea4bfefe9f5bc7854f02b2e1e2b5206766a54447647191828da1a1a2077c"
|
| 636 |
},
|
| 637 |
{
|
| 638 |
"id": "source_alignment_validator",
|
|
|
|
| 686 |
"surface": "repo_hf",
|
| 687 |
"shows": "Defines the window unit, chronological split, task metrics, leakage controls, and current limitations.",
|
| 688 |
"exists": true,
|
| 689 |
+
"bytes": 8905,
|
| 690 |
+
"sha256": "f82e9b9c4a07e95776005968788e7acdaae9e322991113d79432d59057181add"
|
| 691 |
},
|
| 692 |
{
|
| 693 |
"id": "evaluation_protocol_json",
|
|
|
|
| 697 |
"surface": "website_hf",
|
| 698 |
"shows": "Machine-readable protocol generated from committed task metrics for website and HF mirrors.",
|
| 699 |
"exists": true,
|
| 700 |
+
"bytes": 24047,
|
| 701 |
+
"sha256": "d8f61b646a2f3f1e0af901dbdaff310ebfeea90622c93a34b9e35f34be98b896"
|
| 702 |
},
|
| 703 |
{
|
| 704 |
"id": "evaluation_protocol_builder",
|
|
|
|
| 708 |
"surface": "repo_hf",
|
| 709 |
"shows": "Regenerates the protocol from committed summary metrics and task artifacts.",
|
| 710 |
"exists": true,
|
| 711 |
+
"bytes": 19825,
|
| 712 |
+
"sha256": "aa9de1582f8fa79c1850e10e69fb125c0e3c1add433c7ebedc104c2efb42272e"
|
| 713 |
},
|
| 714 |
{
|
| 715 |
"id": "task_suite_20",
|
|
|
|
| 719 |
"surface": "repo_hf",
|
| 720 |
"shows": "Reader-facing table for the single unified public-sample task suite under the same window, split, feature, and baseline contract.",
|
| 721 |
"exists": true,
|
| 722 |
+
"bytes": 4845,
|
| 723 |
+
"sha256": "076a68734f20e2660d1eddba460672c1246951b893494396f1281d6423f3627a"
|
| 724 |
},
|
| 725 |
{
|
| 726 |
"id": "task_suite_20_json",
|
|
|
|
| 730 |
"surface": "website_hf",
|
| 731 |
"shows": "Machine-readable unified 20-task index for the website, Hugging Face mirrors, and live verification.",
|
| 732 |
"exists": true,
|
| 733 |
+
"bytes": 34585,
|
| 734 |
+
"sha256": "75145285cf71bc3bb9a10377a1921b60e85c4546dc8b858102b3c26e94c11a01"
|
| 735 |
},
|
| 736 |
{
|
| 737 |
"id": "task_suite_20_builder",
|
|
|
|
| 741 |
"surface": "repo_hf",
|
| 742 |
"shows": "Regenerates the unified 20-task JSON and Markdown from the public-sample metrics plus the historical provenance result bundle.",
|
| 743 |
"exists": true,
|
| 744 |
+
"bytes": 12157,
|
| 745 |
+
"sha256": "157265b5c025f279ce1eb56c52dd720ce0969b8426d5887030bfa179a3b565e0"
|
| 746 |
},
|
| 747 |
{
|
| 748 |
"id": "unified_task_model_radar_json",
|
|
|
|
| 1005 |
"surface": "repo_hf",
|
| 1006 |
"shows": "Summarizes the main research lessons from committed metrics and identifies which experiments need held-out episodes.",
|
| 1007 |
"exists": true,
|
| 1008 |
+
"bytes": 5175,
|
| 1009 |
+
"sha256": "385d1b77b41c632925bbd27878c334839303462d03a3b9d358326951b1088da8"
|
| 1010 |
},
|
| 1011 |
{
|
| 1012 |
"id": "research_takeaways_json",
|
|
|
|
| 1016 |
"surface": "website_hf",
|
| 1017 |
"shows": "Machine-readable result interpretation for the website, HF cards, and mirror checks.",
|
| 1018 |
"exists": true,
|
| 1019 |
+
"bytes": 7165,
|
| 1020 |
+
"sha256": "f1ddead60f986e3036206bc3c70d4bdda422a8be4761b285eb89c9c49d9832b6"
|
| 1021 |
},
|
| 1022 |
{
|
| 1023 |
"id": "research_takeaways_builder",
|
|
|
|
| 1027 |
"surface": "repo_hf",
|
| 1028 |
"shows": "Regenerates the research takeaways from committed summary metrics and task result artifacts.",
|
| 1029 |
"exists": true,
|
| 1030 |
+
"bytes": 13499,
|
| 1031 |
+
"sha256": "fc749125f9be87ee0db5b66918342da5c0378d6c97fb1acabe9688f920554c39"
|
| 1032 |
},
|
| 1033 |
{
|
| 1034 |
"id": "audio_ablation_script",
|
|
|
|
| 1036 |
"path": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1037 |
"kind": "result_interpretation",
|
| 1038 |
"surface": "repo_hf",
|
| 1039 |
+
"shows": "Measures audio contribution variants across the walkthrough-backed task contracts.",
|
| 1040 |
"exists": true,
|
| 1041 |
"bytes": 43159,
|
| 1042 |
"sha256": "2444f2e52efb975be931b33d66b7180d53031e1d5e821719122160f92f4540aa"
|
|
|
|
| 1080 |
"path": "docs/assets/charts/audio_ablation_delta.svg",
|
| 1081 |
"kind": "visual_evidence",
|
| 1082 |
"surface": "website_hf",
|
| 1083 |
+
"shows": "Bar chart of measured current-audio primary-metric deltas across the walkthrough-backed tasks.",
|
| 1084 |
"exists": true,
|
| 1085 |
"bytes": 4146,
|
| 1086 |
"sha256": "187dbabe01f9ff18841ff61a1e7fbf85bebdd188cc0f248bb5090d64528e7568"
|
|
|
|
| 1093 |
"surface": "repo_hf",
|
| 1094 |
"shows": "Catalogs public figures, charts, modality thumbnails, dimensions, hashes, roles, and source scripts.",
|
| 1095 |
"exists": true,
|
| 1096 |
+
"bytes": 7027,
|
| 1097 |
+
"sha256": "b7b507c35cd3cba2765586e9703a447c8025c89658c3daa390df67db4211d0fc"
|
| 1098 |
},
|
| 1099 |
{
|
| 1100 |
"id": "figure_index_json",
|
|
|
|
| 1104 |
"surface": "website_hf",
|
| 1105 |
"shows": "Machine-readable visual asset index for website and Hugging Face mirrors.",
|
| 1106 |
"exists": true,
|
| 1107 |
+
"bytes": 19485,
|
| 1108 |
+
"sha256": "4f225bf08f00fbe843999d6bd2b3d5f5d6c17f2ff67e1f6a85eee9094c6bb6a3"
|
| 1109 |
},
|
| 1110 |
{
|
| 1111 |
"id": "figure_index_builder",
|
|
|
|
| 1115 |
"surface": "repo_hf",
|
| 1116 |
"shows": "Regenerates visual-asset hashes, dimensions, and source-script provenance.",
|
| 1117 |
"exists": true,
|
| 1118 |
+
"bytes": 16845,
|
| 1119 |
+
"sha256": "3f91f7f13a3fb08ab57c2f0a6b320102e9d5ae19b102b71499edb5b8fd5a2cec"
|
| 1120 |
},
|
| 1121 |
{
|
| 1122 |
"id": "brand_assets_json",
|
|
|
|
| 1182 |
"shows": "Machine-readable release-check summary for validators, mirrors, and public project surfaces.",
|
| 1183 |
"exists": true,
|
| 1184 |
"bytes": 8640,
|
| 1185 |
+
"sha256": "6e54f6828b8fef97e963a9a56bccc91162b8a632f6897743095e32407fa0db98"
|
| 1186 |
},
|
| 1187 |
{
|
| 1188 |
"id": "public_surface_qa",
|
|
|
|
| 1226 |
"volatile": true,
|
| 1227 |
"shows": "Machine-readable report for SEO/social metadata, accessible tab semantics, public links, project links, and clear project presentation.",
|
| 1228 |
"exists": true,
|
| 1229 |
+
"bytes": 7691,
|
| 1230 |
"hash_policy": "existence_and_size_only"
|
| 1231 |
},
|
| 1232 |
{
|
|
|
|
| 1307 |
"volatile": true,
|
| 1308 |
"shows": "Records the last live GitHub/HF URL verification after upload.",
|
| 1309 |
"exists": true,
|
| 1310 |
+
"bytes": 189990,
|
| 1311 |
"hash_policy": "existence_and_size_only"
|
| 1312 |
},
|
| 1313 |
{
|
|
|
|
| 1340 |
"surface": "website_hf",
|
| 1341 |
"shows": "Machine-readable reproduction steps with expected artifacts and public boundaries.",
|
| 1342 |
"exists": true,
|
| 1343 |
+
"bytes": 6836,
|
| 1344 |
+
"sha256": "3f1e1615c6c0853d21bc14a8eab20af3757ecc443e72dab7744b3c0ec149fa87"
|
| 1345 |
},
|
| 1346 |
{
|
| 1347 |
"id": "artifact_index_builder",
|
|
|
|
| 1351 |
"surface": "repo_hf",
|
| 1352 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 1353 |
"exists": true,
|
| 1354 |
+
"bytes": 68279,
|
| 1355 |
+
"sha256": "69b43ad5d3dc5a6893c4592fa47fff6a7a87691728ec2c61b121ec262d00bf2a"
|
| 1356 |
},
|
| 1357 |
{
|
| 1358 |
"id": "publication_audit",
|
|
|
|
| 1410 |
"surface": "website_hf",
|
| 1411 |
"shows": "Lists public URLs, upstream sources, and machine-readable project metadata.",
|
| 1412 |
"exists": true,
|
| 1413 |
+
"bytes": 5739,
|
| 1414 |
+
"sha256": "d972f30552dd346ec296f88d004c70bf2fb99e92e44ddc8d3a6dad5634f0336d"
|
| 1415 |
},
|
| 1416 |
{
|
| 1417 |
"id": "task_summary",
|
|
|
|
| 1474 |
"path": "results/episode_task_suite/neural_mlp",
|
| 1475 |
"kind": "result_directory",
|
| 1476 |
"surface": "repo_hf_model",
|
| 1477 |
+
"shows": "Stores matching PyTorch MLP results for the walkthrough-backed task contracts.",
|
| 1478 |
"exists": true,
|
| 1479 |
"file_count": 60,
|
| 1480 |
"bytes": 90609517
|
|
|
|
| 1485 |
"path": "results/episode_task_suite/research_directions/research_direction_taxonomy.json",
|
| 1486 |
"kind": "taxonomy",
|
| 1487 |
"surface": "repo_hf",
|
| 1488 |
+
"shows": "Maps the walkthrough-backed tasks to the four Ropedia research directions as direct/proxy/diagnostic.",
|
| 1489 |
"exists": true,
|
| 1490 |
"bytes": 25046,
|
| 1491 |
"sha256": "0e3c442e5eb9057b04b1e8c8fa723dfde6f72e7fae1378d5ea022d93f7d25ca3"
|
|
|
|
| 1509 |
"surface": "repo_hf",
|
| 1510 |
"shows": "Stores the historical result bundle for provenance rows with minimal and neural baselines aligned to the same 20-task window/split setup.",
|
| 1511 |
"exists": true,
|
| 1512 |
+
"bytes": 33575,
|
| 1513 |
+
"sha256": "d6d2f851325a691e77aed6d948f7355b16cf8d81ca35bf115e7309a7b7308efd"
|
| 1514 |
},
|
| 1515 |
{
|
| 1516 |
"id": "tier2_task_suite_json",
|
|
|
|
| 1520 |
"surface": "website_hf",
|
| 1521 |
"shows": "Machine-readable provenance definitions, setup alignment, metrics, and public source paths; the file name is historical.",
|
| 1522 |
"exists": true,
|
| 1523 |
+
"bytes": 33575,
|
| 1524 |
+
"sha256": "d6d2f851325a691e77aed6d948f7355b16cf8d81ca35bf115e7309a7b7308efd"
|
| 1525 |
},
|
| 1526 |
{
|
| 1527 |
"id": "tier2_task_suite_chart",
|
|
|
|
| 1531 |
"surface": "website_hf",
|
| 1532 |
"shows": "Visual summary of the historical provenance baseline metrics inside the unified 20-task suite.",
|
| 1533 |
"exists": true,
|
| 1534 |
+
"bytes": 5453,
|
| 1535 |
+
"sha256": "e9da29c57f42b29a7a05622fee1335089ac2b6fc9692a3b49fa5b753904db9dc"
|
| 1536 |
},
|
| 1537 |
{
|
| 1538 |
"id": "tier2_task_suite_builder",
|
|
|
|
| 1542 |
"surface": "repo_hf",
|
| 1543 |
"shows": "Regenerates the historical provenance rows from shared windows plus the local public-sample annotation HDF5; the script name is historical.",
|
| 1544 |
"exists": true,
|
| 1545 |
+
"bytes": 47155,
|
| 1546 |
+
"sha256": "569f05c1299f5186778ec75280188969fe1a5a76ae8553738fd44fc2faaab195"
|
| 1547 |
},
|
| 1548 |
{
|
| 1549 |
"id": "task_walkthroughs",
|
|
|
|
| 1564 |
"surface": "website_hf",
|
| 1565 |
"shows": "Presents the task suite and sample modality thumbnails with metrics generated from committed files.",
|
| 1566 |
"exists": true,
|
| 1567 |
+
"bytes": 1897278,
|
| 1568 |
+
"sha256": "71b1ab150e952cf902488226c65b3822d8016974f63d111204c1eb1a7745faad"
|
| 1569 |
},
|
| 1570 |
{
|
| 1571 |
"id": "modality_atlas",
|
|
|
|
| 1672 |
"path": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
|
| 1673 |
"kind": "scaleup_status",
|
| 1674 |
"surface": "repo_hf",
|
| 1675 |
+
"shows": "Summarizes same-split simple and neural metadata baselines for the walkthrough-backed task ids, with unsupported markers for tasks that need missing raw 128 feature blocks.",
|
| 1676 |
"exists": true,
|
| 1677 |
"bytes": 2238,
|
| 1678 |
"sha256": "c70440aa502ec569a840159ab7e05b8e7d4ed70e0091ad9a4b2fb3fb0d3803c1"
|
|
|
|
| 1696 |
"surface": "repo_hf",
|
| 1697 |
"shows": "Reader-facing comparison of the single-episode task suite, 128-episode aligned baselines, Qwen3-Omni packages, and Cosmos3 future-window branch.",
|
| 1698 |
"exists": true,
|
| 1699 |
+
"bytes": 15997,
|
| 1700 |
+
"sha256": "c8296c51eb1d67d155b84e3a39f703642d30e855fee7ee7d6ca437966b5c760b"
|
| 1701 |
},
|
| 1702 |
{
|
| 1703 |
"id": "omni_model_comparison_json",
|
|
|
|
| 1707 |
"surface": "repo_hf",
|
| 1708 |
"shows": "Machine-readable comparison of the current result versions, per-task aligned baselines, verified Qwen3 packages, and Cosmos3 package.",
|
| 1709 |
"exists": true,
|
| 1710 |
+
"bytes": 82102,
|
| 1711 |
+
"sha256": "6b246dbdb2685efdc9d0a92bb8c446a89523a1787ebc8a883805b4179e266dd1"
|
| 1712 |
},
|
| 1713 |
{
|
| 1714 |
"id": "cosmos3_nano_verified_summary",
|
data/evaluation_protocol.json
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Evaluation Protocol",
|
| 3 |
"status": "pass",
|
| 4 |
"version": "2026-06-01",
|
| 5 |
-
"generated_at_utc": "2026-06-
|
| 6 |
"source_files": [
|
| 7 |
"docs/data/summary_metrics.json",
|
| 8 |
"results/episode_task_suite/summary_report.json",
|
|
@@ -26,8 +26,8 @@
|
|
| 26 |
"task_suite": {
|
| 27 |
"status": "unified_public_sample_suite",
|
| 28 |
"task_count": 20,
|
| 29 |
-
"
|
| 30 |
-
"
|
| 31 |
"unified_results": "docs/data/task_suite_20.json",
|
| 32 |
"legacy_additional_task_result_path": "docs/data/tier2_task_suite.json",
|
| 33 |
"legacy_path_note": "The tier2_task_suite path is retained for stable links only; it is provenance inside the same 20-task suite."
|
|
@@ -82,7 +82,7 @@
|
|
| 82 |
{
|
| 83 |
"task": "timeline_action",
|
| 84 |
"task_display_name": "Action Recognition",
|
| 85 |
-
"
|
| 86 |
"family": "supervised classification",
|
| 87 |
"unit": "single window",
|
| 88 |
"input": "current 20-frame all-feature window",
|
|
@@ -105,7 +105,7 @@
|
|
| 105 |
{
|
| 106 |
"task": "timeline_subtask",
|
| 107 |
"task_display_name": "Procedure Step Recognition",
|
| 108 |
-
"
|
| 109 |
"family": "supervised classification",
|
| 110 |
"unit": "single window",
|
| 111 |
"input": "current 20-frame all-feature window",
|
|
@@ -128,7 +128,7 @@
|
|
| 128 |
{
|
| 129 |
"task": "transition_detection",
|
| 130 |
"task_display_name": "Action Boundary Detection",
|
| 131 |
-
"
|
| 132 |
"family": "temporal diagnostic",
|
| 133 |
"unit": "single window",
|
| 134 |
"input": "current 20-frame all-feature window",
|
|
@@ -151,7 +151,7 @@
|
|
| 151 |
{
|
| 152 |
"task": "next_action",
|
| 153 |
"task_display_name": "Next-Action Prediction",
|
| 154 |
-
"
|
| 155 |
"family": "short-horizon prediction",
|
| 156 |
"unit": "single window",
|
| 157 |
"input": "current 20-frame all-feature window at time t",
|
|
@@ -174,7 +174,7 @@
|
|
| 174 |
{
|
| 175 |
"task": "hand_trajectory_forecast",
|
| 176 |
"task_display_name": "Hand Trajectory Forecasting",
|
| 177 |
-
"
|
| 178 |
"family": "trajectory regression",
|
| 179 |
"unit": "single window",
|
| 180 |
"input": "current all-feature window",
|
|
@@ -197,7 +197,7 @@
|
|
| 197 |
{
|
| 198 |
"task": "contact_prediction",
|
| 199 |
"task_display_name": "Contact State Prediction",
|
| 200 |
-
"
|
| 201 |
"family": "binary classification",
|
| 202 |
"unit": "single window",
|
| 203 |
"input": "non-contact and non-caption feature blocks",
|
|
@@ -220,7 +220,7 @@
|
|
| 220 |
{
|
| 221 |
"task": "object_relevance",
|
| 222 |
"task_display_name": "Object Relevance Prediction",
|
| 223 |
-
"
|
| 224 |
"family": "multi-label classification",
|
| 225 |
"unit": "single window",
|
| 226 |
"input": "non-caption feature blocks",
|
|
@@ -243,7 +243,7 @@
|
|
| 243 |
{
|
| 244 |
"task": "caption_grounding",
|
| 245 |
"task_display_name": "Language Grounding",
|
| 246 |
-
"
|
| 247 |
"family": "retrieval",
|
| 248 |
"unit": "caption query",
|
| 249 |
"input": "caption object/interaction query plus candidate sensor windows",
|
|
@@ -266,7 +266,7 @@
|
|
| 266 |
{
|
| 267 |
"task": "cross_modal_retrieval",
|
| 268 |
"task_display_name": "Cross-Modal Retrieval",
|
| 269 |
-
"
|
| 270 |
"family": "retrieval",
|
| 271 |
"unit": "sensor query",
|
| 272 |
"input": "motion, IMU, and camera query features",
|
|
@@ -289,7 +289,7 @@
|
|
| 289 |
{
|
| 290 |
"task": "modality_reconstruction",
|
| 291 |
"task_display_name": "Cross-Modal Reconstruction",
|
| 292 |
-
"
|
| 293 |
"family": "cross-modal regression",
|
| 294 |
"unit": "single window",
|
| 295 |
"input": "motion, IMU, and camera features",
|
|
@@ -311,7 +311,7 @@
|
|
| 311 |
{
|
| 312 |
"task": "temporal_order",
|
| 313 |
"task_display_name": "Temporal Order Verification",
|
| 314 |
-
"
|
| 315 |
"family": "pairwise diagnostic",
|
| 316 |
"unit": "adjacent window pair",
|
| 317 |
"input": "two adjacent windows",
|
|
@@ -334,7 +334,7 @@
|
|
| 334 |
{
|
| 335 |
"task": "misalignment_detection",
|
| 336 |
"task_display_name": "Multimodal Synchronization Detection",
|
| 337 |
-
"
|
| 338 |
"family": "pairwise diagnostic",
|
| 339 |
"unit": "paired modality window",
|
| 340 |
"input": "motion side plus visual/depth side",
|
|
@@ -357,7 +357,7 @@
|
|
| 357 |
{
|
| 358 |
"task": "long_horizon_next_action",
|
| 359 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 360 |
-
"
|
| 361 |
"family": "classification",
|
| 362 |
"unit": "single aligned window",
|
| 363 |
"input": "Current 20-frame non-caption multimodal window.",
|
|
@@ -375,7 +375,7 @@
|
|
| 375 |
{
|
| 376 |
"task": "next_subtask_forecast",
|
| 377 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 378 |
-
"
|
| 379 |
"family": "classification",
|
| 380 |
"unit": "single aligned window",
|
| 381 |
"input": "Current 20-frame non-caption multimodal window.",
|
|
@@ -393,7 +393,7 @@
|
|
| 393 |
{
|
| 394 |
"task": "interaction_text_prediction",
|
| 395 |
"task_display_name": "Interaction Text Prediction",
|
| 396 |
-
"
|
| 397 |
"family": "classification",
|
| 398 |
"unit": "single aligned window",
|
| 399 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
@@ -411,7 +411,7 @@
|
|
| 411 |
{
|
| 412 |
"task": "action_object_relation",
|
| 413 |
"task_display_name": "Action-Object Relation Prediction",
|
| 414 |
-
"
|
| 415 |
"family": "classification",
|
| 416 |
"unit": "single aligned window",
|
| 417 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
@@ -429,7 +429,7 @@
|
|
| 429 |
{
|
| 430 |
"task": "object_set_forecast",
|
| 431 |
"task_display_name": "Future Object-Set Forecasting",
|
| 432 |
-
"
|
| 433 |
"family": "multi_label",
|
| 434 |
"unit": "single aligned window",
|
| 435 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
@@ -447,7 +447,7 @@
|
|
| 447 |
{
|
| 448 |
"task": "imu_to_hand_pose",
|
| 449 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 450 |
-
"
|
| 451 |
"family": "regression",
|
| 452 |
"unit": "single aligned window",
|
| 453 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
|
@@ -465,7 +465,7 @@
|
|
| 465 |
{
|
| 466 |
"task": "camera_view_sync_retrieval",
|
| 467 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 468 |
-
"
|
| 469 |
"family": "retrieval",
|
| 470 |
"unit": "held-out query window",
|
| 471 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
|
@@ -483,7 +483,7 @@
|
|
| 483 |
{
|
| 484 |
"task": "time_to_transition",
|
| 485 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 486 |
-
"
|
| 487 |
"family": "regression",
|
| 488 |
"unit": "single aligned window",
|
| 489 |
"input": "Current 20-frame non-caption multimodal window.",
|
|
|
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Evaluation Protocol",
|
| 3 |
"status": "pass",
|
| 4 |
"version": "2026-06-01",
|
| 5 |
+
"generated_at_utc": "2026-06-21T15:20:33+00:00",
|
| 6 |
"source_files": [
|
| 7 |
"docs/data/summary_metrics.json",
|
| 8 |
"results/episode_task_suite/summary_report.json",
|
|
|
|
| 26 |
"task_suite": {
|
| 27 |
"status": "unified_public_sample_suite",
|
| 28 |
"task_count": 20,
|
| 29 |
+
"public_framing": "all 20 public-sample task contracts are presented as one suite",
|
| 30 |
+
"legacy_provenance_rows": 8,
|
| 31 |
"unified_results": "docs/data/task_suite_20.json",
|
| 32 |
"legacy_additional_task_result_path": "docs/data/tier2_task_suite.json",
|
| 33 |
"legacy_path_note": "The tier2_task_suite path is retained for stable links only; it is provenance inside the same 20-task suite."
|
|
|
|
| 82 |
{
|
| 83 |
"task": "timeline_action",
|
| 84 |
"task_display_name": "Action Recognition",
|
| 85 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 86 |
"family": "supervised classification",
|
| 87 |
"unit": "single window",
|
| 88 |
"input": "current 20-frame all-feature window",
|
|
|
|
| 105 |
{
|
| 106 |
"task": "timeline_subtask",
|
| 107 |
"task_display_name": "Procedure Step Recognition",
|
| 108 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 109 |
"family": "supervised classification",
|
| 110 |
"unit": "single window",
|
| 111 |
"input": "current 20-frame all-feature window",
|
|
|
|
| 128 |
{
|
| 129 |
"task": "transition_detection",
|
| 130 |
"task_display_name": "Action Boundary Detection",
|
| 131 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 132 |
"family": "temporal diagnostic",
|
| 133 |
"unit": "single window",
|
| 134 |
"input": "current 20-frame all-feature window",
|
|
|
|
| 151 |
{
|
| 152 |
"task": "next_action",
|
| 153 |
"task_display_name": "Next-Action Prediction",
|
| 154 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 155 |
"family": "short-horizon prediction",
|
| 156 |
"unit": "single window",
|
| 157 |
"input": "current 20-frame all-feature window at time t",
|
|
|
|
| 174 |
{
|
| 175 |
"task": "hand_trajectory_forecast",
|
| 176 |
"task_display_name": "Hand Trajectory Forecasting",
|
| 177 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 178 |
"family": "trajectory regression",
|
| 179 |
"unit": "single window",
|
| 180 |
"input": "current all-feature window",
|
|
|
|
| 197 |
{
|
| 198 |
"task": "contact_prediction",
|
| 199 |
"task_display_name": "Contact State Prediction",
|
| 200 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 201 |
"family": "binary classification",
|
| 202 |
"unit": "single window",
|
| 203 |
"input": "non-contact and non-caption feature blocks",
|
|
|
|
| 220 |
{
|
| 221 |
"task": "object_relevance",
|
| 222 |
"task_display_name": "Object Relevance Prediction",
|
| 223 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 224 |
"family": "multi-label classification",
|
| 225 |
"unit": "single window",
|
| 226 |
"input": "non-caption feature blocks",
|
|
|
|
| 243 |
{
|
| 244 |
"task": "caption_grounding",
|
| 245 |
"task_display_name": "Language Grounding",
|
| 246 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 247 |
"family": "retrieval",
|
| 248 |
"unit": "caption query",
|
| 249 |
"input": "caption object/interaction query plus candidate sensor windows",
|
|
|
|
| 266 |
{
|
| 267 |
"task": "cross_modal_retrieval",
|
| 268 |
"task_display_name": "Cross-Modal Retrieval",
|
| 269 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 270 |
"family": "retrieval",
|
| 271 |
"unit": "sensor query",
|
| 272 |
"input": "motion, IMU, and camera query features",
|
|
|
|
| 289 |
{
|
| 290 |
"task": "modality_reconstruction",
|
| 291 |
"task_display_name": "Cross-Modal Reconstruction",
|
| 292 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 293 |
"family": "cross-modal regression",
|
| 294 |
"unit": "single window",
|
| 295 |
"input": "motion, IMU, and camera features",
|
|
|
|
| 311 |
{
|
| 312 |
"task": "temporal_order",
|
| 313 |
"task_display_name": "Temporal Order Verification",
|
| 314 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 315 |
"family": "pairwise diagnostic",
|
| 316 |
"unit": "adjacent window pair",
|
| 317 |
"input": "two adjacent windows",
|
|
|
|
| 334 |
{
|
| 335 |
"task": "misalignment_detection",
|
| 336 |
"task_display_name": "Multimodal Synchronization Detection",
|
| 337 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 338 |
"family": "pairwise diagnostic",
|
| 339 |
"unit": "paired modality window",
|
| 340 |
"input": "motion side plus visual/depth side",
|
|
|
|
| 357 |
{
|
| 358 |
"task": "long_horizon_next_action",
|
| 359 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 360 |
+
"provenance_source": "historical_result_bundle",
|
| 361 |
"family": "classification",
|
| 362 |
"unit": "single aligned window",
|
| 363 |
"input": "Current 20-frame non-caption multimodal window.",
|
|
|
|
| 375 |
{
|
| 376 |
"task": "next_subtask_forecast",
|
| 377 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 378 |
+
"provenance_source": "historical_result_bundle",
|
| 379 |
"family": "classification",
|
| 380 |
"unit": "single aligned window",
|
| 381 |
"input": "Current 20-frame non-caption multimodal window.",
|
|
|
|
| 393 |
{
|
| 394 |
"task": "interaction_text_prediction",
|
| 395 |
"task_display_name": "Interaction Text Prediction",
|
| 396 |
+
"provenance_source": "historical_result_bundle",
|
| 397 |
"family": "classification",
|
| 398 |
"unit": "single aligned window",
|
| 399 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
|
|
| 411 |
{
|
| 412 |
"task": "action_object_relation",
|
| 413 |
"task_display_name": "Action-Object Relation Prediction",
|
| 414 |
+
"provenance_source": "historical_result_bundle",
|
| 415 |
"family": "classification",
|
| 416 |
"unit": "single aligned window",
|
| 417 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
|
|
| 429 |
{
|
| 430 |
"task": "object_set_forecast",
|
| 431 |
"task_display_name": "Future Object-Set Forecasting",
|
| 432 |
+
"provenance_source": "historical_result_bundle",
|
| 433 |
"family": "multi_label",
|
| 434 |
"unit": "single aligned window",
|
| 435 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
|
|
|
| 447 |
{
|
| 448 |
"task": "imu_to_hand_pose",
|
| 449 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 450 |
+
"provenance_source": "historical_result_bundle",
|
| 451 |
"family": "regression",
|
| 452 |
"unit": "single aligned window",
|
| 453 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
|
|
|
| 465 |
{
|
| 466 |
"task": "camera_view_sync_retrieval",
|
| 467 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 468 |
+
"provenance_source": "historical_result_bundle",
|
| 469 |
"family": "retrieval",
|
| 470 |
"unit": "held-out query window",
|
| 471 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
|
|
|
| 483 |
{
|
| 484 |
"task": "time_to_transition",
|
| 485 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 486 |
+
"provenance_source": "historical_result_bundle",
|
| 487 |
"family": "regression",
|
| 488 |
"unit": "single aligned window",
|
| 489 |
"input": "Current 20-frame non-caption multimodal window.",
|
data/live_publication_status.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
data/mirror_parity.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
data/omni_model_comparison.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"status": "pass",
|
| 5 |
"version_count": 3,
|
| 6 |
"model_group_count": 5,
|
| 7 |
"comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
|
| 8 |
"version_reading_notes": [
|
| 9 |
-
"Version 1 is the public-sample 20-task surface:
|
| 10 |
"Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
|
| 11 |
"The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
|
| 12 |
],
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:17:00+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
"version_count": 3,
|
| 6 |
"model_group_count": 5,
|
| 7 |
"comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
|
| 8 |
"version_reading_notes": [
|
| 9 |
+
"Version 1 is the public-sample 20-task surface: unified task heads, historical provenance rows, and the 180-row method-task matrix.",
|
| 10 |
"Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
|
| 11 |
"The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
|
| 12 |
],
|
data/project_manifest.json
CHANGED
|
@@ -23,9 +23,8 @@
|
|
| 23 |
"qwen3_omni_json_quality_target_met": true,
|
| 24 |
"qwen3_omni_lora_adapter_repo": "https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep",
|
| 25 |
"task_count": 20,
|
| 26 |
-
"
|
| 27 |
-
"
|
| 28 |
-
"legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
|
| 29 |
},
|
| 30 |
"public_surfaces": {
|
| 31 |
"github_repo": "https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite",
|
|
@@ -96,7 +95,7 @@
|
|
| 96 |
"task_walkthroughs": "docs/data/task_walkthroughs.json",
|
| 97 |
"task_suite_20": "TASK_SUITE_20.md",
|
| 98 |
"task_suite_20_json": "docs/data/task_suite_20.json",
|
| 99 |
-
"
|
| 100 |
},
|
| 101 |
"citation_files": {
|
| 102 |
"citation_cff": "CITATION.cff",
|
|
|
|
| 23 |
"qwen3_omni_json_quality_target_met": true,
|
| 24 |
"qwen3_omni_lora_adapter_repo": "https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep",
|
| 25 |
"task_count": 20,
|
| 26 |
+
"task_surface_framing": "unified_20_task_suite",
|
| 27 |
+
"legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
|
|
|
|
| 28 |
},
|
| 29 |
"public_surfaces": {
|
| 30 |
"github_repo": "https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite",
|
|
|
|
| 95 |
"task_walkthroughs": "docs/data/task_walkthroughs.json",
|
| 96 |
"task_suite_20": "TASK_SUITE_20.md",
|
| 97 |
"task_suite_20_json": "docs/data/task_suite_20.json",
|
| 98 |
+
"historical_provenance_result_bundle": "docs/data/tier2_task_suite.json"
|
| 99 |
},
|
| 100 |
"citation_files": {
|
| 101 |
"citation_cff": "CITATION.cff",
|
data/project_packet.json
CHANGED
|
@@ -15,9 +15,8 @@
|
|
| 15 |
"cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
|
| 16 |
"task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
|
| 17 |
"task_count": 20,
|
| 18 |
-
"
|
| 19 |
-
"
|
| 20 |
-
"legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
|
| 21 |
},
|
| 22 |
"reading_path": [
|
| 23 |
{
|
|
@@ -110,7 +109,7 @@
|
|
| 110 |
"results/episode_task_suite/neural_mlp/",
|
| 111 |
"docs/data/summary_metrics.json"
|
| 112 |
],
|
| 113 |
-
"readout": "The unified suite has 20 task contracts
|
| 114 |
},
|
| 115 |
{
|
| 116 |
"step": 8,
|
|
|
|
| 15 |
"cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
|
| 16 |
"task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
|
| 17 |
"task_count": 20,
|
| 18 |
+
"task_surface_framing": "unified_20_task_suite",
|
| 19 |
+
"legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
|
|
|
|
| 20 |
},
|
| 21 |
"reading_path": [
|
| 22 |
{
|
|
|
|
| 109 |
"results/episode_task_suite/neural_mlp/",
|
| 110 |
"docs/data/summary_metrics.json"
|
| 111 |
],
|
| 112 |
+
"readout": "The unified suite has 20 task contracts in one task surface. Walkthrough-backed tasks, aligned minimal/neural result bundles, and historical tier2_task_suite provenance paths are all linked from TASK_SUITE_20.md and docs/data/task_suite_20.json."
|
| 113 |
},
|
| 114 |
{
|
| 115 |
"step": 8,
|
data/project_status.json
CHANGED
|
@@ -62,9 +62,8 @@
|
|
| 62 |
"task_suite_enhancement_128_recommended_export": "multiscale_20s10_40s20_80s40",
|
| 63 |
"task_suite_enhancement_128_estimated_windows": 106095,
|
| 64 |
"task_count": 20,
|
| 65 |
-
"
|
| 66 |
-
"
|
| 67 |
-
"legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
|
| 68 |
},
|
| 69 |
"rows": [
|
| 70 |
{
|
|
@@ -86,7 +85,7 @@
|
|
| 86 |
"results/episode_task_suite/",
|
| 87 |
"results/episode_task_suite/tier2_task_suite/"
|
| 88 |
],
|
| 89 |
-
"readout": "All 20 task contracts
|
| 90 |
},
|
| 91 |
{
|
| 92 |
"area": "180-result method matrix",
|
|
@@ -116,7 +115,7 @@
|
|
| 116 |
"results/audio_ablation/",
|
| 117 |
"docs/data/audio_ablation_summary.json"
|
| 118 |
],
|
| 119 |
-
"readout": "Audio variants improve the primary metric on 6
|
| 120 |
},
|
| 121 |
{
|
| 122 |
"area": "Evaluation protocol",
|
|
@@ -355,7 +354,7 @@
|
|
| 355 |
"The Cosmos3-Nano future-window package is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super Forward-Dynamics LoRA is verified as the first fine-tuned Super adapter artifact. Cosmos3-Super adapter weights belong in cy0307/ropedia-cosmos3-super-forward-dynamics-lora-128ep; verified_public packages exclude safetensors.",
|
| 356 |
"The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
|
| 357 |
"Audio is one of the synchronized source modalities in the current task representation.",
|
| 358 |
-
"The audio ablation report compares audio/no-audio variants across the
|
| 359 |
"Foundation-model selection is explicit: Qwen3-Omni is the structured JSON baseline, Cosmos 3 is the world-model track with Nano compatibility and Super forward-dynamics LoRA results, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
|
| 360 |
"Future model tracks should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
|
| 361 |
"The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
|
|
|
|
| 62 |
"task_suite_enhancement_128_recommended_export": "multiscale_20s10_40s20_80s40",
|
| 63 |
"task_suite_enhancement_128_estimated_windows": 106095,
|
| 64 |
"task_count": 20,
|
| 65 |
+
"task_surface_framing": "unified_20_task_suite",
|
| 66 |
+
"legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
|
|
|
|
| 67 |
},
|
| 68 |
"rows": [
|
| 69 |
{
|
|
|
|
| 85 |
"results/episode_task_suite/",
|
| 86 |
"results/episode_task_suite/tier2_task_suite/"
|
| 87 |
],
|
| 88 |
+
"readout": "All 20 task contracts are presented together with committed minimal metrics, the same 20-frame windows, 5-frame stride, chronological split, and minimal/neural head pattern. The tier2_task_suite path is historical provenance inside the suite, not a separate public tier."
|
| 89 |
},
|
| 90 |
{
|
| 91 |
"area": "180-result method matrix",
|
|
|
|
| 115 |
"results/audio_ablation/",
|
| 116 |
"docs/data/audio_ablation_summary.json"
|
| 117 |
],
|
| 118 |
+
"readout": "Audio variants improve the primary metric on 6 walkthrough-backed task contracts in this single-episode setting."
|
| 119 |
},
|
| 120 |
{
|
| 121 |
"area": "Evaluation protocol",
|
|
|
|
| 354 |
"The Cosmos3-Nano future-window package is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super Forward-Dynamics LoRA is verified as the first fine-tuned Super adapter artifact. Cosmos3-Super adapter weights belong in cy0307/ropedia-cosmos3-super-forward-dynamics-lora-128ep; verified_public packages exclude safetensors.",
|
| 355 |
"The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
|
| 356 |
"Audio is one of the synchronized source modalities in the current task representation.",
|
| 357 |
+
"The audio ablation report compares audio/no-audio variants across the walkthrough-backed task contracts in results/audio_ablation/.",
|
| 358 |
"Foundation-model selection is explicit: Qwen3-Omni is the structured JSON baseline, Cosmos 3 is the world-model track with Nano compatibility and Super forward-dynamics LoRA results, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
|
| 359 |
"Future model tracks should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
|
| 360 |
"The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
|
data/publication_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:22:42+00:00",
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
data/quality_gates.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Release Checks",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"rule": "A release is current when the automated reports pass and the live GitHub/Hugging Face mirrors are verified after publishing.",
|
| 6 |
"automated_gates": [
|
| 7 |
{
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Release Checks",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:21:42+00:00",
|
| 5 |
"rule": "A release is current when the automated reports pass and the live GitHub/Hugging Face mirrors are verified after publishing.",
|
| 6 |
"automated_gates": [
|
| 7 |
{
|
data/reproducibility_matrix.json
CHANGED
|
@@ -39,7 +39,7 @@
|
|
| 39 |
"id": "original_task_suite",
|
| 40 |
"status": "reproducible",
|
| 41 |
"command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
|
| 42 |
-
"expected": "
|
| 43 |
"boundary": "8,546-dimensional multimodal window contract"
|
| 44 |
},
|
| 45 |
{
|
|
@@ -50,11 +50,11 @@
|
|
| 50 |
"boundary": "single-episode probes, not full research-direction solutions"
|
| 51 |
},
|
| 52 |
{
|
| 53 |
-
"id": "
|
| 54 |
"status": "reproducible",
|
| 55 |
"command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
|
| 56 |
-
"expected": "
|
| 57 |
-
"boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for
|
| 58 |
},
|
| 59 |
{
|
| 60 |
"id": "source_alignment_audit",
|
|
|
|
| 39 |
"id": "original_task_suite",
|
| 40 |
"status": "reproducible",
|
| 41 |
"command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
|
| 42 |
+
"expected": "walkthrough-backed task metrics, predictions, manifests, and neural_mlp task-head artifacts",
|
| 43 |
"boundary": "8,546-dimensional multimodal window contract"
|
| 44 |
},
|
| 45 |
{
|
|
|
|
| 50 |
"boundary": "single-episode probes, not full research-direction solutions"
|
| 51 |
},
|
| 52 |
{
|
| 53 |
+
"id": "unified_20_task_index",
|
| 54 |
"status": "reproducible",
|
| 55 |
"command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
|
| 56 |
+
"expected": "unified 20-task metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
|
| 57 |
+
"boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for full public-task regeneration; raw HDF5 and MP4 files are not redistributed"
|
| 58 |
},
|
| 59 |
{
|
| 60 |
"id": "source_alignment_audit",
|
data/research_takeaways.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Research Takeaways",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"source_files": [
|
| 6 |
"docs/data/summary_metrics.json",
|
| 7 |
"results/episode_task_suite/summary_report.json",
|
|
@@ -133,7 +133,7 @@
|
|
| 133 |
{
|
| 134 |
"id": "audio_contribution_is_task_specific",
|
| 135 |
"title": "Audio helps some tasks and hurts others on the public sample",
|
| 136 |
-
"readout": "Audio improves the primary metric on 6
|
| 137 |
"evidence": [
|
| 138 |
{
|
| 139 |
"label": "tasks_where_current_audio_improves",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Research Takeaways",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:18:59+00:00",
|
| 5 |
"source_files": [
|
| 6 |
"docs/data/summary_metrics.json",
|
| 7 |
"results/episode_task_suite/summary_report.json",
|
|
|
|
| 133 |
{
|
| 134 |
"id": "audio_contribution_is_task_specific",
|
| 135 |
"title": "Audio helps some tasks and hurts others on the public sample",
|
| 136 |
+
"readout": "Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
|
| 137 |
"evidence": [
|
| 138 |
{
|
| 139 |
"label": "tasks_where_current_audio_improves",
|
data/scope_claims_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:23:13+00:00",
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
data/single_episode_task_model_radar.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Single-Episode 20-Task Radar",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"description": "Minimal and Neural MLP baselines on the one public sample episode, both scored on all 20 task contracts.",
|
| 6 |
"task_count": 20,
|
| 7 |
"method_count": 2,
|
|
@@ -73,7 +73,7 @@
|
|
| 73 |
"label": "Action Recognition",
|
| 74 |
"axis_label": "01 Action Recognition",
|
| 75 |
"short_label": "Action",
|
| 76 |
-
"
|
| 77 |
"metric_key": "macro_f1",
|
| 78 |
"metric_name": "macro-F1",
|
| 79 |
"metric_direction": "higher",
|
|
@@ -107,7 +107,7 @@
|
|
| 107 |
"label": "Procedure Step Recognition",
|
| 108 |
"axis_label": "02 Procedure Step Recognition",
|
| 109 |
"short_label": "Step",
|
| 110 |
-
"
|
| 111 |
"metric_key": "macro_f1",
|
| 112 |
"metric_name": "macro-F1",
|
| 113 |
"metric_direction": "higher",
|
|
@@ -141,7 +141,7 @@
|
|
| 141 |
"label": "Action Boundary Detection",
|
| 142 |
"axis_label": "03 Action Boundary Detection",
|
| 143 |
"short_label": "Boundary",
|
| 144 |
-
"
|
| 145 |
"metric_key": "macro_f1",
|
| 146 |
"metric_name": "macro-F1",
|
| 147 |
"metric_direction": "higher",
|
|
@@ -175,7 +175,7 @@
|
|
| 175 |
"label": "Next-Action Prediction",
|
| 176 |
"axis_label": "04 Next-Action Prediction",
|
| 177 |
"short_label": "Next act",
|
| 178 |
-
"
|
| 179 |
"metric_key": "macro_f1",
|
| 180 |
"metric_name": "macro-F1",
|
| 181 |
"metric_direction": "higher",
|
|
@@ -209,7 +209,7 @@
|
|
| 209 |
"label": "Hand Trajectory Forecasting",
|
| 210 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 211 |
"short_label": "Hand traj",
|
| 212 |
-
"
|
| 213 |
"metric_key": "mpjpe",
|
| 214 |
"metric_name": "MPJPE",
|
| 215 |
"metric_direction": "lower",
|
|
@@ -243,7 +243,7 @@
|
|
| 243 |
"label": "Contact State Prediction",
|
| 244 |
"axis_label": "06 Contact State Prediction",
|
| 245 |
"short_label": "Contact",
|
| 246 |
-
"
|
| 247 |
"metric_key": "macro_f1",
|
| 248 |
"metric_name": "macro-F1",
|
| 249 |
"metric_direction": "higher",
|
|
@@ -277,7 +277,7 @@
|
|
| 277 |
"label": "Object Relevance Prediction",
|
| 278 |
"axis_label": "07 Object Relevance Prediction",
|
| 279 |
"short_label": "Objects",
|
| 280 |
-
"
|
| 281 |
"metric_key": "micro_f1",
|
| 282 |
"metric_name": "micro-F1",
|
| 283 |
"metric_direction": "higher",
|
|
@@ -311,7 +311,7 @@
|
|
| 311 |
"label": "Language Grounding",
|
| 312 |
"axis_label": "08 Language Grounding",
|
| 313 |
"short_label": "Language",
|
| 314 |
-
"
|
| 315 |
"metric_key": "mrr",
|
| 316 |
"metric_name": "MRR",
|
| 317 |
"metric_direction": "higher",
|
|
@@ -345,7 +345,7 @@
|
|
| 345 |
"label": "Cross-Modal Retrieval",
|
| 346 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 347 |
"short_label": "X-modal",
|
| 348 |
-
"
|
| 349 |
"metric_key": "mrr",
|
| 350 |
"metric_name": "MRR",
|
| 351 |
"metric_direction": "higher",
|
|
@@ -379,7 +379,7 @@
|
|
| 379 |
"label": "Cross-Modal Reconstruction",
|
| 380 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 381 |
"short_label": "Recon",
|
| 382 |
-
"
|
| 383 |
"metric_key": "r2",
|
| 384 |
"metric_name": "R2",
|
| 385 |
"metric_direction": "higher",
|
|
@@ -413,7 +413,7 @@
|
|
| 413 |
"label": "Temporal Order Verification",
|
| 414 |
"axis_label": "11 Temporal Order Verification",
|
| 415 |
"short_label": "Order",
|
| 416 |
-
"
|
| 417 |
"metric_key": "f1",
|
| 418 |
"metric_name": "F1",
|
| 419 |
"metric_direction": "higher",
|
|
@@ -447,7 +447,7 @@
|
|
| 447 |
"label": "Multimodal Synchronization Detection",
|
| 448 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 449 |
"short_label": "Sync",
|
| 450 |
-
"
|
| 451 |
"metric_key": "f1",
|
| 452 |
"metric_name": "F1",
|
| 453 |
"metric_direction": "higher",
|
|
@@ -481,7 +481,7 @@
|
|
| 481 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 482 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 483 |
"short_label": "Long act",
|
| 484 |
-
"
|
| 485 |
"metric_key": "macro_f1",
|
| 486 |
"metric_name": "macro-F1",
|
| 487 |
"metric_direction": "higher",
|
|
@@ -515,7 +515,7 @@
|
|
| 515 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 516 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 517 |
"short_label": "Long step",
|
| 518 |
-
"
|
| 519 |
"metric_key": "macro_f1",
|
| 520 |
"metric_name": "macro-F1",
|
| 521 |
"metric_direction": "higher",
|
|
@@ -549,7 +549,7 @@
|
|
| 549 |
"label": "Interaction Text Prediction",
|
| 550 |
"axis_label": "15 Interaction Text Prediction",
|
| 551 |
"short_label": "Interact txt",
|
| 552 |
-
"
|
| 553 |
"metric_key": "macro_f1",
|
| 554 |
"metric_name": "macro-F1",
|
| 555 |
"metric_direction": "higher",
|
|
@@ -583,7 +583,7 @@
|
|
| 583 |
"label": "Action-Object Relation Prediction",
|
| 584 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 585 |
"short_label": "Act+obj",
|
| 586 |
-
"
|
| 587 |
"metric_key": "macro_f1",
|
| 588 |
"metric_name": "macro-F1",
|
| 589 |
"metric_direction": "higher",
|
|
@@ -617,7 +617,7 @@
|
|
| 617 |
"label": "Future Object-Set Forecasting",
|
| 618 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 619 |
"short_label": "Future obj",
|
| 620 |
-
"
|
| 621 |
"metric_key": "micro_f1",
|
| 622 |
"metric_name": "micro-F1",
|
| 623 |
"metric_direction": "higher",
|
|
@@ -651,7 +651,7 @@
|
|
| 651 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 652 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 653 |
"short_label": "IMU->hand",
|
| 654 |
-
"
|
| 655 |
"metric_key": "mae",
|
| 656 |
"metric_name": "MAE",
|
| 657 |
"metric_direction": "lower",
|
|
@@ -685,7 +685,7 @@
|
|
| 685 |
"label": "Camera-View Synchronization Retrieval",
|
| 686 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 687 |
"short_label": "Cam sync",
|
| 688 |
-
"
|
| 689 |
"metric_key": "mrr",
|
| 690 |
"metric_name": "MRR",
|
| 691 |
"metric_direction": "higher",
|
|
@@ -719,7 +719,7 @@
|
|
| 719 |
"label": "Time-to-Next-Transition Regression",
|
| 720 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 721 |
"short_label": "Time2bdry",
|
| 722 |
-
"
|
| 723 |
"metric_key": "mae",
|
| 724 |
"metric_name": "MAE frames",
|
| 725 |
"metric_direction": "lower",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Single-Episode 20-Task Radar",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:20:34+00:00",
|
| 5 |
"description": "Minimal and Neural MLP baselines on the one public sample episode, both scored on all 20 task contracts.",
|
| 6 |
"task_count": 20,
|
| 7 |
"method_count": 2,
|
|
|
|
| 73 |
"label": "Action Recognition",
|
| 74 |
"axis_label": "01 Action Recognition",
|
| 75 |
"short_label": "Action",
|
| 76 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 77 |
"metric_key": "macro_f1",
|
| 78 |
"metric_name": "macro-F1",
|
| 79 |
"metric_direction": "higher",
|
|
|
|
| 107 |
"label": "Procedure Step Recognition",
|
| 108 |
"axis_label": "02 Procedure Step Recognition",
|
| 109 |
"short_label": "Step",
|
| 110 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 111 |
"metric_key": "macro_f1",
|
| 112 |
"metric_name": "macro-F1",
|
| 113 |
"metric_direction": "higher",
|
|
|
|
| 141 |
"label": "Action Boundary Detection",
|
| 142 |
"axis_label": "03 Action Boundary Detection",
|
| 143 |
"short_label": "Boundary",
|
| 144 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 145 |
"metric_key": "macro_f1",
|
| 146 |
"metric_name": "macro-F1",
|
| 147 |
"metric_direction": "higher",
|
|
|
|
| 175 |
"label": "Next-Action Prediction",
|
| 176 |
"axis_label": "04 Next-Action Prediction",
|
| 177 |
"short_label": "Next act",
|
| 178 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 179 |
"metric_key": "macro_f1",
|
| 180 |
"metric_name": "macro-F1",
|
| 181 |
"metric_direction": "higher",
|
|
|
|
| 209 |
"label": "Hand Trajectory Forecasting",
|
| 210 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 211 |
"short_label": "Hand traj",
|
| 212 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 213 |
"metric_key": "mpjpe",
|
| 214 |
"metric_name": "MPJPE",
|
| 215 |
"metric_direction": "lower",
|
|
|
|
| 243 |
"label": "Contact State Prediction",
|
| 244 |
"axis_label": "06 Contact State Prediction",
|
| 245 |
"short_label": "Contact",
|
| 246 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 247 |
"metric_key": "macro_f1",
|
| 248 |
"metric_name": "macro-F1",
|
| 249 |
"metric_direction": "higher",
|
|
|
|
| 277 |
"label": "Object Relevance Prediction",
|
| 278 |
"axis_label": "07 Object Relevance Prediction",
|
| 279 |
"short_label": "Objects",
|
| 280 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 281 |
"metric_key": "micro_f1",
|
| 282 |
"metric_name": "micro-F1",
|
| 283 |
"metric_direction": "higher",
|
|
|
|
| 311 |
"label": "Language Grounding",
|
| 312 |
"axis_label": "08 Language Grounding",
|
| 313 |
"short_label": "Language",
|
| 314 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 315 |
"metric_key": "mrr",
|
| 316 |
"metric_name": "MRR",
|
| 317 |
"metric_direction": "higher",
|
|
|
|
| 345 |
"label": "Cross-Modal Retrieval",
|
| 346 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 347 |
"short_label": "X-modal",
|
| 348 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 349 |
"metric_key": "mrr",
|
| 350 |
"metric_name": "MRR",
|
| 351 |
"metric_direction": "higher",
|
|
|
|
| 379 |
"label": "Cross-Modal Reconstruction",
|
| 380 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 381 |
"short_label": "Recon",
|
| 382 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 383 |
"metric_key": "r2",
|
| 384 |
"metric_name": "R2",
|
| 385 |
"metric_direction": "higher",
|
|
|
|
| 413 |
"label": "Temporal Order Verification",
|
| 414 |
"axis_label": "11 Temporal Order Verification",
|
| 415 |
"short_label": "Order",
|
| 416 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 417 |
"metric_key": "f1",
|
| 418 |
"metric_name": "F1",
|
| 419 |
"metric_direction": "higher",
|
|
|
|
| 447 |
"label": "Multimodal Synchronization Detection",
|
| 448 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 449 |
"short_label": "Sync",
|
| 450 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 451 |
"metric_key": "f1",
|
| 452 |
"metric_name": "F1",
|
| 453 |
"metric_direction": "higher",
|
|
|
|
| 481 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 482 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 483 |
"short_label": "Long act",
|
| 484 |
+
"provenance_source": "historical_result_bundle",
|
| 485 |
"metric_key": "macro_f1",
|
| 486 |
"metric_name": "macro-F1",
|
| 487 |
"metric_direction": "higher",
|
|
|
|
| 515 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 516 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 517 |
"short_label": "Long step",
|
| 518 |
+
"provenance_source": "historical_result_bundle",
|
| 519 |
"metric_key": "macro_f1",
|
| 520 |
"metric_name": "macro-F1",
|
| 521 |
"metric_direction": "higher",
|
|
|
|
| 549 |
"label": "Interaction Text Prediction",
|
| 550 |
"axis_label": "15 Interaction Text Prediction",
|
| 551 |
"short_label": "Interact txt",
|
| 552 |
+
"provenance_source": "historical_result_bundle",
|
| 553 |
"metric_key": "macro_f1",
|
| 554 |
"metric_name": "macro-F1",
|
| 555 |
"metric_direction": "higher",
|
|
|
|
| 583 |
"label": "Action-Object Relation Prediction",
|
| 584 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 585 |
"short_label": "Act+obj",
|
| 586 |
+
"provenance_source": "historical_result_bundle",
|
| 587 |
"metric_key": "macro_f1",
|
| 588 |
"metric_name": "macro-F1",
|
| 589 |
"metric_direction": "higher",
|
|
|
|
| 617 |
"label": "Future Object-Set Forecasting",
|
| 618 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 619 |
"short_label": "Future obj",
|
| 620 |
+
"provenance_source": "historical_result_bundle",
|
| 621 |
"metric_key": "micro_f1",
|
| 622 |
"metric_name": "micro-F1",
|
| 623 |
"metric_direction": "higher",
|
|
|
|
| 651 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 652 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 653 |
"short_label": "IMU->hand",
|
| 654 |
+
"provenance_source": "historical_result_bundle",
|
| 655 |
"metric_key": "mae",
|
| 656 |
"metric_name": "MAE",
|
| 657 |
"metric_direction": "lower",
|
|
|
|
| 685 |
"label": "Camera-View Synchronization Retrieval",
|
| 686 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 687 |
"short_label": "Cam sync",
|
| 688 |
+
"provenance_source": "historical_result_bundle",
|
| 689 |
"metric_key": "mrr",
|
| 690 |
"metric_name": "MRR",
|
| 691 |
"metric_direction": "higher",
|
|
|
|
| 719 |
"label": "Time-to-Next-Transition Regression",
|
| 720 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 721 |
"short_label": "Time2bdry",
|
| 722 |
+
"provenance_source": "historical_result_bundle",
|
| 723 |
"metric_key": "mae",
|
| 724 |
"metric_name": "MAE frames",
|
| 725 |
"metric_direction": "lower",
|
data/source_alignment_audit.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Source Alignment Note",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"alignment_json": "docs/data/xperience10m_dataset_card_alignment.json",
|
| 6 |
"alignment_summary": {
|
| 7 |
"full_dataset_repo": "ropedia-ai/xperience-10m",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Source Alignment Note",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:21:55+00:00",
|
| 5 |
"alignment_json": "docs/data/xperience10m_dataset_card_alignment.json",
|
| 6 |
"alignment_summary": {
|
| 7 |
"full_dataset_repo": "ropedia-ai/xperience-10m",
|
data/task_method_20_gap_audit.json
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
{
|
| 2 |
-
"generated_at_utc": "2026-06-
|
| 3 |
"immediate_actions": [
|
| 4 |
{
|
| 5 |
"artifact": "docs/data/task_method_20_gap_audit.json",
|
|
|
|
| 1 |
{
|
| 2 |
+
"generated_at_utc": "2026-06-21T15:21:42+00:00",
|
| 3 |
"immediate_actions": [
|
| 4 |
{
|
| 5 |
"artifact": "docs/data/task_method_20_gap_audit.json",
|
data/task_method_20_result_matrix.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Task Method 20-Result Matrix",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"task_count": 20,
|
| 6 |
"method_count": 9,
|
| 7 |
"method_task_record_count": 180,
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Task Method 20-Result Matrix",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:20:34+00:00",
|
| 5 |
"task_count": 20,
|
| 6 |
"method_count": 9,
|
| 7 |
"method_task_record_count": 180,
|
data/task_suite_20.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Unified 20-Task Suite",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"task_count": 20,
|
| 6 |
-
"
|
| 7 |
-
"
|
| 8 |
-
"
|
| 9 |
-
"
|
| 10 |
},
|
| 11 |
"unification_policy": {
|
| 12 |
"public_framing": "The suite is presented as one 20-task benchmark surface. All task contracts share the same window, split, feature, baseline, and leakage-control language.",
|
|
@@ -21,7 +21,7 @@
|
|
| 21 |
"window_frames": 20,
|
| 22 |
"stride_frames": 5,
|
| 23 |
"split_policy": "single_episode_chronological_70_30",
|
| 24 |
-
"
|
| 25 |
"raw_data_redistributed": false
|
| 26 |
},
|
| 27 |
"setup_alignment": {
|
|
@@ -47,8 +47,8 @@
|
|
| 47 |
"task_id": "timeline_action",
|
| 48 |
"task_display_name": "Action Recognition",
|
| 49 |
"research_name": "Egocentric Action Recognition",
|
| 50 |
-
"
|
| 51 |
-
"origin_count_label": "
|
| 52 |
"family": "supervised",
|
| 53 |
"architecture_family": "multiclass classifier",
|
| 54 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -82,8 +82,8 @@
|
|
| 82 |
"task_id": "timeline_subtask",
|
| 83 |
"task_display_name": "Procedure Step Recognition",
|
| 84 |
"research_name": "Temporal Subtask Recognition",
|
| 85 |
-
"
|
| 86 |
-
"origin_count_label": "
|
| 87 |
"family": "supervised",
|
| 88 |
"architecture_family": "multiclass classifier",
|
| 89 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -117,8 +117,8 @@
|
|
| 117 |
"task_id": "transition_detection",
|
| 118 |
"task_display_name": "Action Boundary Detection",
|
| 119 |
"research_name": "Temporal Action Segmentation",
|
| 120 |
-
"
|
| 121 |
-
"origin_count_label": "
|
| 122 |
"family": "diagnostic",
|
| 123 |
"architecture_family": "binary classifier",
|
| 124 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -152,8 +152,8 @@
|
|
| 152 |
"task_id": "next_action",
|
| 153 |
"task_display_name": "Next-Action Prediction",
|
| 154 |
"research_name": "Short-Horizon Intention Prediction",
|
| 155 |
-
"
|
| 156 |
-
"origin_count_label": "
|
| 157 |
"family": "supervised",
|
| 158 |
"architecture_family": "future-label classifier",
|
| 159 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -187,8 +187,8 @@
|
|
| 187 |
"task_id": "hand_trajectory_forecast",
|
| 188 |
"task_display_name": "Hand Trajectory Forecasting",
|
| 189 |
"research_name": "3D Hand Motion Forecasting",
|
| 190 |
-
"
|
| 191 |
-
"origin_count_label": "
|
| 192 |
"family": "forecast",
|
| 193 |
"architecture_family": "continuous regressor",
|
| 194 |
"primary_direction": "A. Human Modeling & Motion Understanding",
|
|
@@ -220,8 +220,8 @@
|
|
| 220 |
"task_id": "contact_prediction",
|
| 221 |
"task_display_name": "Contact State Prediction",
|
| 222 |
"research_name": "Human-Object Contact Prediction",
|
| 223 |
-
"
|
| 224 |
-
"origin_count_label": "
|
| 225 |
"family": "supervised",
|
| 226 |
"architecture_family": "binary classifier",
|
| 227 |
"primary_direction": "A. Human Modeling & Motion Understanding",
|
|
@@ -255,8 +255,8 @@
|
|
| 255 |
"task_id": "object_relevance",
|
| 256 |
"task_display_name": "Object Relevance Prediction",
|
| 257 |
"research_name": "Object-Centric Interaction Recognition",
|
| 258 |
-
"
|
| 259 |
-
"origin_count_label": "
|
| 260 |
"family": "supervised",
|
| 261 |
"architecture_family": "multi-label classifier",
|
| 262 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -288,8 +288,8 @@
|
|
| 288 |
"task_id": "caption_grounding",
|
| 289 |
"task_display_name": "Language Grounding",
|
| 290 |
"research_name": "Language-to-Moment Grounding",
|
| 291 |
-
"
|
| 292 |
-
"origin_count_label": "
|
| 293 |
"family": "retrieval",
|
| 294 |
"architecture_family": "retrieval ranker",
|
| 295 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
@@ -321,8 +321,8 @@
|
|
| 321 |
"task_id": "cross_modal_retrieval",
|
| 322 |
"task_display_name": "Cross-Modal Retrieval",
|
| 323 |
"research_name": "Multimodal Representation Retrieval",
|
| 324 |
-
"
|
| 325 |
-
"origin_count_label": "
|
| 326 |
"family": "retrieval",
|
| 327 |
"architecture_family": "two-tower retrieval head",
|
| 328 |
"primary_direction": "D. Scene Reconstruction & World Modeling",
|
|
@@ -354,8 +354,8 @@
|
|
| 354 |
"task_id": "modality_reconstruction",
|
| 355 |
"task_display_name": "Cross-Modal Reconstruction",
|
| 356 |
"research_name": "Modality Feature Reconstruction",
|
| 357 |
-
"
|
| 358 |
-
"origin_count_label": "
|
| 359 |
"family": "forecast",
|
| 360 |
"architecture_family": "feature regressor",
|
| 361 |
"primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
|
|
@@ -386,8 +386,8 @@
|
|
| 386 |
"task_id": "temporal_order",
|
| 387 |
"task_display_name": "Temporal Order Verification",
|
| 388 |
"research_name": "Temporal Order Verification",
|
| 389 |
-
"
|
| 390 |
-
"origin_count_label": "
|
| 391 |
"family": "diagnostic",
|
| 392 |
"architecture_family": "pairwise classifier",
|
| 393 |
"primary_direction": "D. Scene Reconstruction & World Modeling",
|
|
@@ -419,8 +419,8 @@
|
|
| 419 |
"task_id": "misalignment_detection",
|
| 420 |
"task_display_name": "Multimodal Synchronization Detection",
|
| 421 |
"research_name": "Cross-Modal Misalignment Detection",
|
| 422 |
-
"
|
| 423 |
-
"origin_count_label": "
|
| 424 |
"family": "diagnostic",
|
| 425 |
"architecture_family": "pairwise classifier",
|
| 426 |
"primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
|
|
@@ -452,8 +452,8 @@
|
|
| 452 |
"task_id": "long_horizon_next_action",
|
| 453 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 454 |
"research_name": "Long-Horizon Next-Action Forecasting",
|
| 455 |
-
"
|
| 456 |
-
"origin_count_label": "
|
| 457 |
"family": "classification",
|
| 458 |
"architecture_family": "minimal_softmax",
|
| 459 |
"primary_direction": "sample-supported extension",
|
|
@@ -487,8 +487,8 @@
|
|
| 487 |
"task_id": "next_subtask_forecast",
|
| 488 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 489 |
"research_name": "Long-Horizon Next-Subtask Forecasting",
|
| 490 |
-
"
|
| 491 |
-
"origin_count_label": "
|
| 492 |
"family": "classification",
|
| 493 |
"architecture_family": "minimal_softmax",
|
| 494 |
"primary_direction": "sample-supported extension",
|
|
@@ -522,8 +522,8 @@
|
|
| 522 |
"task_id": "interaction_text_prediction",
|
| 523 |
"task_display_name": "Interaction Text Prediction",
|
| 524 |
"research_name": "Interaction Text Prediction",
|
| 525 |
-
"
|
| 526 |
-
"origin_count_label": "
|
| 527 |
"family": "classification",
|
| 528 |
"architecture_family": "minimal_softmax",
|
| 529 |
"primary_direction": "sample-supported extension",
|
|
@@ -557,8 +557,8 @@
|
|
| 557 |
"task_id": "action_object_relation",
|
| 558 |
"task_display_name": "Action-Object Relation Prediction",
|
| 559 |
"research_name": "Action-Object Relation Prediction",
|
| 560 |
-
"
|
| 561 |
-
"origin_count_label": "
|
| 562 |
"family": "classification",
|
| 563 |
"architecture_family": "minimal_softmax",
|
| 564 |
"primary_direction": "sample-supported extension",
|
|
@@ -592,8 +592,8 @@
|
|
| 592 |
"task_id": "object_set_forecast",
|
| 593 |
"task_display_name": "Future Object-Set Forecasting",
|
| 594 |
"research_name": "Future Object-Set Forecasting",
|
| 595 |
-
"
|
| 596 |
-
"origin_count_label": "
|
| 597 |
"family": "multi_label",
|
| 598 |
"architecture_family": "minimal_ridge_multilabel",
|
| 599 |
"primary_direction": "sample-supported extension",
|
|
@@ -625,8 +625,8 @@
|
|
| 625 |
"task_id": "imu_to_hand_pose",
|
| 626 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 627 |
"research_name": "IMU-to-Hand Pose Reconstruction",
|
| 628 |
-
"
|
| 629 |
-
"origin_count_label": "
|
| 630 |
"family": "regression",
|
| 631 |
"architecture_family": "minimal_ridge_regression",
|
| 632 |
"primary_direction": "sample-supported extension",
|
|
@@ -658,8 +658,8 @@
|
|
| 658 |
"task_id": "camera_view_sync_retrieval",
|
| 659 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 660 |
"research_name": "Camera-View Synchronization Retrieval",
|
| 661 |
-
"
|
| 662 |
-
"origin_count_label": "
|
| 663 |
"family": "retrieval",
|
| 664 |
"architecture_family": "minimal_ridge_projection_cosine_retrieval",
|
| 665 |
"primary_direction": "sample-supported extension",
|
|
@@ -690,8 +690,8 @@
|
|
| 690 |
"task_id": "time_to_transition",
|
| 691 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 692 |
"research_name": "Time-to-Next-Transition Regression",
|
| 693 |
-
"
|
| 694 |
-
"origin_count_label": "
|
| 695 |
"family": "regression",
|
| 696 |
"architecture_family": "minimal_ridge_regression",
|
| 697 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Unified 20-Task Suite",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:21:12+00:00",
|
| 5 |
"task_count": 20,
|
| 6 |
+
"task_count_summary": {
|
| 7 |
+
"total_unified_tasks": 20,
|
| 8 |
+
"public_framing": "all 20 task contracts are presented as one suite",
|
| 9 |
+
"legacy_provenance_rows": 8
|
| 10 |
},
|
| 11 |
"unification_policy": {
|
| 12 |
"public_framing": "The suite is presented as one 20-task benchmark surface. All task contracts share the same window, split, feature, baseline, and leakage-control language.",
|
|
|
|
| 21 |
"window_frames": 20,
|
| 22 |
"stride_frames": 5,
|
| 23 |
"split_policy": "single_episode_chronological_70_30",
|
| 24 |
+
"raw_hdf5_required_for_full_public_regeneration": true,
|
| 25 |
"raw_data_redistributed": false
|
| 26 |
},
|
| 27 |
"setup_alignment": {
|
|
|
|
| 47 |
"task_id": "timeline_action",
|
| 48 |
"task_display_name": "Action Recognition",
|
| 49 |
"research_name": "Egocentric Action Recognition",
|
| 50 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 51 |
+
"origin_count_label": "unified task",
|
| 52 |
"family": "supervised",
|
| 53 |
"architecture_family": "multiclass classifier",
|
| 54 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 82 |
"task_id": "timeline_subtask",
|
| 83 |
"task_display_name": "Procedure Step Recognition",
|
| 84 |
"research_name": "Temporal Subtask Recognition",
|
| 85 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 86 |
+
"origin_count_label": "unified task",
|
| 87 |
"family": "supervised",
|
| 88 |
"architecture_family": "multiclass classifier",
|
| 89 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 117 |
"task_id": "transition_detection",
|
| 118 |
"task_display_name": "Action Boundary Detection",
|
| 119 |
"research_name": "Temporal Action Segmentation",
|
| 120 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 121 |
+
"origin_count_label": "unified task",
|
| 122 |
"family": "diagnostic",
|
| 123 |
"architecture_family": "binary classifier",
|
| 124 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 152 |
"task_id": "next_action",
|
| 153 |
"task_display_name": "Next-Action Prediction",
|
| 154 |
"research_name": "Short-Horizon Intention Prediction",
|
| 155 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 156 |
+
"origin_count_label": "unified task",
|
| 157 |
"family": "supervised",
|
| 158 |
"architecture_family": "future-label classifier",
|
| 159 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 187 |
"task_id": "hand_trajectory_forecast",
|
| 188 |
"task_display_name": "Hand Trajectory Forecasting",
|
| 189 |
"research_name": "3D Hand Motion Forecasting",
|
| 190 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 191 |
+
"origin_count_label": "unified task",
|
| 192 |
"family": "forecast",
|
| 193 |
"architecture_family": "continuous regressor",
|
| 194 |
"primary_direction": "A. Human Modeling & Motion Understanding",
|
|
|
|
| 220 |
"task_id": "contact_prediction",
|
| 221 |
"task_display_name": "Contact State Prediction",
|
| 222 |
"research_name": "Human-Object Contact Prediction",
|
| 223 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 224 |
+
"origin_count_label": "unified task",
|
| 225 |
"family": "supervised",
|
| 226 |
"architecture_family": "binary classifier",
|
| 227 |
"primary_direction": "A. Human Modeling & Motion Understanding",
|
|
|
|
| 255 |
"task_id": "object_relevance",
|
| 256 |
"task_display_name": "Object Relevance Prediction",
|
| 257 |
"research_name": "Object-Centric Interaction Recognition",
|
| 258 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 259 |
+
"origin_count_label": "unified task",
|
| 260 |
"family": "supervised",
|
| 261 |
"architecture_family": "multi-label classifier",
|
| 262 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 288 |
"task_id": "caption_grounding",
|
| 289 |
"task_display_name": "Language Grounding",
|
| 290 |
"research_name": "Language-to-Moment Grounding",
|
| 291 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 292 |
+
"origin_count_label": "unified task",
|
| 293 |
"family": "retrieval",
|
| 294 |
"architecture_family": "retrieval ranker",
|
| 295 |
"primary_direction": "C. Egocentric Vision & Interaction",
|
|
|
|
| 321 |
"task_id": "cross_modal_retrieval",
|
| 322 |
"task_display_name": "Cross-Modal Retrieval",
|
| 323 |
"research_name": "Multimodal Representation Retrieval",
|
| 324 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 325 |
+
"origin_count_label": "unified task",
|
| 326 |
"family": "retrieval",
|
| 327 |
"architecture_family": "two-tower retrieval head",
|
| 328 |
"primary_direction": "D. Scene Reconstruction & World Modeling",
|
|
|
|
| 354 |
"task_id": "modality_reconstruction",
|
| 355 |
"task_display_name": "Cross-Modal Reconstruction",
|
| 356 |
"research_name": "Modality Feature Reconstruction",
|
| 357 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 358 |
+
"origin_count_label": "unified task",
|
| 359 |
"family": "forecast",
|
| 360 |
"architecture_family": "feature regressor",
|
| 361 |
"primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
|
|
|
|
| 386 |
"task_id": "temporal_order",
|
| 387 |
"task_display_name": "Temporal Order Verification",
|
| 388 |
"research_name": "Temporal Order Verification",
|
| 389 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 390 |
+
"origin_count_label": "unified task",
|
| 391 |
"family": "diagnostic",
|
| 392 |
"architecture_family": "pairwise classifier",
|
| 393 |
"primary_direction": "D. Scene Reconstruction & World Modeling",
|
|
|
|
| 419 |
"task_id": "misalignment_detection",
|
| 420 |
"task_display_name": "Multimodal Synchronization Detection",
|
| 421 |
"research_name": "Cross-Modal Misalignment Detection",
|
| 422 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 423 |
+
"origin_count_label": "unified task",
|
| 424 |
"family": "diagnostic",
|
| 425 |
"architecture_family": "pairwise classifier",
|
| 426 |
"primary_direction": "B. 3D/4D Reconstruction & Neural Rendering",
|
|
|
|
| 452 |
"task_id": "long_horizon_next_action",
|
| 453 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 454 |
"research_name": "Long-Horizon Next-Action Forecasting",
|
| 455 |
+
"provenance_source": "historical_result_bundle",
|
| 456 |
+
"origin_count_label": "unified task",
|
| 457 |
"family": "classification",
|
| 458 |
"architecture_family": "minimal_softmax",
|
| 459 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 487 |
"task_id": "next_subtask_forecast",
|
| 488 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 489 |
"research_name": "Long-Horizon Next-Subtask Forecasting",
|
| 490 |
+
"provenance_source": "historical_result_bundle",
|
| 491 |
+
"origin_count_label": "unified task",
|
| 492 |
"family": "classification",
|
| 493 |
"architecture_family": "minimal_softmax",
|
| 494 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 522 |
"task_id": "interaction_text_prediction",
|
| 523 |
"task_display_name": "Interaction Text Prediction",
|
| 524 |
"research_name": "Interaction Text Prediction",
|
| 525 |
+
"provenance_source": "historical_result_bundle",
|
| 526 |
+
"origin_count_label": "unified task",
|
| 527 |
"family": "classification",
|
| 528 |
"architecture_family": "minimal_softmax",
|
| 529 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 557 |
"task_id": "action_object_relation",
|
| 558 |
"task_display_name": "Action-Object Relation Prediction",
|
| 559 |
"research_name": "Action-Object Relation Prediction",
|
| 560 |
+
"provenance_source": "historical_result_bundle",
|
| 561 |
+
"origin_count_label": "unified task",
|
| 562 |
"family": "classification",
|
| 563 |
"architecture_family": "minimal_softmax",
|
| 564 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 592 |
"task_id": "object_set_forecast",
|
| 593 |
"task_display_name": "Future Object-Set Forecasting",
|
| 594 |
"research_name": "Future Object-Set Forecasting",
|
| 595 |
+
"provenance_source": "historical_result_bundle",
|
| 596 |
+
"origin_count_label": "unified task",
|
| 597 |
"family": "multi_label",
|
| 598 |
"architecture_family": "minimal_ridge_multilabel",
|
| 599 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 625 |
"task_id": "imu_to_hand_pose",
|
| 626 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 627 |
"research_name": "IMU-to-Hand Pose Reconstruction",
|
| 628 |
+
"provenance_source": "historical_result_bundle",
|
| 629 |
+
"origin_count_label": "unified task",
|
| 630 |
"family": "regression",
|
| 631 |
"architecture_family": "minimal_ridge_regression",
|
| 632 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 658 |
"task_id": "camera_view_sync_retrieval",
|
| 659 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 660 |
"research_name": "Camera-View Synchronization Retrieval",
|
| 661 |
+
"provenance_source": "historical_result_bundle",
|
| 662 |
+
"origin_count_label": "unified task",
|
| 663 |
"family": "retrieval",
|
| 664 |
"architecture_family": "minimal_ridge_projection_cosine_retrieval",
|
| 665 |
"primary_direction": "sample-supported extension",
|
|
|
|
| 690 |
"task_id": "time_to_transition",
|
| 691 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 692 |
"research_name": "Time-to-Next-Transition Regression",
|
| 693 |
+
"provenance_source": "historical_result_bundle",
|
| 694 |
+
"origin_count_label": "unified task",
|
| 695 |
"family": "regression",
|
| 696 |
"architecture_family": "minimal_ridge_regression",
|
| 697 |
"primary_direction": "sample-supported extension",
|
data/task_surface_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"summary": {
|
| 5 |
"original_walkthrough_task_count": 12,
|
| 6 |
"expected_original_walkthrough_task_count": 12,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:21:55+00:00",
|
| 4 |
"summary": {
|
| 5 |
"original_walkthrough_task_count": 12,
|
| 6 |
"expected_original_walkthrough_task_count": 12,
|
data/tier2_task_suite.json
CHANGED
|
@@ -2,13 +2,12 @@
|
|
| 2 |
"title": "Ropedia Xperience-10M Unified 20-Task Provenance Bundle",
|
| 3 |
"status": "pass",
|
| 4 |
"generated_at_utc": "2026-06-16T06:25:58+00:00",
|
| 5 |
-
"suite_position": "
|
| 6 |
"legacy_path_note": "The tier2_task_suite file and directory names are retained for stable public links; this bundle is provenance inside the unified 20-task suite, not a separate public tier.",
|
| 7 |
-
"
|
| 8 |
-
"
|
| 9 |
-
"
|
| 10 |
-
"
|
| 11 |
-
"tasks_1_to_12_metrics": "docs/data/summary_metrics.json",
|
| 12 |
"unified_protocol": "docs/data/evaluation_protocol.json"
|
| 13 |
},
|
| 14 |
"dataset_scope": {
|
|
@@ -28,9 +27,9 @@
|
|
| 28 |
"raw_data_redistributed": false
|
| 29 |
},
|
| 30 |
"setup_alignment": {
|
| 31 |
-
"
|
| 32 |
-
"
|
| 33 |
-
"
|
| 34 |
"minimal_baselines": "softmax, ridge regression/projection, and ridge multilabel heads",
|
| 35 |
"neural_baselines": "compact one-hidden-layer/two-layer PyTorch MLP heads with the same chronological split",
|
| 36 |
"leakage_policy": "Caption-derived text features are removed whenever the target is a label, object, relation, interaction phrase, or future semantic state."
|
|
@@ -135,7 +134,7 @@
|
|
| 135 |
"status": "pass",
|
| 136 |
"task": "long_horizon_next_action",
|
| 137 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 138 |
-
"suite_position": "
|
| 139 |
"model_family": "minimal_softmax",
|
| 140 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 141 |
"split": "single_episode_chronological",
|
|
@@ -221,7 +220,7 @@
|
|
| 221 |
"status": "pass",
|
| 222 |
"task": "long_horizon_next_action",
|
| 223 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 224 |
-
"suite_position": "
|
| 225 |
"model_family": "neural_mlp",
|
| 226 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 227 |
"split": "single_episode_chronological",
|
|
@@ -276,7 +275,7 @@
|
|
| 276 |
"status": "pass",
|
| 277 |
"task": "next_subtask_forecast",
|
| 278 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 279 |
-
"suite_position": "
|
| 280 |
"model_family": "minimal_softmax",
|
| 281 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 282 |
"split": "single_episode_chronological",
|
|
@@ -361,7 +360,7 @@
|
|
| 361 |
"status": "pass",
|
| 362 |
"task": "next_subtask_forecast",
|
| 363 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 364 |
-
"suite_position": "
|
| 365 |
"model_family": "neural_mlp",
|
| 366 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 367 |
"split": "single_episode_chronological",
|
|
@@ -416,7 +415,7 @@
|
|
| 416 |
"status": "pass",
|
| 417 |
"task": "interaction_text_prediction",
|
| 418 |
"task_display_name": "Interaction Text Prediction",
|
| 419 |
-
"suite_position": "
|
| 420 |
"model_family": "minimal_softmax",
|
| 421 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 422 |
"split": "single_episode_chronological",
|
|
@@ -512,7 +511,7 @@
|
|
| 512 |
"status": "pass",
|
| 513 |
"task": "interaction_text_prediction",
|
| 514 |
"task_display_name": "Interaction Text Prediction",
|
| 515 |
-
"suite_position": "
|
| 516 |
"model_family": "neural_mlp",
|
| 517 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 518 |
"split": "single_episode_chronological",
|
|
@@ -567,7 +566,7 @@
|
|
| 567 |
"status": "pass",
|
| 568 |
"task": "action_object_relation",
|
| 569 |
"task_display_name": "Action-Object Relation Prediction",
|
| 570 |
-
"suite_position": "
|
| 571 |
"model_family": "minimal_softmax",
|
| 572 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 573 |
"split": "single_episode_chronological",
|
|
@@ -659,7 +658,7 @@
|
|
| 659 |
"status": "pass",
|
| 660 |
"task": "action_object_relation",
|
| 661 |
"task_display_name": "Action-Object Relation Prediction",
|
| 662 |
-
"suite_position": "
|
| 663 |
"model_family": "neural_mlp",
|
| 664 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 665 |
"split": "single_episode_chronological",
|
|
@@ -713,7 +712,7 @@
|
|
| 713 |
"status": "pass",
|
| 714 |
"task": "object_set_forecast",
|
| 715 |
"task_display_name": "Future Object-Set Forecasting",
|
| 716 |
-
"suite_position": "
|
| 717 |
"model_family": "minimal_ridge_multilabel",
|
| 718 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 719 |
"split": "single_episode_chronological",
|
|
@@ -747,7 +746,7 @@
|
|
| 747 |
"status": "pass",
|
| 748 |
"task": "object_set_forecast",
|
| 749 |
"task_display_name": "Future Object-Set Forecasting",
|
| 750 |
-
"suite_position": "
|
| 751 |
"model_family": "neural_mlp_multilabel",
|
| 752 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 753 |
"split": "single_episode_chronological",
|
|
@@ -795,7 +794,7 @@
|
|
| 795 |
"status": "pass",
|
| 796 |
"task": "imu_to_hand_pose",
|
| 797 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 798 |
-
"suite_position": "
|
| 799 |
"model_family": "minimal_ridge_regression",
|
| 800 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
| 801 |
"split": "single_episode_chronological",
|
|
@@ -814,7 +813,7 @@
|
|
| 814 |
"status": "pass",
|
| 815 |
"task": "imu_to_hand_pose",
|
| 816 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 817 |
-
"suite_position": "
|
| 818 |
"model_family": "neural_mlp_regression",
|
| 819 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
| 820 |
"split": "single_episode_chronological",
|
|
@@ -864,7 +863,7 @@
|
|
| 864 |
"status": "pass",
|
| 865 |
"task": "camera_view_sync_retrieval",
|
| 866 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 867 |
-
"suite_position": "
|
| 868 |
"model_family": "minimal_ridge_projection_cosine_retrieval",
|
| 869 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
| 870 |
"split": "single_episode_chronological",
|
|
@@ -885,7 +884,7 @@
|
|
| 885 |
"status": "pass",
|
| 886 |
"task": "camera_view_sync_retrieval",
|
| 887 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 888 |
-
"suite_position": "
|
| 889 |
"model_family": "neural_mlp_projection_cosine_retrieval",
|
| 890 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
| 891 |
"split": "single_episode_chronological",
|
|
@@ -934,7 +933,7 @@
|
|
| 934 |
"status": "pass",
|
| 935 |
"task": "time_to_transition",
|
| 936 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 937 |
-
"suite_position": "
|
| 938 |
"model_family": "minimal_ridge_regression",
|
| 939 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 940 |
"split": "single_episode_chronological",
|
|
@@ -954,7 +953,7 @@
|
|
| 954 |
"status": "pass",
|
| 955 |
"task": "time_to_transition",
|
| 956 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 957 |
-
"suite_position": "
|
| 958 |
"model_family": "neural_mlp_regression",
|
| 959 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 960 |
"split": "single_episode_chronological",
|
|
|
|
| 2 |
"title": "Ropedia Xperience-10M Unified 20-Task Provenance Bundle",
|
| 3 |
"status": "pass",
|
| 4 |
"generated_at_utc": "2026-06-16T06:25:58+00:00",
|
| 5 |
+
"suite_position": "unified_20_task_provenance",
|
| 6 |
"legacy_path_note": "The tier2_task_suite file and directory names are retained for stable public links; this bundle is provenance inside the unified 20-task suite, not a separate public tier.",
|
| 7 |
+
"unified_task_integration": {
|
| 8 |
+
"total_task_count": 20,
|
| 9 |
+
"legacy_provenance_row_count": 8,
|
| 10 |
+
"shared_metrics": "docs/data/summary_metrics.json",
|
|
|
|
| 11 |
"unified_protocol": "docs/data/evaluation_protocol.json"
|
| 12 |
},
|
| 13 |
"dataset_scope": {
|
|
|
|
| 27 |
"raw_data_redistributed": false
|
| 28 |
},
|
| 29 |
"setup_alignment": {
|
| 30 |
+
"same_window_unit_as_unified_suite": true,
|
| 31 |
+
"same_feature_manifest_as_unified_suite": "results/episode_task_suite/feature_manifest.json",
|
| 32 |
+
"same_shared_tensor_as_unified_suite": "results/episode_task_suite/shared_windows.npz",
|
| 33 |
"minimal_baselines": "softmax, ridge regression/projection, and ridge multilabel heads",
|
| 34 |
"neural_baselines": "compact one-hidden-layer/two-layer PyTorch MLP heads with the same chronological split",
|
| 35 |
"leakage_policy": "Caption-derived text features are removed whenever the target is a label, object, relation, interaction phrase, or future semantic state."
|
|
|
|
| 134 |
"status": "pass",
|
| 135 |
"task": "long_horizon_next_action",
|
| 136 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 137 |
+
"suite_position": "unified_20_task_provenance",
|
| 138 |
"model_family": "minimal_softmax",
|
| 139 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 140 |
"split": "single_episode_chronological",
|
|
|
|
| 220 |
"status": "pass",
|
| 221 |
"task": "long_horizon_next_action",
|
| 222 |
"task_display_name": "Long-Horizon Next-Action Forecasting",
|
| 223 |
+
"suite_position": "unified_20_task_provenance",
|
| 224 |
"model_family": "neural_mlp",
|
| 225 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 226 |
"split": "single_episode_chronological",
|
|
|
|
| 275 |
"status": "pass",
|
| 276 |
"task": "next_subtask_forecast",
|
| 277 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 278 |
+
"suite_position": "unified_20_task_provenance",
|
| 279 |
"model_family": "minimal_softmax",
|
| 280 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 281 |
"split": "single_episode_chronological",
|
|
|
|
| 360 |
"status": "pass",
|
| 361 |
"task": "next_subtask_forecast",
|
| 362 |
"task_display_name": "Long-Horizon Next-Subtask Forecasting",
|
| 363 |
+
"suite_position": "unified_20_task_provenance",
|
| 364 |
"model_family": "neural_mlp",
|
| 365 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 366 |
"split": "single_episode_chronological",
|
|
|
|
| 415 |
"status": "pass",
|
| 416 |
"task": "interaction_text_prediction",
|
| 417 |
"task_display_name": "Interaction Text Prediction",
|
| 418 |
+
"suite_position": "unified_20_task_provenance",
|
| 419 |
"model_family": "minimal_softmax",
|
| 420 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 421 |
"split": "single_episode_chronological",
|
|
|
|
| 511 |
"status": "pass",
|
| 512 |
"task": "interaction_text_prediction",
|
| 513 |
"task_display_name": "Interaction Text Prediction",
|
| 514 |
+
"suite_position": "unified_20_task_provenance",
|
| 515 |
"model_family": "neural_mlp",
|
| 516 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 517 |
"split": "single_episode_chronological",
|
|
|
|
| 566 |
"status": "pass",
|
| 567 |
"task": "action_object_relation",
|
| 568 |
"task_display_name": "Action-Object Relation Prediction",
|
| 569 |
+
"suite_position": "unified_20_task_provenance",
|
| 570 |
"model_family": "minimal_softmax",
|
| 571 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 572 |
"split": "single_episode_chronological",
|
|
|
|
| 658 |
"status": "pass",
|
| 659 |
"task": "action_object_relation",
|
| 660 |
"task_display_name": "Action-Object Relation Prediction",
|
| 661 |
+
"suite_position": "unified_20_task_provenance",
|
| 662 |
"model_family": "neural_mlp",
|
| 663 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 664 |
"split": "single_episode_chronological",
|
|
|
|
| 712 |
"status": "pass",
|
| 713 |
"task": "object_set_forecast",
|
| 714 |
"task_display_name": "Future Object-Set Forecasting",
|
| 715 |
+
"suite_position": "unified_20_task_provenance",
|
| 716 |
"model_family": "minimal_ridge_multilabel",
|
| 717 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 718 |
"split": "single_episode_chronological",
|
|
|
|
| 746 |
"status": "pass",
|
| 747 |
"task": "object_set_forecast",
|
| 748 |
"task_display_name": "Future Object-Set Forecasting",
|
| 749 |
+
"suite_position": "unified_20_task_provenance",
|
| 750 |
"model_family": "neural_mlp_multilabel",
|
| 751 |
"input": "Current 20-frame sensor window with caption-text features removed.",
|
| 752 |
"split": "single_episode_chronological",
|
|
|
|
| 794 |
"status": "pass",
|
| 795 |
"task": "imu_to_hand_pose",
|
| 796 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 797 |
+
"suite_position": "unified_20_task_provenance",
|
| 798 |
"model_family": "minimal_ridge_regression",
|
| 799 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
| 800 |
"split": "single_episode_chronological",
|
|
|
|
| 813 |
"status": "pass",
|
| 814 |
"task": "imu_to_hand_pose",
|
| 815 |
"task_display_name": "IMU-to-Hand Pose Reconstruction",
|
| 816 |
+
"suite_position": "unified_20_task_provenance",
|
| 817 |
"model_family": "neural_mlp_regression",
|
| 818 |
"input": "Current IMU acceleration/gyroscope feature block only.",
|
| 819 |
"split": "single_episode_chronological",
|
|
|
|
| 863 |
"status": "pass",
|
| 864 |
"task": "camera_view_sync_retrieval",
|
| 865 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 866 |
+
"suite_position": "unified_20_task_provenance",
|
| 867 |
"model_family": "minimal_ridge_projection_cosine_retrieval",
|
| 868 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
| 869 |
"split": "single_episode_chronological",
|
|
|
|
| 884 |
"status": "pass",
|
| 885 |
"task": "camera_view_sync_retrieval",
|
| 886 |
"task_display_name": "Camera-View Synchronization Retrieval",
|
| 887 |
+
"suite_position": "unified_20_task_provenance",
|
| 888 |
"model_family": "neural_mlp_projection_cosine_retrieval",
|
| 889 |
"input": "Fisheye camera-1 feature query projected into fisheye camera-3 feature space.",
|
| 890 |
"split": "single_episode_chronological",
|
|
|
|
| 933 |
"status": "pass",
|
| 934 |
"task": "time_to_transition",
|
| 935 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 936 |
+
"suite_position": "unified_20_task_provenance",
|
| 937 |
"model_family": "minimal_ridge_regression",
|
| 938 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 939 |
"split": "single_episode_chronological",
|
|
|
|
| 953 |
"status": "pass",
|
| 954 |
"task": "time_to_transition",
|
| 955 |
"task_display_name": "Time-to-Next-Transition Regression",
|
| 956 |
+
"suite_position": "unified_20_task_provenance",
|
| 957 |
"model_family": "neural_mlp_regression",
|
| 958 |
"input": "Current 20-frame non-caption multimodal window.",
|
| 959 |
"split": "single_episode_chronological",
|
data/unified_task_model_radar.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Unified 20-Task Model Radar",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"task_count": 20,
|
| 6 |
"method_count": 9,
|
| 7 |
"method_task_record_count": 180,
|
|
@@ -235,7 +235,7 @@
|
|
| 235 |
"label": "Action Recognition",
|
| 236 |
"axis_label": "01 Action Recognition",
|
| 237 |
"short_label": "Action",
|
| 238 |
-
"
|
| 239 |
"metric_key": "macro_f1",
|
| 240 |
"metric_name": "macro-F1",
|
| 241 |
"metric_direction": "higher",
|
|
@@ -346,7 +346,7 @@
|
|
| 346 |
"label": "Procedure Step Recognition",
|
| 347 |
"axis_label": "02 Procedure Step Recognition",
|
| 348 |
"short_label": "Step",
|
| 349 |
-
"
|
| 350 |
"metric_key": "macro_f1",
|
| 351 |
"metric_name": "macro-F1",
|
| 352 |
"metric_direction": "higher",
|
|
@@ -457,7 +457,7 @@
|
|
| 457 |
"label": "Action Boundary Detection",
|
| 458 |
"axis_label": "03 Action Boundary Detection",
|
| 459 |
"short_label": "Boundary",
|
| 460 |
-
"
|
| 461 |
"metric_key": "macro_f1",
|
| 462 |
"metric_name": "macro-F1",
|
| 463 |
"metric_direction": "higher",
|
|
@@ -568,7 +568,7 @@
|
|
| 568 |
"label": "Next-Action Prediction",
|
| 569 |
"axis_label": "04 Next-Action Prediction",
|
| 570 |
"short_label": "Next act",
|
| 571 |
-
"
|
| 572 |
"metric_key": "macro_f1",
|
| 573 |
"metric_name": "macro-F1",
|
| 574 |
"metric_direction": "higher",
|
|
@@ -679,7 +679,7 @@
|
|
| 679 |
"label": "Hand Trajectory Forecasting",
|
| 680 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 681 |
"short_label": "Hand traj",
|
| 682 |
-
"
|
| 683 |
"metric_key": "mpjpe",
|
| 684 |
"metric_name": "MPJPE",
|
| 685 |
"metric_direction": "lower",
|
|
@@ -790,7 +790,7 @@
|
|
| 790 |
"label": "Contact State Prediction",
|
| 791 |
"axis_label": "06 Contact State Prediction",
|
| 792 |
"short_label": "Contact",
|
| 793 |
-
"
|
| 794 |
"metric_key": "macro_f1",
|
| 795 |
"metric_name": "macro-F1",
|
| 796 |
"metric_direction": "higher",
|
|
@@ -901,7 +901,7 @@
|
|
| 901 |
"label": "Object Relevance Prediction",
|
| 902 |
"axis_label": "07 Object Relevance Prediction",
|
| 903 |
"short_label": "Objects",
|
| 904 |
-
"
|
| 905 |
"metric_key": "micro_f1",
|
| 906 |
"metric_name": "micro-F1",
|
| 907 |
"metric_direction": "higher",
|
|
@@ -1012,7 +1012,7 @@
|
|
| 1012 |
"label": "Language Grounding",
|
| 1013 |
"axis_label": "08 Language Grounding",
|
| 1014 |
"short_label": "Language",
|
| 1015 |
-
"
|
| 1016 |
"metric_key": "mrr",
|
| 1017 |
"metric_name": "MRR",
|
| 1018 |
"metric_direction": "higher",
|
|
@@ -1123,7 +1123,7 @@
|
|
| 1123 |
"label": "Cross-Modal Retrieval",
|
| 1124 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 1125 |
"short_label": "X-modal",
|
| 1126 |
-
"
|
| 1127 |
"metric_key": "mrr",
|
| 1128 |
"metric_name": "MRR",
|
| 1129 |
"metric_direction": "higher",
|
|
@@ -1234,7 +1234,7 @@
|
|
| 1234 |
"label": "Cross-Modal Reconstruction",
|
| 1235 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 1236 |
"short_label": "Recon",
|
| 1237 |
-
"
|
| 1238 |
"metric_key": "r2",
|
| 1239 |
"metric_name": "R2",
|
| 1240 |
"metric_direction": "higher",
|
|
@@ -1345,7 +1345,7 @@
|
|
| 1345 |
"label": "Temporal Order Verification",
|
| 1346 |
"axis_label": "11 Temporal Order Verification",
|
| 1347 |
"short_label": "Order",
|
| 1348 |
-
"
|
| 1349 |
"metric_key": "f1",
|
| 1350 |
"metric_name": "F1",
|
| 1351 |
"metric_direction": "higher",
|
|
@@ -1456,7 +1456,7 @@
|
|
| 1456 |
"label": "Multimodal Synchronization Detection",
|
| 1457 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 1458 |
"short_label": "Sync",
|
| 1459 |
-
"
|
| 1460 |
"metric_key": "f1",
|
| 1461 |
"metric_name": "F1",
|
| 1462 |
"metric_direction": "higher",
|
|
@@ -1567,7 +1567,7 @@
|
|
| 1567 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 1568 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 1569 |
"short_label": "Long act",
|
| 1570 |
-
"
|
| 1571 |
"metric_key": "macro_f1",
|
| 1572 |
"metric_name": "macro-F1",
|
| 1573 |
"metric_direction": "higher",
|
|
@@ -1678,7 +1678,7 @@
|
|
| 1678 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 1679 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 1680 |
"short_label": "Long step",
|
| 1681 |
-
"
|
| 1682 |
"metric_key": "macro_f1",
|
| 1683 |
"metric_name": "macro-F1",
|
| 1684 |
"metric_direction": "higher",
|
|
@@ -1789,7 +1789,7 @@
|
|
| 1789 |
"label": "Interaction Text Prediction",
|
| 1790 |
"axis_label": "15 Interaction Text Prediction",
|
| 1791 |
"short_label": "Interact txt",
|
| 1792 |
-
"
|
| 1793 |
"metric_key": "macro_f1",
|
| 1794 |
"metric_name": "macro-F1",
|
| 1795 |
"metric_direction": "higher",
|
|
@@ -1900,7 +1900,7 @@
|
|
| 1900 |
"label": "Action-Object Relation Prediction",
|
| 1901 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 1902 |
"short_label": "Act+obj",
|
| 1903 |
-
"
|
| 1904 |
"metric_key": "macro_f1",
|
| 1905 |
"metric_name": "macro-F1",
|
| 1906 |
"metric_direction": "higher",
|
|
@@ -2011,7 +2011,7 @@
|
|
| 2011 |
"label": "Future Object-Set Forecasting",
|
| 2012 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 2013 |
"short_label": "Future obj",
|
| 2014 |
-
"
|
| 2015 |
"metric_key": "micro_f1",
|
| 2016 |
"metric_name": "micro-F1",
|
| 2017 |
"metric_direction": "higher",
|
|
@@ -2122,7 +2122,7 @@
|
|
| 2122 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 2123 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 2124 |
"short_label": "IMU->hand",
|
| 2125 |
-
"
|
| 2126 |
"metric_key": "mae",
|
| 2127 |
"metric_name": "MAE",
|
| 2128 |
"metric_direction": "lower",
|
|
@@ -2233,7 +2233,7 @@
|
|
| 2233 |
"label": "Camera-View Synchronization Retrieval",
|
| 2234 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 2235 |
"short_label": "Cam sync",
|
| 2236 |
-
"
|
| 2237 |
"metric_key": "mrr",
|
| 2238 |
"metric_name": "MRR",
|
| 2239 |
"metric_direction": "higher",
|
|
@@ -2344,7 +2344,7 @@
|
|
| 2344 |
"label": "Time-to-Next-Transition Regression",
|
| 2345 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 2346 |
"short_label": "Time2bdry",
|
| 2347 |
-
"
|
| 2348 |
"metric_key": "mae",
|
| 2349 |
"metric_name": "MAE frames",
|
| 2350 |
"metric_direction": "lower",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Unified 20-Task Model Radar",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:20:34+00:00",
|
| 5 |
"task_count": 20,
|
| 6 |
"method_count": 9,
|
| 7 |
"method_task_record_count": 180,
|
|
|
|
| 235 |
"label": "Action Recognition",
|
| 236 |
"axis_label": "01 Action Recognition",
|
| 237 |
"short_label": "Action",
|
| 238 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 239 |
"metric_key": "macro_f1",
|
| 240 |
"metric_name": "macro-F1",
|
| 241 |
"metric_direction": "higher",
|
|
|
|
| 346 |
"label": "Procedure Step Recognition",
|
| 347 |
"axis_label": "02 Procedure Step Recognition",
|
| 348 |
"short_label": "Step",
|
| 349 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 350 |
"metric_key": "macro_f1",
|
| 351 |
"metric_name": "macro-F1",
|
| 352 |
"metric_direction": "higher",
|
|
|
|
| 457 |
"label": "Action Boundary Detection",
|
| 458 |
"axis_label": "03 Action Boundary Detection",
|
| 459 |
"short_label": "Boundary",
|
| 460 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 461 |
"metric_key": "macro_f1",
|
| 462 |
"metric_name": "macro-F1",
|
| 463 |
"metric_direction": "higher",
|
|
|
|
| 568 |
"label": "Next-Action Prediction",
|
| 569 |
"axis_label": "04 Next-Action Prediction",
|
| 570 |
"short_label": "Next act",
|
| 571 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 572 |
"metric_key": "macro_f1",
|
| 573 |
"metric_name": "macro-F1",
|
| 574 |
"metric_direction": "higher",
|
|
|
|
| 679 |
"label": "Hand Trajectory Forecasting",
|
| 680 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 681 |
"short_label": "Hand traj",
|
| 682 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 683 |
"metric_key": "mpjpe",
|
| 684 |
"metric_name": "MPJPE",
|
| 685 |
"metric_direction": "lower",
|
|
|
|
| 790 |
"label": "Contact State Prediction",
|
| 791 |
"axis_label": "06 Contact State Prediction",
|
| 792 |
"short_label": "Contact",
|
| 793 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 794 |
"metric_key": "macro_f1",
|
| 795 |
"metric_name": "macro-F1",
|
| 796 |
"metric_direction": "higher",
|
|
|
|
| 901 |
"label": "Object Relevance Prediction",
|
| 902 |
"axis_label": "07 Object Relevance Prediction",
|
| 903 |
"short_label": "Objects",
|
| 904 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 905 |
"metric_key": "micro_f1",
|
| 906 |
"metric_name": "micro-F1",
|
| 907 |
"metric_direction": "higher",
|
|
|
|
| 1012 |
"label": "Language Grounding",
|
| 1013 |
"axis_label": "08 Language Grounding",
|
| 1014 |
"short_label": "Language",
|
| 1015 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1016 |
"metric_key": "mrr",
|
| 1017 |
"metric_name": "MRR",
|
| 1018 |
"metric_direction": "higher",
|
|
|
|
| 1123 |
"label": "Cross-Modal Retrieval",
|
| 1124 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 1125 |
"short_label": "X-modal",
|
| 1126 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1127 |
"metric_key": "mrr",
|
| 1128 |
"metric_name": "MRR",
|
| 1129 |
"metric_direction": "higher",
|
|
|
|
| 1234 |
"label": "Cross-Modal Reconstruction",
|
| 1235 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 1236 |
"short_label": "Recon",
|
| 1237 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1238 |
"metric_key": "r2",
|
| 1239 |
"metric_name": "R2",
|
| 1240 |
"metric_direction": "higher",
|
|
|
|
| 1345 |
"label": "Temporal Order Verification",
|
| 1346 |
"axis_label": "11 Temporal Order Verification",
|
| 1347 |
"short_label": "Order",
|
| 1348 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1349 |
"metric_key": "f1",
|
| 1350 |
"metric_name": "F1",
|
| 1351 |
"metric_direction": "higher",
|
|
|
|
| 1456 |
"label": "Multimodal Synchronization Detection",
|
| 1457 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 1458 |
"short_label": "Sync",
|
| 1459 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1460 |
"metric_key": "f1",
|
| 1461 |
"metric_name": "F1",
|
| 1462 |
"metric_direction": "higher",
|
|
|
|
| 1567 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 1568 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 1569 |
"short_label": "Long act",
|
| 1570 |
+
"provenance_source": "historical_result_bundle",
|
| 1571 |
"metric_key": "macro_f1",
|
| 1572 |
"metric_name": "macro-F1",
|
| 1573 |
"metric_direction": "higher",
|
|
|
|
| 1678 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 1679 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 1680 |
"short_label": "Long step",
|
| 1681 |
+
"provenance_source": "historical_result_bundle",
|
| 1682 |
"metric_key": "macro_f1",
|
| 1683 |
"metric_name": "macro-F1",
|
| 1684 |
"metric_direction": "higher",
|
|
|
|
| 1789 |
"label": "Interaction Text Prediction",
|
| 1790 |
"axis_label": "15 Interaction Text Prediction",
|
| 1791 |
"short_label": "Interact txt",
|
| 1792 |
+
"provenance_source": "historical_result_bundle",
|
| 1793 |
"metric_key": "macro_f1",
|
| 1794 |
"metric_name": "macro-F1",
|
| 1795 |
"metric_direction": "higher",
|
|
|
|
| 1900 |
"label": "Action-Object Relation Prediction",
|
| 1901 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 1902 |
"short_label": "Act+obj",
|
| 1903 |
+
"provenance_source": "historical_result_bundle",
|
| 1904 |
"metric_key": "macro_f1",
|
| 1905 |
"metric_name": "macro-F1",
|
| 1906 |
"metric_direction": "higher",
|
|
|
|
| 2011 |
"label": "Future Object-Set Forecasting",
|
| 2012 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 2013 |
"short_label": "Future obj",
|
| 2014 |
+
"provenance_source": "historical_result_bundle",
|
| 2015 |
"metric_key": "micro_f1",
|
| 2016 |
"metric_name": "micro-F1",
|
| 2017 |
"metric_direction": "higher",
|
|
|
|
| 2122 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 2123 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 2124 |
"short_label": "IMU->hand",
|
| 2125 |
+
"provenance_source": "historical_result_bundle",
|
| 2126 |
"metric_key": "mae",
|
| 2127 |
"metric_name": "MAE",
|
| 2128 |
"metric_direction": "lower",
|
|
|
|
| 2233 |
"label": "Camera-View Synchronization Retrieval",
|
| 2234 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 2235 |
"short_label": "Cam sync",
|
| 2236 |
+
"provenance_source": "historical_result_bundle",
|
| 2237 |
"metric_key": "mrr",
|
| 2238 |
"metric_name": "MRR",
|
| 2239 |
"metric_direction": "higher",
|
|
|
|
| 2344 |
"label": "Time-to-Next-Transition Regression",
|
| 2345 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 2346 |
"short_label": "Time2bdry",
|
| 2347 |
+
"provenance_source": "historical_result_bundle",
|
| 2348 |
"metric_key": "mae",
|
| 2349 |
"metric_name": "MAE frames",
|
| 2350 |
"metric_direction": "lower",
|
data/website_integrity.json
CHANGED
|
@@ -1,14 +1,14 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
| 7 |
"html_pages": 4,
|
| 8 |
-
"local_references":
|
| 9 |
"external_reference_count": 157,
|
| 10 |
"json_files": 55,
|
| 11 |
-
"image_assets_referenced":
|
| 12 |
"failure_count": 0
|
| 13 |
},
|
| 14 |
"failures": {
|
|
@@ -81,7 +81,7 @@
|
|
| 81 |
"status": "pass",
|
| 82 |
"reason": "The project overview should appear before the deeper progress ledger.",
|
| 83 |
"overview_index": 121816,
|
| 84 |
-
"evidence_index":
|
| 85 |
},
|
| 86 |
{
|
| 87 |
"name": "project_status_links_json",
|
|
@@ -161,7 +161,7 @@
|
|
| 161 |
"reason": "The evaluation protocol should appear before the deeper evidence ledger.",
|
| 162 |
"overview_index": 121816,
|
| 163 |
"protocol_index": 163835,
|
| 164 |
-
"evidence_index":
|
| 165 |
},
|
| 166 |
{
|
| 167 |
"name": "evaluation_protocol_links_json",
|
|
@@ -277,8 +277,8 @@
|
|
| 277 |
{
|
| 278 |
"path": "index.html",
|
| 279 |
"id_count": 96,
|
| 280 |
-
"reference_count":
|
| 281 |
-
"image_count":
|
| 282 |
},
|
| 283 |
{
|
| 284 |
"path": "research_roadmap.html",
|
|
@@ -301,7 +301,7 @@
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/artifact_index.json",
|
| 304 |
-
"bytes":
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
@@ -316,12 +316,12 @@
|
|
| 316 |
},
|
| 317 |
{
|
| 318 |
"path": "data/episode128_task_model_radar.json",
|
| 319 |
-
"bytes":
|
| 320 |
"top_level_type": "dict"
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/evaluation_protocol.json",
|
| 324 |
-
"bytes":
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
|
@@ -331,7 +331,7 @@
|
|
| 331 |
},
|
| 332 |
{
|
| 333 |
"path": "data/figure_index.json",
|
| 334 |
-
"bytes":
|
| 335 |
"top_level_type": "dict"
|
| 336 |
},
|
| 337 |
{
|
|
@@ -351,7 +351,7 @@
|
|
| 351 |
},
|
| 352 |
{
|
| 353 |
"path": "data/live_publication_status.json",
|
| 354 |
-
"bytes":
|
| 355 |
"top_level_type": "dict"
|
| 356 |
},
|
| 357 |
{
|
|
@@ -371,27 +371,27 @@
|
|
| 371 |
},
|
| 372 |
{
|
| 373 |
"path": "data/omni_model_comparison.json",
|
| 374 |
-
"bytes":
|
| 375 |
"top_level_type": "dict"
|
| 376 |
},
|
| 377 |
{
|
| 378 |
"path": "data/project_brief.json",
|
| 379 |
-
"bytes":
|
| 380 |
"top_level_type": "dict"
|
| 381 |
},
|
| 382 |
{
|
| 383 |
"path": "data/project_manifest.json",
|
| 384 |
-
"bytes":
|
| 385 |
"top_level_type": "dict"
|
| 386 |
},
|
| 387 |
{
|
| 388 |
"path": "data/project_packet.json",
|
| 389 |
-
"bytes":
|
| 390 |
"top_level_type": "dict"
|
| 391 |
},
|
| 392 |
{
|
| 393 |
"path": "data/project_status.json",
|
| 394 |
-
"bytes":
|
| 395 |
"top_level_type": "dict"
|
| 396 |
},
|
| 397 |
{
|
|
@@ -401,7 +401,7 @@
|
|
| 401 |
},
|
| 402 |
{
|
| 403 |
"path": "data/public_surface_qa.json",
|
| 404 |
-
"bytes":
|
| 405 |
"top_level_type": "dict"
|
| 406 |
},
|
| 407 |
{
|
|
@@ -441,7 +441,7 @@
|
|
| 441 |
},
|
| 442 |
{
|
| 443 |
"path": "data/reproducibility_matrix.json",
|
| 444 |
-
"bytes":
|
| 445 |
"top_level_type": "dict"
|
| 446 |
},
|
| 447 |
{
|
|
@@ -466,7 +466,7 @@
|
|
| 466 |
},
|
| 467 |
{
|
| 468 |
"path": "data/research_takeaways.json",
|
| 469 |
-
"bytes":
|
| 470 |
"top_level_type": "dict"
|
| 471 |
},
|
| 472 |
{
|
|
@@ -481,7 +481,7 @@
|
|
| 481 |
},
|
| 482 |
{
|
| 483 |
"path": "data/single_episode_task_model_radar.json",
|
| 484 |
-
"bytes":
|
| 485 |
"top_level_type": "dict"
|
| 486 |
},
|
| 487 |
{
|
|
@@ -511,7 +511,7 @@
|
|
| 511 |
},
|
| 512 |
{
|
| 513 |
"path": "data/task_suite_20.json",
|
| 514 |
-
"bytes":
|
| 515 |
"top_level_type": "dict"
|
| 516 |
},
|
| 517 |
{
|
|
@@ -536,7 +536,7 @@
|
|
| 536 |
},
|
| 537 |
{
|
| 538 |
"path": "data/tier2_task_suite.json",
|
| 539 |
-
"bytes":
|
| 540 |
"top_level_type": "dict"
|
| 541 |
},
|
| 542 |
{
|
|
@@ -551,7 +551,7 @@
|
|
| 551 |
},
|
| 552 |
{
|
| 553 |
"path": "data/unified_task_model_radar.json",
|
| 554 |
-
"bytes":
|
| 555 |
"top_level_type": "dict"
|
| 556 |
},
|
| 557 |
{
|
|
@@ -656,13 +656,6 @@
|
|
| 656 |
"format": "SVG",
|
| 657 |
"has_viewbox": true
|
| 658 |
},
|
| 659 |
-
{
|
| 660 |
-
"path": "assets/charts/tier2_task_suite.svg",
|
| 661 |
-
"exists": true,
|
| 662 |
-
"bytes": 5453,
|
| 663 |
-
"format": "SVG",
|
| 664 |
-
"has_viewbox": true
|
| 665 |
-
},
|
| 666 |
{
|
| 667 |
"path": "assets/charts/two_evidence_line_map.svg",
|
| 668 |
"exists": true,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:21:58+00:00",
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
| 7 |
"html_pages": 4,
|
| 8 |
+
"local_references": 256,
|
| 9 |
"external_reference_count": 157,
|
| 10 |
"json_files": 55,
|
| 11 |
+
"image_assets_referenced": 28,
|
| 12 |
"failure_count": 0
|
| 13 |
},
|
| 14 |
"failures": {
|
|
|
|
| 81 |
"status": "pass",
|
| 82 |
"reason": "The project overview should appear before the deeper progress ledger.",
|
| 83 |
"overview_index": 121816,
|
| 84 |
+
"evidence_index": 167655
|
| 85 |
},
|
| 86 |
{
|
| 87 |
"name": "project_status_links_json",
|
|
|
|
| 161 |
"reason": "The evaluation protocol should appear before the deeper evidence ledger.",
|
| 162 |
"overview_index": 121816,
|
| 163 |
"protocol_index": 163835,
|
| 164 |
+
"evidence_index": 167655
|
| 165 |
},
|
| 166 |
{
|
| 167 |
"name": "evaluation_protocol_links_json",
|
|
|
|
| 277 |
{
|
| 278 |
"path": "index.html",
|
| 279 |
"id_count": 96,
|
| 280 |
+
"reference_count": 228,
|
| 281 |
+
"image_count": 34
|
| 282 |
},
|
| 283 |
{
|
| 284 |
"path": "research_roadmap.html",
|
|
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/artifact_index.json",
|
| 304 |
+
"bytes": 124341,
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
|
|
| 316 |
},
|
| 317 |
{
|
| 318 |
"path": "data/episode128_task_model_radar.json",
|
| 319 |
+
"bytes": 185212,
|
| 320 |
"top_level_type": "dict"
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/evaluation_protocol.json",
|
| 324 |
+
"bytes": 24267,
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
|
|
|
| 331 |
},
|
| 332 |
{
|
| 333 |
"path": "data/figure_index.json",
|
| 334 |
+
"bytes": 19485,
|
| 335 |
"top_level_type": "dict"
|
| 336 |
},
|
| 337 |
{
|
|
|
|
| 351 |
},
|
| 352 |
{
|
| 353 |
"path": "data/live_publication_status.json",
|
| 354 |
+
"bytes": 189990,
|
| 355 |
"top_level_type": "dict"
|
| 356 |
},
|
| 357 |
{
|
|
|
|
| 371 |
},
|
| 372 |
{
|
| 373 |
"path": "data/omni_model_comparison.json",
|
| 374 |
+
"bytes": 82102,
|
| 375 |
"top_level_type": "dict"
|
| 376 |
},
|
| 377 |
{
|
| 378 |
"path": "data/project_brief.json",
|
| 379 |
+
"bytes": 4032,
|
| 380 |
"top_level_type": "dict"
|
| 381 |
},
|
| 382 |
{
|
| 383 |
"path": "data/project_manifest.json",
|
| 384 |
+
"bytes": 5739,
|
| 385 |
"top_level_type": "dict"
|
| 386 |
},
|
| 387 |
{
|
| 388 |
"path": "data/project_packet.json",
|
| 389 |
+
"bytes": 10018,
|
| 390 |
"top_level_type": "dict"
|
| 391 |
},
|
| 392 |
{
|
| 393 |
"path": "data/project_status.json",
|
| 394 |
+
"bytes": 23232,
|
| 395 |
"top_level_type": "dict"
|
| 396 |
},
|
| 397 |
{
|
|
|
|
| 401 |
},
|
| 402 |
{
|
| 403 |
"path": "data/public_surface_qa.json",
|
| 404 |
+
"bytes": 7691,
|
| 405 |
"top_level_type": "dict"
|
| 406 |
},
|
| 407 |
{
|
|
|
|
| 441 |
},
|
| 442 |
{
|
| 443 |
"path": "data/reproducibility_matrix.json",
|
| 444 |
+
"bytes": 6836,
|
| 445 |
"top_level_type": "dict"
|
| 446 |
},
|
| 447 |
{
|
|
|
|
| 466 |
},
|
| 467 |
{
|
| 468 |
"path": "data/research_takeaways.json",
|
| 469 |
+
"bytes": 7165,
|
| 470 |
"top_level_type": "dict"
|
| 471 |
},
|
| 472 |
{
|
|
|
|
| 481 |
},
|
| 482 |
{
|
| 483 |
"path": "data/single_episode_task_model_radar.json",
|
| 484 |
+
"bytes": 51327,
|
| 485 |
"top_level_type": "dict"
|
| 486 |
},
|
| 487 |
{
|
|
|
|
| 511 |
},
|
| 512 |
{
|
| 513 |
"path": "data/task_suite_20.json",
|
| 514 |
+
"bytes": 34805,
|
| 515 |
"top_level_type": "dict"
|
| 516 |
},
|
| 517 |
{
|
|
|
|
| 536 |
},
|
| 537 |
{
|
| 538 |
"path": "data/tier2_task_suite.json",
|
| 539 |
+
"bytes": 33575,
|
| 540 |
"top_level_type": "dict"
|
| 541 |
},
|
| 542 |
{
|
|
|
|
| 551 |
},
|
| 552 |
{
|
| 553 |
"path": "data/unified_task_model_radar.json",
|
| 554 |
+
"bytes": 229035,
|
| 555 |
"top_level_type": "dict"
|
| 556 |
},
|
| 557 |
{
|
|
|
|
| 656 |
"format": "SVG",
|
| 657 |
"has_viewbox": true
|
| 658 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 659 |
{
|
| 660 |
"path": "assets/charts/two_evidence_line_map.svg",
|
| 661 |
"exists": true,
|
index.html
CHANGED
|
@@ -4787,7 +4787,7 @@
|
|
| 4787 |
<article class="artifact"><h3>Split policy</h3><p>Single-episode chronological 70/30 train/test split. This avoids random future-window mixing; cross-episode generalization is measured in the later multi-episode pilot.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/EVALUATION_PROTOCOL.md">protocol document</a></article>
|
| 4788 |
<article class="artifact"><h3>Metric contract</h3><p>All 20 tasks list input, target, primary metric, baseline score, and source artifact path in the unified suite file.</p><a href="data/task_suite_20.json">task_suite_20.json</a></article>
|
| 4789 |
<article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
|
| 4790 |
-
<article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across the
|
| 4791 |
<article class="artifact"><h3>Foundation track selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 is the world-model track with a camera-pose proxy forward-dynamics contract ready for trainer work, policy models wait for robot-compatible action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
|
| 4792 |
<article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight JSON-task evaluation plus a fine-tuned forward-dynamics LoRA branch. The next stage is action/subtask error analysis, stronger model-quality runs, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
|
| 4793 |
<article class="artifact"><h3>128-Episode Task Suite Enhancement Pack</h3><p>Before adding episodes, the suite should try `multiscale_20s10_40s20_80s40`, hierarchical action/subtask targets, label-normalized scoring, and compact raw-feature shards for unsupported tasks.</p><a href="data/task_suite_enhancement_128.json">task_suite_enhancement_128.json</a></article>
|
|
@@ -4824,7 +4824,7 @@
|
|
| 4824 |
<article class="evidence-card">
|
| 4825 |
<span class="status-pill">verified</span>
|
| 4826 |
<h3>Audio contribution is measured task by task</h3>
|
| 4827 |
-
<p>Audio variants improve the primary metric on 6
|
| 4828 |
<div class="evidence-links">
|
| 4829 |
<a href="data/audio_ablation_summary.json">audio summary</a>
|
| 4830 |
<a href="assets/charts/audio_ablation_delta.svg">delta chart</a>
|
|
@@ -5463,7 +5463,7 @@
|
|
| 5463 |
<section id="directions" data-project-tab="directions" role="tabpanel" aria-labelledby="tab-directions" tabindex="-1">
|
| 5464 |
<div class="wrap">
|
| 5465 |
<div class="section-head">
|
| 5466 |
-
<h2>The
|
| 5467 |
<p>Each task is mapped as direct, proxy, or diagnostic evidence for the Ropedia research tracks. The mapping uses two current baselines: minimal interpretable heads and neural MLP heads over the same feature contract.</p>
|
| 5468 |
</div>
|
| 5469 |
<div class="direction-grid">
|
|
@@ -5510,76 +5510,18 @@
|
|
| 5510 |
<div class="wrap">
|
| 5511 |
<div class="section-head">
|
| 5512 |
<h2>Unified 20-task evidence and provenance.</h2>
|
| 5513 |
-
<p>All 20 tasks
|
| 5514 |
-
</div>
|
| 5515 |
-
<img class="chart" src="assets/charts/tier2_task_suite.svg?v=xperience10m-tier2" alt="Historical additional-task provenance chart for the unified Xperience-10M 20-task suite">
|
| 5516 |
-
<div class="extension-grid">
|
| 5517 |
-
<article class="extension-card">
|
| 5518 |
-
<span class="status-pill">Task 13 / forecast</span>
|
| 5519 |
-
<h3>Long-Horizon Next-Action Forecasting</h3>
|
| 5520 |
-
<p><strong>Input:</strong> current non-caption multimodal window.</p>
|
| 5521 |
-
<p><strong>Output:</strong> action label five seconds later.</p>
|
| 5522 |
-
<div class="extension-metrics"><span><strong>0.0750</strong>minimal macro-F1</span><span><strong>0.0655</strong>neural macro-F1</span></div>
|
| 5523 |
-
</article>
|
| 5524 |
-
<article class="extension-card">
|
| 5525 |
-
<span class="status-pill">Task 14 / procedure</span>
|
| 5526 |
-
<h3>Long-Horizon Next-Subtask Forecasting</h3>
|
| 5527 |
-
<p><strong>Input:</strong> current non-caption multimodal window.</p>
|
| 5528 |
-
<p><strong>Output:</strong> procedure subtask five seconds later.</p>
|
| 5529 |
-
<div class="extension-metrics"><span><strong>0.0455</strong>minimal macro-F1</span><span><strong>0.0507</strong>neural macro-F1</span></div>
|
| 5530 |
-
</article>
|
| 5531 |
-
<article class="extension-card">
|
| 5532 |
-
<span class="status-pill">Task 15 / language</span>
|
| 5533 |
-
<h3>Interaction Text Prediction</h3>
|
| 5534 |
-
<p><strong>Input:</strong> current sensor window with caption features removed.</p>
|
| 5535 |
-
<p><strong>Output:</strong> raw annotation interaction phrase.</p>
|
| 5536 |
-
<div class="extension-metrics"><span><strong>0.0444</strong>minimal macro-F1</span><span><strong>0.0381</strong>neural macro-F1</span></div>
|
| 5537 |
-
</article>
|
| 5538 |
-
<article class="extension-card">
|
| 5539 |
-
<span class="status-pill">Task 16 / relation</span>
|
| 5540 |
-
<h3>Action-Object Relation Prediction</h3>
|
| 5541 |
-
<p><strong>Input:</strong> current sensor window with caption features removed.</p>
|
| 5542 |
-
<p><strong>Output:</strong> joint action plus active object-set label.</p>
|
| 5543 |
-
<div class="extension-metrics"><span><strong>0.0000</strong>minimal macro-F1</span><span><strong>0.0000</strong>neural macro-F1</span></div>
|
| 5544 |
-
</article>
|
| 5545 |
-
<article class="extension-card">
|
| 5546 |
-
<span class="status-pill">Task 17 / objects</span>
|
| 5547 |
-
<h3>Future Object-Set Forecasting</h3>
|
| 5548 |
-
<p><strong>Input:</strong> current sensor window with caption features removed.</p>
|
| 5549 |
-
<p><strong>Output:</strong> object set active five seconds later.</p>
|
| 5550 |
-
<div class="extension-metrics"><span><strong>0.1694</strong>minimal micro-F1</span><span><strong>0.1972</strong>neural micro-F1</span></div>
|
| 5551 |
-
</article>
|
| 5552 |
-
<article class="extension-card">
|
| 5553 |
-
<span class="status-pill">Task 18 / sensor bridge</span>
|
| 5554 |
-
<h3>IMU-to-Hand Pose Reconstruction</h3>
|
| 5555 |
-
<p><strong>Input:</strong> IMU acceleration and gyroscope features only.</p>
|
| 5556 |
-
<p><strong>Output:</strong> current left/right hand joint feature blocks.</p>
|
| 5557 |
-
<div class="extension-metrics"><span><strong>0.0420</strong>minimal MAE</span><span><strong>0.0426</strong>neural MAE</span></div>
|
| 5558 |
-
</article>
|
| 5559 |
-
<article class="extension-card">
|
| 5560 |
-
<span class="status-pill">Task 19 / camera sync</span>
|
| 5561 |
-
<h3>Camera-View Synchronization Retrieval</h3>
|
| 5562 |
-
<p><strong>Input:</strong> fisheye camera-1 feature query.</p>
|
| 5563 |
-
<p><strong>Output:</strong> synchronized fisheye camera-3 window rank.</p>
|
| 5564 |
-
<div class="extension-metrics"><span><strong>0.4943</strong>minimal MRR</span><span><strong>0.2409</strong>neural MRR</span></div>
|
| 5565 |
-
</article>
|
| 5566 |
-
<article class="extension-card">
|
| 5567 |
-
<span class="status-pill">Task 20 / timing</span>
|
| 5568 |
-
<h3>Time-to-Next-Transition Regression</h3>
|
| 5569 |
-
<p><strong>Input:</strong> current non-caption multimodal window.</p>
|
| 5570 |
-
<p><strong>Output:</strong> capped frames until the next action boundary.</p>
|
| 5571 |
-
<div class="extension-metrics"><span><strong>10.5374</strong>minimal MAE frames</span><span><strong>10.5545</strong>neural MAE frames</span></div>
|
| 5572 |
-
</article>
|
| 5573 |
</div>
|
| 5574 |
<div class="callout-row">
|
| 5575 |
<div class="callout">
|
| 5576 |
<h3>Unified task artifact package</h3>
|
| 5577 |
-
<p>The public task package has
|
| 5578 |
-
<p><a href="data/task_suite_20.json">Open unified 20-task JSON</a> · <a href="data/
|
| 5579 |
</div>
|
| 5580 |
<div class="callout">
|
| 5581 |
<h3>One setup, one task surface</h3>
|
| 5582 |
<p>Every task uses the same 20-frame window unit, 5-frame stride, 8,546-dimensional feature manifest, chronological split discipline, and minimal/neural comparison pattern unless a task-specific leakage rule removes target-side features.</p>
|
|
|
|
| 5583 |
</div>
|
| 5584 |
</div>
|
| 5585 |
<img class="chart" src="assets/charts/research_direction_extension_tasks.svg?v=xperience10m-ext" alt="Four Xperience-10M research-direction extension probes with minimal and neural metrics">
|
|
@@ -5633,7 +5575,7 @@
|
|
| 5633 |
<section id="architectures" data-project-tab="method" role="tabpanel" aria-labelledby="tab-method" tabindex="-1">
|
| 5634 |
<div class="wrap">
|
| 5635 |
<div class="section-head">
|
| 5636 |
-
<h2>The
|
| 5637 |
<p>The diagram separates the shared episode-window representation from the task-specific heads, so the task contracts stay readable before scaling to larger models.</p>
|
| 5638 |
</div>
|
| 5639 |
<img class="architecture-image" src="assets/task_architectures.png?v=xperience10m-nn" alt="Verified minimal and neural architecture diagram for Ropedia Xperience-10M task heads">
|
|
@@ -5732,7 +5674,7 @@
|
|
| 5732 |
<img class="chart" src="assets/charts/cross_modal_retrieval.svg" alt="Cross modal retrieval chart">
|
| 5733 |
<img class="chart" src="assets/charts/episode_task_scores_neural_mlp.svg" alt="Neural MLP task score chart">
|
| 5734 |
<img class="chart" src="assets/charts/episode_task_scores_minimal_vs_neural.svg" alt="Minimal versus neural score chart">
|
| 5735 |
-
<img class="chart" src="assets/charts/audio_ablation_delta.svg" alt="Measured audio delta chart across
|
| 5736 |
</div>
|
| 5737 |
<p class="section-note"><a href="single_episode_explorer.html">Open the single-episode explorer</a> to inspect window-level labels, predictions, modality statistics, object labels, and diagnostic scores. The <a href="data/audio_ablation_summary.json">audio ablation summary</a> records the task-by-task audio contribution.</p>
|
| 5738 |
</div>
|
|
@@ -5861,9 +5803,9 @@
|
|
| 5861 |
<article class="artifact"><h3>Windows table</h3><p>Window start/end frames and aligned action/subtask labels for the public sample episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/windows.csv">window table</a></article>
|
| 5862 |
<article class="artifact"><h3>Feature inputs</h3><p>Source map for the current modality inputs used by the task suite.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/feature_manifest.json">feature inputs</a></article>
|
| 5863 |
<article class="artifact"><h3>Neural MLP task results</h3><p>Per-task PyTorch MLP metrics, predictions, histories, and checkpoints for the unified task contracts, with historical result-bundle paths retained for provenance.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/neural_mlp">neural MLP outputs</a></article>
|
| 5864 |
-
<article class="artifact"><h3>Four-direction taxonomy</h3><p>Maps the
|
| 5865 |
<article class="artifact"><h3>Direction extension probes</h3><p>Four coded probes, one per research direction, with minimal and neural metrics plus prediction/rank CSVs.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_direction_extensions">extension probe outputs</a></article>
|
| 5866 |
-
<article class="artifact"><h3>Task walkthroughs</h3><p>Case studies for the
|
| 5867 |
<article class="artifact"><h3>Audio ablation and raw upgrade</h3><p>All 72 task/variant rows comparing current audio, no audio, raw audio, replacement, and combined-input settings.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/audio_ablation">audio ablation outputs</a></article>
|
| 5868 |
<article class="artifact"><h3>Single-episode explorer</h3><p>Interactive window-level view of labels, predictions, modality statistics, object labels, and diagnostics.</p><a href="single_episode_explorer.html">open explorer</a></article>
|
| 5869 |
<article class="artifact"><h3>Cross-modal retrieval</h3><p>The strongest self-supervised signal from the single episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/cross_modal_retrieval/metrics.json">retrieval metrics</a></article>
|
|
@@ -5917,7 +5859,7 @@
|
|
| 5917 |
<div class="artifact-grid">
|
| 5918 |
<article class="artifact"><h3>Project brief</h3><p>The fastest written overview of the dataset sample, tasks, baselines, and scale-up plan.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/PROJECT_BRIEF.md">brief</a></article>
|
| 5919 |
<article class="artifact"><h3>Glossary</h3><p>Plain-language definitions for the terms most likely to confuse first-time readers and reviewers.</p><a href="data/glossary.json">glossary</a></article>
|
| 5920 |
-
<article class="artifact"><h3>Task walkthroughs</h3><p>Human-readable case studies for the
|
| 5921 |
<article class="artifact"><h3>Task results</h3><p>Minimal and neural-head metrics for the same sample windows and chronological split.</p><a href="data/summary_metrics.json">metrics</a></article>
|
| 5922 |
<article class="artifact"><h3>Visual figures</h3><p>Task-suite map, modality atlas, pipeline diagram, model architecture figure, and Qwen3-Omni LoRA training-flow figure.</p><a href="assets/task_suite_infographic.png">task-suite figure</a></article>
|
| 5923 |
<article class="artifact"><h3>Dataset notes</h3><p>Official dataset links, public sample source, modalities, access boundary, and current project subset.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE10M_DATASET_CARD_ALIGNMENT.md">dataset notes</a></article>
|
|
|
|
| 4787 |
<article class="artifact"><h3>Split policy</h3><p>Single-episode chronological 70/30 train/test split. This avoids random future-window mixing; cross-episode generalization is measured in the later multi-episode pilot.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/EVALUATION_PROTOCOL.md">protocol document</a></article>
|
| 4788 |
<article class="artifact"><h3>Metric contract</h3><p>All 20 tasks list input, target, primary metric, baseline score, and source artifact path in the unified suite file.</p><a href="data/task_suite_20.json">task_suite_20.json</a></article>
|
| 4789 |
<article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
|
| 4790 |
+
<article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across the walkthrough-backed task contracts under the same chronological split.</p><a href="data/audio_ablation_summary.json">audio summary</a></article>
|
| 4791 |
<article class="artifact"><h3>Foundation track selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 is the world-model track with a camera-pose proxy forward-dynamics contract ready for trainer work, policy models wait for robot-compatible action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
|
| 4792 |
<article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight JSON-task evaluation plus a fine-tuned forward-dynamics LoRA branch. The next stage is action/subtask error analysis, stronger model-quality runs, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
|
| 4793 |
<article class="artifact"><h3>128-Episode Task Suite Enhancement Pack</h3><p>Before adding episodes, the suite should try `multiscale_20s10_40s20_80s40`, hierarchical action/subtask targets, label-normalized scoring, and compact raw-feature shards for unsupported tasks.</p><a href="data/task_suite_enhancement_128.json">task_suite_enhancement_128.json</a></article>
|
|
|
|
| 4824 |
<article class="evidence-card">
|
| 4825 |
<span class="status-pill">verified</span>
|
| 4826 |
<h3>Audio contribution is measured task by task</h3>
|
| 4827 |
+
<p>Audio variants improve the primary metric on 6 walkthrough-backed task contracts in this single-episode setting.</p>
|
| 4828 |
<div class="evidence-links">
|
| 4829 |
<a href="data/audio_ablation_summary.json">audio summary</a>
|
| 4830 |
<a href="assets/charts/audio_ablation_delta.svg">delta chart</a>
|
|
|
|
| 5463 |
<section id="directions" data-project-tab="directions" role="tabpanel" aria-labelledby="tab-directions" tabindex="-1">
|
| 5464 |
<div class="wrap">
|
| 5465 |
<div class="section-head">
|
| 5466 |
+
<h2>The walkthrough-backed tasks organized into four research directions.</h2>
|
| 5467 |
<p>Each task is mapped as direct, proxy, or diagnostic evidence for the Ropedia research tracks. The mapping uses two current baselines: minimal interpretable heads and neural MLP heads over the same feature contract.</p>
|
| 5468 |
</div>
|
| 5469 |
<div class="direction-grid">
|
|
|
|
| 5510 |
<div class="wrap">
|
| 5511 |
<div class="section-head">
|
| 5512 |
<h2>Unified 20-task evidence and provenance.</h2>
|
| 5513 |
+
<p>All 20 tasks live in the same task table, task-card grid, radar, and 180-record result matrix. Historical result paths are retained only for exact provenance links.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5514 |
</div>
|
| 5515 |
<div class="callout-row">
|
| 5516 |
<div class="callout">
|
| 5517 |
<h3>Unified task artifact package</h3>
|
| 5518 |
+
<p>The public task package has one 20-task JSON, per-task metrics, prediction/rank files, Markdown summaries, radar charts, and the 180-record method-task matrix.</p>
|
| 5519 |
+
<p><a href="data/task_suite_20.json">Open unified 20-task JSON</a> · <a href="data/task_method_20_result_matrix.json">Open 180-record matrix</a> · <a href="assets/charts/unified_task_model_radar.svg">Open unified radar</a></p>
|
| 5520 |
</div>
|
| 5521 |
<div class="callout">
|
| 5522 |
<h3>One setup, one task surface</h3>
|
| 5523 |
<p>Every task uses the same 20-frame window unit, 5-frame stride, 8,546-dimensional feature manifest, chronological split discipline, and minimal/neural comparison pattern unless a task-specific leakage rule removes target-side features.</p>
|
| 5524 |
+
<p><a href="data/tier2_task_suite.json">Historical provenance JSON</a> and <a href="assets/charts/tier2_task_suite.svg">historical provenance chart</a> remain available for exact source tracing.</p>
|
| 5525 |
</div>
|
| 5526 |
</div>
|
| 5527 |
<img class="chart" src="assets/charts/research_direction_extension_tasks.svg?v=xperience10m-ext" alt="Four Xperience-10M research-direction extension probes with minimal and neural metrics">
|
|
|
|
| 5575 |
<section id="architectures" data-project-tab="method" role="tabpanel" aria-labelledby="tab-method" tabindex="-1">
|
| 5576 |
<div class="wrap">
|
| 5577 |
<div class="section-head">
|
| 5578 |
+
<h2>The baseline task heads share four head families.</h2>
|
| 5579 |
<p>The diagram separates the shared episode-window representation from the task-specific heads, so the task contracts stay readable before scaling to larger models.</p>
|
| 5580 |
</div>
|
| 5581 |
<img class="architecture-image" src="assets/task_architectures.png?v=xperience10m-nn" alt="Verified minimal and neural architecture diagram for Ropedia Xperience-10M task heads">
|
|
|
|
| 5674 |
<img class="chart" src="assets/charts/cross_modal_retrieval.svg" alt="Cross modal retrieval chart">
|
| 5675 |
<img class="chart" src="assets/charts/episode_task_scores_neural_mlp.svg" alt="Neural MLP task score chart">
|
| 5676 |
<img class="chart" src="assets/charts/episode_task_scores_minimal_vs_neural.svg" alt="Minimal versus neural score chart">
|
| 5677 |
+
<img class="chart" src="assets/charts/audio_ablation_delta.svg" alt="Measured audio delta chart across walkthrough-backed task contracts">
|
| 5678 |
</div>
|
| 5679 |
<p class="section-note"><a href="single_episode_explorer.html">Open the single-episode explorer</a> to inspect window-level labels, predictions, modality statistics, object labels, and diagnostic scores. The <a href="data/audio_ablation_summary.json">audio ablation summary</a> records the task-by-task audio contribution.</p>
|
| 5680 |
</div>
|
|
|
|
| 5803 |
<article class="artifact"><h3>Windows table</h3><p>Window start/end frames and aligned action/subtask labels for the public sample episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/windows.csv">window table</a></article>
|
| 5804 |
<article class="artifact"><h3>Feature inputs</h3><p>Source map for the current modality inputs used by the task suite.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/feature_manifest.json">feature inputs</a></article>
|
| 5805 |
<article class="artifact"><h3>Neural MLP task results</h3><p>Per-task PyTorch MLP metrics, predictions, histories, and checkpoints for the unified task contracts, with historical result-bundle paths retained for provenance.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/neural_mlp">neural MLP outputs</a></article>
|
| 5806 |
+
<article class="artifact"><h3>Four-direction taxonomy</h3><p>Maps the walkthrough-backed task contracts to the four research tracks: human modeling, 3D/4D reconstruction, egocentric interaction, and world modeling.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_directions">research direction outputs</a></article>
|
| 5807 |
<article class="artifact"><h3>Direction extension probes</h3><p>Four coded probes, one per research direction, with minimal and neural metrics plus prediction/rank CSVs.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/research_direction_extensions">extension probe outputs</a></article>
|
| 5808 |
+
<article class="artifact"><h3>Task walkthroughs</h3><p>Case studies for the walkthrough-backed task contracts, including input, middle process modules, output, metric, limitation, and task-player data.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/episode_task_suite/task_walkthroughs">walkthrough outputs</a></article>
|
| 5809 |
<article class="artifact"><h3>Audio ablation and raw upgrade</h3><p>All 72 task/variant rows comparing current audio, no audio, raw audio, replacement, and combined-input settings.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/tree/main/results/audio_ablation">audio ablation outputs</a></article>
|
| 5810 |
<article class="artifact"><h3>Single-episode explorer</h3><p>Interactive window-level view of labels, predictions, modality statistics, object labels, and diagnostics.</p><a href="single_episode_explorer.html">open explorer</a></article>
|
| 5811 |
<article class="artifact"><h3>Cross-modal retrieval</h3><p>The strongest self-supervised signal from the single episode.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/episode_task_suite/cross_modal_retrieval/metrics.json">retrieval metrics</a></article>
|
|
|
|
| 5859 |
<div class="artifact-grid">
|
| 5860 |
<article class="artifact"><h3>Project brief</h3><p>The fastest written overview of the dataset sample, tasks, baselines, and scale-up plan.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/PROJECT_BRIEF.md">brief</a></article>
|
| 5861 |
<article class="artifact"><h3>Glossary</h3><p>Plain-language definitions for the terms most likely to confuse first-time readers and reviewers.</p><a href="data/glossary.json">glossary</a></article>
|
| 5862 |
+
<article class="artifact"><h3>Task walkthroughs</h3><p>Human-readable case studies for the walkthrough-backed task contracts, including input, process modules, output, metric, and limitation.</p><a href="data/task_walkthroughs.json">walkthroughs</a></article>
|
| 5863 |
<article class="artifact"><h3>Task results</h3><p>Minimal and neural-head metrics for the same sample windows and chronological split.</p><a href="data/summary_metrics.json">metrics</a></article>
|
| 5864 |
<article class="artifact"><h3>Visual figures</h3><p>Task-suite map, modality atlas, pipeline diagram, model architecture figure, and Qwen3-Omni LoRA training-flow figure.</p><a href="assets/task_suite_infographic.png">task-suite figure</a></article>
|
| 5865 |
<article class="artifact"><h3>Dataset notes</h3><p>Official dataset links, public sample source, modalities, access boundary, and current project subset.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE10M_DATASET_CARD_ALIGNMENT.md">dataset notes</a></article>
|
metrics/episode128_task_model_radar.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "128-Episode 20-Task Radar",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"description": "Selected 128-episode metadata/raw baselines plus verified Qwen3-Omni v6, Cosmos3-Super, and Cosmos3-Nano diagnostics. Every method has 20 records; numeric scores appear only where the public artifact produced that task target.",
|
| 6 |
"task_count": 20,
|
| 7 |
"method_count": 7,
|
|
@@ -192,7 +192,7 @@
|
|
| 192 |
"label": "Action Recognition",
|
| 193 |
"axis_label": "01 Action Recognition",
|
| 194 |
"short_label": "Action",
|
| 195 |
-
"
|
| 196 |
"metric_key": "macro_f1",
|
| 197 |
"metric_name": "macro-F1",
|
| 198 |
"metric_direction": "higher",
|
|
@@ -283,7 +283,7 @@
|
|
| 283 |
"label": "Procedure Step Recognition",
|
| 284 |
"axis_label": "02 Procedure Step Recognition",
|
| 285 |
"short_label": "Step",
|
| 286 |
-
"
|
| 287 |
"metric_key": "macro_f1",
|
| 288 |
"metric_name": "macro-F1",
|
| 289 |
"metric_direction": "higher",
|
|
@@ -374,7 +374,7 @@
|
|
| 374 |
"label": "Action Boundary Detection",
|
| 375 |
"axis_label": "03 Action Boundary Detection",
|
| 376 |
"short_label": "Boundary",
|
| 377 |
-
"
|
| 378 |
"metric_key": "macro_f1",
|
| 379 |
"metric_name": "macro-F1",
|
| 380 |
"metric_direction": "higher",
|
|
@@ -465,7 +465,7 @@
|
|
| 465 |
"label": "Next-Action Prediction",
|
| 466 |
"axis_label": "04 Next-Action Prediction",
|
| 467 |
"short_label": "Next act",
|
| 468 |
-
"
|
| 469 |
"metric_key": "macro_f1",
|
| 470 |
"metric_name": "macro-F1",
|
| 471 |
"metric_direction": "higher",
|
|
@@ -556,7 +556,7 @@
|
|
| 556 |
"label": "Hand Trajectory Forecasting",
|
| 557 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 558 |
"short_label": "Hand traj",
|
| 559 |
-
"
|
| 560 |
"metric_key": "mpjpe",
|
| 561 |
"metric_name": "MPJPE",
|
| 562 |
"metric_direction": "lower",
|
|
@@ -647,7 +647,7 @@
|
|
| 647 |
"label": "Contact State Prediction",
|
| 648 |
"axis_label": "06 Contact State Prediction",
|
| 649 |
"short_label": "Contact",
|
| 650 |
-
"
|
| 651 |
"metric_key": "macro_f1",
|
| 652 |
"metric_name": "macro-F1",
|
| 653 |
"metric_direction": "higher",
|
|
@@ -738,7 +738,7 @@
|
|
| 738 |
"label": "Object Relevance Prediction",
|
| 739 |
"axis_label": "07 Object Relevance Prediction",
|
| 740 |
"short_label": "Objects",
|
| 741 |
-
"
|
| 742 |
"metric_key": "micro_f1",
|
| 743 |
"metric_name": "micro-F1",
|
| 744 |
"metric_direction": "higher",
|
|
@@ -829,7 +829,7 @@
|
|
| 829 |
"label": "Language Grounding",
|
| 830 |
"axis_label": "08 Language Grounding",
|
| 831 |
"short_label": "Language",
|
| 832 |
-
"
|
| 833 |
"metric_key": "mrr",
|
| 834 |
"metric_name": "MRR",
|
| 835 |
"metric_direction": "higher",
|
|
@@ -920,7 +920,7 @@
|
|
| 920 |
"label": "Cross-Modal Retrieval",
|
| 921 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 922 |
"short_label": "X-modal",
|
| 923 |
-
"
|
| 924 |
"metric_key": "mrr",
|
| 925 |
"metric_name": "MRR",
|
| 926 |
"metric_direction": "higher",
|
|
@@ -1011,7 +1011,7 @@
|
|
| 1011 |
"label": "Cross-Modal Reconstruction",
|
| 1012 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 1013 |
"short_label": "Recon",
|
| 1014 |
-
"
|
| 1015 |
"metric_key": "r2",
|
| 1016 |
"metric_name": "R2",
|
| 1017 |
"metric_direction": "higher",
|
|
@@ -1102,7 +1102,7 @@
|
|
| 1102 |
"label": "Temporal Order Verification",
|
| 1103 |
"axis_label": "11 Temporal Order Verification",
|
| 1104 |
"short_label": "Order",
|
| 1105 |
-
"
|
| 1106 |
"metric_key": "f1",
|
| 1107 |
"metric_name": "F1",
|
| 1108 |
"metric_direction": "higher",
|
|
@@ -1193,7 +1193,7 @@
|
|
| 1193 |
"label": "Multimodal Synchronization Detection",
|
| 1194 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 1195 |
"short_label": "Sync",
|
| 1196 |
-
"
|
| 1197 |
"metric_key": "f1",
|
| 1198 |
"metric_name": "F1",
|
| 1199 |
"metric_direction": "higher",
|
|
@@ -1284,7 +1284,7 @@
|
|
| 1284 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 1285 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 1286 |
"short_label": "Long act",
|
| 1287 |
-
"
|
| 1288 |
"metric_key": "macro_f1",
|
| 1289 |
"metric_name": "macro-F1",
|
| 1290 |
"metric_direction": "higher",
|
|
@@ -1375,7 +1375,7 @@
|
|
| 1375 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 1376 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 1377 |
"short_label": "Long step",
|
| 1378 |
-
"
|
| 1379 |
"metric_key": "macro_f1",
|
| 1380 |
"metric_name": "macro-F1",
|
| 1381 |
"metric_direction": "higher",
|
|
@@ -1466,7 +1466,7 @@
|
|
| 1466 |
"label": "Interaction Text Prediction",
|
| 1467 |
"axis_label": "15 Interaction Text Prediction",
|
| 1468 |
"short_label": "Interact txt",
|
| 1469 |
-
"
|
| 1470 |
"metric_key": "macro_f1",
|
| 1471 |
"metric_name": "macro-F1",
|
| 1472 |
"metric_direction": "higher",
|
|
@@ -1557,7 +1557,7 @@
|
|
| 1557 |
"label": "Action-Object Relation Prediction",
|
| 1558 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 1559 |
"short_label": "Act+obj",
|
| 1560 |
-
"
|
| 1561 |
"metric_key": "macro_f1",
|
| 1562 |
"metric_name": "macro-F1",
|
| 1563 |
"metric_direction": "higher",
|
|
@@ -1648,7 +1648,7 @@
|
|
| 1648 |
"label": "Future Object-Set Forecasting",
|
| 1649 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 1650 |
"short_label": "Future obj",
|
| 1651 |
-
"
|
| 1652 |
"metric_key": "micro_f1",
|
| 1653 |
"metric_name": "micro-F1",
|
| 1654 |
"metric_direction": "higher",
|
|
@@ -1739,7 +1739,7 @@
|
|
| 1739 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 1740 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 1741 |
"short_label": "IMU->hand",
|
| 1742 |
-
"
|
| 1743 |
"metric_key": "mae",
|
| 1744 |
"metric_name": "MAE",
|
| 1745 |
"metric_direction": "lower",
|
|
@@ -1830,7 +1830,7 @@
|
|
| 1830 |
"label": "Camera-View Synchronization Retrieval",
|
| 1831 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 1832 |
"short_label": "Cam sync",
|
| 1833 |
-
"
|
| 1834 |
"metric_key": "mrr",
|
| 1835 |
"metric_name": "MRR",
|
| 1836 |
"metric_direction": "higher",
|
|
@@ -1921,7 +1921,7 @@
|
|
| 1921 |
"label": "Time-to-Next-Transition Regression",
|
| 1922 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 1923 |
"short_label": "Time2bdry",
|
| 1924 |
-
"
|
| 1925 |
"metric_key": "mae",
|
| 1926 |
"metric_name": "MAE frames",
|
| 1927 |
"metric_direction": "lower",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "128-Episode 20-Task Radar",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:20:34+00:00",
|
| 5 |
"description": "Selected 128-episode metadata/raw baselines plus verified Qwen3-Omni v6, Cosmos3-Super, and Cosmos3-Nano diagnostics. Every method has 20 records; numeric scores appear only where the public artifact produced that task target.",
|
| 6 |
"task_count": 20,
|
| 7 |
"method_count": 7,
|
|
|
|
| 192 |
"label": "Action Recognition",
|
| 193 |
"axis_label": "01 Action Recognition",
|
| 194 |
"short_label": "Action",
|
| 195 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 196 |
"metric_key": "macro_f1",
|
| 197 |
"metric_name": "macro-F1",
|
| 198 |
"metric_direction": "higher",
|
|
|
|
| 283 |
"label": "Procedure Step Recognition",
|
| 284 |
"axis_label": "02 Procedure Step Recognition",
|
| 285 |
"short_label": "Step",
|
| 286 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 287 |
"metric_key": "macro_f1",
|
| 288 |
"metric_name": "macro-F1",
|
| 289 |
"metric_direction": "higher",
|
|
|
|
| 374 |
"label": "Action Boundary Detection",
|
| 375 |
"axis_label": "03 Action Boundary Detection",
|
| 376 |
"short_label": "Boundary",
|
| 377 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 378 |
"metric_key": "macro_f1",
|
| 379 |
"metric_name": "macro-F1",
|
| 380 |
"metric_direction": "higher",
|
|
|
|
| 465 |
"label": "Next-Action Prediction",
|
| 466 |
"axis_label": "04 Next-Action Prediction",
|
| 467 |
"short_label": "Next act",
|
| 468 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 469 |
"metric_key": "macro_f1",
|
| 470 |
"metric_name": "macro-F1",
|
| 471 |
"metric_direction": "higher",
|
|
|
|
| 556 |
"label": "Hand Trajectory Forecasting",
|
| 557 |
"axis_label": "05 Hand Trajectory Forecasting",
|
| 558 |
"short_label": "Hand traj",
|
| 559 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 560 |
"metric_key": "mpjpe",
|
| 561 |
"metric_name": "MPJPE",
|
| 562 |
"metric_direction": "lower",
|
|
|
|
| 647 |
"label": "Contact State Prediction",
|
| 648 |
"axis_label": "06 Contact State Prediction",
|
| 649 |
"short_label": "Contact",
|
| 650 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 651 |
"metric_key": "macro_f1",
|
| 652 |
"metric_name": "macro-F1",
|
| 653 |
"metric_direction": "higher",
|
|
|
|
| 738 |
"label": "Object Relevance Prediction",
|
| 739 |
"axis_label": "07 Object Relevance Prediction",
|
| 740 |
"short_label": "Objects",
|
| 741 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 742 |
"metric_key": "micro_f1",
|
| 743 |
"metric_name": "micro-F1",
|
| 744 |
"metric_direction": "higher",
|
|
|
|
| 829 |
"label": "Language Grounding",
|
| 830 |
"axis_label": "08 Language Grounding",
|
| 831 |
"short_label": "Language",
|
| 832 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 833 |
"metric_key": "mrr",
|
| 834 |
"metric_name": "MRR",
|
| 835 |
"metric_direction": "higher",
|
|
|
|
| 920 |
"label": "Cross-Modal Retrieval",
|
| 921 |
"axis_label": "09 Cross-Modal Retrieval",
|
| 922 |
"short_label": "X-modal",
|
| 923 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 924 |
"metric_key": "mrr",
|
| 925 |
"metric_name": "MRR",
|
| 926 |
"metric_direction": "higher",
|
|
|
|
| 1011 |
"label": "Cross-Modal Reconstruction",
|
| 1012 |
"axis_label": "10 Cross-Modal Reconstruction",
|
| 1013 |
"short_label": "Recon",
|
| 1014 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1015 |
"metric_key": "r2",
|
| 1016 |
"metric_name": "R2",
|
| 1017 |
"metric_direction": "higher",
|
|
|
|
| 1102 |
"label": "Temporal Order Verification",
|
| 1103 |
"axis_label": "11 Temporal Order Verification",
|
| 1104 |
"short_label": "Order",
|
| 1105 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1106 |
"metric_key": "f1",
|
| 1107 |
"metric_name": "F1",
|
| 1108 |
"metric_direction": "higher",
|
|
|
|
| 1193 |
"label": "Multimodal Synchronization Detection",
|
| 1194 |
"axis_label": "12 Multimodal Synchronization Detection",
|
| 1195 |
"short_label": "Sync",
|
| 1196 |
+
"provenance_source": "walkthrough_backed_task_contract",
|
| 1197 |
"metric_key": "f1",
|
| 1198 |
"metric_name": "F1",
|
| 1199 |
"metric_direction": "higher",
|
|
|
|
| 1284 |
"label": "Long-Horizon Next-Action Forecasting",
|
| 1285 |
"axis_label": "13 Long-Horizon Next-Action Forecasting",
|
| 1286 |
"short_label": "Long act",
|
| 1287 |
+
"provenance_source": "historical_result_bundle",
|
| 1288 |
"metric_key": "macro_f1",
|
| 1289 |
"metric_name": "macro-F1",
|
| 1290 |
"metric_direction": "higher",
|
|
|
|
| 1375 |
"label": "Long-Horizon Next-Subtask Forecasting",
|
| 1376 |
"axis_label": "14 Long-Horizon Next-Subtask Forecasting",
|
| 1377 |
"short_label": "Long step",
|
| 1378 |
+
"provenance_source": "historical_result_bundle",
|
| 1379 |
"metric_key": "macro_f1",
|
| 1380 |
"metric_name": "macro-F1",
|
| 1381 |
"metric_direction": "higher",
|
|
|
|
| 1466 |
"label": "Interaction Text Prediction",
|
| 1467 |
"axis_label": "15 Interaction Text Prediction",
|
| 1468 |
"short_label": "Interact txt",
|
| 1469 |
+
"provenance_source": "historical_result_bundle",
|
| 1470 |
"metric_key": "macro_f1",
|
| 1471 |
"metric_name": "macro-F1",
|
| 1472 |
"metric_direction": "higher",
|
|
|
|
| 1557 |
"label": "Action-Object Relation Prediction",
|
| 1558 |
"axis_label": "16 Action-Object Relation Prediction",
|
| 1559 |
"short_label": "Act+obj",
|
| 1560 |
+
"provenance_source": "historical_result_bundle",
|
| 1561 |
"metric_key": "macro_f1",
|
| 1562 |
"metric_name": "macro-F1",
|
| 1563 |
"metric_direction": "higher",
|
|
|
|
| 1648 |
"label": "Future Object-Set Forecasting",
|
| 1649 |
"axis_label": "17 Future Object-Set Forecasting",
|
| 1650 |
"short_label": "Future obj",
|
| 1651 |
+
"provenance_source": "historical_result_bundle",
|
| 1652 |
"metric_key": "micro_f1",
|
| 1653 |
"metric_name": "micro-F1",
|
| 1654 |
"metric_direction": "higher",
|
|
|
|
| 1739 |
"label": "IMU-to-Hand Pose Reconstruction",
|
| 1740 |
"axis_label": "18 IMU-to-Hand Pose Reconstruction",
|
| 1741 |
"short_label": "IMU->hand",
|
| 1742 |
+
"provenance_source": "historical_result_bundle",
|
| 1743 |
"metric_key": "mae",
|
| 1744 |
"metric_name": "MAE",
|
| 1745 |
"metric_direction": "lower",
|
|
|
|
| 1830 |
"label": "Camera-View Synchronization Retrieval",
|
| 1831 |
"axis_label": "19 Camera-View Synchronization Retrieval",
|
| 1832 |
"short_label": "Cam sync",
|
| 1833 |
+
"provenance_source": "historical_result_bundle",
|
| 1834 |
"metric_key": "mrr",
|
| 1835 |
"metric_name": "MRR",
|
| 1836 |
"metric_direction": "higher",
|
|
|
|
| 1921 |
"label": "Time-to-Next-Transition Regression",
|
| 1922 |
"axis_label": "20 Time-to-Next-Transition Regression",
|
| 1923 |
"short_label": "Time2bdry",
|
| 1924 |
+
"provenance_source": "historical_result_bundle",
|
| 1925 |
"metric_key": "mae",
|
| 1926 |
"metric_name": "MAE frames",
|
| 1927 |
"metric_direction": "lower",
|
metrics/figure_index.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Figure Index",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"scope": "Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience-10M videos, annotations, RRD files, and Qwen weights are excluded.",
|
| 6 |
"figure_count": 29,
|
| 7 |
"figures": [
|
|
@@ -60,12 +60,12 @@
|
|
| 60 |
"id": "task_suite_infographic",
|
| 61 |
"title": "Original task-suite infographic",
|
| 62 |
"path": "docs/assets/task_suite_infographic.png",
|
| 63 |
-
"role": "Primary visual map of the
|
| 64 |
"source_script": "scripts/render_task_suite_infographic.py",
|
| 65 |
"surface": "README, website, HF Space, artifact dataset, model card",
|
| 66 |
"exists": true,
|
| 67 |
-
"bytes":
|
| 68 |
-
"sha256": "
|
| 69 |
"dimensions": {
|
| 70 |
"format": "PNG",
|
| 71 |
"width": 1800,
|
|
@@ -162,7 +162,7 @@
|
|
| 162 |
"id": "task_architectures",
|
| 163 |
"title": "Minimal and neural task architecture map",
|
| 164 |
"path": "docs/assets/task_architectures.png",
|
| 165 |
-
"role": "Minimal and neural heads for the
|
| 166 |
"source_script": "scripts/render_overview_figures.py",
|
| 167 |
"surface": "README, website, HF artifact dataset, model card",
|
| 168 |
"exists": true,
|
|
@@ -392,8 +392,8 @@
|
|
| 392 |
"source_script": "scripts/tier2_task_suite.py",
|
| 393 |
"surface": "website unified task section, README, HF mirrors",
|
| 394 |
"exists": true,
|
| 395 |
-
"bytes":
|
| 396 |
-
"sha256": "
|
| 397 |
"dimensions": {
|
| 398 |
"format": "SVG",
|
| 399 |
"width": 1440,
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Figure Index",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:19:00+00:00",
|
| 5 |
"scope": "Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience-10M videos, annotations, RRD files, and Qwen weights are excluded.",
|
| 6 |
"figure_count": 29,
|
| 7 |
"figures": [
|
|
|
|
| 60 |
"id": "task_suite_infographic",
|
| 61 |
"title": "Original task-suite infographic",
|
| 62 |
"path": "docs/assets/task_suite_infographic.png",
|
| 63 |
+
"role": "Primary visual map of the walkthrough-backed task families, verified metrics, and sample modalities; the unified public suite is documented as 20 tasks.",
|
| 64 |
"source_script": "scripts/render_task_suite_infographic.py",
|
| 65 |
"surface": "README, website, HF Space, artifact dataset, model card",
|
| 66 |
"exists": true,
|
| 67 |
+
"bytes": 1897278,
|
| 68 |
+
"sha256": "71b1ab150e952cf902488226c65b3822d8016974f63d111204c1eb1a7745faad",
|
| 69 |
"dimensions": {
|
| 70 |
"format": "PNG",
|
| 71 |
"width": 1800,
|
|
|
|
| 162 |
"id": "task_architectures",
|
| 163 |
"title": "Minimal and neural task architecture map",
|
| 164 |
"path": "docs/assets/task_architectures.png",
|
| 165 |
+
"role": "Minimal and neural heads for the walkthrough-backed task contracts and shared feature contracts.",
|
| 166 |
"source_script": "scripts/render_overview_figures.py",
|
| 167 |
"surface": "README, website, HF artifact dataset, model card",
|
| 168 |
"exists": true,
|
|
|
|
| 392 |
"source_script": "scripts/tier2_task_suite.py",
|
| 393 |
"surface": "website unified task section, README, HF mirrors",
|
| 394 |
"exists": true,
|
| 395 |
+
"bytes": 5453,
|
| 396 |
+
"sha256": "e9da29c57f42b29a7a05622fee1335089ac2b6fc9692a3b49fa5b753904db9dc",
|
| 397 |
"dimensions": {
|
| 398 |
"format": "SVG",
|
| 399 |
"width": 1440,
|
metrics/live_publication_status.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
metrics/omni_model_comparison.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"status": "pass",
|
| 5 |
"version_count": 3,
|
| 6 |
"model_group_count": 5,
|
| 7 |
"comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
|
| 8 |
"version_reading_notes": [
|
| 9 |
-
"Version 1 is the public-sample 20-task surface:
|
| 10 |
"Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
|
| 11 |
"The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
|
| 12 |
],
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:17:00+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
"version_count": 3,
|
| 6 |
"model_group_count": 5,
|
| 7 |
"comparison_rule": "Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.",
|
| 8 |
"version_reading_notes": [
|
| 9 |
+
"Version 1 is the public-sample 20-task surface: unified task heads, historical provenance rows, and the 180-row method-task matrix.",
|
| 10 |
"Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
|
| 11 |
"The selected-128 model-diagnostic group contains the current Qwen3-Omni LoRA JSON-task row, Cosmos3-Nano future-window compatibility result, Cosmos3-Super Reasoner base-weight JSON-task evaluation, and the separate Cosmos3-Super Forward-Dynamics LoRA adapter artifact."
|
| 12 |
],
|
metrics/project_brief.json
CHANGED
|
@@ -52,7 +52,7 @@
|
|
| 52 |
"Open EVALUATION_PROTOCOL.md before comparing task scores.",
|
| 53 |
"Use RESEARCH_TAKEAWAYS.md for the current metric interpretation.",
|
| 54 |
"Inspect results/episode_task_suite/feature_manifest.json to understand one model input.",
|
| 55 |
-
"Use TASK_SUITE_20.md and docs/data/task_suite_20.json to read the unified 20-task suite; the historical docs/data/tier2_task_suite.json path
|
| 56 |
"Use docs/data/omni_finetune_verified_result.json for the current multi-episode Qwen3-Omni pilot result."
|
| 57 |
],
|
| 58 |
"scope_boundary": "The public sample is enough to build and verify task definitions, feature contracts, metrics, visualization, and baseline code. The final multi-episode Qwen3-Omni diagnostic result verifies the training loop and strict-JSON output reliability, but does not yet show strong action/subtask model quality.",
|
|
|
|
| 52 |
"Open EVALUATION_PROTOCOL.md before comparing task scores.",
|
| 53 |
"Use RESEARCH_TAKEAWAYS.md for the current metric interpretation.",
|
| 54 |
"Inspect results/episode_task_suite/feature_manifest.json to understand one model input.",
|
| 55 |
+
"Use TASK_SUITE_20.md and docs/data/task_suite_20.json to read the unified 20-task suite; the historical docs/data/tier2_task_suite.json path is retained only for provenance inside that suite.",
|
| 56 |
"Use docs/data/omni_finetune_verified_result.json for the current multi-episode Qwen3-Omni pilot result."
|
| 57 |
],
|
| 58 |
"scope_boundary": "The public sample is enough to build and verify task definitions, feature contracts, metrics, visualization, and baseline code. The final multi-episode Qwen3-Omni diagnostic result verifies the training loop and strict-JSON output reliability, but does not yet show strong action/subtask model quality.",
|
metrics/project_packet.json
CHANGED
|
@@ -15,9 +15,8 @@
|
|
| 15 |
"cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
|
| 16 |
"task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
|
| 17 |
"task_count": 20,
|
| 18 |
-
"
|
| 19 |
-
"
|
| 20 |
-
"legacy_tasks_13_to_20_result_path": "docs/data/tier2_task_suite.json"
|
| 21 |
},
|
| 22 |
"reading_path": [
|
| 23 |
{
|
|
@@ -110,7 +109,7 @@
|
|
| 110 |
"results/episode_task_suite/neural_mlp/",
|
| 111 |
"docs/data/summary_metrics.json"
|
| 112 |
],
|
| 113 |
-
"readout": "The unified suite has 20 task contracts
|
| 114 |
},
|
| 115 |
{
|
| 116 |
"step": 8,
|
|
|
|
| 15 |
"cosmos3_super_forward_dynamics_lora_status": "The first Cosmos3-Super fine-tuned adapter branch is verified as a forward-dynamics LoRA over camera-pose proxy targets; it reports loss metrics, not JSON action-label accuracy.",
|
| 16 |
"task_suite_enhancement_128_status": "Current no-new-episode enhancement pack recommends multiscale_20s10_40s20_80s40, hierarchical action/subtask targets, label-normalized scoring, and raw-feature shards before adding more episodes.",
|
| 17 |
"task_count": 20,
|
| 18 |
+
"task_surface_framing": "unified_20_task_suite",
|
| 19 |
+
"legacy_provenance_result_path": "docs/data/tier2_task_suite.json"
|
|
|
|
| 20 |
},
|
| 21 |
"reading_path": [
|
| 22 |
{
|
|
|
|
| 109 |
"results/episode_task_suite/neural_mlp/",
|
| 110 |
"docs/data/summary_metrics.json"
|
| 111 |
],
|
| 112 |
+
"readout": "The unified suite has 20 task contracts in one task surface. Walkthrough-backed tasks, aligned minimal/neural result bundles, and historical tier2_task_suite provenance paths are all linked from TASK_SUITE_20.md and docs/data/task_suite_20.json."
|
| 113 |
},
|
| 114 |
{
|
| 115 |
"step": 8,
|
metrics/public_surface_qa.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Public Project Surface",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"scope": "Repo README, GitHub Pages HTML, Hugging Face Space card, artifact dataset card, and model card.",
|
| 6 |
"checks": [
|
| 7 |
{
|
|
@@ -33,12 +33,12 @@
|
|
| 33 |
"source_alignment": {
|
| 34 |
"exists": true,
|
| 35 |
"status": "pass",
|
| 36 |
-
"generated_at_utc": "2026-06-
|
| 37 |
},
|
| 38 |
"scale_up_status": {
|
| 39 |
"exists": true,
|
| 40 |
"status": "pass",
|
| 41 |
-
"generated_at_utc": "2026-06-
|
| 42 |
},
|
| 43 |
"publication_package": {
|
| 44 |
"exists": true,
|
|
@@ -48,7 +48,7 @@
|
|
| 48 |
"mirror_parity": {
|
| 49 |
"exists": true,
|
| 50 |
"status": "pass",
|
| 51 |
-
"generated_at_utc": "2026-06-21T14:
|
| 52 |
}
|
| 53 |
},
|
| 54 |
"failures": {}
|
|
@@ -96,7 +96,7 @@
|
|
| 96 |
"reason": "Public copy should consistently present the project as Ropedia Xperience-10M, with the Qwen3-Omni scale-up status.",
|
| 97 |
"marker_counts": {
|
| 98 |
"Ropedia Xperience-10M Task Suite": 20,
|
| 99 |
-
"Xperience-10M":
|
| 100 |
"20-task": 100,
|
| 101 |
"Qwen3-Omni": 245,
|
| 102 |
"128-episode pilot": 1
|
|
@@ -137,11 +137,11 @@
|
|
| 137 |
"data/unified_task_model_radar.json": 21,
|
| 138 |
"data/single_episode_task_model_radar.json": 17,
|
| 139 |
"data/episode128_task_model_radar.json": 16,
|
| 140 |
-
"data/task_method_20_result_matrix.json":
|
| 141 |
"data/task_method_20_gap_audit.json": 23,
|
| 142 |
"data/language_versions.json": 3,
|
| 143 |
"assets/charts/two_evidence_line_map.svg": 5,
|
| 144 |
-
"assets/charts/unified_task_model_radar.svg":
|
| 145 |
"assets/charts/single_episode_task_model_radar.svg": 19,
|
| 146 |
"assets/charts/episode128_task_model_radar.svg": 19,
|
| 147 |
"data/tier2_task_suite.json": 11
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Public Project Surface",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:21:42+00:00",
|
| 5 |
"scope": "Repo README, GitHub Pages HTML, Hugging Face Space card, artifact dataset card, and model card.",
|
| 6 |
"checks": [
|
| 7 |
{
|
|
|
|
| 33 |
"source_alignment": {
|
| 34 |
"exists": true,
|
| 35 |
"status": "pass",
|
| 36 |
+
"generated_at_utc": "2026-06-21T14:46:49+00:00"
|
| 37 |
},
|
| 38 |
"scale_up_status": {
|
| 39 |
"exists": true,
|
| 40 |
"status": "pass",
|
| 41 |
+
"generated_at_utc": "2026-06-21T14:47:03+00:00"
|
| 42 |
},
|
| 43 |
"publication_package": {
|
| 44 |
"exists": true,
|
|
|
|
| 48 |
"mirror_parity": {
|
| 49 |
"exists": true,
|
| 50 |
"status": "pass",
|
| 51 |
+
"generated_at_utc": "2026-06-21T14:53:27+00:00"
|
| 52 |
}
|
| 53 |
},
|
| 54 |
"failures": {}
|
|
|
|
| 96 |
"reason": "Public copy should consistently present the project as Ropedia Xperience-10M, with the Qwen3-Omni scale-up status.",
|
| 97 |
"marker_counts": {
|
| 98 |
"Ropedia Xperience-10M Task Suite": 20,
|
| 99 |
+
"Xperience-10M": 166,
|
| 100 |
"20-task": 100,
|
| 101 |
"Qwen3-Omni": 245,
|
| 102 |
"128-episode pilot": 1
|
|
|
|
| 137 |
"data/unified_task_model_radar.json": 21,
|
| 138 |
"data/single_episode_task_model_radar.json": 17,
|
| 139 |
"data/episode128_task_model_radar.json": 16,
|
| 140 |
+
"data/task_method_20_result_matrix.json": 25,
|
| 141 |
"data/task_method_20_gap_audit.json": 23,
|
| 142 |
"data/language_versions.json": 3,
|
| 143 |
"assets/charts/two_evidence_line_map.svg": 5,
|
| 144 |
+
"assets/charts/unified_task_model_radar.svg": 18,
|
| 145 |
"assets/charts/single_episode_task_model_radar.svg": 19,
|
| 146 |
"assets/charts/episode128_task_model_radar.svg": 19,
|
| 147 |
"data/tier2_task_suite.json": 11
|
metrics/reproducibility_matrix.json
CHANGED
|
@@ -39,7 +39,7 @@
|
|
| 39 |
"id": "original_task_suite",
|
| 40 |
"status": "reproducible",
|
| 41 |
"command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
|
| 42 |
-
"expected": "
|
| 43 |
"boundary": "8,546-dimensional multimodal window contract"
|
| 44 |
},
|
| 45 |
{
|
|
@@ -50,11 +50,11 @@
|
|
| 50 |
"boundary": "single-episode probes, not full research-direction solutions"
|
| 51 |
},
|
| 52 |
{
|
| 53 |
-
"id": "
|
| 54 |
"status": "reproducible",
|
| 55 |
"command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
|
| 56 |
-
"expected": "
|
| 57 |
-
"boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for
|
| 58 |
},
|
| 59 |
{
|
| 60 |
"id": "source_alignment_audit",
|
|
|
|
| 39 |
"id": "original_task_suite",
|
| 40 |
"status": "reproducible",
|
| 41 |
"command": "python scripts/episode_task_suite.py --workspace $WORKSPACE --include-neural",
|
| 42 |
+
"expected": "walkthrough-backed task metrics, predictions, manifests, and neural_mlp task-head artifacts",
|
| 43 |
"boundary": "8,546-dimensional multimodal window contract"
|
| 44 |
},
|
| 45 |
{
|
|
|
|
| 50 |
"boundary": "single-episode probes, not full research-direction solutions"
|
| 51 |
},
|
| 52 |
{
|
| 53 |
+
"id": "unified_20_task_index",
|
| 54 |
"status": "reproducible",
|
| 55 |
"command": "python scripts/tier2_task_suite.py && python scripts/build_unified_task_suite.py && python scripts/build_unified_task_model_radar.py",
|
| 56 |
+
"expected": "unified 20-task metrics, prediction/rank artifacts, TASK_SUITE_20.md, docs/data/task_suite_20.json, docs/data/tier2_task_suite.json, docs/assets/charts/tier2_task_suite.svg, docs/data/unified_task_model_radar.json, and docs/assets/charts/unified_task_model_radar.svg",
|
| 57 |
+
"boundary": "requires local public-sample annotation.hdf5 plus HOMIE Toolkit or h5py for full public-task regeneration; raw HDF5 and MP4 files are not redistributed"
|
| 58 |
},
|
| 59 |
{
|
| 60 |
"id": "source_alignment_audit",
|
metrics/research_takeaways.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Research Takeaways",
|
| 3 |
"status": "pass",
|
| 4 |
-
"generated_at_utc": "2026-06-
|
| 5 |
"source_files": [
|
| 6 |
"docs/data/summary_metrics.json",
|
| 7 |
"results/episode_task_suite/summary_report.json",
|
|
@@ -133,7 +133,7 @@
|
|
| 133 |
{
|
| 134 |
"id": "audio_contribution_is_task_specific",
|
| 135 |
"title": "Audio helps some tasks and hurts others on the public sample",
|
| 136 |
-
"readout": "Audio improves the primary metric on 6
|
| 137 |
"evidence": [
|
| 138 |
{
|
| 139 |
"label": "tasks_where_current_audio_improves",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Research Takeaways",
|
| 3 |
"status": "pass",
|
| 4 |
+
"generated_at_utc": "2026-06-21T15:18:59+00:00",
|
| 5 |
"source_files": [
|
| 6 |
"docs/data/summary_metrics.json",
|
| 7 |
"results/episode_task_suite/summary_report.json",
|
|
|
|
| 133 |
{
|
| 134 |
"id": "audio_contribution_is_task_specific",
|
| 135 |
"title": "Audio helps some tasks and hurts others on the public sample",
|
| 136 |
+
"readout": "Audio improves the primary metric on 6 walkthrough-backed task contracts, while raw log-mel replacement improves over the current handcrafted block on 6 of those contracts. The largest current-audio gain appears in feature reconstruction, not in action classification.",
|
| 137 |
"evidence": [
|
| 138 |
{
|
| 139 |
"label": "tasks_where_current_audio_improves",
|
metrics/task_method_20_gap_audit.json
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
{
|
| 2 |
-
"generated_at_utc": "2026-06-
|
| 3 |
"immediate_actions": [
|
| 4 |
{
|
| 5 |
"artifact": "docs/data/task_method_20_gap_audit.json",
|
|
|
|
| 1 |
{
|
| 2 |
+
"generated_at_utc": "2026-06-21T15:21:42+00:00",
|
| 3 |
"immediate_actions": [
|
| 4 |
{
|
| 5 |
"artifact": "docs/data/task_method_20_gap_audit.json",
|
metrics/task_surface_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-
|
| 4 |
"summary": {
|
| 5 |
"original_walkthrough_task_count": 12,
|
| 6 |
"expected_original_walkthrough_task_count": 12,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-21T15:21:55+00:00",
|
| 4 |
"summary": {
|
| 5 |
"original_walkthrough_task_count": 12,
|
| 6 |
"expected_original_walkthrough_task_count": 12,
|