cy0307 commited on
Commit
3a10443
·
verified ·
1 Parent(s): eeac43c

Publish Ropedia Xperience-10M task baseline cards

Browse files
Files changed (49) hide show
  1. PROJECT_STATUS.md +6 -3
  2. RESEARCH_ROADMAP.md +10 -0
  3. data/mirror_parity.json +620 -61
  4. data/omni_finetune_verified_result.json +1 -1
  5. data/omni_model_comparison.json +61 -6
  6. data/project_packet.json +1 -1
  7. data/project_status.json +15 -3
  8. data/research_roadmap.json +1 -1
  9. data/research_roadmap_interactive.json +1 -1
  10. data/website_integrity.json +11 -11
  11. docs/assets/charts/episode_task_scores.svg +12 -12
  12. docs/assets/charts/episode_task_scores_minimal_vs_neural.svg +24 -24
  13. docs/assets/charts/episode_task_scores_neural_mlp.svg +12 -12
  14. docs/assets/charts/research_direction_coverage.svg +4 -4
  15. docs/assets/task_architectures.png +2 -2
  16. docs/assets/task_architectures.svg +12 -12
  17. docs/assets/task_suite_infographic.png +2 -2
  18. docs/data/mirror_parity.json +620 -61
  19. docs/data/omni_finetune_verified_result.json +1 -1
  20. docs/data/omni_model_comparison.json +61 -6
  21. docs/data/project_packet.json +1 -1
  22. docs/data/project_status.json +15 -3
  23. docs/data/research_roadmap.json +1 -1
  24. docs/data/research_roadmap_interactive.json +1 -1
  25. docs/data/website_integrity.json +11 -11
  26. docs/index.html +6 -6
  27. metrics/mirror_parity.json +95 -95
  28. metrics/omni_finetune_verified_result.json +1 -1
  29. metrics/omni_model_comparison.json +61 -6
  30. metrics/project_packet.json +1 -1
  31. metrics/project_status.json +15 -3
  32. metrics/research_roadmap.json +1 -1
  33. metrics/research_roadmap_interactive.json +1 -1
  34. metrics/website_integrity.json +11 -11
  35. results/omni_finetune/OMNI_MODEL_COMPARISON.md +7 -5
  36. results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/RUN_REPORT.md +19 -0
  37. results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json +136 -0
  38. results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/progress.jsonl +3 -0
  39. results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/training_metadata.json +8 -0
  40. results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/RUN_REPORT.md +35 -0
  41. results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/progress.jsonl +3 -0
  42. results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/training_contract_audit.json +78 -0
  43. results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/training_metadata.json +47 -0
  44. scripts/omni/audit_cosmos3_super_training_contract.py +406 -0
  45. scripts/omni/build_omni_model_comparison.py +106 -9
  46. scripts/omni/export_cosmos3_camera_pose_targets.py +250 -0
  47. scripts/omni/pack_cosmos3_super_action_batch.py +459 -0
  48. scripts/omni/run_qwen3_omni_v4_4epoch_8gpu.sh +105 -0
  49. scripts/verify_live_publication.py +2 -2
PROJECT_STATUS.md CHANGED
@@ -22,7 +22,8 @@ scale-up readiness; it is not presented as final full-dataset model quality.
22
  | Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across all 12 task contracts; audio improves the primary metric on 6 of 12 tasks, and a 588-d audio-window representation improves over the baseline audio variant on 6 of 12 tasks. |
23
  | Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
24
  | Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy branches, and the future Xperience-native pretraining goal. |
25
- | Foundation-model plan | Current | `FOUNDATION_MODEL_PLAN.md`, `docs/data/foundation_model_plan.json` | Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is added as the first world-model/action-generation branch; OpenVLA/openpi/GR00T are policy candidates after action targets are explicit. |
 
26
  | Omni model extension contract | Current | `OMNI_MODEL_EXTENSION_CONTRACT.md`, `configs/omni_backbones/`, `scripts/omni/backbone_registry.py`, `scripts/omni/smoke_test_backbone_packaging.py` | Future model branches must keep the same episode split discipline, held-out metrics, validation gate, public-safe package contract, and explicit forbidden-artifact policy before reporting results. |
27
  | Xperience Embodied Foundation Model | Future goal | `XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md` | A future full-corpus pretraining plan describes target modules, objectives, staged scale-up, hardware ranges, and evaluation for a domain-specific embodied foundation model. |
28
  | Evaluation protocol | Verified | `EVALUATION_PROTOCOL.md`, `docs/data/evaluation_protocol.json`, `scripts/build_evaluation_protocol.py` | Windowing, chronological split, per-task metrics, leakage controls, and current limitations are generated from committed metric artifacts. |
@@ -83,8 +84,10 @@ scale-up readiness; it is not presented as final full-dataset model quality.
83
  - Audio contribution is evaluated across all 12 task contracts in
84
  `results/audio_ablation/`.
85
  - Foundation-model selection is now explicit: Qwen3-Omni is the immediate
86
- trainable pilot, Cosmos 3 is the first world-model branch, and policy models
87
- such as OpenVLA/openpi/GR00T wait for action-target conversion.
 
 
88
  - Future model branches should be added through the backbone registry and
89
  verified package contract, not by creating one-off result folders with
90
  incompatible metrics or publication rules.
 
22
  | Audio contribution study | Verified | `scripts/audio_ablation_and_raw_upgrade.py`, `results/audio_ablation/`, `docs/data/audio_ablation_summary.json` | Audio variants are compared across all 12 task contracts; audio improves the primary metric on 6 of 12 tasks, and a 588-d audio-window representation improves over the baseline audio variant on 6 of 12 tasks. |
23
  | Research takeaways | Verified | `RESEARCH_TAKEAWAYS.md`, `docs/data/research_takeaways.json`, `scripts/build_research_takeaways.py` | The main result interpretation is generated from committed metrics: chronological class shift, neural gains on dynamics/order/alignment, open retrieval/reconstruction problems, and the need for held-out episodes. |
24
  | Research roadmap | Current | `RESEARCH_ROADMAP.md`, `docs/data/research_roadmap.json` | The roadmap connects public-sample task development to the final verified Qwen3-Omni diagnostic result, same-split baseline alignment, action/subtask error analysis, robustness runs, world/policy branches, and the future Xperience-native pretraining goal. |
25
+ | Foundation-model plan | Current | `FOUNDATION_MODEL_PLAN.md`, `docs/data/foundation_model_plan.json` | Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is added as the first world-model/action-generation branch; Cosmos3-Super now has camera-pose proxy action targets that pass the contract audit and a schema-only batch-packer smoke. The current target mode is forward-dynamics, so it supports vision-velocity training under action conditioning, not supervised action-token prediction. OpenVLA/openpi/GR00T are policy candidates after robot-compatible action targets are explicit. |
26
+ | Cosmos3-Super action-target contract | Ready for forward-dynamics trainer implementation | `scripts/omni/export_cosmos3_camera_pose_targets.py`, `scripts/omni/pack_cosmos3_super_action_batch.py`, `results/omni_finetune/xperience10m_cosmos3_camera_pose_targets_20260608/target_manifest.json`, `results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json`, `results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json` | The selected 128-episode JSONL is augmented with 3,808/3,808 valid `camera_pose` proxy `cosmos_action_target` records from SLAM pose deltas. The schema-only packer smoke confirms the current `forward_dynamics` target should supervise noisy vision tokens under camera-pose conditioning; it does not supervise `preds_action`. Remaining work is a pipeline-loaded packer check, one-sample forward-dynamics overfit, and a separate policy/inverse target export before claiming action-token prediction. |
27
  | Omni model extension contract | Current | `OMNI_MODEL_EXTENSION_CONTRACT.md`, `configs/omni_backbones/`, `scripts/omni/backbone_registry.py`, `scripts/omni/smoke_test_backbone_packaging.py` | Future model branches must keep the same episode split discipline, held-out metrics, validation gate, public-safe package contract, and explicit forbidden-artifact policy before reporting results. |
28
  | Xperience Embodied Foundation Model | Future goal | `XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md` | A future full-corpus pretraining plan describes target modules, objectives, staged scale-up, hardware ranges, and evaluation for a domain-specific embodied foundation model. |
29
  | Evaluation protocol | Verified | `EVALUATION_PROTOCOL.md`, `docs/data/evaluation_protocol.json`, `scripts/build_evaluation_protocol.py` | Windowing, chronological split, per-task metrics, leakage controls, and current limitations are generated from committed metric artifacts. |
 
84
  - Audio contribution is evaluated across all 12 task contracts in
85
  `results/audio_ablation/`.
86
  - Foundation-model selection is now explicit: Qwen3-Omni is the immediate
87
+ trainable pilot, Cosmos 3 is the first world-model branch, and Cosmos3-Super
88
+ has a camera-pose proxy forward-dynamics contract ready for trainer
89
+ implementation; policy models such as OpenVLA/openpi/GR00T still wait for
90
+ robot-compatible action-target conversion.
91
  - Future model branches should be added through the backbone registry and
92
  verified package contract, not by creating one-off result folders with
93
  incompatible metrics or publication rules.
RESEARCH_ROADMAP.md CHANGED
@@ -145,6 +145,16 @@ objectives: audio-visible alignment, future-window prediction,
145
  action-conditioned world modeling, synthetic-data usefulness tests, policy-style
146
  next action, contact, object relevance, and affordance reasoning.
147
 
 
 
 
 
 
 
 
 
 
 
148
  ### 7. Xperience Embodied Foundation Model Pretraining
149
 
150
  This stage is the long-term full-corpus goal. Instead of adapting an existing
 
145
  action-conditioned world modeling, synthetic-data usefulness tests, policy-style
146
  next action, contact, object relevance, and affordance reasoning.
147
 
148
+ Current Cosmos3-Super status: a camera-pose proxy action target export now
149
+ augments all 3,808 selected 128-episode windows and passes the contract audit.
150
+ A schema-only batch-packer smoke confirms the current `forward_dynamics` target
151
+ uses camera-pose actions as conditioning and should supervise noisy vision
152
+ tokens, not `preds_action`. This is a trainer-readiness artifact, not a
153
+ fine-tuned Cosmos weight release. The next Cosmos step is a pipeline-loaded
154
+ packer check and one-sample forward-dynamics overfit before any 96/16/16 Super
155
+ LoRA run; supervised action-token prediction needs a separate policy or
156
+ inverse-dynamics target export.
157
+
158
  ### 7. Xperience Embodied Foundation Model Pretraining
159
 
160
  This stage is the long-term full-corpus goal. Instead of adapting an existing
data/mirror_parity.json CHANGED
@@ -1,16 +1,21 @@
1
  {
2
- "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:49:31+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
7
- "failure_count": 0,
8
- "failures_by_surface": {}
 
 
 
 
 
9
  },
10
  "checks": [
11
  {
12
  "name": "repo_hf_space_artifact_model_data_parity",
13
- "status": "pass"
14
  },
15
  {
16
  "name": "repo_hf_visual_asset_parity",
@@ -18,19 +23,19 @@
18
  },
19
  {
20
  "name": "repo_hf_validator_script_parity",
21
- "status": "pass"
22
  },
23
  {
24
  "name": "repo_hf_website_html_parity",
25
- "status": "pass"
26
  },
27
  {
28
  "name": "repo_hf_diagnostic_result_parity",
29
- "status": "pass"
30
  },
31
  {
32
  "name": "repo_hf_quality_doc_parity",
33
- "status": "pass"
34
  }
35
  ],
36
  "groups": [
@@ -346,12 +351,12 @@
346
  },
347
  {
348
  "name": "data/omni_finetune_verified_result.json",
349
- "status": "pass",
350
  "local": {
351
  "path": "repo:docs/data/omni_finetune_verified_result.json",
352
  "exists": true,
353
- "bytes": 3628,
354
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
355
  },
356
  "mirrors": {
357
  "hf_space": {
@@ -373,16 +378,38 @@
373
  "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
374
  }
375
  },
376
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
377
  },
378
  {
379
  "name": "data/omni_model_comparison.json",
380
- "status": "pass",
381
  "local": {
382
  "path": "repo:docs/data/omni_model_comparison.json",
383
  "exists": true,
384
- "bytes": 48296,
385
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
386
  },
387
  "mirrors": {
388
  "hf_space": {
@@ -404,7 +431,29 @@
404
  "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
405
  }
406
  },
407
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
408
  },
409
  {
410
  "name": "data/project_brief.json",
@@ -470,12 +519,12 @@
470
  },
471
  {
472
  "name": "data/project_packet.json",
473
- "status": "pass",
474
  "local": {
475
  "path": "repo:docs/data/project_packet.json",
476
  "exists": true,
477
- "bytes": 8005,
478
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
479
  },
480
  "mirrors": {
481
  "hf_space": {
@@ -497,16 +546,38 @@
497
  "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
498
  }
499
  },
500
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
501
  },
502
  {
503
  "name": "data/project_status.json",
504
- "status": "pass",
505
  "local": {
506
  "path": "repo:docs/data/project_status.json",
507
  "exists": true,
508
- "bytes": 16455,
509
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
510
  },
511
  "mirrors": {
512
  "hf_space": {
@@ -528,7 +599,29 @@
528
  "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
529
  }
530
  },
531
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
532
  },
533
  {
534
  "name": "data/publication_audit.json",
@@ -687,12 +780,12 @@
687
  },
688
  {
689
  "name": "data/research_roadmap.json",
690
- "status": "pass",
691
  "local": {
692
  "path": "repo:docs/data/research_roadmap.json",
693
  "exists": true,
694
- "bytes": 10133,
695
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
696
  },
697
  "mirrors": {
698
  "hf_space": {
@@ -714,16 +807,38 @@
714
  "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
715
  }
716
  },
717
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
718
  },
719
  {
720
  "name": "data/research_roadmap_interactive.json",
721
- "status": "pass",
722
  "local": {
723
  "path": "repo:docs/data/research_roadmap_interactive.json",
724
  "exists": true,
725
- "bytes": 143560,
726
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
727
  },
728
  "mirrors": {
729
  "hf_space": {
@@ -745,7 +860,29 @@
745
  "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
746
  }
747
  },
748
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
749
  },
750
  {
751
  "name": "data/research_takeaways.json",
@@ -1028,12 +1165,12 @@
1028
  },
1029
  {
1030
  "name": "data/website_integrity.json",
1031
- "status": "pass",
1032
  "local": {
1033
  "path": "repo:docs/data/website_integrity.json",
1034
  "exists": true,
1035
  "bytes": 15375,
1036
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1037
  },
1038
  "mirrors": {
1039
  "hf_space": {
@@ -1055,7 +1192,29 @@
1055
  "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1056
  }
1057
  },
1058
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1059
  },
1060
  {
1061
  "name": "data/xperience10m_dataset_card_alignment.json",
@@ -1781,12 +1940,12 @@
1781
  },
1782
  {
1783
  "name": "scripts/omni/build_omni_model_comparison.py",
1784
- "status": "pass",
1785
  "local": {
1786
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1787
  "exists": true,
1788
- "bytes": 30236,
1789
- "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1790
  },
1791
  "mirrors": {
1792
  "hf_artifacts": {
@@ -1802,7 +1961,22 @@
1802
  "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1803
  }
1804
  },
1805
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1806
  },
1807
  {
1808
  "name": "scripts/omni/prepare_qwen3_lora_hf_package.py",
@@ -2156,12 +2330,12 @@
2156
  },
2157
  {
2158
  "name": "scripts/verify_live_publication.py",
2159
- "status": "pass",
2160
  "local": {
2161
  "path": "repo:scripts/verify_live_publication.py",
2162
  "exists": true,
2163
- "bytes": 36201,
2164
- "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2165
  },
2166
  "mirrors": {
2167
  "hf_artifacts": {
@@ -2177,7 +2351,22 @@
2177
  "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2178
  }
2179
  },
2180
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2181
  },
2182
  {
2183
  "name": "scripts/validate_mirror_parity.py",
@@ -2406,12 +2595,12 @@
2406
  },
2407
  {
2408
  "name": "website/index.html",
2409
- "status": "pass",
2410
  "local": {
2411
  "path": "repo:docs/index.html",
2412
  "exists": true,
2413
- "bytes": 180727,
2414
- "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2415
  },
2416
  "mirrors": {
2417
  "hf_space": {
@@ -2427,7 +2616,22 @@
2427
  "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2428
  }
2429
  },
2430
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2431
  },
2432
  {
2433
  "name": "website/research_roadmap.html",
@@ -2692,12 +2896,12 @@
2692
  },
2693
  {
2694
  "name": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2695
- "status": "pass",
2696
  "local": {
2697
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2698
  "exists": true,
2699
- "bytes": 9231,
2700
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2701
  },
2702
  "mirrors": {
2703
  "hf_space": {
@@ -2719,7 +2923,29 @@
2719
  "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2720
  }
2721
  },
2722
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2723
  },
2724
  {
2725
  "name": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
@@ -7032,12 +7258,12 @@
7032
  },
7033
  {
7034
  "name": "docs/RESEARCH_ROADMAP.md",
7035
- "status": "pass",
7036
  "local": {
7037
  "path": "repo:RESEARCH_ROADMAP.md",
7038
  "exists": true,
7039
- "bytes": 12233,
7040
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7041
  },
7042
  "mirrors": {
7043
  "hf_space": {
@@ -7059,16 +7285,38 @@
7059
  "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7060
  }
7061
  },
7062
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7063
  },
7064
  {
7065
  "name": "docs/PROJECT_STATUS.md",
7066
- "status": "pass",
7067
  "local": {
7068
  "path": "repo:PROJECT_STATUS.md",
7069
  "exists": true,
7070
- "bytes": 9926,
7071
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7072
  },
7073
  "mirrors": {
7074
  "hf_space": {
@@ -7090,7 +7338,29 @@
7090
  "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7091
  }
7092
  },
7093
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7094
  },
7095
  {
7096
  "name": "docs/PUBLIC_SURFACE_QA.md",
@@ -7217,5 +7487,294 @@
7217
  "failures": []
7218
  }
7219
  ],
7220
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7221
  }
 
1
  {
2
+ "status": "fail",
3
+ "generated_at_utc": "2026-06-07T17:27:20+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
7
+ "failure_count": 36,
8
+ "failures_by_surface": {
9
+ "hf_space": 11,
10
+ "hf_artifacts": 12,
11
+ "hf_model": 12,
12
+ "hf_artifacts_docs": 1
13
+ }
14
  },
15
  "checks": [
16
  {
17
  "name": "repo_hf_space_artifact_model_data_parity",
18
+ "status": "fail"
19
  },
20
  {
21
  "name": "repo_hf_visual_asset_parity",
 
23
  },
24
  {
25
  "name": "repo_hf_validator_script_parity",
26
+ "status": "fail"
27
  },
28
  {
29
  "name": "repo_hf_website_html_parity",
30
+ "status": "fail"
31
  },
32
  {
33
  "name": "repo_hf_diagnostic_result_parity",
34
+ "status": "fail"
35
  },
36
  {
37
  "name": "repo_hf_quality_doc_parity",
38
+ "status": "fail"
39
  }
40
  ],
41
  "groups": [
 
351
  },
352
  {
353
  "name": "data/omni_finetune_verified_result.json",
354
+ "status": "fail",
355
  "local": {
356
  "path": "repo:docs/data/omni_finetune_verified_result.json",
357
  "exists": true,
358
+ "bytes": 3768,
359
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
360
  },
361
  "mirrors": {
362
  "hf_space": {
 
378
  "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
379
  }
380
  },
381
+ "failures": [
382
+ {
383
+ "surface": "hf_space",
384
+ "kind": "hash_mismatch",
385
+ "path": "hf_space:data/omni_finetune_verified_result.json",
386
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
387
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
388
+ },
389
+ {
390
+ "surface": "hf_artifacts",
391
+ "kind": "hash_mismatch",
392
+ "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
393
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
394
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
395
+ },
396
+ {
397
+ "surface": "hf_model",
398
+ "kind": "hash_mismatch",
399
+ "path": "hf_model:metrics/omni_finetune_verified_result.json",
400
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
401
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
402
+ }
403
+ ]
404
  },
405
  {
406
  "name": "data/omni_model_comparison.json",
407
+ "status": "fail",
408
  "local": {
409
  "path": "repo:docs/data/omni_model_comparison.json",
410
  "exists": true,
411
+ "bytes": 50422,
412
+ "sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb"
413
  },
414
  "mirrors": {
415
  "hf_space": {
 
431
  "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
432
  }
433
  },
434
+ "failures": [
435
+ {
436
+ "surface": "hf_space",
437
+ "kind": "hash_mismatch",
438
+ "path": "hf_space:data/omni_model_comparison.json",
439
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
440
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
441
+ },
442
+ {
443
+ "surface": "hf_artifacts",
444
+ "kind": "hash_mismatch",
445
+ "path": "hf_artifacts:docs/data/omni_model_comparison.json",
446
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
447
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
448
+ },
449
+ {
450
+ "surface": "hf_model",
451
+ "kind": "hash_mismatch",
452
+ "path": "hf_model:metrics/omni_model_comparison.json",
453
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
454
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
455
+ }
456
+ ]
457
  },
458
  {
459
  "name": "data/project_brief.json",
 
519
  },
520
  {
521
  "name": "data/project_packet.json",
522
+ "status": "fail",
523
  "local": {
524
  "path": "repo:docs/data/project_packet.json",
525
  "exists": true,
526
+ "bytes": 8098,
527
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
528
  },
529
  "mirrors": {
530
  "hf_space": {
 
546
  "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
547
  }
548
  },
549
+ "failures": [
550
+ {
551
+ "surface": "hf_space",
552
+ "kind": "hash_mismatch",
553
+ "path": "hf_space:data/project_packet.json",
554
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
555
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
556
+ },
557
+ {
558
+ "surface": "hf_artifacts",
559
+ "kind": "hash_mismatch",
560
+ "path": "hf_artifacts:docs/data/project_packet.json",
561
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
562
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
563
+ },
564
+ {
565
+ "surface": "hf_model",
566
+ "kind": "hash_mismatch",
567
+ "path": "hf_model:metrics/project_packet.json",
568
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
569
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
570
+ }
571
+ ]
572
  },
573
  {
574
  "name": "data/project_status.json",
575
+ "status": "fail",
576
  "local": {
577
  "path": "repo:docs/data/project_status.json",
578
  "exists": true,
579
+ "bytes": 18062,
580
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
581
  },
582
  "mirrors": {
583
  "hf_space": {
 
599
  "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
600
  }
601
  },
602
+ "failures": [
603
+ {
604
+ "surface": "hf_space",
605
+ "kind": "hash_mismatch",
606
+ "path": "hf_space:data/project_status.json",
607
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
608
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
609
+ },
610
+ {
611
+ "surface": "hf_artifacts",
612
+ "kind": "hash_mismatch",
613
+ "path": "hf_artifacts:docs/data/project_status.json",
614
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
615
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
616
+ },
617
+ {
618
+ "surface": "hf_model",
619
+ "kind": "hash_mismatch",
620
+ "path": "hf_model:metrics/project_status.json",
621
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
622
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
623
+ }
624
+ ]
625
  },
626
  {
627
  "name": "data/publication_audit.json",
 
780
  },
781
  {
782
  "name": "data/research_roadmap.json",
783
+ "status": "fail",
784
  "local": {
785
  "path": "repo:docs/data/research_roadmap.json",
786
  "exists": true,
787
+ "bytes": 10246,
788
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
789
  },
790
  "mirrors": {
791
  "hf_space": {
 
807
  "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
808
  }
809
  },
810
+ "failures": [
811
+ {
812
+ "surface": "hf_space",
813
+ "kind": "hash_mismatch",
814
+ "path": "hf_space:data/research_roadmap.json",
815
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
816
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
817
+ },
818
+ {
819
+ "surface": "hf_artifacts",
820
+ "kind": "hash_mismatch",
821
+ "path": "hf_artifacts:docs/data/research_roadmap.json",
822
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
823
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
824
+ },
825
+ {
826
+ "surface": "hf_model",
827
+ "kind": "hash_mismatch",
828
+ "path": "hf_model:metrics/research_roadmap.json",
829
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
830
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
831
+ }
832
+ ]
833
  },
834
  {
835
  "name": "data/research_roadmap_interactive.json",
836
+ "status": "fail",
837
  "local": {
838
  "path": "repo:docs/data/research_roadmap_interactive.json",
839
  "exists": true,
840
+ "bytes": 143673,
841
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
842
  },
843
  "mirrors": {
844
  "hf_space": {
 
860
  "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
861
  }
862
  },
863
+ "failures": [
864
+ {
865
+ "surface": "hf_space",
866
+ "kind": "hash_mismatch",
867
+ "path": "hf_space:data/research_roadmap_interactive.json",
868
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
869
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
870
+ },
871
+ {
872
+ "surface": "hf_artifacts",
873
+ "kind": "hash_mismatch",
874
+ "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
875
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
876
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
877
+ },
878
+ {
879
+ "surface": "hf_model",
880
+ "kind": "hash_mismatch",
881
+ "path": "hf_model:metrics/research_roadmap_interactive.json",
882
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
883
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
884
+ }
885
+ ]
886
  },
887
  {
888
  "name": "data/research_takeaways.json",
 
1165
  },
1166
  {
1167
  "name": "data/website_integrity.json",
1168
+ "status": "fail",
1169
  "local": {
1170
  "path": "repo:docs/data/website_integrity.json",
1171
  "exists": true,
1172
  "bytes": 15375,
1173
+ "sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828"
1174
  },
1175
  "mirrors": {
1176
  "hf_space": {
 
1192
  "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1193
  }
1194
  },
1195
+ "failures": [
1196
+ {
1197
+ "surface": "hf_space",
1198
+ "kind": "hash_mismatch",
1199
+ "path": "hf_space:data/website_integrity.json",
1200
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1201
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1202
+ },
1203
+ {
1204
+ "surface": "hf_artifacts",
1205
+ "kind": "hash_mismatch",
1206
+ "path": "hf_artifacts:docs/data/website_integrity.json",
1207
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1208
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1209
+ },
1210
+ {
1211
+ "surface": "hf_model",
1212
+ "kind": "hash_mismatch",
1213
+ "path": "hf_model:metrics/website_integrity.json",
1214
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1215
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1216
+ }
1217
+ ]
1218
  },
1219
  {
1220
  "name": "data/xperience10m_dataset_card_alignment.json",
 
1940
  },
1941
  {
1942
  "name": "scripts/omni/build_omni_model_comparison.py",
1943
+ "status": "fail",
1944
  "local": {
1945
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1946
  "exists": true,
1947
+ "bytes": 35566,
1948
+ "sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a"
1949
  },
1950
  "mirrors": {
1951
  "hf_artifacts": {
 
1961
  "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1962
  }
1963
  },
1964
+ "failures": [
1965
+ {
1966
+ "surface": "hf_artifacts",
1967
+ "kind": "hash_mismatch",
1968
+ "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
1969
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
1970
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1971
+ },
1972
+ {
1973
+ "surface": "hf_model",
1974
+ "kind": "hash_mismatch",
1975
+ "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
1976
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
1977
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1978
+ }
1979
+ ]
1980
  },
1981
  {
1982
  "name": "scripts/omni/prepare_qwen3_lora_hf_package.py",
 
2330
  },
2331
  {
2332
  "name": "scripts/verify_live_publication.py",
2333
+ "status": "fail",
2334
  "local": {
2335
  "path": "repo:scripts/verify_live_publication.py",
2336
  "exists": true,
2337
+ "bytes": 36285,
2338
+ "sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471"
2339
  },
2340
  "mirrors": {
2341
  "hf_artifacts": {
 
2351
  "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2352
  }
2353
  },
2354
+ "failures": [
2355
+ {
2356
+ "surface": "hf_artifacts",
2357
+ "kind": "hash_mismatch",
2358
+ "path": "hf_artifacts:scripts/verify_live_publication.py",
2359
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
2360
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2361
+ },
2362
+ {
2363
+ "surface": "hf_model",
2364
+ "kind": "hash_mismatch",
2365
+ "path": "hf_model:scripts/verify_live_publication.py",
2366
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
2367
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2368
+ }
2369
+ ]
2370
  },
2371
  {
2372
  "name": "scripts/validate_mirror_parity.py",
 
2595
  },
2596
  {
2597
  "name": "website/index.html",
2598
+ "status": "fail",
2599
  "local": {
2600
  "path": "repo:docs/index.html",
2601
  "exists": true,
2602
+ "bytes": 181095,
2603
+ "sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1"
2604
  },
2605
  "mirrors": {
2606
  "hf_space": {
 
2616
  "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2617
  }
2618
  },
2619
+ "failures": [
2620
+ {
2621
+ "surface": "hf_space",
2622
+ "kind": "hash_mismatch",
2623
+ "path": "hf_space:index.html",
2624
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
2625
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2626
+ },
2627
+ {
2628
+ "surface": "hf_artifacts_docs",
2629
+ "kind": "hash_mismatch",
2630
+ "path": "hf_artifacts:docs/index.html",
2631
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
2632
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2633
+ }
2634
+ ]
2635
  },
2636
  {
2637
  "name": "website/research_roadmap.html",
 
2896
  },
2897
  {
2898
  "name": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2899
+ "status": "fail",
2900
  "local": {
2901
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2902
  "exists": true,
2903
+ "bytes": 9893,
2904
+ "sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb"
2905
  },
2906
  "mirrors": {
2907
  "hf_space": {
 
2923
  "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2924
  }
2925
  },
2926
+ "failures": [
2927
+ {
2928
+ "surface": "hf_space",
2929
+ "kind": "hash_mismatch",
2930
+ "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2931
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2932
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2933
+ },
2934
+ {
2935
+ "surface": "hf_artifacts",
2936
+ "kind": "hash_mismatch",
2937
+ "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2938
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2939
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2940
+ },
2941
+ {
2942
+ "surface": "hf_model",
2943
+ "kind": "hash_mismatch",
2944
+ "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2945
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2946
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2947
+ }
2948
+ ]
2949
  },
2950
  {
2951
  "name": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
 
7258
  },
7259
  {
7260
  "name": "docs/RESEARCH_ROADMAP.md",
7261
+ "status": "fail",
7262
  "local": {
7263
  "path": "repo:RESEARCH_ROADMAP.md",
7264
  "exists": true,
7265
+ "bytes": 12874,
7266
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7267
  },
7268
  "mirrors": {
7269
  "hf_space": {
 
7285
  "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7286
  }
7287
  },
7288
+ "failures": [
7289
+ {
7290
+ "surface": "hf_space",
7291
+ "kind": "hash_mismatch",
7292
+ "path": "hf_space:RESEARCH_ROADMAP.md",
7293
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7294
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7295
+ },
7296
+ {
7297
+ "surface": "hf_artifacts",
7298
+ "kind": "hash_mismatch",
7299
+ "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7300
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7301
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7302
+ },
7303
+ {
7304
+ "surface": "hf_model",
7305
+ "kind": "hash_mismatch",
7306
+ "path": "hf_model:RESEARCH_ROADMAP.md",
7307
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7308
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7309
+ }
7310
+ ]
7311
  },
7312
  {
7313
  "name": "docs/PROJECT_STATUS.md",
7314
+ "status": "fail",
7315
  "local": {
7316
  "path": "repo:PROJECT_STATUS.md",
7317
  "exists": true,
7318
+ "bytes": 11369,
7319
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7320
  },
7321
  "mirrors": {
7322
  "hf_space": {
 
7338
  "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7339
  }
7340
  },
7341
+ "failures": [
7342
+ {
7343
+ "surface": "hf_space",
7344
+ "kind": "hash_mismatch",
7345
+ "path": "hf_space:PROJECT_STATUS.md",
7346
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7347
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7348
+ },
7349
+ {
7350
+ "surface": "hf_artifacts",
7351
+ "kind": "hash_mismatch",
7352
+ "path": "hf_artifacts:PROJECT_STATUS.md",
7353
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7354
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7355
+ },
7356
+ {
7357
+ "surface": "hf_model",
7358
+ "kind": "hash_mismatch",
7359
+ "path": "hf_model:PROJECT_STATUS.md",
7360
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7361
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7362
+ }
7363
+ ]
7364
  },
7365
  {
7366
  "name": "docs/PUBLIC_SURFACE_QA.md",
 
7487
  "failures": []
7488
  }
7489
  ],
7490
+ "failures": [
7491
+ {
7492
+ "group": "data/omni_finetune_verified_result.json",
7493
+ "surface": "hf_space",
7494
+ "kind": "hash_mismatch",
7495
+ "path": "hf_space:data/omni_finetune_verified_result.json",
7496
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7497
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7498
+ },
7499
+ {
7500
+ "group": "data/omni_finetune_verified_result.json",
7501
+ "surface": "hf_artifacts",
7502
+ "kind": "hash_mismatch",
7503
+ "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
7504
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7505
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7506
+ },
7507
+ {
7508
+ "group": "data/omni_finetune_verified_result.json",
7509
+ "surface": "hf_model",
7510
+ "kind": "hash_mismatch",
7511
+ "path": "hf_model:metrics/omni_finetune_verified_result.json",
7512
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7513
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7514
+ },
7515
+ {
7516
+ "group": "data/omni_model_comparison.json",
7517
+ "surface": "hf_space",
7518
+ "kind": "hash_mismatch",
7519
+ "path": "hf_space:data/omni_model_comparison.json",
7520
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7521
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7522
+ },
7523
+ {
7524
+ "group": "data/omni_model_comparison.json",
7525
+ "surface": "hf_artifacts",
7526
+ "kind": "hash_mismatch",
7527
+ "path": "hf_artifacts:docs/data/omni_model_comparison.json",
7528
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7529
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7530
+ },
7531
+ {
7532
+ "group": "data/omni_model_comparison.json",
7533
+ "surface": "hf_model",
7534
+ "kind": "hash_mismatch",
7535
+ "path": "hf_model:metrics/omni_model_comparison.json",
7536
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7537
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7538
+ },
7539
+ {
7540
+ "group": "data/project_packet.json",
7541
+ "surface": "hf_space",
7542
+ "kind": "hash_mismatch",
7543
+ "path": "hf_space:data/project_packet.json",
7544
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7545
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7546
+ },
7547
+ {
7548
+ "group": "data/project_packet.json",
7549
+ "surface": "hf_artifacts",
7550
+ "kind": "hash_mismatch",
7551
+ "path": "hf_artifacts:docs/data/project_packet.json",
7552
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7553
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7554
+ },
7555
+ {
7556
+ "group": "data/project_packet.json",
7557
+ "surface": "hf_model",
7558
+ "kind": "hash_mismatch",
7559
+ "path": "hf_model:metrics/project_packet.json",
7560
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7561
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7562
+ },
7563
+ {
7564
+ "group": "data/project_status.json",
7565
+ "surface": "hf_space",
7566
+ "kind": "hash_mismatch",
7567
+ "path": "hf_space:data/project_status.json",
7568
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7569
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7570
+ },
7571
+ {
7572
+ "group": "data/project_status.json",
7573
+ "surface": "hf_artifacts",
7574
+ "kind": "hash_mismatch",
7575
+ "path": "hf_artifacts:docs/data/project_status.json",
7576
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7577
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7578
+ },
7579
+ {
7580
+ "group": "data/project_status.json",
7581
+ "surface": "hf_model",
7582
+ "kind": "hash_mismatch",
7583
+ "path": "hf_model:metrics/project_status.json",
7584
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7585
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7586
+ },
7587
+ {
7588
+ "group": "data/research_roadmap.json",
7589
+ "surface": "hf_space",
7590
+ "kind": "hash_mismatch",
7591
+ "path": "hf_space:data/research_roadmap.json",
7592
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7593
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7594
+ },
7595
+ {
7596
+ "group": "data/research_roadmap.json",
7597
+ "surface": "hf_artifacts",
7598
+ "kind": "hash_mismatch",
7599
+ "path": "hf_artifacts:docs/data/research_roadmap.json",
7600
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7601
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7602
+ },
7603
+ {
7604
+ "group": "data/research_roadmap.json",
7605
+ "surface": "hf_model",
7606
+ "kind": "hash_mismatch",
7607
+ "path": "hf_model:metrics/research_roadmap.json",
7608
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7609
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7610
+ },
7611
+ {
7612
+ "group": "data/research_roadmap_interactive.json",
7613
+ "surface": "hf_space",
7614
+ "kind": "hash_mismatch",
7615
+ "path": "hf_space:data/research_roadmap_interactive.json",
7616
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7617
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7618
+ },
7619
+ {
7620
+ "group": "data/research_roadmap_interactive.json",
7621
+ "surface": "hf_artifacts",
7622
+ "kind": "hash_mismatch",
7623
+ "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
7624
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7625
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7626
+ },
7627
+ {
7628
+ "group": "data/research_roadmap_interactive.json",
7629
+ "surface": "hf_model",
7630
+ "kind": "hash_mismatch",
7631
+ "path": "hf_model:metrics/research_roadmap_interactive.json",
7632
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7633
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7634
+ },
7635
+ {
7636
+ "group": "data/website_integrity.json",
7637
+ "surface": "hf_space",
7638
+ "kind": "hash_mismatch",
7639
+ "path": "hf_space:data/website_integrity.json",
7640
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7641
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7642
+ },
7643
+ {
7644
+ "group": "data/website_integrity.json",
7645
+ "surface": "hf_artifacts",
7646
+ "kind": "hash_mismatch",
7647
+ "path": "hf_artifacts:docs/data/website_integrity.json",
7648
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7649
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7650
+ },
7651
+ {
7652
+ "group": "data/website_integrity.json",
7653
+ "surface": "hf_model",
7654
+ "kind": "hash_mismatch",
7655
+ "path": "hf_model:metrics/website_integrity.json",
7656
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7657
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7658
+ },
7659
+ {
7660
+ "group": "scripts/omni/build_omni_model_comparison.py",
7661
+ "surface": "hf_artifacts",
7662
+ "kind": "hash_mismatch",
7663
+ "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
7664
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
7665
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
7666
+ },
7667
+ {
7668
+ "group": "scripts/omni/build_omni_model_comparison.py",
7669
+ "surface": "hf_model",
7670
+ "kind": "hash_mismatch",
7671
+ "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
7672
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
7673
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
7674
+ },
7675
+ {
7676
+ "group": "scripts/verify_live_publication.py",
7677
+ "surface": "hf_artifacts",
7678
+ "kind": "hash_mismatch",
7679
+ "path": "hf_artifacts:scripts/verify_live_publication.py",
7680
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
7681
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
7682
+ },
7683
+ {
7684
+ "group": "scripts/verify_live_publication.py",
7685
+ "surface": "hf_model",
7686
+ "kind": "hash_mismatch",
7687
+ "path": "hf_model:scripts/verify_live_publication.py",
7688
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
7689
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
7690
+ },
7691
+ {
7692
+ "group": "website/index.html",
7693
+ "surface": "hf_space",
7694
+ "kind": "hash_mismatch",
7695
+ "path": "hf_space:index.html",
7696
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
7697
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
7698
+ },
7699
+ {
7700
+ "group": "website/index.html",
7701
+ "surface": "hf_artifacts_docs",
7702
+ "kind": "hash_mismatch",
7703
+ "path": "hf_artifacts:docs/index.html",
7704
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
7705
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
7706
+ },
7707
+ {
7708
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7709
+ "surface": "hf_space",
7710
+ "kind": "hash_mismatch",
7711
+ "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7712
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7713
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7714
+ },
7715
+ {
7716
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7717
+ "surface": "hf_artifacts",
7718
+ "kind": "hash_mismatch",
7719
+ "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7720
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7721
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7722
+ },
7723
+ {
7724
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7725
+ "surface": "hf_model",
7726
+ "kind": "hash_mismatch",
7727
+ "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7728
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7729
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7730
+ },
7731
+ {
7732
+ "group": "docs/RESEARCH_ROADMAP.md",
7733
+ "surface": "hf_space",
7734
+ "kind": "hash_mismatch",
7735
+ "path": "hf_space:RESEARCH_ROADMAP.md",
7736
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7737
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7738
+ },
7739
+ {
7740
+ "group": "docs/RESEARCH_ROADMAP.md",
7741
+ "surface": "hf_artifacts",
7742
+ "kind": "hash_mismatch",
7743
+ "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7744
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7745
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7746
+ },
7747
+ {
7748
+ "group": "docs/RESEARCH_ROADMAP.md",
7749
+ "surface": "hf_model",
7750
+ "kind": "hash_mismatch",
7751
+ "path": "hf_model:RESEARCH_ROADMAP.md",
7752
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7753
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7754
+ },
7755
+ {
7756
+ "group": "docs/PROJECT_STATUS.md",
7757
+ "surface": "hf_space",
7758
+ "kind": "hash_mismatch",
7759
+ "path": "hf_space:PROJECT_STATUS.md",
7760
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7761
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7762
+ },
7763
+ {
7764
+ "group": "docs/PROJECT_STATUS.md",
7765
+ "surface": "hf_artifacts",
7766
+ "kind": "hash_mismatch",
7767
+ "path": "hf_artifacts:PROJECT_STATUS.md",
7768
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7769
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7770
+ },
7771
+ {
7772
+ "group": "docs/PROJECT_STATUS.md",
7773
+ "surface": "hf_model",
7774
+ "kind": "hash_mismatch",
7775
+ "path": "hf_model:PROJECT_STATUS.md",
7776
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7777
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7778
+ }
7779
+ ]
7780
  }
data/omni_finetune_verified_result.json CHANGED
@@ -80,7 +80,7 @@
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
- "Implement the Cosmos3-Super diffusion/action target packer and supervised loss before claiming Cosmos3 fine-tuning.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
 
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
+ "Implement the Cosmos3-Super pipeline-loaded batch packer and one-sample forward-dynamics overfit before claiming Cosmos3 fine-tuning; camera-pose proxy targets are now exported, contract-audited, and schema-packed, but no Cosmos weights have been updated.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
data/omni_model_comparison.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
- "generated_at_utc": "2026-06-07T15:34:51+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
@@ -8,7 +8,7 @@
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
- "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation rather than a new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
@@ -1012,7 +1012,62 @@
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
- "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible, but blocks true fine-tuning until a Cosmos-specific diffusion/action target packer and supervised loss are implemented."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1016
  }
1017
  ],
1018
  "multi_episode_128_runs": [
@@ -1056,7 +1111,7 @@
1056
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1057
  }
1058
  ],
1059
- "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. The readiness probe records why true Cosmos3-Super fine-tuning is not launched yet."
1060
  }
1061
  ],
1062
  "model_group_reading_notes": [
@@ -1064,10 +1119,10 @@
1064
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1065
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1066
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1067
- "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a training-readiness probe; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1068
  ],
1069
  "pending": [
1070
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1071
- "Promote Cosmos3 from Nano compatibility and Super base-weight evaluation to true fine-tuning only after a dedicated Cosmos diffusion/action target packer and supervised loss produce new weights."
1072
  ]
1073
  }
 
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
+ "generated_at_utc": "2026-06-07T17:27:36+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
 
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
+ "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation; Cosmos3-Super now has a camera-pose forward-dynamics contract audit and schema-only packer smoke, but no new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
 
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
+ "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible. It predates the camera-pose action-target export, so use the 20260608 contract audit for the current trainer-readiness status."
1016
+ },
1017
+ {
1018
+ "id": "xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608",
1019
+ "title": "Cosmos3-Super Camera-Pose Target Audit",
1020
+ "scope_label": "action target contract",
1021
+ "scope": "selected 128-episode 96/16/16 dataset augmented with camera_pose proxy cosmos_action_target records",
1022
+ "status": "ready_for_forward_dynamics_trainer",
1023
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
1024
+ "split": "train/val/test by selected episode/session",
1025
+ "counts": {
1026
+ "dataset_samples": 3808,
1027
+ "rows_with_action_target": 3808,
1028
+ "valid_action_targets": 3808,
1029
+ "split_counts": {
1030
+ "train": 2848,
1031
+ "val": 512,
1032
+ "test": 448
1033
+ },
1034
+ "episode_split_counts": {
1035
+ "test": 14,
1036
+ "train": 89,
1037
+ "val": 16
1038
+ }
1039
+ },
1040
+ "primary_metrics": {
1041
+ "domain_name": "camera_pose",
1042
+ "raw_action_dim": 9,
1043
+ "mode": "forward_dynamics",
1044
+ "valid_action_targets": 3808,
1045
+ "weights_updated": false
1046
+ },
1047
+ "weights": "none; action-target contract audit only, no adapter checkpoint",
1048
+ "interpretation": "The selected dataset now has valid Cosmos3 camera_pose forward_dynamics targets for an egocentric camera-motion proxy. These remove the target-schema blocker for action-conditioned world-model training, but they supervise noisy vision tokens rather than preds_action. The remaining work is a pipeline-loaded packer check and one-sample forward-dynamics overfit; action-token prediction needs a separate policy or inverse-dynamics target export."
1049
+ },
1050
+ {
1051
+ "id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608",
1052
+ "title": "Cosmos3-Super Action Batch Packer Smoke",
1053
+ "scope_label": "batch packer",
1054
+ "scope": "one selected train row from the camera_pose forward_dynamics augmented JSONL",
1055
+ "status": "pass",
1056
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json",
1057
+ "split": "train",
1058
+ "counts": {
1059
+ "samples": 1,
1060
+ "raw_action_rows": 8,
1061
+ "raw_action_dim": 9
1062
+ },
1063
+ "primary_metrics": {
1064
+ "mode": "forward_dynamics",
1065
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose",
1066
+ "pipeline_loaded": false,
1067
+ "weights_updated": false
1068
+ },
1069
+ "weights": "none; schema-only packer smoke, no adapter checkpoint",
1070
+ "interpretation": "The selected row maps to a camera_pose forward_dynamics contract. In the installed Cosmos3 pipeline this uses raw actions as conditioning and supervises noisy vision tokens; it does not supervise preds_action."
1071
  }
1072
  ],
1073
  "multi_episode_128_runs": [
 
1111
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1112
  }
1113
  ],
1114
+ "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. A camera-pose proxy forward-dynamics target export now passes the contract audit and schema-only packer smoke; true Cosmos3-Super fine-tuning is still not launched until the pipeline-loaded packer check and one-sample overfit exist."
1115
  }
1116
  ],
1117
  "model_group_reading_notes": [
 
1119
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1120
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1121
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1122
+ "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a camera-pose forward-dynamics contract audit; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1123
  ],
1124
  "pending": [
1125
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1126
+ "Promote Cosmos3 from Nano compatibility, Super base-weight evaluation, and the camera-pose forward-dynamics contract to true fine-tuning only after the pipeline-loaded packer check and one-sample overfit produce new weights."
1127
  ]
1128
  }
data/project_packet.json CHANGED
@@ -41,7 +41,7 @@
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
- "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, and the Cosmos3-Super base-weight Reasoner evaluation are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
 
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
+ "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, the Cosmos3-Super base-weight Reasoner evaluation, and the Cosmos3-Super camera-pose forward-dynamics contract audit plus schema-only packer smoke are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
data/project_status.json CHANGED
@@ -119,7 +119,7 @@
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
- "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package plus a verified Cosmos3-Super base-weight Reasoner evaluation; OpenVLA/openpi/GR00T are policy candidates after action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
@@ -244,6 +244,18 @@
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
 
 
 
 
 
 
 
 
 
 
 
 
247
  {
248
  "area": "Raw Xperience-10M redistribution",
249
  "status": "not_included",
@@ -276,11 +288,11 @@
276
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
277
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
278
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
279
- "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, and Cosmos3-Super Reasoner is verified as a base-weight evaluation; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
280
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
281
  "Audio is one of the synchronized source modalities in the current task representation.",
282
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
283
- "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, and policy models such as OpenVLA/openpi/GR00T wait for action-target conversion.",
284
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
285
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
286
  ]
 
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
+ "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package, a verified Cosmos3-Super base-weight Reasoner evaluation, and a Cosmos3-Super camera-pose proxy forward-dynamics contract audit plus schema-only packer smoke. The current target supports vision-velocity training under action conditioning, not supervised action-token prediction; OpenVLA/openpi/GR00T are policy candidates after robot-compatible action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
 
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
247
+ {
248
+ "area": "Cosmos3-Super action-target contract",
249
+ "status": "ready_for_forward_dynamics_trainer_implementation",
250
+ "evidence": [
251
+ "scripts/omni/export_cosmos3_camera_pose_targets.py",
252
+ "scripts/omni/pack_cosmos3_super_action_batch.py",
253
+ "results/omni_finetune/xperience10m_cosmos3_camera_pose_targets_20260608/target_manifest.json",
254
+ "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
255
+ "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json"
256
+ ],
257
+ "readout": "The selected 128-episode JSONL is augmented with 3,808/3,808 valid camera_pose proxy cosmos_action_target records from SLAM pose deltas. The schema-only packer smoke confirms the current forward_dynamics target should supervise noisy vision tokens under camera-pose conditioning; it does not supervise preds_action. Remaining work is a pipeline-loaded packer check, one-sample forward-dynamics overfit, and a separate policy/inverse target export before claiming action-token prediction."
258
+ },
259
  {
260
  "area": "Raw Xperience-10M redistribution",
261
  "status": "not_included",
 
288
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
289
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
290
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
291
+ "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super camera-pose forward-dynamics targets now pass the contract audit plus a schema-only packer smoke; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
292
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
293
  "Audio is one of the synchronized source modalities in the current task representation.",
294
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
295
+ "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, Cosmos3-Super has a camera-pose proxy forward-dynamics contract ready for trainer implementation, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
296
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
297
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
298
  ]
data/research_roadmap.json CHANGED
@@ -133,7 +133,7 @@
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
 
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
data/research_roadmap_interactive.json CHANGED
@@ -2369,7 +2369,7 @@
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
 
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
data/website_integrity.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:47:32+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
@@ -75,7 +75,7 @@
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
- "evidence_index": 90477
79
  },
80
  {
81
  "name": "project_status_links_json",
@@ -153,8 +153,8 @@
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
- "protocol_index": 87160,
157
- "evidence_index": 90477
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
@@ -292,7 +292,7 @@
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
- "bytes": 410374,
296
  "top_level_type": "dict"
297
  },
298
  {
@@ -302,12 +302,12 @@
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
- "bytes": 3628,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
- "bytes": 48296,
311
  "top_level_type": "dict"
312
  },
313
  {
@@ -322,12 +322,12 @@
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
- "bytes": 8005,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
- "bytes": 16455,
331
  "top_level_type": "dict"
332
  },
333
  {
@@ -367,12 +367,12 @@
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
- "bytes": 10133,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
- "bytes": 143560,
376
  "top_level_type": "dict"
377
  },
378
  {
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-07T17:27:17+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
 
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
+ "evidence_index": 90659
79
  },
80
  {
81
  "name": "project_status_links_json",
 
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
+ "protocol_index": 87218,
157
+ "evidence_index": 90659
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
 
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
+ "bytes": 319291,
296
  "top_level_type": "dict"
297
  },
298
  {
 
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
+ "bytes": 3768,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
+ "bytes": 50422,
311
  "top_level_type": "dict"
312
  },
313
  {
 
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
+ "bytes": 8098,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
+ "bytes": 18062,
331
  "top_level_type": "dict"
332
  },
333
  {
 
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
+ "bytes": 10246,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
+ "bytes": 143673,
376
  "top_level_type": "dict"
377
  },
378
  {
docs/assets/charts/episode_task_scores.svg CHANGED
docs/assets/charts/episode_task_scores_minimal_vs_neural.svg CHANGED
docs/assets/charts/episode_task_scores_neural_mlp.svg CHANGED
docs/assets/charts/research_direction_coverage.svg CHANGED
docs/assets/task_architectures.png CHANGED

Git LFS Details

  • SHA256: 076c2e463ddce473e9138ac6f3615152d59031d6be2aa5c3d9ae1ace3d3f6c83
  • Pointer size: 131 Bytes
  • Size of remote file: 762 kB

Git LFS Details

  • SHA256: f08b03bc21e194efe382347d74cf89cd6ac65dede51889971dbfc2fb9d1de3c2
  • Pointer size: 131 Bytes
  • Size of remote file: 774 kB
docs/assets/task_architectures.svg CHANGED
docs/assets/task_suite_infographic.png CHANGED

Git LFS Details

  • SHA256: 213d81f49d27e3f2560c79e29a017c017cbe38d8d605815bf3bc87834a1424ae
  • Pointer size: 132 Bytes
  • Size of remote file: 2.61 MB

Git LFS Details

  • SHA256: 1275e2adaef920ecde7c29dc62c8d79d4f13475a0c09bc3baa693f47cdec2e1f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.59 MB
docs/data/mirror_parity.json CHANGED
@@ -1,16 +1,21 @@
1
  {
2
- "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:49:31+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
7
- "failure_count": 0,
8
- "failures_by_surface": {}
 
 
 
 
 
9
  },
10
  "checks": [
11
  {
12
  "name": "repo_hf_space_artifact_model_data_parity",
13
- "status": "pass"
14
  },
15
  {
16
  "name": "repo_hf_visual_asset_parity",
@@ -18,19 +23,19 @@
18
  },
19
  {
20
  "name": "repo_hf_validator_script_parity",
21
- "status": "pass"
22
  },
23
  {
24
  "name": "repo_hf_website_html_parity",
25
- "status": "pass"
26
  },
27
  {
28
  "name": "repo_hf_diagnostic_result_parity",
29
- "status": "pass"
30
  },
31
  {
32
  "name": "repo_hf_quality_doc_parity",
33
- "status": "pass"
34
  }
35
  ],
36
  "groups": [
@@ -346,12 +351,12 @@
346
  },
347
  {
348
  "name": "data/omni_finetune_verified_result.json",
349
- "status": "pass",
350
  "local": {
351
  "path": "repo:docs/data/omni_finetune_verified_result.json",
352
  "exists": true,
353
- "bytes": 3628,
354
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
355
  },
356
  "mirrors": {
357
  "hf_space": {
@@ -373,16 +378,38 @@
373
  "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
374
  }
375
  },
376
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
377
  },
378
  {
379
  "name": "data/omni_model_comparison.json",
380
- "status": "pass",
381
  "local": {
382
  "path": "repo:docs/data/omni_model_comparison.json",
383
  "exists": true,
384
- "bytes": 48296,
385
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
386
  },
387
  "mirrors": {
388
  "hf_space": {
@@ -404,7 +431,29 @@
404
  "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
405
  }
406
  },
407
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
408
  },
409
  {
410
  "name": "data/project_brief.json",
@@ -470,12 +519,12 @@
470
  },
471
  {
472
  "name": "data/project_packet.json",
473
- "status": "pass",
474
  "local": {
475
  "path": "repo:docs/data/project_packet.json",
476
  "exists": true,
477
- "bytes": 8005,
478
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
479
  },
480
  "mirrors": {
481
  "hf_space": {
@@ -497,16 +546,38 @@
497
  "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
498
  }
499
  },
500
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
501
  },
502
  {
503
  "name": "data/project_status.json",
504
- "status": "pass",
505
  "local": {
506
  "path": "repo:docs/data/project_status.json",
507
  "exists": true,
508
- "bytes": 16455,
509
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
510
  },
511
  "mirrors": {
512
  "hf_space": {
@@ -528,7 +599,29 @@
528
  "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
529
  }
530
  },
531
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
532
  },
533
  {
534
  "name": "data/publication_audit.json",
@@ -687,12 +780,12 @@
687
  },
688
  {
689
  "name": "data/research_roadmap.json",
690
- "status": "pass",
691
  "local": {
692
  "path": "repo:docs/data/research_roadmap.json",
693
  "exists": true,
694
- "bytes": 10133,
695
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
696
  },
697
  "mirrors": {
698
  "hf_space": {
@@ -714,16 +807,38 @@
714
  "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
715
  }
716
  },
717
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
718
  },
719
  {
720
  "name": "data/research_roadmap_interactive.json",
721
- "status": "pass",
722
  "local": {
723
  "path": "repo:docs/data/research_roadmap_interactive.json",
724
  "exists": true,
725
- "bytes": 143560,
726
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
727
  },
728
  "mirrors": {
729
  "hf_space": {
@@ -745,7 +860,29 @@
745
  "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
746
  }
747
  },
748
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
749
  },
750
  {
751
  "name": "data/research_takeaways.json",
@@ -1028,12 +1165,12 @@
1028
  },
1029
  {
1030
  "name": "data/website_integrity.json",
1031
- "status": "pass",
1032
  "local": {
1033
  "path": "repo:docs/data/website_integrity.json",
1034
  "exists": true,
1035
  "bytes": 15375,
1036
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1037
  },
1038
  "mirrors": {
1039
  "hf_space": {
@@ -1055,7 +1192,29 @@
1055
  "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1056
  }
1057
  },
1058
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1059
  },
1060
  {
1061
  "name": "data/xperience10m_dataset_card_alignment.json",
@@ -1781,12 +1940,12 @@
1781
  },
1782
  {
1783
  "name": "scripts/omni/build_omni_model_comparison.py",
1784
- "status": "pass",
1785
  "local": {
1786
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1787
  "exists": true,
1788
- "bytes": 30236,
1789
- "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1790
  },
1791
  "mirrors": {
1792
  "hf_artifacts": {
@@ -1802,7 +1961,22 @@
1802
  "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1803
  }
1804
  },
1805
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1806
  },
1807
  {
1808
  "name": "scripts/omni/prepare_qwen3_lora_hf_package.py",
@@ -2156,12 +2330,12 @@
2156
  },
2157
  {
2158
  "name": "scripts/verify_live_publication.py",
2159
- "status": "pass",
2160
  "local": {
2161
  "path": "repo:scripts/verify_live_publication.py",
2162
  "exists": true,
2163
- "bytes": 36201,
2164
- "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2165
  },
2166
  "mirrors": {
2167
  "hf_artifacts": {
@@ -2177,7 +2351,22 @@
2177
  "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2178
  }
2179
  },
2180
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2181
  },
2182
  {
2183
  "name": "scripts/validate_mirror_parity.py",
@@ -2406,12 +2595,12 @@
2406
  },
2407
  {
2408
  "name": "website/index.html",
2409
- "status": "pass",
2410
  "local": {
2411
  "path": "repo:docs/index.html",
2412
  "exists": true,
2413
- "bytes": 180727,
2414
- "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2415
  },
2416
  "mirrors": {
2417
  "hf_space": {
@@ -2427,7 +2616,22 @@
2427
  "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2428
  }
2429
  },
2430
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2431
  },
2432
  {
2433
  "name": "website/research_roadmap.html",
@@ -2692,12 +2896,12 @@
2692
  },
2693
  {
2694
  "name": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2695
- "status": "pass",
2696
  "local": {
2697
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2698
  "exists": true,
2699
- "bytes": 9231,
2700
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2701
  },
2702
  "mirrors": {
2703
  "hf_space": {
@@ -2719,7 +2923,29 @@
2719
  "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2720
  }
2721
  },
2722
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2723
  },
2724
  {
2725
  "name": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
@@ -7032,12 +7258,12 @@
7032
  },
7033
  {
7034
  "name": "docs/RESEARCH_ROADMAP.md",
7035
- "status": "pass",
7036
  "local": {
7037
  "path": "repo:RESEARCH_ROADMAP.md",
7038
  "exists": true,
7039
- "bytes": 12233,
7040
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7041
  },
7042
  "mirrors": {
7043
  "hf_space": {
@@ -7059,16 +7285,38 @@
7059
  "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7060
  }
7061
  },
7062
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7063
  },
7064
  {
7065
  "name": "docs/PROJECT_STATUS.md",
7066
- "status": "pass",
7067
  "local": {
7068
  "path": "repo:PROJECT_STATUS.md",
7069
  "exists": true,
7070
- "bytes": 9926,
7071
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7072
  },
7073
  "mirrors": {
7074
  "hf_space": {
@@ -7090,7 +7338,29 @@
7090
  "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7091
  }
7092
  },
7093
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7094
  },
7095
  {
7096
  "name": "docs/PUBLIC_SURFACE_QA.md",
@@ -7217,5 +7487,294 @@
7217
  "failures": []
7218
  }
7219
  ],
7220
- "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7221
  }
 
1
  {
2
+ "status": "fail",
3
+ "generated_at_utc": "2026-06-07T17:27:20+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
7
+ "failure_count": 36,
8
+ "failures_by_surface": {
9
+ "hf_space": 11,
10
+ "hf_artifacts": 12,
11
+ "hf_model": 12,
12
+ "hf_artifacts_docs": 1
13
+ }
14
  },
15
  "checks": [
16
  {
17
  "name": "repo_hf_space_artifact_model_data_parity",
18
+ "status": "fail"
19
  },
20
  {
21
  "name": "repo_hf_visual_asset_parity",
 
23
  },
24
  {
25
  "name": "repo_hf_validator_script_parity",
26
+ "status": "fail"
27
  },
28
  {
29
  "name": "repo_hf_website_html_parity",
30
+ "status": "fail"
31
  },
32
  {
33
  "name": "repo_hf_diagnostic_result_parity",
34
+ "status": "fail"
35
  },
36
  {
37
  "name": "repo_hf_quality_doc_parity",
38
+ "status": "fail"
39
  }
40
  ],
41
  "groups": [
 
351
  },
352
  {
353
  "name": "data/omni_finetune_verified_result.json",
354
+ "status": "fail",
355
  "local": {
356
  "path": "repo:docs/data/omni_finetune_verified_result.json",
357
  "exists": true,
358
+ "bytes": 3768,
359
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
360
  },
361
  "mirrors": {
362
  "hf_space": {
 
378
  "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
379
  }
380
  },
381
+ "failures": [
382
+ {
383
+ "surface": "hf_space",
384
+ "kind": "hash_mismatch",
385
+ "path": "hf_space:data/omni_finetune_verified_result.json",
386
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
387
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
388
+ },
389
+ {
390
+ "surface": "hf_artifacts",
391
+ "kind": "hash_mismatch",
392
+ "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
393
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
394
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
395
+ },
396
+ {
397
+ "surface": "hf_model",
398
+ "kind": "hash_mismatch",
399
+ "path": "hf_model:metrics/omni_finetune_verified_result.json",
400
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
401
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
402
+ }
403
+ ]
404
  },
405
  {
406
  "name": "data/omni_model_comparison.json",
407
+ "status": "fail",
408
  "local": {
409
  "path": "repo:docs/data/omni_model_comparison.json",
410
  "exists": true,
411
+ "bytes": 50422,
412
+ "sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb"
413
  },
414
  "mirrors": {
415
  "hf_space": {
 
431
  "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
432
  }
433
  },
434
+ "failures": [
435
+ {
436
+ "surface": "hf_space",
437
+ "kind": "hash_mismatch",
438
+ "path": "hf_space:data/omni_model_comparison.json",
439
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
440
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
441
+ },
442
+ {
443
+ "surface": "hf_artifacts",
444
+ "kind": "hash_mismatch",
445
+ "path": "hf_artifacts:docs/data/omni_model_comparison.json",
446
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
447
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
448
+ },
449
+ {
450
+ "surface": "hf_model",
451
+ "kind": "hash_mismatch",
452
+ "path": "hf_model:metrics/omni_model_comparison.json",
453
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
454
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
455
+ }
456
+ ]
457
  },
458
  {
459
  "name": "data/project_brief.json",
 
519
  },
520
  {
521
  "name": "data/project_packet.json",
522
+ "status": "fail",
523
  "local": {
524
  "path": "repo:docs/data/project_packet.json",
525
  "exists": true,
526
+ "bytes": 8098,
527
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
528
  },
529
  "mirrors": {
530
  "hf_space": {
 
546
  "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
547
  }
548
  },
549
+ "failures": [
550
+ {
551
+ "surface": "hf_space",
552
+ "kind": "hash_mismatch",
553
+ "path": "hf_space:data/project_packet.json",
554
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
555
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
556
+ },
557
+ {
558
+ "surface": "hf_artifacts",
559
+ "kind": "hash_mismatch",
560
+ "path": "hf_artifacts:docs/data/project_packet.json",
561
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
562
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
563
+ },
564
+ {
565
+ "surface": "hf_model",
566
+ "kind": "hash_mismatch",
567
+ "path": "hf_model:metrics/project_packet.json",
568
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
569
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
570
+ }
571
+ ]
572
  },
573
  {
574
  "name": "data/project_status.json",
575
+ "status": "fail",
576
  "local": {
577
  "path": "repo:docs/data/project_status.json",
578
  "exists": true,
579
+ "bytes": 18062,
580
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
581
  },
582
  "mirrors": {
583
  "hf_space": {
 
599
  "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
600
  }
601
  },
602
+ "failures": [
603
+ {
604
+ "surface": "hf_space",
605
+ "kind": "hash_mismatch",
606
+ "path": "hf_space:data/project_status.json",
607
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
608
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
609
+ },
610
+ {
611
+ "surface": "hf_artifacts",
612
+ "kind": "hash_mismatch",
613
+ "path": "hf_artifacts:docs/data/project_status.json",
614
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
615
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
616
+ },
617
+ {
618
+ "surface": "hf_model",
619
+ "kind": "hash_mismatch",
620
+ "path": "hf_model:metrics/project_status.json",
621
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
622
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
623
+ }
624
+ ]
625
  },
626
  {
627
  "name": "data/publication_audit.json",
 
780
  },
781
  {
782
  "name": "data/research_roadmap.json",
783
+ "status": "fail",
784
  "local": {
785
  "path": "repo:docs/data/research_roadmap.json",
786
  "exists": true,
787
+ "bytes": 10246,
788
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
789
  },
790
  "mirrors": {
791
  "hf_space": {
 
807
  "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
808
  }
809
  },
810
+ "failures": [
811
+ {
812
+ "surface": "hf_space",
813
+ "kind": "hash_mismatch",
814
+ "path": "hf_space:data/research_roadmap.json",
815
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
816
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
817
+ },
818
+ {
819
+ "surface": "hf_artifacts",
820
+ "kind": "hash_mismatch",
821
+ "path": "hf_artifacts:docs/data/research_roadmap.json",
822
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
823
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
824
+ },
825
+ {
826
+ "surface": "hf_model",
827
+ "kind": "hash_mismatch",
828
+ "path": "hf_model:metrics/research_roadmap.json",
829
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
830
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
831
+ }
832
+ ]
833
  },
834
  {
835
  "name": "data/research_roadmap_interactive.json",
836
+ "status": "fail",
837
  "local": {
838
  "path": "repo:docs/data/research_roadmap_interactive.json",
839
  "exists": true,
840
+ "bytes": 143673,
841
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
842
  },
843
  "mirrors": {
844
  "hf_space": {
 
860
  "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
861
  }
862
  },
863
+ "failures": [
864
+ {
865
+ "surface": "hf_space",
866
+ "kind": "hash_mismatch",
867
+ "path": "hf_space:data/research_roadmap_interactive.json",
868
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
869
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
870
+ },
871
+ {
872
+ "surface": "hf_artifacts",
873
+ "kind": "hash_mismatch",
874
+ "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
875
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
876
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
877
+ },
878
+ {
879
+ "surface": "hf_model",
880
+ "kind": "hash_mismatch",
881
+ "path": "hf_model:metrics/research_roadmap_interactive.json",
882
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
883
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
884
+ }
885
+ ]
886
  },
887
  {
888
  "name": "data/research_takeaways.json",
 
1165
  },
1166
  {
1167
  "name": "data/website_integrity.json",
1168
+ "status": "fail",
1169
  "local": {
1170
  "path": "repo:docs/data/website_integrity.json",
1171
  "exists": true,
1172
  "bytes": 15375,
1173
+ "sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828"
1174
  },
1175
  "mirrors": {
1176
  "hf_space": {
 
1192
  "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1193
  }
1194
  },
1195
+ "failures": [
1196
+ {
1197
+ "surface": "hf_space",
1198
+ "kind": "hash_mismatch",
1199
+ "path": "hf_space:data/website_integrity.json",
1200
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1201
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1202
+ },
1203
+ {
1204
+ "surface": "hf_artifacts",
1205
+ "kind": "hash_mismatch",
1206
+ "path": "hf_artifacts:docs/data/website_integrity.json",
1207
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1208
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1209
+ },
1210
+ {
1211
+ "surface": "hf_model",
1212
+ "kind": "hash_mismatch",
1213
+ "path": "hf_model:metrics/website_integrity.json",
1214
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
1215
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1216
+ }
1217
+ ]
1218
  },
1219
  {
1220
  "name": "data/xperience10m_dataset_card_alignment.json",
 
1940
  },
1941
  {
1942
  "name": "scripts/omni/build_omni_model_comparison.py",
1943
+ "status": "fail",
1944
  "local": {
1945
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1946
  "exists": true,
1947
+ "bytes": 35566,
1948
+ "sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a"
1949
  },
1950
  "mirrors": {
1951
  "hf_artifacts": {
 
1961
  "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1962
  }
1963
  },
1964
+ "failures": [
1965
+ {
1966
+ "surface": "hf_artifacts",
1967
+ "kind": "hash_mismatch",
1968
+ "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
1969
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
1970
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1971
+ },
1972
+ {
1973
+ "surface": "hf_model",
1974
+ "kind": "hash_mismatch",
1975
+ "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
1976
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
1977
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1978
+ }
1979
+ ]
1980
  },
1981
  {
1982
  "name": "scripts/omni/prepare_qwen3_lora_hf_package.py",
 
2330
  },
2331
  {
2332
  "name": "scripts/verify_live_publication.py",
2333
+ "status": "fail",
2334
  "local": {
2335
  "path": "repo:scripts/verify_live_publication.py",
2336
  "exists": true,
2337
+ "bytes": 36285,
2338
+ "sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471"
2339
  },
2340
  "mirrors": {
2341
  "hf_artifacts": {
 
2351
  "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2352
  }
2353
  },
2354
+ "failures": [
2355
+ {
2356
+ "surface": "hf_artifacts",
2357
+ "kind": "hash_mismatch",
2358
+ "path": "hf_artifacts:scripts/verify_live_publication.py",
2359
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
2360
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2361
+ },
2362
+ {
2363
+ "surface": "hf_model",
2364
+ "kind": "hash_mismatch",
2365
+ "path": "hf_model:scripts/verify_live_publication.py",
2366
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
2367
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2368
+ }
2369
+ ]
2370
  },
2371
  {
2372
  "name": "scripts/validate_mirror_parity.py",
 
2595
  },
2596
  {
2597
  "name": "website/index.html",
2598
+ "status": "fail",
2599
  "local": {
2600
  "path": "repo:docs/index.html",
2601
  "exists": true,
2602
+ "bytes": 181095,
2603
+ "sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1"
2604
  },
2605
  "mirrors": {
2606
  "hf_space": {
 
2616
  "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2617
  }
2618
  },
2619
+ "failures": [
2620
+ {
2621
+ "surface": "hf_space",
2622
+ "kind": "hash_mismatch",
2623
+ "path": "hf_space:index.html",
2624
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
2625
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2626
+ },
2627
+ {
2628
+ "surface": "hf_artifacts_docs",
2629
+ "kind": "hash_mismatch",
2630
+ "path": "hf_artifacts:docs/index.html",
2631
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
2632
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2633
+ }
2634
+ ]
2635
  },
2636
  {
2637
  "name": "website/research_roadmap.html",
 
2896
  },
2897
  {
2898
  "name": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2899
+ "status": "fail",
2900
  "local": {
2901
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2902
  "exists": true,
2903
+ "bytes": 9893,
2904
+ "sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb"
2905
  },
2906
  "mirrors": {
2907
  "hf_space": {
 
2923
  "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2924
  }
2925
  },
2926
+ "failures": [
2927
+ {
2928
+ "surface": "hf_space",
2929
+ "kind": "hash_mismatch",
2930
+ "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2931
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2932
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2933
+ },
2934
+ {
2935
+ "surface": "hf_artifacts",
2936
+ "kind": "hash_mismatch",
2937
+ "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2938
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2939
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2940
+ },
2941
+ {
2942
+ "surface": "hf_model",
2943
+ "kind": "hash_mismatch",
2944
+ "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2945
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
2946
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2947
+ }
2948
+ ]
2949
  },
2950
  {
2951
  "name": "results/omni_finetune/multi_episode_128_task_baselines/BASELINE_ALIGNMENT_REPORT.md",
 
7258
  },
7259
  {
7260
  "name": "docs/RESEARCH_ROADMAP.md",
7261
+ "status": "fail",
7262
  "local": {
7263
  "path": "repo:RESEARCH_ROADMAP.md",
7264
  "exists": true,
7265
+ "bytes": 12874,
7266
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7267
  },
7268
  "mirrors": {
7269
  "hf_space": {
 
7285
  "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7286
  }
7287
  },
7288
+ "failures": [
7289
+ {
7290
+ "surface": "hf_space",
7291
+ "kind": "hash_mismatch",
7292
+ "path": "hf_space:RESEARCH_ROADMAP.md",
7293
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7294
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7295
+ },
7296
+ {
7297
+ "surface": "hf_artifacts",
7298
+ "kind": "hash_mismatch",
7299
+ "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7300
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7301
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7302
+ },
7303
+ {
7304
+ "surface": "hf_model",
7305
+ "kind": "hash_mismatch",
7306
+ "path": "hf_model:RESEARCH_ROADMAP.md",
7307
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7308
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7309
+ }
7310
+ ]
7311
  },
7312
  {
7313
  "name": "docs/PROJECT_STATUS.md",
7314
+ "status": "fail",
7315
  "local": {
7316
  "path": "repo:PROJECT_STATUS.md",
7317
  "exists": true,
7318
+ "bytes": 11369,
7319
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7320
  },
7321
  "mirrors": {
7322
  "hf_space": {
 
7338
  "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7339
  }
7340
  },
7341
+ "failures": [
7342
+ {
7343
+ "surface": "hf_space",
7344
+ "kind": "hash_mismatch",
7345
+ "path": "hf_space:PROJECT_STATUS.md",
7346
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7347
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7348
+ },
7349
+ {
7350
+ "surface": "hf_artifacts",
7351
+ "kind": "hash_mismatch",
7352
+ "path": "hf_artifacts:PROJECT_STATUS.md",
7353
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7354
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7355
+ },
7356
+ {
7357
+ "surface": "hf_model",
7358
+ "kind": "hash_mismatch",
7359
+ "path": "hf_model:PROJECT_STATUS.md",
7360
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7361
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7362
+ }
7363
+ ]
7364
  },
7365
  {
7366
  "name": "docs/PUBLIC_SURFACE_QA.md",
 
7487
  "failures": []
7488
  }
7489
  ],
7490
+ "failures": [
7491
+ {
7492
+ "group": "data/omni_finetune_verified_result.json",
7493
+ "surface": "hf_space",
7494
+ "kind": "hash_mismatch",
7495
+ "path": "hf_space:data/omni_finetune_verified_result.json",
7496
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7497
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7498
+ },
7499
+ {
7500
+ "group": "data/omni_finetune_verified_result.json",
7501
+ "surface": "hf_artifacts",
7502
+ "kind": "hash_mismatch",
7503
+ "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
7504
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7505
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7506
+ },
7507
+ {
7508
+ "group": "data/omni_finetune_verified_result.json",
7509
+ "surface": "hf_model",
7510
+ "kind": "hash_mismatch",
7511
+ "path": "hf_model:metrics/omni_finetune_verified_result.json",
7512
+ "expected_sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1",
7513
+ "actual_sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
7514
+ },
7515
+ {
7516
+ "group": "data/omni_model_comparison.json",
7517
+ "surface": "hf_space",
7518
+ "kind": "hash_mismatch",
7519
+ "path": "hf_space:data/omni_model_comparison.json",
7520
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7521
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7522
+ },
7523
+ {
7524
+ "group": "data/omni_model_comparison.json",
7525
+ "surface": "hf_artifacts",
7526
+ "kind": "hash_mismatch",
7527
+ "path": "hf_artifacts:docs/data/omni_model_comparison.json",
7528
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7529
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7530
+ },
7531
+ {
7532
+ "group": "data/omni_model_comparison.json",
7533
+ "surface": "hf_model",
7534
+ "kind": "hash_mismatch",
7535
+ "path": "hf_model:metrics/omni_model_comparison.json",
7536
+ "expected_sha256": "71d32b81180c9acadcc614dff99256dcc6e560be08f1c6bd1a32487eed704ebb",
7537
+ "actual_sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
7538
+ },
7539
+ {
7540
+ "group": "data/project_packet.json",
7541
+ "surface": "hf_space",
7542
+ "kind": "hash_mismatch",
7543
+ "path": "hf_space:data/project_packet.json",
7544
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7545
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7546
+ },
7547
+ {
7548
+ "group": "data/project_packet.json",
7549
+ "surface": "hf_artifacts",
7550
+ "kind": "hash_mismatch",
7551
+ "path": "hf_artifacts:docs/data/project_packet.json",
7552
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7553
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7554
+ },
7555
+ {
7556
+ "group": "data/project_packet.json",
7557
+ "surface": "hf_model",
7558
+ "kind": "hash_mismatch",
7559
+ "path": "hf_model:metrics/project_packet.json",
7560
+ "expected_sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15",
7561
+ "actual_sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
7562
+ },
7563
+ {
7564
+ "group": "data/project_status.json",
7565
+ "surface": "hf_space",
7566
+ "kind": "hash_mismatch",
7567
+ "path": "hf_space:data/project_status.json",
7568
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7569
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7570
+ },
7571
+ {
7572
+ "group": "data/project_status.json",
7573
+ "surface": "hf_artifacts",
7574
+ "kind": "hash_mismatch",
7575
+ "path": "hf_artifacts:docs/data/project_status.json",
7576
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7577
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7578
+ },
7579
+ {
7580
+ "group": "data/project_status.json",
7581
+ "surface": "hf_model",
7582
+ "kind": "hash_mismatch",
7583
+ "path": "hf_model:metrics/project_status.json",
7584
+ "expected_sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8",
7585
+ "actual_sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
7586
+ },
7587
+ {
7588
+ "group": "data/research_roadmap.json",
7589
+ "surface": "hf_space",
7590
+ "kind": "hash_mismatch",
7591
+ "path": "hf_space:data/research_roadmap.json",
7592
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7593
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7594
+ },
7595
+ {
7596
+ "group": "data/research_roadmap.json",
7597
+ "surface": "hf_artifacts",
7598
+ "kind": "hash_mismatch",
7599
+ "path": "hf_artifacts:docs/data/research_roadmap.json",
7600
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7601
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7602
+ },
7603
+ {
7604
+ "group": "data/research_roadmap.json",
7605
+ "surface": "hf_model",
7606
+ "kind": "hash_mismatch",
7607
+ "path": "hf_model:metrics/research_roadmap.json",
7608
+ "expected_sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06",
7609
+ "actual_sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
7610
+ },
7611
+ {
7612
+ "group": "data/research_roadmap_interactive.json",
7613
+ "surface": "hf_space",
7614
+ "kind": "hash_mismatch",
7615
+ "path": "hf_space:data/research_roadmap_interactive.json",
7616
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7617
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7618
+ },
7619
+ {
7620
+ "group": "data/research_roadmap_interactive.json",
7621
+ "surface": "hf_artifacts",
7622
+ "kind": "hash_mismatch",
7623
+ "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
7624
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7625
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7626
+ },
7627
+ {
7628
+ "group": "data/research_roadmap_interactive.json",
7629
+ "surface": "hf_model",
7630
+ "kind": "hash_mismatch",
7631
+ "path": "hf_model:metrics/research_roadmap_interactive.json",
7632
+ "expected_sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6",
7633
+ "actual_sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
7634
+ },
7635
+ {
7636
+ "group": "data/website_integrity.json",
7637
+ "surface": "hf_space",
7638
+ "kind": "hash_mismatch",
7639
+ "path": "hf_space:data/website_integrity.json",
7640
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7641
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7642
+ },
7643
+ {
7644
+ "group": "data/website_integrity.json",
7645
+ "surface": "hf_artifacts",
7646
+ "kind": "hash_mismatch",
7647
+ "path": "hf_artifacts:docs/data/website_integrity.json",
7648
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7649
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7650
+ },
7651
+ {
7652
+ "group": "data/website_integrity.json",
7653
+ "surface": "hf_model",
7654
+ "kind": "hash_mismatch",
7655
+ "path": "hf_model:metrics/website_integrity.json",
7656
+ "expected_sha256": "31d063e601db5ed64b8156f417f48d4e0474bc2b6c9088d875d3d0e18b6f4828",
7657
+ "actual_sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
7658
+ },
7659
+ {
7660
+ "group": "scripts/omni/build_omni_model_comparison.py",
7661
+ "surface": "hf_artifacts",
7662
+ "kind": "hash_mismatch",
7663
+ "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
7664
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
7665
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
7666
+ },
7667
+ {
7668
+ "group": "scripts/omni/build_omni_model_comparison.py",
7669
+ "surface": "hf_model",
7670
+ "kind": "hash_mismatch",
7671
+ "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
7672
+ "expected_sha256": "c66d3d9dd32dd16203bb5a832d9bdafb985c44d3b4040cbd58cd08e77a70458a",
7673
+ "actual_sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
7674
+ },
7675
+ {
7676
+ "group": "scripts/verify_live_publication.py",
7677
+ "surface": "hf_artifacts",
7678
+ "kind": "hash_mismatch",
7679
+ "path": "hf_artifacts:scripts/verify_live_publication.py",
7680
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
7681
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
7682
+ },
7683
+ {
7684
+ "group": "scripts/verify_live_publication.py",
7685
+ "surface": "hf_model",
7686
+ "kind": "hash_mismatch",
7687
+ "path": "hf_model:scripts/verify_live_publication.py",
7688
+ "expected_sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471",
7689
+ "actual_sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
7690
+ },
7691
+ {
7692
+ "group": "website/index.html",
7693
+ "surface": "hf_space",
7694
+ "kind": "hash_mismatch",
7695
+ "path": "hf_space:index.html",
7696
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
7697
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
7698
+ },
7699
+ {
7700
+ "group": "website/index.html",
7701
+ "surface": "hf_artifacts_docs",
7702
+ "kind": "hash_mismatch",
7703
+ "path": "hf_artifacts:docs/index.html",
7704
+ "expected_sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1",
7705
+ "actual_sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
7706
+ },
7707
+ {
7708
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7709
+ "surface": "hf_space",
7710
+ "kind": "hash_mismatch",
7711
+ "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7712
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7713
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7714
+ },
7715
+ {
7716
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7717
+ "surface": "hf_artifacts",
7718
+ "kind": "hash_mismatch",
7719
+ "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7720
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7721
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7722
+ },
7723
+ {
7724
+ "group": "results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7725
+ "surface": "hf_model",
7726
+ "kind": "hash_mismatch",
7727
+ "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
7728
+ "expected_sha256": "fa2129ff8775376674bb4550a6dac629baa9a48a0d49986f6bd33341c4a7bddb",
7729
+ "actual_sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
7730
+ },
7731
+ {
7732
+ "group": "docs/RESEARCH_ROADMAP.md",
7733
+ "surface": "hf_space",
7734
+ "kind": "hash_mismatch",
7735
+ "path": "hf_space:RESEARCH_ROADMAP.md",
7736
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7737
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7738
+ },
7739
+ {
7740
+ "group": "docs/RESEARCH_ROADMAP.md",
7741
+ "surface": "hf_artifacts",
7742
+ "kind": "hash_mismatch",
7743
+ "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7744
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7745
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7746
+ },
7747
+ {
7748
+ "group": "docs/RESEARCH_ROADMAP.md",
7749
+ "surface": "hf_model",
7750
+ "kind": "hash_mismatch",
7751
+ "path": "hf_model:RESEARCH_ROADMAP.md",
7752
+ "expected_sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347",
7753
+ "actual_sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7754
+ },
7755
+ {
7756
+ "group": "docs/PROJECT_STATUS.md",
7757
+ "surface": "hf_space",
7758
+ "kind": "hash_mismatch",
7759
+ "path": "hf_space:PROJECT_STATUS.md",
7760
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7761
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7762
+ },
7763
+ {
7764
+ "group": "docs/PROJECT_STATUS.md",
7765
+ "surface": "hf_artifacts",
7766
+ "kind": "hash_mismatch",
7767
+ "path": "hf_artifacts:PROJECT_STATUS.md",
7768
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7769
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7770
+ },
7771
+ {
7772
+ "group": "docs/PROJECT_STATUS.md",
7773
+ "surface": "hf_model",
7774
+ "kind": "hash_mismatch",
7775
+ "path": "hf_model:PROJECT_STATUS.md",
7776
+ "expected_sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114",
7777
+ "actual_sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7778
+ }
7779
+ ]
7780
  }
docs/data/omni_finetune_verified_result.json CHANGED
@@ -80,7 +80,7 @@
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
- "Implement the Cosmos3-Super diffusion/action target packer and supervised loss before claiming Cosmos3 fine-tuning.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
 
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
+ "Implement the Cosmos3-Super pipeline-loaded batch packer and one-sample forward-dynamics overfit before claiming Cosmos3 fine-tuning; camera-pose proxy targets are now exported, contract-audited, and schema-packed, but no Cosmos weights have been updated.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
docs/data/omni_model_comparison.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
- "generated_at_utc": "2026-06-07T15:34:51+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
@@ -8,7 +8,7 @@
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
- "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation rather than a new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
@@ -1012,7 +1012,62 @@
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
- "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible, but blocks true fine-tuning until a Cosmos-specific diffusion/action target packer and supervised loss are implemented."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1016
  }
1017
  ],
1018
  "multi_episode_128_runs": [
@@ -1056,7 +1111,7 @@
1056
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1057
  }
1058
  ],
1059
- "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. The readiness probe records why true Cosmos3-Super fine-tuning is not launched yet."
1060
  }
1061
  ],
1062
  "model_group_reading_notes": [
@@ -1064,10 +1119,10 @@
1064
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1065
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1066
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1067
- "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a training-readiness probe; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1068
  ],
1069
  "pending": [
1070
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1071
- "Promote Cosmos3 from Nano compatibility and Super base-weight evaluation to true fine-tuning only after a dedicated Cosmos diffusion/action target packer and supervised loss produce new weights."
1072
  ]
1073
  }
 
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
+ "generated_at_utc": "2026-06-07T17:27:36+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
 
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
+ "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation; Cosmos3-Super now has a camera-pose forward-dynamics contract audit and schema-only packer smoke, but no new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
 
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
+ "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible. It predates the camera-pose action-target export, so use the 20260608 contract audit for the current trainer-readiness status."
1016
+ },
1017
+ {
1018
+ "id": "xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608",
1019
+ "title": "Cosmos3-Super Camera-Pose Target Audit",
1020
+ "scope_label": "action target contract",
1021
+ "scope": "selected 128-episode 96/16/16 dataset augmented with camera_pose proxy cosmos_action_target records",
1022
+ "status": "ready_for_forward_dynamics_trainer",
1023
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
1024
+ "split": "train/val/test by selected episode/session",
1025
+ "counts": {
1026
+ "dataset_samples": 3808,
1027
+ "rows_with_action_target": 3808,
1028
+ "valid_action_targets": 3808,
1029
+ "split_counts": {
1030
+ "train": 2848,
1031
+ "val": 512,
1032
+ "test": 448
1033
+ },
1034
+ "episode_split_counts": {
1035
+ "test": 14,
1036
+ "train": 89,
1037
+ "val": 16
1038
+ }
1039
+ },
1040
+ "primary_metrics": {
1041
+ "domain_name": "camera_pose",
1042
+ "raw_action_dim": 9,
1043
+ "mode": "forward_dynamics",
1044
+ "valid_action_targets": 3808,
1045
+ "weights_updated": false
1046
+ },
1047
+ "weights": "none; action-target contract audit only, no adapter checkpoint",
1048
+ "interpretation": "The selected dataset now has valid Cosmos3 camera_pose forward_dynamics targets for an egocentric camera-motion proxy. These remove the target-schema blocker for action-conditioned world-model training, but they supervise noisy vision tokens rather than preds_action. The remaining work is a pipeline-loaded packer check and one-sample forward-dynamics overfit; action-token prediction needs a separate policy or inverse-dynamics target export."
1049
+ },
1050
+ {
1051
+ "id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608",
1052
+ "title": "Cosmos3-Super Action Batch Packer Smoke",
1053
+ "scope_label": "batch packer",
1054
+ "scope": "one selected train row from the camera_pose forward_dynamics augmented JSONL",
1055
+ "status": "pass",
1056
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json",
1057
+ "split": "train",
1058
+ "counts": {
1059
+ "samples": 1,
1060
+ "raw_action_rows": 8,
1061
+ "raw_action_dim": 9
1062
+ },
1063
+ "primary_metrics": {
1064
+ "mode": "forward_dynamics",
1065
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose",
1066
+ "pipeline_loaded": false,
1067
+ "weights_updated": false
1068
+ },
1069
+ "weights": "none; schema-only packer smoke, no adapter checkpoint",
1070
+ "interpretation": "The selected row maps to a camera_pose forward_dynamics contract. In the installed Cosmos3 pipeline this uses raw actions as conditioning and supervises noisy vision tokens; it does not supervise preds_action."
1071
  }
1072
  ],
1073
  "multi_episode_128_runs": [
 
1111
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1112
  }
1113
  ],
1114
+ "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. A camera-pose proxy forward-dynamics target export now passes the contract audit and schema-only packer smoke; true Cosmos3-Super fine-tuning is still not launched until the pipeline-loaded packer check and one-sample overfit exist."
1115
  }
1116
  ],
1117
  "model_group_reading_notes": [
 
1119
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1120
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1121
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1122
+ "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a camera-pose forward-dynamics contract audit; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1123
  ],
1124
  "pending": [
1125
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1126
+ "Promote Cosmos3 from Nano compatibility, Super base-weight evaluation, and the camera-pose forward-dynamics contract to true fine-tuning only after the pipeline-loaded packer check and one-sample overfit produce new weights."
1127
  ]
1128
  }
docs/data/project_packet.json CHANGED
@@ -41,7 +41,7 @@
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
- "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, and the Cosmos3-Super base-weight Reasoner evaluation are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
 
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
+ "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, the Cosmos3-Super base-weight Reasoner evaluation, and the Cosmos3-Super camera-pose forward-dynamics contract audit plus schema-only packer smoke are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
docs/data/project_status.json CHANGED
@@ -119,7 +119,7 @@
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
- "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package plus a verified Cosmos3-Super base-weight Reasoner evaluation; OpenVLA/openpi/GR00T are policy candidates after action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
@@ -244,6 +244,18 @@
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
 
 
 
 
 
 
 
 
 
 
 
 
247
  {
248
  "area": "Raw Xperience-10M redistribution",
249
  "status": "not_included",
@@ -276,11 +288,11 @@
276
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
277
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
278
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
279
- "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, and Cosmos3-Super Reasoner is verified as a base-weight evaluation; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
280
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
281
  "Audio is one of the synchronized source modalities in the current task representation.",
282
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
283
- "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, and policy models such as OpenVLA/openpi/GR00T wait for action-target conversion.",
284
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
285
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
286
  ]
 
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
+ "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package, a verified Cosmos3-Super base-weight Reasoner evaluation, and a Cosmos3-Super camera-pose proxy forward-dynamics contract audit plus schema-only packer smoke. The current target supports vision-velocity training under action conditioning, not supervised action-token prediction; OpenVLA/openpi/GR00T are policy candidates after robot-compatible action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
 
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
247
+ {
248
+ "area": "Cosmos3-Super action-target contract",
249
+ "status": "ready_for_forward_dynamics_trainer_implementation",
250
+ "evidence": [
251
+ "scripts/omni/export_cosmos3_camera_pose_targets.py",
252
+ "scripts/omni/pack_cosmos3_super_action_batch.py",
253
+ "results/omni_finetune/xperience10m_cosmos3_camera_pose_targets_20260608/target_manifest.json",
254
+ "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
255
+ "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json"
256
+ ],
257
+ "readout": "The selected 128-episode JSONL is augmented with 3,808/3,808 valid camera_pose proxy cosmos_action_target records from SLAM pose deltas. The schema-only packer smoke confirms the current forward_dynamics target should supervise noisy vision tokens under camera-pose conditioning; it does not supervise preds_action. Remaining work is a pipeline-loaded packer check, one-sample forward-dynamics overfit, and a separate policy/inverse target export before claiming action-token prediction."
258
+ },
259
  {
260
  "area": "Raw Xperience-10M redistribution",
261
  "status": "not_included",
 
288
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
289
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
290
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
291
+ "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super camera-pose forward-dynamics targets now pass the contract audit plus a schema-only packer smoke; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
292
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
293
  "Audio is one of the synchronized source modalities in the current task representation.",
294
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
295
+ "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, Cosmos3-Super has a camera-pose proxy forward-dynamics contract ready for trainer implementation, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
296
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
297
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
298
  ]
docs/data/research_roadmap.json CHANGED
@@ -133,7 +133,7 @@
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
 
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
docs/data/research_roadmap_interactive.json CHANGED
@@ -2369,7 +2369,7 @@
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
 
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
docs/data/website_integrity.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:47:32+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
@@ -75,7 +75,7 @@
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
- "evidence_index": 90477
79
  },
80
  {
81
  "name": "project_status_links_json",
@@ -153,8 +153,8 @@
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
- "protocol_index": 87160,
157
- "evidence_index": 90477
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
@@ -292,7 +292,7 @@
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
- "bytes": 410374,
296
  "top_level_type": "dict"
297
  },
298
  {
@@ -302,12 +302,12 @@
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
- "bytes": 3628,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
- "bytes": 48296,
311
  "top_level_type": "dict"
312
  },
313
  {
@@ -322,12 +322,12 @@
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
- "bytes": 8005,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
- "bytes": 16455,
331
  "top_level_type": "dict"
332
  },
333
  {
@@ -367,12 +367,12 @@
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
- "bytes": 10133,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
- "bytes": 143560,
376
  "top_level_type": "dict"
377
  },
378
  {
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-07T17:27:17+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
 
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
+ "evidence_index": 90659
79
  },
80
  {
81
  "name": "project_status_links_json",
 
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
+ "protocol_index": 87218,
157
+ "evidence_index": 90659
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
 
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
+ "bytes": 319291,
296
  "top_level_type": "dict"
297
  },
298
  {
 
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
+ "bytes": 3768,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
+ "bytes": 50422,
311
  "top_level_type": "dict"
312
  },
313
  {
 
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
+ "bytes": 8098,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
+ "bytes": 18062,
331
  "top_level_type": "dict"
332
  },
333
  {
 
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
+ "bytes": 10246,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
+ "bytes": 143673,
376
  "top_level_type": "dict"
377
  },
378
  {
docs/index.html CHANGED
@@ -2409,7 +2409,7 @@
2409
  <article class="roadmap-card" data-status="next">
2410
  <span class="roadmap-status">next</span>
2411
  <h3>Foundation-Model Selection Matrix</h3>
2412
- <p>Keep Qwen3-Omni as the first trainable held-out pilot, add Cosmos 3 for world modeling, and stage policy candidates after action targets are explicit.</p>
2413
  <div class="roadmap-meta">
2414
  <strong>Entry</strong><p>Completed 128-episode preparation or a smaller 3-8 episode preprocessing dry run.</p>
2415
  <strong>Evidence</strong><p>Foundation model plan, source links, model-specific entry conditions, and evaluation additions.</p>
@@ -2488,8 +2488,8 @@
2488
  <article class="artifact"><h3>Metric contract</h3><p>All 12 tasks list input, target, primary metric, minimal baseline score, and neural MLP score from committed result files.</p><a href="data/summary_metrics.json">summary metrics</a></article>
2489
  <article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
2490
  <article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across all 12 task contracts under the same chronological split.</p><a href="data/audio_ablation_summary.json">audio summary</a></article>
2491
- <article class="artifact"><h3>Foundation branch selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 becomes the world-model branch, policy models wait for explicit action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
2492
- <article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight Reasoner JSON-task evaluation. The next stage is action/subtask error analysis, true Cosmos fine-tuning, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
2493
  <article class="artifact"><h3>Scale-up requirement</h3><p>Future Omni, Cosmos, and policy branches use the same episode split discipline, training metadata, held-out predictions, metrics, run report, and public-safe package gate.</p><a href="data/foundation_model_plan.json">scale-up status</a></article>
2494
  </div>
2495
  </div>
@@ -2542,7 +2542,7 @@
2542
  <article class="evidence-card">
2543
  <span class="status-pill">current plan</span>
2544
  <h3>Foundation backbones are separated by role</h3>
2545
- <p>Qwen3-Omni stays first for held-out LoRA; Cosmos 3 is the world-model branch; OpenVLA/openpi/GR00T are policy candidates after action-space conversion; Xperience-native pretraining is the later full-corpus goal.</p>
2546
  <div class="evidence-links">
2547
  <a href="data/foundation_model_plan.json">foundation model plan</a>
2548
  <a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/FOUNDATION_MODEL_PLAN.md">plan doc</a>
@@ -2552,7 +2552,7 @@
2552
  <article class="evidence-card">
2553
  <span class="status-pill">verified diagnostic</span>
2554
  <h3>Qwen3-Omni and Cosmos3 branches</h3>
2555
- <p>The selected 96/16/16 episode split produced verified Qwen3-Omni packages with 448 held-out test predictions. Cosmos3-Nano has 378 held-out future-window predictions, and Cosmos3-Super Reasoner has 448 held-out base-weight JSON-task predictions.</p>
2556
  <div class="evidence-links">
2557
  <a href="data/omni_model_comparison.json">result comparison</a>
2558
  <a href="data/omni_finetune_verified_result.json">pilot result</a>
@@ -3160,7 +3160,7 @@
3160
  <article class="artifact"><h3>Foundation-model plan</h3><p>Backbone selection matrix covering Qwen3-Omni, Cosmos 3, GR00T, OpenVLA/openpi, Gemini Robotics, Octo, SmolVLA-style policy candidates, and the future Xperience-native pretraining goal.</p><a href="data/foundation_model_plan.json">foundation model plan</a></article>
3161
  <article class="artifact"><h3>Multi-episode data access</h3><p>Public data-access path, selected 128-episode pilot plan, and preparation requirements.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md">data access</a></article>
3162
  <article class="artifact"><h3>Qwen3-Omni LoRA group</h3><p>Separates the 1-episode sensor-adapter smoke test from the current 128-episode LoRA adapter package and older diagnostics.</p><a href="data/omni_model_comparison.json">Qwen group</a></article>
3163
- <article class="artifact"><h3>Cosmos3 groups</h3><p>Shows the verified Nano future-window compatibility package and the Super base-weight Reasoner JSON-task evaluation; neither is a new fine-tuned Cosmos weight release.</p><a href="data/omni_model_comparison.json">Cosmos groups</a></article>
3164
  <article class="artifact"><h3>Scale-up requirement</h3><p>Future runs need validation tracking, held-out predictions, quality-target reporting, and the same public-safe package gate.</p><a href="data/foundation_model_plan.json">training requirements</a></article>
3165
  <article class="artifact"><h3>Xperience-native pretraining</h3><p>Future plan for a domain-specific embodied foundation model trained from scratch over full-corpus video, audio, geometry, motion, inertial, and language streams.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md">pretraining plan</a></article>
3166
  </div>
 
2409
  <article class="roadmap-card" data-status="next">
2410
  <span class="roadmap-status">next</span>
2411
  <h3>Foundation-Model Selection Matrix</h3>
2412
+ <p>Keep Qwen3-Omni as the first trainable held-out pilot, use Cosmos 3 for world modeling and forward-dynamics trainer development, and stage policy candidates after robot-compatible action targets are explicit.</p>
2413
  <div class="roadmap-meta">
2414
  <strong>Entry</strong><p>Completed 128-episode preparation or a smaller 3-8 episode preprocessing dry run.</p>
2415
  <strong>Evidence</strong><p>Foundation model plan, source links, model-specific entry conditions, and evaluation additions.</p>
 
2488
  <article class="artifact"><h3>Metric contract</h3><p>All 12 tasks list input, target, primary metric, minimal baseline score, and neural MLP score from committed result files.</p><a href="data/summary_metrics.json">summary metrics</a></article>
2489
  <article class="artifact"><h3>Leakage controls</h3><p>Scalers fit on train windows only; future labels, target-side signals, caption/object labels, and contact labels stay on the target side unless explicitly queried.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/scripts/build_evaluation_protocol.py">builder script</a></article>
2490
  <article class="artifact"><h3>Audio ablation</h3><p>Audio and no-audio variants are evaluated across all 12 task contracts under the same chronological split.</p><a href="data/audio_ablation_summary.json">audio summary</a></article>
2491
+ <article class="artifact"><h3>Foundation branch selection</h3><p>Qwen3-Omni is the first trainable baseline, Cosmos 3 becomes the world-model branch with a camera-pose proxy forward-dynamics contract ready for trainer work, policy models wait for robot-compatible action targets, and Xperience-native pretraining remains a later full-corpus goal.</p><a href="data/foundation_model_plan.json">backbone plan</a></article>
2492
+ <article class="artifact"><h3>Next evaluation stage</h3><p>This public-sample run covers single-episode task development. The selected multi-episode Qwen3-Omni final diagnostic result is verified and meets the JSON-validity target; Cosmos3-Nano has a verified future-window compatibility package; and Cosmos3-Super has a verified base-weight JSON-task evaluation plus a camera-pose forward-dynamics contract audit. The next stage is action/subtask error analysis, true Cosmos fine-tuning, and policy-target conversion.</p><a href="data/omni_model_comparison.json">result comparison</a></article>
2493
  <article class="artifact"><h3>Scale-up requirement</h3><p>Future Omni, Cosmos, and policy branches use the same episode split discipline, training metadata, held-out predictions, metrics, run report, and public-safe package gate.</p><a href="data/foundation_model_plan.json">scale-up status</a></article>
2494
  </div>
2495
  </div>
 
2542
  <article class="evidence-card">
2543
  <span class="status-pill">current plan</span>
2544
  <h3>Foundation backbones are separated by role</h3>
2545
+ <p>Qwen3-Omni stays first for held-out LoRA; Cosmos 3 is the world-model branch with camera-pose proxy forward-dynamics targets ready for trainer work; OpenVLA/openpi/GR00T are policy candidates after robot-compatible action conversion; Xperience-native pretraining is the later full-corpus goal.</p>
2546
  <div class="evidence-links">
2547
  <a href="data/foundation_model_plan.json">foundation model plan</a>
2548
  <a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/FOUNDATION_MODEL_PLAN.md">plan doc</a>
 
2552
  <article class="evidence-card">
2553
  <span class="status-pill">verified diagnostic</span>
2554
  <h3>Qwen3-Omni and Cosmos3 branches</h3>
2555
+ <p>The selected 96/16/16 episode split produced verified Qwen3-Omni packages with 448 held-out test predictions. Cosmos3-Nano has 378 held-out future-window predictions, and Cosmos3-Super Reasoner has 448 held-out base-weight JSON-task predictions plus a camera-pose forward-dynamics contract audit.</p>
2556
  <div class="evidence-links">
2557
  <a href="data/omni_model_comparison.json">result comparison</a>
2558
  <a href="data/omni_finetune_verified_result.json">pilot result</a>
 
3160
  <article class="artifact"><h3>Foundation-model plan</h3><p>Backbone selection matrix covering Qwen3-Omni, Cosmos 3, GR00T, OpenVLA/openpi, Gemini Robotics, Octo, SmolVLA-style policy candidates, and the future Xperience-native pretraining goal.</p><a href="data/foundation_model_plan.json">foundation model plan</a></article>
3161
  <article class="artifact"><h3>Multi-episode data access</h3><p>Public data-access path, selected 128-episode pilot plan, and preparation requirements.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md">data access</a></article>
3162
  <article class="artifact"><h3>Qwen3-Omni LoRA group</h3><p>Separates the 1-episode sensor-adapter smoke test from the current 128-episode LoRA adapter package and older diagnostics.</p><a href="data/omni_model_comparison.json">Qwen group</a></article>
3163
+ <article class="artifact"><h3>Cosmos3 groups</h3><p>Shows the verified Nano future-window compatibility package, the Super base-weight Reasoner JSON-task evaluation, and the Super camera-pose forward-dynamics contract audit; none is a new fine-tuned Cosmos weight release.</p><a href="data/omni_model_comparison.json">Cosmos groups</a></article>
3164
  <article class="artifact"><h3>Scale-up requirement</h3><p>Future runs need validation tracking, held-out predictions, quality-target reporting, and the same public-safe package gate.</p><a href="data/foundation_model_plan.json">training requirements</a></article>
3165
  <article class="artifact"><h3>Xperience-native pretraining</h3><p>Future plan for a domain-specific embodied foundation model trained from scratch over full-corpus video, audio, geometry, motion, inertial, and language streams.</p><a href="https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/blob/main/XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md">pretraining plan</a></article>
3166
  </div>
metrics/mirror_parity.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:49:31+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
@@ -350,27 +350,27 @@
350
  "local": {
351
  "path": "repo:docs/data/omni_finetune_verified_result.json",
352
  "exists": true,
353
- "bytes": 3628,
354
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
355
  },
356
  "mirrors": {
357
  "hf_space": {
358
  "path": "hf_space:data/omni_finetune_verified_result.json",
359
  "exists": true,
360
- "bytes": 3628,
361
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
362
  },
363
  "hf_artifacts": {
364
  "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
365
  "exists": true,
366
- "bytes": 3628,
367
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
368
  },
369
  "hf_model": {
370
  "path": "hf_model:metrics/omni_finetune_verified_result.json",
371
  "exists": true,
372
- "bytes": 3628,
373
- "sha256": "ce28a11876aa33feb1f7b28c977c1d3e708b7d5d8b24b062684d472ba671d004"
374
  }
375
  },
376
  "failures": []
@@ -381,27 +381,27 @@
381
  "local": {
382
  "path": "repo:docs/data/omni_model_comparison.json",
383
  "exists": true,
384
- "bytes": 48296,
385
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
386
  },
387
  "mirrors": {
388
  "hf_space": {
389
  "path": "hf_space:data/omni_model_comparison.json",
390
  "exists": true,
391
- "bytes": 48296,
392
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
393
  },
394
  "hf_artifacts": {
395
  "path": "hf_artifacts:docs/data/omni_model_comparison.json",
396
  "exists": true,
397
- "bytes": 48296,
398
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
399
  },
400
  "hf_model": {
401
  "path": "hf_model:metrics/omni_model_comparison.json",
402
  "exists": true,
403
- "bytes": 48296,
404
- "sha256": "1c968bd58842af9a4e6159c1a8bd171aec08757bb77fce9f04c55030be08357f"
405
  }
406
  },
407
  "failures": []
@@ -474,27 +474,27 @@
474
  "local": {
475
  "path": "repo:docs/data/project_packet.json",
476
  "exists": true,
477
- "bytes": 8005,
478
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
479
  },
480
  "mirrors": {
481
  "hf_space": {
482
  "path": "hf_space:data/project_packet.json",
483
  "exists": true,
484
- "bytes": 8005,
485
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
486
  },
487
  "hf_artifacts": {
488
  "path": "hf_artifacts:docs/data/project_packet.json",
489
  "exists": true,
490
- "bytes": 8005,
491
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
492
  },
493
  "hf_model": {
494
  "path": "hf_model:metrics/project_packet.json",
495
  "exists": true,
496
- "bytes": 8005,
497
- "sha256": "2258fecb80850c745e60cb28733869c49a5182879d9d0461b666a5575e3c1610"
498
  }
499
  },
500
  "failures": []
@@ -505,27 +505,27 @@
505
  "local": {
506
  "path": "repo:docs/data/project_status.json",
507
  "exists": true,
508
- "bytes": 16455,
509
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
510
  },
511
  "mirrors": {
512
  "hf_space": {
513
  "path": "hf_space:data/project_status.json",
514
  "exists": true,
515
- "bytes": 16455,
516
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
517
  },
518
  "hf_artifacts": {
519
  "path": "hf_artifacts:docs/data/project_status.json",
520
  "exists": true,
521
- "bytes": 16455,
522
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
523
  },
524
  "hf_model": {
525
  "path": "hf_model:metrics/project_status.json",
526
  "exists": true,
527
- "bytes": 16455,
528
- "sha256": "3590ee1e09ecf819080a7714ea9629db305e1fd68c99a65f62bb65061c0d766c"
529
  }
530
  },
531
  "failures": []
@@ -691,27 +691,27 @@
691
  "local": {
692
  "path": "repo:docs/data/research_roadmap.json",
693
  "exists": true,
694
- "bytes": 10133,
695
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
696
  },
697
  "mirrors": {
698
  "hf_space": {
699
  "path": "hf_space:data/research_roadmap.json",
700
  "exists": true,
701
- "bytes": 10133,
702
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
703
  },
704
  "hf_artifacts": {
705
  "path": "hf_artifacts:docs/data/research_roadmap.json",
706
  "exists": true,
707
- "bytes": 10133,
708
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
709
  },
710
  "hf_model": {
711
  "path": "hf_model:metrics/research_roadmap.json",
712
  "exists": true,
713
- "bytes": 10133,
714
- "sha256": "45fd3a1bde93654ccfe14f9271928a67b36eb3f166826bfbdbb9c1092ad33bcf"
715
  }
716
  },
717
  "failures": []
@@ -722,27 +722,27 @@
722
  "local": {
723
  "path": "repo:docs/data/research_roadmap_interactive.json",
724
  "exists": true,
725
- "bytes": 143560,
726
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
727
  },
728
  "mirrors": {
729
  "hf_space": {
730
  "path": "hf_space:data/research_roadmap_interactive.json",
731
  "exists": true,
732
- "bytes": 143560,
733
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
734
  },
735
  "hf_artifacts": {
736
  "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
737
  "exists": true,
738
- "bytes": 143560,
739
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
740
  },
741
  "hf_model": {
742
  "path": "hf_model:metrics/research_roadmap_interactive.json",
743
  "exists": true,
744
- "bytes": 143560,
745
- "sha256": "9198752056a40eb5a7457ded21576862d9954be1f0f4a9e996e935d328ef4062"
746
  }
747
  },
748
  "failures": []
@@ -1033,26 +1033,26 @@
1033
  "path": "repo:docs/data/website_integrity.json",
1034
  "exists": true,
1035
  "bytes": 15375,
1036
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1037
  },
1038
  "mirrors": {
1039
  "hf_space": {
1040
  "path": "hf_space:data/website_integrity.json",
1041
  "exists": true,
1042
  "bytes": 15375,
1043
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1044
  },
1045
  "hf_artifacts": {
1046
  "path": "hf_artifacts:docs/data/website_integrity.json",
1047
  "exists": true,
1048
  "bytes": 15375,
1049
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1050
  },
1051
  "hf_model": {
1052
  "path": "hf_model:metrics/website_integrity.json",
1053
  "exists": true,
1054
  "bytes": 15375,
1055
- "sha256": "449b5525a0fc9ba200e59c6248e5ce963381938ab2c2027e1933db9483622037"
1056
  }
1057
  },
1058
  "failures": []
@@ -1785,21 +1785,21 @@
1785
  "local": {
1786
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1787
  "exists": true,
1788
- "bytes": 30236,
1789
- "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1790
  },
1791
  "mirrors": {
1792
  "hf_artifacts": {
1793
  "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
1794
  "exists": true,
1795
- "bytes": 30236,
1796
- "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1797
  },
1798
  "hf_model": {
1799
  "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
1800
  "exists": true,
1801
- "bytes": 30236,
1802
- "sha256": "207b0bbfbea1cd3d7e6e77e7eafcf231b71c9f6483ffc36889234c7bafbcb1df"
1803
  }
1804
  },
1805
  "failures": []
@@ -2160,21 +2160,21 @@
2160
  "local": {
2161
  "path": "repo:scripts/verify_live_publication.py",
2162
  "exists": true,
2163
- "bytes": 36201,
2164
- "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2165
  },
2166
  "mirrors": {
2167
  "hf_artifacts": {
2168
  "path": "hf_artifacts:scripts/verify_live_publication.py",
2169
  "exists": true,
2170
- "bytes": 36201,
2171
- "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2172
  },
2173
  "hf_model": {
2174
  "path": "hf_model:scripts/verify_live_publication.py",
2175
  "exists": true,
2176
- "bytes": 36201,
2177
- "sha256": "76f03885867a8ed7095958a6948cbce81b4958fb74a09df24c24ad7eb5b0d944"
2178
  }
2179
  },
2180
  "failures": []
@@ -2410,21 +2410,21 @@
2410
  "local": {
2411
  "path": "repo:docs/index.html",
2412
  "exists": true,
2413
- "bytes": 180727,
2414
- "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2415
  },
2416
  "mirrors": {
2417
  "hf_space": {
2418
  "path": "hf_space:index.html",
2419
  "exists": true,
2420
- "bytes": 180727,
2421
- "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2422
  },
2423
  "hf_artifacts_docs": {
2424
  "path": "hf_artifacts:docs/index.html",
2425
  "exists": true,
2426
- "bytes": 180727,
2427
- "sha256": "a88769e505d5af34674278f282ed1f482cc91dc711ddc0ed894a3fca5d08ff67"
2428
  }
2429
  },
2430
  "failures": []
@@ -2696,27 +2696,27 @@
2696
  "local": {
2697
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2698
  "exists": true,
2699
- "bytes": 9231,
2700
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2701
  },
2702
  "mirrors": {
2703
  "hf_space": {
2704
  "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2705
  "exists": true,
2706
- "bytes": 9231,
2707
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2708
  },
2709
  "hf_artifacts": {
2710
  "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2711
  "exists": true,
2712
- "bytes": 9231,
2713
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2714
  },
2715
  "hf_model": {
2716
  "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2717
  "exists": true,
2718
- "bytes": 9231,
2719
- "sha256": "c38d12e138193f7200800d4dd8c149497de2c5f5895299e22fe81285b69fc62d"
2720
  }
2721
  },
2722
  "failures": []
@@ -7036,27 +7036,27 @@
7036
  "local": {
7037
  "path": "repo:RESEARCH_ROADMAP.md",
7038
  "exists": true,
7039
- "bytes": 12233,
7040
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7041
  },
7042
  "mirrors": {
7043
  "hf_space": {
7044
  "path": "hf_space:RESEARCH_ROADMAP.md",
7045
  "exists": true,
7046
- "bytes": 12233,
7047
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7048
  },
7049
  "hf_artifacts": {
7050
  "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7051
  "exists": true,
7052
- "bytes": 12233,
7053
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7054
  },
7055
  "hf_model": {
7056
  "path": "hf_model:RESEARCH_ROADMAP.md",
7057
  "exists": true,
7058
- "bytes": 12233,
7059
- "sha256": "020512aa647cef7d63eccf7bb8dd6cb86f0e5c457f3c0e3d5ef293e7b35a58bf"
7060
  }
7061
  },
7062
  "failures": []
@@ -7067,27 +7067,27 @@
7067
  "local": {
7068
  "path": "repo:PROJECT_STATUS.md",
7069
  "exists": true,
7070
- "bytes": 9926,
7071
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7072
  },
7073
  "mirrors": {
7074
  "hf_space": {
7075
  "path": "hf_space:PROJECT_STATUS.md",
7076
  "exists": true,
7077
- "bytes": 9926,
7078
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7079
  },
7080
  "hf_artifacts": {
7081
  "path": "hf_artifacts:PROJECT_STATUS.md",
7082
  "exists": true,
7083
- "bytes": 9926,
7084
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7085
  },
7086
  "hf_model": {
7087
  "path": "hf_model:PROJECT_STATUS.md",
7088
  "exists": true,
7089
- "bytes": 9926,
7090
- "sha256": "c7dfb7a45f0c1ea435c16d93208a82da4227336e34f56a96d4afa04fce42438c"
7091
  }
7092
  },
7093
  "failures": []
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-07T17:31:58+00:00",
4
  "hf_root": "hf_publish",
5
  "summary": {
6
  "group_count": 234,
 
350
  "local": {
351
  "path": "repo:docs/data/omni_finetune_verified_result.json",
352
  "exists": true,
353
+ "bytes": 3768,
354
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
355
  },
356
  "mirrors": {
357
  "hf_space": {
358
  "path": "hf_space:data/omni_finetune_verified_result.json",
359
  "exists": true,
360
+ "bytes": 3768,
361
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
362
  },
363
  "hf_artifacts": {
364
  "path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
365
  "exists": true,
366
+ "bytes": 3768,
367
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
368
  },
369
  "hf_model": {
370
  "path": "hf_model:metrics/omni_finetune_verified_result.json",
371
  "exists": true,
372
+ "bytes": 3768,
373
+ "sha256": "efc1b9c1938f358f44e2cfbc53bb395714217f8e158ecc0e2609a775c670c6e1"
374
  }
375
  },
376
  "failures": []
 
381
  "local": {
382
  "path": "repo:docs/data/omni_model_comparison.json",
383
  "exists": true,
384
+ "bytes": 51589,
385
+ "sha256": "ba400d7c5dadd5fa654f3ba2b202be7f11537c1de7e2abee600ca431de2785a4"
386
  },
387
  "mirrors": {
388
  "hf_space": {
389
  "path": "hf_space:data/omni_model_comparison.json",
390
  "exists": true,
391
+ "bytes": 51589,
392
+ "sha256": "ba400d7c5dadd5fa654f3ba2b202be7f11537c1de7e2abee600ca431de2785a4"
393
  },
394
  "hf_artifacts": {
395
  "path": "hf_artifacts:docs/data/omni_model_comparison.json",
396
  "exists": true,
397
+ "bytes": 51589,
398
+ "sha256": "ba400d7c5dadd5fa654f3ba2b202be7f11537c1de7e2abee600ca431de2785a4"
399
  },
400
  "hf_model": {
401
  "path": "hf_model:metrics/omni_model_comparison.json",
402
  "exists": true,
403
+ "bytes": 51589,
404
+ "sha256": "ba400d7c5dadd5fa654f3ba2b202be7f11537c1de7e2abee600ca431de2785a4"
405
  }
406
  },
407
  "failures": []
 
474
  "local": {
475
  "path": "repo:docs/data/project_packet.json",
476
  "exists": true,
477
+ "bytes": 8098,
478
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
479
  },
480
  "mirrors": {
481
  "hf_space": {
482
  "path": "hf_space:data/project_packet.json",
483
  "exists": true,
484
+ "bytes": 8098,
485
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
486
  },
487
  "hf_artifacts": {
488
  "path": "hf_artifacts:docs/data/project_packet.json",
489
  "exists": true,
490
+ "bytes": 8098,
491
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
492
  },
493
  "hf_model": {
494
  "path": "hf_model:metrics/project_packet.json",
495
  "exists": true,
496
+ "bytes": 8098,
497
+ "sha256": "77cabac65b31db4e0477e20b1e6dfb06572bee42d8f71ac48f9380c0f4d86e15"
498
  }
499
  },
500
  "failures": []
 
505
  "local": {
506
  "path": "repo:docs/data/project_status.json",
507
  "exists": true,
508
+ "bytes": 18062,
509
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
510
  },
511
  "mirrors": {
512
  "hf_space": {
513
  "path": "hf_space:data/project_status.json",
514
  "exists": true,
515
+ "bytes": 18062,
516
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
517
  },
518
  "hf_artifacts": {
519
  "path": "hf_artifacts:docs/data/project_status.json",
520
  "exists": true,
521
+ "bytes": 18062,
522
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
523
  },
524
  "hf_model": {
525
  "path": "hf_model:metrics/project_status.json",
526
  "exists": true,
527
+ "bytes": 18062,
528
+ "sha256": "3f75b0894d215e39f69b4a477c06132eba00d4ed67cf6e39a22716e08ee725b8"
529
  }
530
  },
531
  "failures": []
 
691
  "local": {
692
  "path": "repo:docs/data/research_roadmap.json",
693
  "exists": true,
694
+ "bytes": 10246,
695
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
696
  },
697
  "mirrors": {
698
  "hf_space": {
699
  "path": "hf_space:data/research_roadmap.json",
700
  "exists": true,
701
+ "bytes": 10246,
702
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
703
  },
704
  "hf_artifacts": {
705
  "path": "hf_artifacts:docs/data/research_roadmap.json",
706
  "exists": true,
707
+ "bytes": 10246,
708
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
709
  },
710
  "hf_model": {
711
  "path": "hf_model:metrics/research_roadmap.json",
712
  "exists": true,
713
+ "bytes": 10246,
714
+ "sha256": "d34d763c3e880002f0b5de554b1b3f17b65f2cff24c5bc080ece938d04db2d06"
715
  }
716
  },
717
  "failures": []
 
722
  "local": {
723
  "path": "repo:docs/data/research_roadmap_interactive.json",
724
  "exists": true,
725
+ "bytes": 143673,
726
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
727
  },
728
  "mirrors": {
729
  "hf_space": {
730
  "path": "hf_space:data/research_roadmap_interactive.json",
731
  "exists": true,
732
+ "bytes": 143673,
733
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
734
  },
735
  "hf_artifacts": {
736
  "path": "hf_artifacts:docs/data/research_roadmap_interactive.json",
737
  "exists": true,
738
+ "bytes": 143673,
739
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
740
  },
741
  "hf_model": {
742
  "path": "hf_model:metrics/research_roadmap_interactive.json",
743
  "exists": true,
744
+ "bytes": 143673,
745
+ "sha256": "ad989e7cf78a213543614e23f90d4f03e5f5617b3ec6be43dfcc4b3a22cd6ac6"
746
  }
747
  },
748
  "failures": []
 
1033
  "path": "repo:docs/data/website_integrity.json",
1034
  "exists": true,
1035
  "bytes": 15375,
1036
+ "sha256": "29b9ad18c3c76ebf8d453a77c726f2d56c207ea262d74a8b6d086092020bef94"
1037
  },
1038
  "mirrors": {
1039
  "hf_space": {
1040
  "path": "hf_space:data/website_integrity.json",
1041
  "exists": true,
1042
  "bytes": 15375,
1043
+ "sha256": "29b9ad18c3c76ebf8d453a77c726f2d56c207ea262d74a8b6d086092020bef94"
1044
  },
1045
  "hf_artifacts": {
1046
  "path": "hf_artifacts:docs/data/website_integrity.json",
1047
  "exists": true,
1048
  "bytes": 15375,
1049
+ "sha256": "29b9ad18c3c76ebf8d453a77c726f2d56c207ea262d74a8b6d086092020bef94"
1050
  },
1051
  "hf_model": {
1052
  "path": "hf_model:metrics/website_integrity.json",
1053
  "exists": true,
1054
  "bytes": 15375,
1055
+ "sha256": "29b9ad18c3c76ebf8d453a77c726f2d56c207ea262d74a8b6d086092020bef94"
1056
  }
1057
  },
1058
  "failures": []
 
1785
  "local": {
1786
  "path": "repo:scripts/omni/build_omni_model_comparison.py",
1787
  "exists": true,
1788
+ "bytes": 35577,
1789
+ "sha256": "593fa7179d2ad0ca03aa11652f3273f046468d38447a6f05b0c8f36c4be25889"
1790
  },
1791
  "mirrors": {
1792
  "hf_artifacts": {
1793
  "path": "hf_artifacts:scripts/omni/build_omni_model_comparison.py",
1794
  "exists": true,
1795
+ "bytes": 35577,
1796
+ "sha256": "593fa7179d2ad0ca03aa11652f3273f046468d38447a6f05b0c8f36c4be25889"
1797
  },
1798
  "hf_model": {
1799
  "path": "hf_model:scripts/omni/build_omni_model_comparison.py",
1800
  "exists": true,
1801
+ "bytes": 35577,
1802
+ "sha256": "593fa7179d2ad0ca03aa11652f3273f046468d38447a6f05b0c8f36c4be25889"
1803
  }
1804
  },
1805
  "failures": []
 
2160
  "local": {
2161
  "path": "repo:scripts/verify_live_publication.py",
2162
  "exists": true,
2163
+ "bytes": 36285,
2164
+ "sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471"
2165
  },
2166
  "mirrors": {
2167
  "hf_artifacts": {
2168
  "path": "hf_artifacts:scripts/verify_live_publication.py",
2169
  "exists": true,
2170
+ "bytes": 36285,
2171
+ "sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471"
2172
  },
2173
  "hf_model": {
2174
  "path": "hf_model:scripts/verify_live_publication.py",
2175
  "exists": true,
2176
+ "bytes": 36285,
2177
+ "sha256": "4605124056ca329069b1ec848372dda439258140e0e2aeb449d7bf1929623471"
2178
  }
2179
  },
2180
  "failures": []
 
2410
  "local": {
2411
  "path": "repo:docs/index.html",
2412
  "exists": true,
2413
+ "bytes": 181095,
2414
+ "sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1"
2415
  },
2416
  "mirrors": {
2417
  "hf_space": {
2418
  "path": "hf_space:index.html",
2419
  "exists": true,
2420
+ "bytes": 181095,
2421
+ "sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1"
2422
  },
2423
  "hf_artifacts_docs": {
2424
  "path": "hf_artifacts:docs/index.html",
2425
  "exists": true,
2426
+ "bytes": 181095,
2427
+ "sha256": "856d5f9529fc30adbd995f45df43af0861f5e48b8fbfb14cb4e4313ede097dc1"
2428
  }
2429
  },
2430
  "failures": []
 
2696
  "local": {
2697
  "path": "repo:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2698
  "exists": true,
2699
+ "bytes": 10215,
2700
+ "sha256": "a5de891b2119941e27af8d28fd6d93c53387cc7609dea8fe4fe8e30786e1cc7c"
2701
  },
2702
  "mirrors": {
2703
  "hf_space": {
2704
  "path": "hf_space:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2705
  "exists": true,
2706
+ "bytes": 10215,
2707
+ "sha256": "a5de891b2119941e27af8d28fd6d93c53387cc7609dea8fe4fe8e30786e1cc7c"
2708
  },
2709
  "hf_artifacts": {
2710
  "path": "hf_artifacts:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2711
  "exists": true,
2712
+ "bytes": 10215,
2713
+ "sha256": "a5de891b2119941e27af8d28fd6d93c53387cc7609dea8fe4fe8e30786e1cc7c"
2714
  },
2715
  "hf_model": {
2716
  "path": "hf_model:results/omni_finetune/OMNI_MODEL_COMPARISON.md",
2717
  "exists": true,
2718
+ "bytes": 10215,
2719
+ "sha256": "a5de891b2119941e27af8d28fd6d93c53387cc7609dea8fe4fe8e30786e1cc7c"
2720
  }
2721
  },
2722
  "failures": []
 
7036
  "local": {
7037
  "path": "repo:RESEARCH_ROADMAP.md",
7038
  "exists": true,
7039
+ "bytes": 12874,
7040
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7041
  },
7042
  "mirrors": {
7043
  "hf_space": {
7044
  "path": "hf_space:RESEARCH_ROADMAP.md",
7045
  "exists": true,
7046
+ "bytes": 12874,
7047
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7048
  },
7049
  "hf_artifacts": {
7050
  "path": "hf_artifacts:RESEARCH_ROADMAP.md",
7051
  "exists": true,
7052
+ "bytes": 12874,
7053
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7054
  },
7055
  "hf_model": {
7056
  "path": "hf_model:RESEARCH_ROADMAP.md",
7057
  "exists": true,
7058
+ "bytes": 12874,
7059
+ "sha256": "834317a5b066b46046042be3f0c9ac7d12226a95728bd4a0a5898c3c96044347"
7060
  }
7061
  },
7062
  "failures": []
 
7067
  "local": {
7068
  "path": "repo:PROJECT_STATUS.md",
7069
  "exists": true,
7070
+ "bytes": 11369,
7071
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7072
  },
7073
  "mirrors": {
7074
  "hf_space": {
7075
  "path": "hf_space:PROJECT_STATUS.md",
7076
  "exists": true,
7077
+ "bytes": 11369,
7078
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7079
  },
7080
  "hf_artifacts": {
7081
  "path": "hf_artifacts:PROJECT_STATUS.md",
7082
  "exists": true,
7083
+ "bytes": 11369,
7084
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7085
  },
7086
  "hf_model": {
7087
  "path": "hf_model:PROJECT_STATUS.md",
7088
  "exists": true,
7089
+ "bytes": 11369,
7090
+ "sha256": "9ada29f7e7c8f6203abe2ddde67fcbe35656fa0c299b70d6adbd28053f69d114"
7091
  }
7092
  },
7093
  "failures": []
metrics/omni_finetune_verified_result.json CHANGED
@@ -80,7 +80,7 @@
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
- "Implement the Cosmos3-Super diffusion/action target packer and supervised loss before claiming Cosmos3 fine-tuning.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
 
80
  "required_next_steps": [
81
  "Use the v3 strict-label predictions for action/subtask error analysis and unseen-label debugging.",
82
  "Keep the existing Qwen LoRA adapter repository as the weight-bearing artifact; v3 is an evaluation/package refresh over the same adapter, not new weights.",
83
+ "Implement the Cosmos3-Super pipeline-loaded batch packer and one-sample forward-dynamics overfit before claiming Cosmos3 fine-tuning; camera-pose proxy targets are now exported, contract-audited, and schema-packed, but no Cosmos weights have been updated.",
84
  "Use sharded Qwen eval for future long held-out passes to improve GPU utilization."
85
  ]
86
  }
metrics/omni_model_comparison.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
- "generated_at_utc": "2026-06-07T15:34:51+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
@@ -8,7 +8,7 @@
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
- "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation rather than a new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
@@ -1012,7 +1012,62 @@
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
- "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible, but blocks true fine-tuning until a Cosmos-specific diffusion/action target packer and supervised loss are implemented."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1016
  }
1017
  ],
1018
  "multi_episode_128_runs": [
@@ -1056,7 +1111,7 @@
1056
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1057
  }
1058
  ],
1059
- "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. The readiness probe records why true Cosmos3-Super fine-tuning is not launched yet."
1060
  }
1061
  ],
1062
  "model_group_reading_notes": [
@@ -1064,10 +1119,10 @@
1064
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1065
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1066
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1067
- "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a training-readiness probe; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1068
  ],
1069
  "pending": [
1070
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1071
- "Promote Cosmos3 from Nano compatibility and Super base-weight evaluation to true fine-tuning only after a dedicated Cosmos diffusion/action target packer and supervised loss produce new weights."
1072
  ]
1073
  }
 
1
  {
2
  "title": "Ropedia Xperience-10M Current Result Versions and Model Groups",
3
+ "generated_at_utc": "2026-06-07T17:29:16+00:00",
4
  "status": "pass",
5
  "version_count": 3,
6
  "model_group_count": 4,
 
8
  "version_reading_notes": [
9
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
10
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
11
+ "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation; Cosmos3-Super now has a camera-pose forward-dynamics contract audit and schema-only packer smoke, but no new fine-tuned weight release."
12
  ],
13
  "versions": [
14
  {
 
1012
  "weights_updated": false
1013
  },
1014
  "weights": "none; readiness audit only, no adapter checkpoint",
1015
+ "interpretation": "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and the same JSON QA dataset are visible. It predates the camera-pose action-target export, so use the 20260608 contract audit for the current trainer-readiness status."
1016
+ },
1017
+ {
1018
+ "id": "xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608",
1019
+ "title": "Cosmos3-Super Camera-Pose Target Audit",
1020
+ "scope_label": "action target contract",
1021
+ "scope": "selected 128-episode 96/16/16 dataset augmented with camera_pose proxy cosmos_action_target records",
1022
+ "status": "ready_for_forward_dynamics_trainer",
1023
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
1024
+ "split": "train/val/test by selected episode/session",
1025
+ "counts": {
1026
+ "dataset_samples": 3808,
1027
+ "rows_with_action_target": 3808,
1028
+ "valid_action_targets": 3808,
1029
+ "split_counts": {
1030
+ "train": 2848,
1031
+ "val": 512,
1032
+ "test": 448
1033
+ },
1034
+ "episode_split_counts": {
1035
+ "test": 14,
1036
+ "train": 89,
1037
+ "val": 16
1038
+ }
1039
+ },
1040
+ "primary_metrics": {
1041
+ "domain_name": "camera_pose",
1042
+ "raw_action_dim": 9,
1043
+ "mode": "forward_dynamics",
1044
+ "valid_action_targets": 3808,
1045
+ "weights_updated": false
1046
+ },
1047
+ "weights": "none; action-target contract audit only, no adapter checkpoint",
1048
+ "interpretation": "The selected dataset now has valid Cosmos3 camera_pose forward_dynamics targets for an egocentric camera-motion proxy. These remove the target-schema blocker for action-conditioned world-model training, but they supervise noisy vision tokens rather than preds_action. The remaining work is a pipeline-loaded packer check and one-sample forward-dynamics overfit; action-token prediction needs a separate policy or inverse-dynamics target export."
1049
+ },
1050
+ {
1051
+ "id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608",
1052
+ "title": "Cosmos3-Super Action Batch Packer Smoke",
1053
+ "scope_label": "batch packer",
1054
+ "scope": "one selected train row from the camera_pose forward_dynamics augmented JSONL",
1055
+ "status": "pass",
1056
+ "source": "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json",
1057
+ "split": "train",
1058
+ "counts": {
1059
+ "samples": 1,
1060
+ "raw_action_rows": 8,
1061
+ "raw_action_dim": 9
1062
+ },
1063
+ "primary_metrics": {
1064
+ "mode": "forward_dynamics",
1065
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose",
1066
+ "pipeline_loaded": false,
1067
+ "weights_updated": false
1068
+ },
1069
+ "weights": "none; schema-only packer smoke, no adapter checkpoint",
1070
+ "interpretation": "The selected row maps to a camera_pose forward_dynamics contract. In the installed Cosmos3 pipeline this uses raw actions as conditioning and supervises noisy vision tokens; it does not supervise preds_action."
1071
  }
1072
  ],
1073
  "multi_episode_128_runs": [
 
1111
  "weights_repository": "none for this run: staged base nv-community/Cosmos3-Super weights were evaluated through vLLM; create a separate repo only after new adapter or fine-tuned weights exist"
1112
  }
1113
  ],
1114
+ "comparison_note": "Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. A camera-pose proxy forward-dynamics target export now passes the contract audit and schema-only packer smoke; true Cosmos3-Super fine-tuning is still not launched until the pipeline-loaded packer check and one-sample overfit exist."
1115
  }
1116
  ],
1117
  "model_group_reading_notes": [
 
1119
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
1120
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
1121
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
1122
+ "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a camera-pose forward-dynamics contract audit; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist."
1123
  ],
1124
  "pending": [
1125
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
1126
+ "Promote Cosmos3 from Nano compatibility, Super base-weight evaluation, and the camera-pose forward-dynamics contract to true fine-tuning only after the pipeline-loaded packer check and one-sample overfit produce new weights."
1127
  ]
1128
  }
metrics/project_packet.json CHANGED
@@ -41,7 +41,7 @@
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
- "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, and the Cosmos3-Super base-weight Reasoner evaluation are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
 
41
  "docs/data/scope_claims_audit.json",
42
  "docs/data/website_integrity.json"
43
  ],
44
+ "readout": "The project status table and roadmap give the compact current-state summary. Single-episode task engineering, metrics, visualizations, public website integrity, mirror parity, same-split 128-episode baselines, the final selected-episode Qwen3-Omni diagnostic result, the Cosmos3-Nano compatibility package, the Cosmos3-Super base-weight Reasoner evaluation, and the Cosmos3-Super camera-pose forward-dynamics contract audit plus schema-only packer smoke are implemented; stronger action/subtask and real Cosmos fine-tuned model quality remain follow-ups."
45
  },
46
  {
47
  "step": 2,
metrics/project_status.json CHANGED
@@ -119,7 +119,7 @@
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
- "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package plus a verified Cosmos3-Super base-weight Reasoner evaluation; OpenVLA/openpi/GR00T are policy candidates after action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
@@ -244,6 +244,18 @@
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
 
 
 
 
 
 
 
 
 
 
 
 
247
  {
248
  "area": "Raw Xperience-10M redistribution",
249
  "status": "not_included",
@@ -276,11 +288,11 @@
276
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
277
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
278
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
279
- "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, and Cosmos3-Super Reasoner is verified as a base-weight evaluation; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
280
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
281
  "Audio is one of the synchronized source modalities in the current task representation.",
282
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
283
- "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, and policy models such as OpenVLA/openpi/GR00T wait for action-target conversion.",
284
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
285
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
286
  ]
 
119
  "FOUNDATION_MODEL_PLAN.md",
120
  "docs/data/foundation_model_plan.json"
121
  ],
122
+ "readout": "Qwen3-Omni remains the first trainable held-out LoRA baseline; Cosmos 3 is now represented by a verified Cosmos3-Nano future-window compatibility package, a verified Cosmos3-Super base-weight Reasoner evaluation, and a Cosmos3-Super camera-pose proxy forward-dynamics contract audit plus schema-only packer smoke. The current target supports vision-velocity training under action conditioning, not supervised action-token prediction; OpenVLA/openpi/GR00T are policy candidates after robot-compatible action targets are explicit."
123
  },
124
  {
125
  "area": "Omni model extension contract",
 
244
  ],
245
  "readout": "Cosmos3-Super Reasoner now has a public-safe verified 448-window held-out evaluation on the same structured JSON task as Qwen3. It uses staged nv-community/Cosmos3-Super base weights through an 8-GPU vLLM server, not fine-tuned weights: JSON validity 0.5112, action macro-F1 0.0008, transition accuracy 0.3683, contact accuracy 0.3214, and object micro-F1 0.1370."
246
  },
247
+ {
248
+ "area": "Cosmos3-Super action-target contract",
249
+ "status": "ready_for_forward_dynamics_trainer_implementation",
250
+ "evidence": [
251
+ "scripts/omni/export_cosmos3_camera_pose_targets.py",
252
+ "scripts/omni/pack_cosmos3_super_action_batch.py",
253
+ "results/omni_finetune/xperience10m_cosmos3_camera_pose_targets_20260608/target_manifest.json",
254
+ "results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json",
255
+ "results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json"
256
+ ],
257
+ "readout": "The selected 128-episode JSONL is augmented with 3,808/3,808 valid camera_pose proxy cosmos_action_target records from SLAM pose deltas. The schema-only packer smoke confirms the current forward_dynamics target should supervise noisy vision tokens under camera-pose conditioning; it does not supervise preds_action. Remaining work is a pipeline-loaded packer check, one-sample forward-dynamics overfit, and a separate policy/inverse target export before claiming action-token prediction."
258
+ },
259
  {
260
  "area": "Raw Xperience-10M redistribution",
261
  "status": "not_included",
 
288
  "Use docs/data/omni_model_comparison.json to compare both views: the single-episode/128-baseline/model-branch result layers and the model-family grouping for task heads, Qwen3-Omni LoRA, Cosmos3-Nano, and Cosmos3-Super.",
289
  "Use docs/data/omni_finetune_verified_result.json and the latest verified_public final Qwen package for current held-out results.",
290
  "The 128-episode aligned simple/NN baselines use metadata/text features from the derived Qwen JSONL export; they align the split and task ids but do not replace raw-modality baselines for trajectory, retrieval, reconstruction, or misalignment tasks.",
291
+ "The Cosmos3-Nano future-window branch is verified as a compatibility adapter result, Cosmos3-Super Reasoner is verified as a base-weight evaluation, and Cosmos3-Super camera-pose forward-dynamics targets now pass the contract audit plus a schema-only packer smoke; one-episode Cosmos fine-tuning and full Cosmos adapter/diffusion-weight fine-tuning remain pending, so no Cosmos weight repo should be published yet.",
292
  "The current reconstruction task reconstructs feature vectors, not pixel-depth, mesh, NeRF, or Gaussian reconstruction.",
293
  "Audio is one of the synchronized source modalities in the current task representation.",
294
  "The audio ablation report compares audio/no-audio variants across all 12 task contracts in results/audio_ablation/.",
295
+ "Foundation-model selection is explicit: Qwen3-Omni is the immediate trainable pilot, Cosmos 3 is the first world-model branch, Cosmos3-Super has a camera-pose proxy forward-dynamics contract ready for trainer implementation, and policy models such as OpenVLA/openpi/GR00T wait for robot-compatible action-target conversion.",
296
  "Future model branches should be added through the backbone registry and verified package contract, not as one-off result folders with incompatible metrics or publication rules.",
297
  "The Xperience Embodied Foundation Model is a future native-pretraining goal, not a completed model or current benchmark."
298
  ]
metrics/research_roadmap.json CHANGED
@@ -133,7 +133,7 @@
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
 
133
  "docs/data/foundation_model_plan.json",
134
  "research_roadmap_interactive.json"
135
  ],
136
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets."
137
  },
138
  {
139
  "id": "robustness_run_64_128_episode",
metrics/research_roadmap_interactive.json CHANGED
@@ -2369,7 +2369,7 @@
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
- "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch; VLA/policy models wait for explicit action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
 
2369
  "entry_condition": "The selected episodes are prepared or a 3-8 episode dry run is available for preprocessing checks.",
2370
  "id": "foundation_model_selection_matrix",
2371
  "name": "Foundation-Model Selection Matrix",
2372
+ "reader_takeaway": "Qwen3-Omni remains the first trainable held-out pilot; Cosmos 3 is the first world-model branch. Cosmos3-Super now has camera-pose proxy forward-dynamics targets ready for trainer implementation, while VLA/policy models wait for robot-compatible action targets.",
2373
  "stage": "omni",
2374
  "status": "next"
2375
  },
metrics/website_integrity.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-07T15:47:32+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
@@ -75,7 +75,7 @@
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
- "evidence_index": 90477
79
  },
80
  {
81
  "name": "project_status_links_json",
@@ -153,8 +153,8 @@
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
- "protocol_index": 87160,
157
- "evidence_index": 90477
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
@@ -292,7 +292,7 @@
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
- "bytes": 410374,
296
  "top_level_type": "dict"
297
  },
298
  {
@@ -302,12 +302,12 @@
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
- "bytes": 3628,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
- "bytes": 48296,
311
  "top_level_type": "dict"
312
  },
313
  {
@@ -322,12 +322,12 @@
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
- "bytes": 8005,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
- "bytes": 16455,
331
  "top_level_type": "dict"
332
  },
333
  {
@@ -367,12 +367,12 @@
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
- "bytes": 10133,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
- "bytes": 143560,
376
  "top_level_type": "dict"
377
  },
378
  {
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-07T17:31:44+00:00",
4
  "docs_root": "docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
 
75
  "status": "pass",
76
  "reason": "The project overview should appear before the deeper progress ledger.",
77
  "overview_index": 67412,
78
+ "evidence_index": 90659
79
  },
80
  {
81
  "name": "project_status_links_json",
 
153
  "status": "pass",
154
  "reason": "The evaluation protocol should appear before the deeper evidence ledger.",
155
  "overview_index": 67412,
156
+ "protocol_index": 87218,
157
+ "evidence_index": 90659
158
  },
159
  {
160
  "name": "evaluation_protocol_links_json",
 
292
  },
293
  {
294
  "path": "data/mirror_parity.json",
295
+ "bytes": 345072,
296
  "top_level_type": "dict"
297
  },
298
  {
 
302
  },
303
  {
304
  "path": "data/omni_finetune_verified_result.json",
305
+ "bytes": 3768,
306
  "top_level_type": "dict"
307
  },
308
  {
309
  "path": "data/omni_model_comparison.json",
310
+ "bytes": 51589,
311
  "top_level_type": "dict"
312
  },
313
  {
 
322
  },
323
  {
324
  "path": "data/project_packet.json",
325
+ "bytes": 8098,
326
  "top_level_type": "dict"
327
  },
328
  {
329
  "path": "data/project_status.json",
330
+ "bytes": 18062,
331
  "top_level_type": "dict"
332
  },
333
  {
 
367
  },
368
  {
369
  "path": "data/research_roadmap.json",
370
+ "bytes": 10246,
371
  "top_level_type": "dict"
372
  },
373
  {
374
  "path": "data/research_roadmap_interactive.json",
375
+ "bytes": 143673,
376
  "top_level_type": "dict"
377
  },
378
  {
results/omni_finetune/OMNI_MODEL_COMPARISON.md CHANGED
@@ -1,6 +1,6 @@
1
  # Omni Model Comparison
2
 
3
- Generated: `2026-06-07T15:34:51+00:00`
4
 
5
  Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.
6
 
@@ -16,7 +16,7 @@ Read the three rows this way:
16
 
17
  - Version 1 is the public-sample 12-task harness with minimal and neural heads.
18
  - Version 2 is the selected 128-episode same-split simple/NN baseline alignment.
19
- - Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation rather than a new fine-tuned weight release.
20
 
21
  ## Model-Family Grouped View
22
 
@@ -24,7 +24,7 @@ Read the three rows this way:
24
  - Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.
25
  - Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.
26
  - Cosmos3-Nano has a 128-episode future-window compatibility package.
27
- - Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a training-readiness probe; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist.
28
 
29
  ### Minimal and Neural Task Heads
30
 
@@ -64,7 +64,7 @@ The current 128-episode Cosmos result is a public-safe future-window compatibili
64
 
65
  ### Cosmos3-Super Reasoner
66
 
67
- Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. The readiness probe records why true Cosmos3-Super fine-tuning is not launched yet.
68
 
69
  - Weight repo policy: none for this run; staged base weights only, no new fine-tuned weights
70
 
@@ -72,6 +72,8 @@ Cosmos3-Super is now represented by a verified 448-window held-out Reasoner eval
72
  | --- | --- | --- | --- | --- | --- |
73
  | 1 episode | not_run | Cosmos3-Super One-Episode Fine-Tune | | | |
74
  | readiness | blocked_until_trainer_implemented | Cosmos3-Super Training Readiness Probe | 3808 windows/samples | diffusers_runtime_supported=True, chat_sft_supported=False, weights_updated=False | `results/omni_finetune/xperience10m_cosmos3_super_training_readiness_20260607/training_readiness.json` |
 
 
75
  | 128 episode | verified current | Cosmos3-Super Reasoner | 119 episodes, 3808 windows/samples, 448 eval | json_validity_rate=0.5112, action_macro_f1=0.0008, transition_accuracy=0.3683, contact_accuracy=0.3214 | `results/omni_finetune/verified_public/xperience10m_cosmos3_super_reasoner_128ep_test_full_20260607/verified_result_summary.json` |
76
 
77
  ## 128-Episode Task Baselines
@@ -105,4 +107,4 @@ Cosmos3-Super is now represented by a verified 448-window held-out Reasoner eval
105
  ## Pending
106
 
107
  - Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.
108
- - Promote Cosmos3 from Nano compatibility and Super base-weight evaluation to true fine-tuning only after a dedicated Cosmos diffusion/action target packer and supervised loss produce new weights.
 
1
  # Omni Model Comparison
2
 
3
+ Generated: `2026-06-07T17:29:16+00:00`
4
 
5
  Compare only rows with the same scope and target. Single-episode raw-feature metrics, 128-episode metadata baselines, Qwen3 structured JSON metrics, and the two Cosmos3 targets answer different questions: Nano future-window retrieval versus Super structured JSON Reasoner evaluation.
6
 
 
16
 
17
  - Version 1 is the public-sample 12-task harness with minimal and neural heads.
18
  - Version 2 is the selected 128-episode same-split simple/NN baseline alignment.
19
+ - Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation; Cosmos3-Super now has a camera-pose forward-dynamics contract audit and schema-only packer smoke, but no new fine-tuned weight release.
20
 
21
  ## Model-Family Grouped View
22
 
 
24
  - Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.
25
  - Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.
26
  - Cosmos3-Nano has a 128-episode future-window compatibility package.
27
+ - Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a camera-pose forward-dynamics contract audit; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist.
28
 
29
  ### Minimal and Neural Task Heads
30
 
 
64
 
65
  ### Cosmos3-Super Reasoner
66
 
67
+ Cosmos3-Super is now represented by a verified 448-window held-out Reasoner evaluation on the same JSON task as Qwen3. It uses staged base weights through vLLM, so it is a model-branch diagnostic, not a weight release. A camera-pose proxy forward-dynamics target export now passes the contract audit and schema-only packer smoke; true Cosmos3-Super fine-tuning is still not launched until the pipeline-loaded packer check and one-sample overfit exist.
68
 
69
  - Weight repo policy: none for this run; staged base weights only, no new fine-tuned weights
70
 
 
72
  | --- | --- | --- | --- | --- | --- |
73
  | 1 episode | not_run | Cosmos3-Super One-Episode Fine-Tune | | | |
74
  | readiness | blocked_until_trainer_implemented | Cosmos3-Super Training Readiness Probe | 3808 windows/samples | diffusers_runtime_supported=True, chat_sft_supported=False, weights_updated=False | `results/omni_finetune/xperience10m_cosmos3_super_training_readiness_20260607/training_readiness.json` |
75
+ | action target contract | ready_for_forward_dynamics_trainer | Cosmos3-Super Camera-Pose Target Audit | 3808 windows/samples | domain_name=camera_pose, raw_action_dim=9, mode=forward_dynamics, valid_action_targets=3808, weights_updated=False | `results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_camera_pose_20260608/training_contract_audit.json` |
76
+ | batch packer | pass | Cosmos3-Super Action Batch Packer Smoke | 1 windows/samples | mode=forward_dynamics, loss_surface=vision_velocity_conditioned_on_camera_pose, pipeline_loaded=False, weights_updated=False | `results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json` |
77
  | 128 episode | verified current | Cosmos3-Super Reasoner | 119 episodes, 3808 windows/samples, 448 eval | json_validity_rate=0.5112, action_macro_f1=0.0008, transition_accuracy=0.3683, contact_accuracy=0.3214 | `results/omni_finetune/verified_public/xperience10m_cosmos3_super_reasoner_128ep_test_full_20260607/verified_result_summary.json` |
78
 
79
  ## 128-Episode Task Baselines
 
107
  ## Pending
108
 
109
  - Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.
110
+ - Promote Cosmos3 from Nano compatibility, Super base-weight evaluation, and the camera-pose forward-dynamics contract to true fine-tuning only after the pipeline-loaded packer check and one-sample overfit produce new weights.
results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/RUN_REPORT.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cosmos3-Super Action Batch Packer
2
+
3
+ - Run id: `xperience10m_cosmos3_super_action_packer_schema_smoke_20260608`
4
+ - Row: `27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4:qa:0`
5
+ - Mode: `forward_dynamics`
6
+ - Domain: `camera_pose`
7
+ - Raw action shape: `[8, 9]`
8
+ - Pipeline loaded: `False`
9
+ - Status: `pass`
10
+
11
+ ## Loss Surface
12
+
13
+ - `vision_velocity_conditioned_on_camera_pose`
14
+ - Cosmos3 forward_dynamics consumes raw_actions as conditioning and predicts noisy vision tokens. It does not supervise preds_action for this target mode.
15
+
16
+ ## Next Step
17
+
18
+ - Implement the one-sample overfit with a vision velocity/rectified-flow loss under camera-pose action conditioning.
19
+ - Add a separate policy or inverse-dynamics target export before claiming supervised action-token prediction.
results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/packer_summary.json ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608",
3
+ "run_kind": "cosmos3_super_action_batch_packer",
4
+ "started_at_unix": 1780852840.3893492,
5
+ "finished_at_unix": 1780852842.8621027,
6
+ "elapsed_seconds": 2.4727537631988525,
7
+ "dataset_jsonl": "/home/cy/Ropedia/ropedia-episode-task-suite/results/omni_finetune/xperience10m_cosmos3_camera_pose_targets_20260608/dataset_with_cosmos_actions.jsonl",
8
+ "backbone_config": "/home/cy/Ropedia/ropedia-episode-task-suite/configs/omni_backbones/cosmos3_super_reasoner.json",
9
+ "backbone": {
10
+ "id": "cosmos3_super_reasoner",
11
+ "display_name": "Cosmos3-Super Reasoner",
12
+ "status": "implemented",
13
+ "model_family": "Cosmos3 / physical-world foundation models",
14
+ "default_model_id": "nv-community/Cosmos3-Super",
15
+ "local_model_env": "COSMOS3_SUPER_MODEL_DIR",
16
+ "dataset_contract": "xperience10m_episode_json_qa_v1",
17
+ "training_objective": "zero_shot_structured_episode_understanding_json_qa_via_vllm_reasoner",
18
+ "split_policy": {
19
+ "unit": "episode",
20
+ "default_counts": {
21
+ "train": 96,
22
+ "val": 16,
23
+ "test": 16
24
+ },
25
+ "leakage_guard": "uses the same 96/16/16 selected episode split as the Qwen3-Omni LoRA branch; no Super weights are updated"
26
+ },
27
+ "modalities": {
28
+ "direct_inputs": [
29
+ "multi-camera rendered mosaic video",
30
+ "language prompt and label options"
31
+ ],
32
+ "conditioning_inputs": [
33
+ "prompt-side task schema and episode/window metadata"
34
+ ],
35
+ "targets": [
36
+ "structured action/subtask/contact/transition/object JSON"
37
+ ],
38
+ "excluded_inputs": [
39
+ "visualization.rrd",
40
+ "raw annotation HDF5",
41
+ "audio in the current vLLM Reasoner path"
42
+ ]
43
+ },
44
+ "entrypoints": {
45
+ "selection_manifest": "scripts/omni/build_selection_episode_manifest.py",
46
+ "export": "scripts/omni/parallel_export_qwen3_omni_action_dataset.py",
47
+ "neutral_index": "scripts/omni/export_model_neutral_window_index.py",
48
+ "action_target_export": "scripts/omni/export_cosmos3_camera_pose_targets.py",
49
+ "action_batch_packer": "scripts/omni/pack_cosmos3_super_action_batch.py",
50
+ "train": "",
51
+ "train_contract_audit": "scripts/omni/audit_cosmos3_super_training_contract.py",
52
+ "train_probe": "scripts/omni/probe_cosmos3_super_training_readiness.py",
53
+ "eval": "scripts/omni/eval_cosmos3_super_reasoner.py",
54
+ "launcher": "scripts/omni/run_cosmos3_super_reasoner_eval.sh",
55
+ "validate": "scripts/omni/validate_omni_finetune_run.py"
56
+ },
57
+ "primary_metrics": [
58
+ "json_validity_rate",
59
+ "action_macro_f1",
60
+ "subtask_accuracy",
61
+ "transition_accuracy",
62
+ "next_action_accuracy",
63
+ "contact_accuracy",
64
+ "object_micro_f1",
65
+ "held_out_episode_count"
66
+ ],
67
+ "artifact_contract": {
68
+ "checkpoint_gate": "base_weight_vllm_reasoner_setup_metadata",
69
+ "required_eval_files": [
70
+ "metrics.json",
71
+ "predictions.jsonl",
72
+ "predictions.csv",
73
+ "per_class_metrics.csv",
74
+ "confusion_matrix.csv",
75
+ "server_info.json",
76
+ "RUN_REPORT.md"
77
+ ],
78
+ "required_training_files": [
79
+ "training_metadata.json",
80
+ "progress.jsonl"
81
+ ],
82
+ "public_package_allowed": [
83
+ "metrics",
84
+ "predictions",
85
+ "confusion matrices",
86
+ "run reports",
87
+ "server/model setup metadata",
88
+ "episode and dataset manifests",
89
+ "validation summaries"
90
+ ],
91
+ "public_package_forbidden": [
92
+ "raw MP4",
93
+ "annotation HDF5",
94
+ "Rerun RRD",
95
+ "base-model weights",
96
+ "fine-tuned weights",
97
+ "checkpoints",
98
+ "large archives"
99
+ ]
100
+ },
101
+ "extension_requirements": [
102
+ "This branch evaluates staged Cosmos3-Super Reasoner base weights through vLLM on the 128-episode held-out JSON task; it does not fine-tune or release new Cosmos weights.",
103
+ "Run scripts/omni/probe_cosmos3_super_training_readiness.py before any Cosmos3-Super adapter launch; the probe must have no blockers before train can be filled.",
104
+ "Create a separate Cosmos3-Super adapter/model repository only after a real fine-tuning run produces new adapter or checkpoint weights.",
105
+ "Keep it separate from the Cosmos3-Nano future-window compatibility branch, which answers a different world-model retrieval target."
106
+ ]
107
+ },
108
+ "status": "pass",
109
+ "row_contract": {
110
+ "row_id": "27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4:qa:0",
111
+ "episode_id": "27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4",
112
+ "split": "train",
113
+ "target_key": "cosmos_action_target",
114
+ "mode": "forward_dynamics",
115
+ "domain_name": "camera_pose",
116
+ "chunk_size": 8,
117
+ "raw_action_dim": 9,
118
+ "raw_actions_shape": [
119
+ 8,
120
+ 9
121
+ ],
122
+ "video_path": "/home/cy/Ropedia/ropedia-episode-task-suite/results/omni_finetune/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_dataset/shards/shard_00/media/27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4/27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4_w00000_ctx0_119_mosaic.mp4",
123
+ "video_path_exists": true,
124
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose",
125
+ "action_loss_expected": false,
126
+ "interpretation": "Cosmos3 forward_dynamics consumes raw_actions as conditioning and predicts noisy vision tokens. It does not supervise preds_action for this target mode.",
127
+ "issues": []
128
+ },
129
+ "pack_result": {
130
+ "status": "schema_ready_pipeline_not_loaded",
131
+ "pipeline_loaded": false,
132
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose",
133
+ "action_loss_expected": false
134
+ },
135
+ "weights_updated": false
136
+ }
results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/progress.jsonl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {"event": "start", "run_id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608", "time": 1780852840.3893492}
2
+ {"event": "row_selected", "row_id": "27c9fc42-2bb4-4737-b09c-08d2dd88aed4__ep4:qa:0", "time": 1780852842.8619707}
3
+ {"event": "complete", "status": "pass", "time": 1780852842.8629975}
results/omni_finetune/xperience10m_cosmos3_super_action_packer_schema_smoke_20260608/training_metadata.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_id": "xperience10m_cosmos3_super_action_packer_schema_smoke_20260608",
3
+ "run_kind": "cosmos3_super_action_batch_packer",
4
+ "weights_updated": false,
5
+ "checkpoint_dir": null,
6
+ "status": "pass",
7
+ "loss_surface": "vision_velocity_conditioned_on_camera_pose"
8
+ }
results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/RUN_REPORT.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cosmos3-Super Training Contract Audit
2
+
3
+ - Run id: `xperience10m_cosmos3_super_training_contract_audit_local`
4
+ - Dataset: `/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/results/omni_finetune/dataset.jsonl`
5
+ - Rows: `128`
6
+ - Rows with Cosmos action targets: `0`
7
+ - Valid Cosmos action targets: `0`
8
+ - Status: `blocked_missing_cosmos_action_targets`
9
+ - Weights updated: `False`
10
+
11
+ ## Blockers
12
+
13
+ - dataset has no cosmos_action_target/cosmos3_action_target/action_target records; semantic JSON labels cannot be used as Cosmos continuous action latents
14
+
15
+ ## Required Target Schema
16
+
17
+ ```json
18
+ {
19
+ "cosmos_action_target": {
20
+ "mode": "policy|forward_dynamics|inverse_dynamics",
21
+ "domain_name": "one Cosmos3 embodiment domain supported by CosmosActionCondition",
22
+ "chunk_size": "positive integer action transition count",
23
+ "raw_actions": "required for forward_dynamics; list[list[float]] with shape [T, raw_action_dim]",
24
+ "video": "required for inverse_dynamics, or image/video conditioning for policy and forward_dynamics",
25
+ "resolution_tier": "optional; one of 256, 480, 704, 720",
26
+ "view_point": "optional; ego_view|third_person_view|wrist_view|concat_view"
27
+ }
28
+ }
29
+ ```
30
+
31
+ ## Next Steps
32
+
33
+ - Export Cosmos-native action targets from Xperience annotations or mocap/pose/contact signals into the required cosmos_action_target schema.
34
+ - Implement a one-sample batch packer that calls Cosmos3OmniPipeline.prepare_latents and the static segment helpers, then computes MSE/rectified-flow loss over preds_action for noisy action tokens.
35
+ - Run a one-episode overfit before scheduling a 96/16/16 Super LoRA run; only publish a Cosmos model repo after new adapter/checkpoint weights exist.
results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/progress.jsonl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {"event": "start", "run_id": "xperience10m_cosmos3_super_training_contract_audit_local", "time": 1780849944.267908}
2
+ {"event": "dataset_loaded", "rows": 128, "time": 1780849944.278147}
3
+ {"event": "complete", "status": "blocked_missing_cosmos_action_targets", "time": 1780849944.2802079}
results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/training_contract_audit.json ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_id": "xperience10m_cosmos3_super_training_contract_audit_local",
3
+ "run_kind": "cosmos3_super_training_contract_audit",
4
+ "started_at_unix": 1780849944.267908,
5
+ "finished_at_unix": 1780849944.279339,
6
+ "elapsed_seconds": 0.011430978775024414,
7
+ "workspace": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy",
8
+ "dataset_jsonl": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/results/omni_finetune/dataset.jsonl",
9
+ "sample_limit": 0,
10
+ "backbone_config": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/configs/omni_backbones/cosmos3_super_reasoner.json",
11
+ "backbone": {
12
+ "id": "cosmos3_super_reasoner",
13
+ "display_name": "Cosmos3-Super Reasoner",
14
+ "training_objective": "zero_shot_structured_episode_understanding_json_qa_via_vllm_reasoner"
15
+ },
16
+ "model": {
17
+ "provided": false
18
+ },
19
+ "dataset": {
20
+ "num_rows": 128,
21
+ "split_counts": {
22
+ "train": 128
23
+ },
24
+ "episode_split_counts": {
25
+ "train": 1
26
+ },
27
+ "rows_with_video": 128,
28
+ "missing_json_answer": 0,
29
+ "missing_json_fields": {},
30
+ "rows_with_action_target": 0,
31
+ "valid_action_targets": 0,
32
+ "target_key_counts": {},
33
+ "target_mode_counts": {},
34
+ "target_issue_counts": {},
35
+ "target_issue_examples": []
36
+ },
37
+ "decision": {
38
+ "status": "blocked_missing_cosmos_action_targets",
39
+ "weights_updated": false,
40
+ "blockers": [
41
+ "dataset has no cosmos_action_target/cosmos3_action_target/action_target records; semantic JSON labels cannot be used as Cosmos continuous action latents"
42
+ ],
43
+ "warnings": [
44
+ "model_dir not provided; model action_gen/action_dim could not be verified"
45
+ ],
46
+ "required_target_schema": {
47
+ "cosmos_action_target": {
48
+ "mode": "policy|forward_dynamics|inverse_dynamics",
49
+ "domain_name": "one Cosmos3 embodiment domain supported by CosmosActionCondition",
50
+ "chunk_size": "positive integer action transition count",
51
+ "raw_actions": "required for forward_dynamics; list[list[float]] with shape [T, raw_action_dim]",
52
+ "video": "required for inverse_dynamics, or image/video conditioning for policy and forward_dynamics",
53
+ "resolution_tier": "optional; one of 256, 480, 704, 720",
54
+ "view_point": "optional; ego_view|third_person_view|wrist_view|concat_view"
55
+ }
56
+ },
57
+ "trainer_contract": {
58
+ "diffusers_classes": [
59
+ "Cosmos3OmniPipeline",
60
+ "Cosmos3OmniTransformer",
61
+ "CosmosActionCondition"
62
+ ],
63
+ "packing_helpers": [
64
+ "Cosmos3OmniPipeline.prepare_latents",
65
+ "Cosmos3OmniPipeline._prepare_text_segment",
66
+ "Cosmos3OmniPipeline._prepare_vision_segment",
67
+ "Cosmos3OmniPipeline._prepare_action_segment"
68
+ ],
69
+ "forward_outputs": "Cosmos3OmniTransformer.forward returns (preds_vision, preds_sound, preds_action); action LoRA needs supervised loss against raw continuous action tokens, not JSON strings.",
70
+ "lora_targets": "use checkpoint-declared q_proj_moe_gen,k_proj_moe_gen,v_proj_moe_gen,o_proj_moe_gen unless a new audited config overrides them"
71
+ },
72
+ "next_steps": [
73
+ "Export Cosmos-native action targets from Xperience annotations or mocap/pose/contact signals into the required cosmos_action_target schema.",
74
+ "Implement a one-sample batch packer that calls Cosmos3OmniPipeline.prepare_latents and the static segment helpers, then computes MSE/rectified-flow loss over preds_action for noisy action tokens.",
75
+ "Run a one-episode overfit before scheduling a 96/16/16 Super LoRA run; only publish a Cosmos model repo after new adapter/checkpoint weights exist."
76
+ ]
77
+ }
78
+ }
results/omni_finetune/xperience10m_cosmos3_super_training_contract_audit_local/training_metadata.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_id": "xperience10m_cosmos3_super_training_contract_audit_local",
3
+ "run_kind": "cosmos3_super_training_contract_audit",
4
+ "weights_updated": false,
5
+ "checkpoint_dir": null,
6
+ "decision": {
7
+ "status": "blocked_missing_cosmos_action_targets",
8
+ "weights_updated": false,
9
+ "blockers": [
10
+ "dataset has no cosmos_action_target/cosmos3_action_target/action_target records; semantic JSON labels cannot be used as Cosmos continuous action latents"
11
+ ],
12
+ "warnings": [
13
+ "model_dir not provided; model action_gen/action_dim could not be verified"
14
+ ],
15
+ "required_target_schema": {
16
+ "cosmos_action_target": {
17
+ "mode": "policy|forward_dynamics|inverse_dynamics",
18
+ "domain_name": "one Cosmos3 embodiment domain supported by CosmosActionCondition",
19
+ "chunk_size": "positive integer action transition count",
20
+ "raw_actions": "required for forward_dynamics; list[list[float]] with shape [T, raw_action_dim]",
21
+ "video": "required for inverse_dynamics, or image/video conditioning for policy and forward_dynamics",
22
+ "resolution_tier": "optional; one of 256, 480, 704, 720",
23
+ "view_point": "optional; ego_view|third_person_view|wrist_view|concat_view"
24
+ }
25
+ },
26
+ "trainer_contract": {
27
+ "diffusers_classes": [
28
+ "Cosmos3OmniPipeline",
29
+ "Cosmos3OmniTransformer",
30
+ "CosmosActionCondition"
31
+ ],
32
+ "packing_helpers": [
33
+ "Cosmos3OmniPipeline.prepare_latents",
34
+ "Cosmos3OmniPipeline._prepare_text_segment",
35
+ "Cosmos3OmniPipeline._prepare_vision_segment",
36
+ "Cosmos3OmniPipeline._prepare_action_segment"
37
+ ],
38
+ "forward_outputs": "Cosmos3OmniTransformer.forward returns (preds_vision, preds_sound, preds_action); action LoRA needs supervised loss against raw continuous action tokens, not JSON strings.",
39
+ "lora_targets": "use checkpoint-declared q_proj_moe_gen,k_proj_moe_gen,v_proj_moe_gen,o_proj_moe_gen unless a new audited config overrides them"
40
+ },
41
+ "next_steps": [
42
+ "Export Cosmos-native action targets from Xperience annotations or mocap/pose/contact signals into the required cosmos_action_target schema.",
43
+ "Implement a one-sample batch packer that calls Cosmos3OmniPipeline.prepare_latents and the static segment helpers, then computes MSE/rectified-flow loss over preds_action for noisy action tokens.",
44
+ "Run a one-episode overfit before scheduling a 96/16/16 Super LoRA run; only publish a Cosmos model repo after new adapter/checkpoint weights exist."
45
+ ]
46
+ }
47
+ }
scripts/omni/audit_cosmos3_super_training_contract.py ADDED
@@ -0,0 +1,406 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Audit whether a dataset can drive real Cosmos3-Super action fine-tuning.
3
+
4
+ The existing Cosmos3-Super Reasoner run evaluates base weights on structured
5
+ JSON QA. A true Cosmos3 Diffusers fine-tune is a different contract: the
6
+ transformer action path predicts continuous embodiment-domain action vectors,
7
+ not semantic JSON labels. This guard makes that distinction explicit and fails
8
+ closed until the exported Xperience-10M windows contain Cosmos-native action
9
+ targets.
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ import argparse
15
+ import json
16
+ import math
17
+ import time
18
+ from collections import Counter
19
+ from pathlib import Path
20
+ from typing import Any
21
+
22
+ from qwen3_omni_dataset_utils import load_jsonl
23
+
24
+
25
+ REQUIRED_JSON_QA_FIELDS = {
26
+ "action",
27
+ "subtask",
28
+ "objects",
29
+ "contact",
30
+ "transition",
31
+ "next_action",
32
+ "evidence_window",
33
+ }
34
+
35
+ ACTION_TARGET_KEYS = (
36
+ "cosmos_action_target",
37
+ "cosmos3_action_target",
38
+ "cosmos_action_condition",
39
+ "action_target",
40
+ )
41
+
42
+ REQUIRED_ACTION_TARGET_FIELDS = {
43
+ "mode",
44
+ "domain_name",
45
+ "chunk_size",
46
+ }
47
+
48
+ ACTION_MODES = {"policy", "forward_dynamics", "inverse_dynamics"}
49
+
50
+ REQUIRED_SCHEMA = {
51
+ "cosmos_action_target": {
52
+ "mode": "policy|forward_dynamics|inverse_dynamics",
53
+ "domain_name": "one Cosmos3 embodiment domain supported by CosmosActionCondition",
54
+ "chunk_size": "positive integer action transition count",
55
+ "raw_actions": "required for forward_dynamics; list[list[float]] with shape [T, raw_action_dim]",
56
+ "video": "required for inverse_dynamics, or image/video conditioning for policy and forward_dynamics",
57
+ "resolution_tier": "optional; one of 256, 480, 704, 720",
58
+ "view_point": "optional; ego_view|third_person_view|wrist_view|concat_view",
59
+ }
60
+ }
61
+
62
+
63
+ def parse_args() -> argparse.Namespace:
64
+ workspace_default = Path(__file__).resolve().parents[2]
65
+ parser = argparse.ArgumentParser(description=__doc__)
66
+ parser.add_argument("--workspace", type=Path, default=workspace_default)
67
+ parser.add_argument("--dataset-jsonl", type=Path, required=True)
68
+ parser.add_argument("--model-dir", type=Path)
69
+ parser.add_argument(
70
+ "--backbone-config",
71
+ type=Path,
72
+ default=workspace_default / "configs" / "omni_backbones" / "cosmos3_super_reasoner.json",
73
+ )
74
+ parser.add_argument("--run-id", default="xperience10m_cosmos3_super_training_contract_audit")
75
+ parser.add_argument("--output-dir", type=Path)
76
+ parser.add_argument("--sample-limit", type=int, default=0)
77
+ parser.add_argument(
78
+ "--require-trainable",
79
+ action="store_true",
80
+ help="Exit non-zero unless the dataset/model contract is ready for a real trainer launch.",
81
+ )
82
+ return parser.parse_args()
83
+
84
+
85
+ def read_json(path: Path | None) -> dict[str, Any]:
86
+ if path is None or not path.exists():
87
+ return {}
88
+ return json.loads(path.read_text(encoding="utf-8"))
89
+
90
+
91
+ def write_json(path: Path, payload: dict[str, Any]) -> None:
92
+ path.parent.mkdir(parents=True, exist_ok=True)
93
+ path.write_text(json.dumps(payload, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
94
+
95
+
96
+ def append_jsonl(path: Path, payload: dict[str, Any]) -> None:
97
+ path.parent.mkdir(parents=True, exist_ok=True)
98
+ with path.open("a", encoding="utf-8") as handle:
99
+ handle.write(json.dumps(payload, sort_keys=True, ensure_ascii=False) + "\n")
100
+
101
+
102
+ def numeric_matrix(value: Any) -> tuple[bool, tuple[int, int] | None]:
103
+ if not isinstance(value, list) or not value:
104
+ return False, None
105
+ width: int | None = None
106
+ for row in value:
107
+ if not isinstance(row, list) or not row:
108
+ return False, None
109
+ if width is None:
110
+ width = len(row)
111
+ elif len(row) != width:
112
+ return False, None
113
+ for item in row:
114
+ if not isinstance(item, (int, float)) or not math.isfinite(float(item)):
115
+ return False, None
116
+ return True, (len(value), int(width or 0))
117
+
118
+
119
+ def find_action_target(row: dict[str, Any]) -> tuple[str | None, dict[str, Any] | None]:
120
+ for key in ACTION_TARGET_KEYS:
121
+ value = row.get(key)
122
+ if isinstance(value, dict):
123
+ return key, value
124
+ return None, None
125
+
126
+
127
+ def media_has_video(row: dict[str, Any]) -> bool:
128
+ media = row.get("media") if isinstance(row.get("media"), dict) else {}
129
+ if media.get("mosaic_video_path") or row.get("primary_video_path"):
130
+ return True
131
+ video_paths = media.get("video_paths")
132
+ return isinstance(video_paths, list) and any(isinstance(item, dict) and item.get("path") for item in video_paths)
133
+
134
+
135
+ def validate_action_target(target: dict[str, Any]) -> list[str]:
136
+ issues: list[str] = []
137
+ missing = sorted(field for field in REQUIRED_ACTION_TARGET_FIELDS if field not in target)
138
+ if missing:
139
+ issues.append(f"missing fields: {missing}")
140
+ return issues
141
+
142
+ mode = str(target.get("mode"))
143
+ if mode not in ACTION_MODES:
144
+ issues.append(f"unsupported mode: {mode!r}")
145
+
146
+ try:
147
+ chunk_size = int(target.get("chunk_size"))
148
+ if chunk_size < 1:
149
+ issues.append("chunk_size must be >= 1")
150
+ except Exception:
151
+ issues.append("chunk_size must be an integer")
152
+ chunk_size = 0
153
+
154
+ if not str(target.get("domain_name") or "").strip():
155
+ issues.append("domain_name is empty")
156
+
157
+ raw_actions = target.get("raw_actions")
158
+ if mode == "forward_dynamics":
159
+ ok, shape = numeric_matrix(raw_actions)
160
+ if not ok:
161
+ issues.append("forward_dynamics requires numeric raw_actions shaped [T, raw_action_dim]")
162
+ elif shape and shape[0] < 1:
163
+ issues.append("raw_actions must include at least one action row")
164
+ elif raw_actions is not None:
165
+ ok, _ = numeric_matrix(raw_actions)
166
+ if not ok:
167
+ issues.append("raw_actions is present but is not a numeric matrix")
168
+
169
+ return issues
170
+
171
+
172
+ def model_summary(model_dir: Path | None) -> dict[str, Any]:
173
+ if model_dir is None:
174
+ return {"provided": False}
175
+ model_dir = model_dir.expanduser().resolve()
176
+ config = read_json(model_dir / "config.json")
177
+ transformer_config = read_json(model_dir / "transformer" / "config.json")
178
+ inner = ((config.get("model") or {}).get("config") or {})
179
+ return {
180
+ "provided": True,
181
+ "path": str(model_dir),
182
+ "exists": model_dir.exists(),
183
+ "model_type": config.get("model_type"),
184
+ "architectures": config.get("architectures"),
185
+ "pipeline_class": read_json(model_dir / "model_index.json").get("_class_name"),
186
+ "transformer_class": transformer_config.get("_class_name"),
187
+ "action_gen": transformer_config.get("action_gen", inner.get("action_gen")),
188
+ "action_dim": transformer_config.get("action_dim", inner.get("action_dim")),
189
+ "lora_enabled_default": inner.get("lora_enabled"),
190
+ "lora_rank_default": inner.get("lora_rank"),
191
+ "lora_alpha_default": inner.get("lora_alpha"),
192
+ "lora_target_modules_default": inner.get("lora_target_modules"),
193
+ "rectified_flow_training_config_keys": sorted(
194
+ ((inner.get("rectified_flow_training_config") or {}).keys())
195
+ ),
196
+ }
197
+
198
+
199
+ def dataset_summary(rows: list[dict[str, Any]]) -> dict[str, Any]:
200
+ split_counts = Counter(str(row.get("split", "unspecified")) for row in rows)
201
+ episodes_by_split: dict[str, set[str]] = {}
202
+ missing_json_answer = 0
203
+ missing_json_fields = Counter()
204
+ rows_with_video = 0
205
+ rows_with_action_target = 0
206
+ valid_action_targets = 0
207
+ target_key_counts = Counter()
208
+ target_mode_counts = Counter()
209
+ target_issue_counts = Counter()
210
+ examples: list[dict[str, Any]] = []
211
+
212
+ for row in rows:
213
+ split = str(row.get("split", "unspecified"))
214
+ episodes_by_split.setdefault(split, set()).add(str(row.get("episode_id", "")))
215
+ answer = row.get("answer_json") if isinstance(row.get("answer_json"), dict) else {}
216
+ if not answer:
217
+ missing_json_answer += 1
218
+ for field in REQUIRED_JSON_QA_FIELDS:
219
+ if field not in answer:
220
+ missing_json_fields[field] += 1
221
+ if media_has_video(row):
222
+ rows_with_video += 1
223
+
224
+ key, target = find_action_target(row)
225
+ if target is None:
226
+ continue
227
+ rows_with_action_target += 1
228
+ target_key_counts[str(key)] += 1
229
+ target_mode_counts[str(target.get("mode", "missing"))] += 1
230
+ issues = validate_action_target(target)
231
+ if issues:
232
+ for issue in issues:
233
+ target_issue_counts[issue] += 1
234
+ if len(examples) < 5:
235
+ examples.append({"id": row.get("id"), "target_key": key, "issues": issues})
236
+ else:
237
+ valid_action_targets += 1
238
+
239
+ return {
240
+ "num_rows": len(rows),
241
+ "split_counts": dict(split_counts),
242
+ "episode_split_counts": {split: len(episodes) for split, episodes in sorted(episodes_by_split.items())},
243
+ "rows_with_video": rows_with_video,
244
+ "missing_json_answer": missing_json_answer,
245
+ "missing_json_fields": dict(missing_json_fields),
246
+ "rows_with_action_target": rows_with_action_target,
247
+ "valid_action_targets": valid_action_targets,
248
+ "target_key_counts": dict(target_key_counts),
249
+ "target_mode_counts": dict(target_mode_counts),
250
+ "target_issue_counts": dict(target_issue_counts),
251
+ "target_issue_examples": examples,
252
+ }
253
+
254
+
255
+ def decide(dataset: dict[str, Any], model: dict[str, Any]) -> dict[str, Any]:
256
+ blockers: list[str] = []
257
+ warnings: list[str] = []
258
+
259
+ if dataset["num_rows"] <= 0:
260
+ blockers.append("dataset has zero rows")
261
+ if dataset["rows_with_video"] <= 0:
262
+ blockers.append("dataset has no video conditioning paths")
263
+ if dataset["missing_json_answer"] or dataset["missing_json_fields"]:
264
+ warnings.append("dataset is not a complete JSON QA export")
265
+
266
+ if model.get("provided"):
267
+ if not model.get("exists"):
268
+ blockers.append(f"model_dir does not exist: {model.get('path')}")
269
+ if model.get("model_type") != "cosmos3_omni":
270
+ warnings.append(f"model_type is not cosmos3_omni: {model.get('model_type')}")
271
+ if model.get("action_gen") is not True:
272
+ blockers.append("Cosmos3 transformer config does not advertise action_gen=True")
273
+ if not model.get("action_dim"):
274
+ blockers.append("Cosmos3 transformer config does not expose action_dim")
275
+ else:
276
+ warnings.append("model_dir not provided; model action_gen/action_dim could not be verified")
277
+
278
+ if dataset["rows_with_action_target"] <= 0:
279
+ blockers.append(
280
+ "dataset has no cosmos_action_target/cosmos3_action_target/action_target records; "
281
+ "semantic JSON labels cannot be used as Cosmos continuous action latents"
282
+ )
283
+ elif dataset["valid_action_targets"] != dataset["rows_with_action_target"]:
284
+ blockers.append(
285
+ "one or more action target records do not satisfy the CosmosActionCondition schema"
286
+ )
287
+
288
+ status = "ready_for_cosmos3_super_action_lora" if not blockers else "blocked_missing_cosmos_action_targets"
289
+ if not blockers and dataset.get("target_mode_counts") == {"forward_dynamics": dataset["rows_with_action_target"]}:
290
+ status = "ready_for_cosmos3_super_forward_dynamics_lora"
291
+ return {
292
+ "status": status,
293
+ "weights_updated": False,
294
+ "blockers": blockers,
295
+ "warnings": warnings,
296
+ "required_target_schema": REQUIRED_SCHEMA,
297
+ "trainer_contract": {
298
+ "diffusers_classes": [
299
+ "Cosmos3OmniPipeline",
300
+ "Cosmos3OmniTransformer",
301
+ "CosmosActionCondition",
302
+ ],
303
+ "packing_helpers": [
304
+ "Cosmos3OmniPipeline.prepare_latents",
305
+ "Cosmos3OmniPipeline._prepare_text_segment",
306
+ "Cosmos3OmniPipeline._prepare_vision_segment",
307
+ "Cosmos3OmniPipeline._prepare_action_segment",
308
+ ],
309
+ "forward_outputs": "Cosmos3OmniTransformer.forward returns (preds_vision, preds_sound, preds_action). The current camera_pose forward_dynamics target uses raw actions as conditioning and should supervise preds_vision; supervised preds_action needs policy or inverse_dynamics targets.",
310
+ "lora_targets": "use checkpoint-declared q_proj_moe_gen,k_proj_moe_gen,v_proj_moe_gen,o_proj_moe_gen unless a new audited config overrides them",
311
+ },
312
+ "next_steps": [
313
+ "Run the one-sample action batch packer that calls Cosmos3OmniPipeline.prepare_latents and the static segment helpers, then records whether the current target supervises vision or action tokens.",
314
+ "For the current camera_pose forward_dynamics target, implement a one-sample overfit with vision velocity/rectified-flow loss under action conditioning; add a policy/inverse target export before claiming supervised action-token prediction.",
315
+ "Run a one-episode overfit before scheduling a 96/16/16 Super LoRA run; only publish a Cosmos model repo after new adapter/checkpoint weights exist.",
316
+ ],
317
+ }
318
+
319
+
320
+ def write_report(path: Path, payload: dict[str, Any]) -> None:
321
+ decision = payload["decision"]
322
+ lines = [
323
+ "# Cosmos3-Super Training Contract Audit",
324
+ "",
325
+ f"- Run id: `{payload['run_id']}`",
326
+ f"- Dataset: `{payload['dataset_jsonl']}`",
327
+ f"- Rows: `{payload['dataset']['num_rows']}`",
328
+ f"- Rows with Cosmos action targets: `{payload['dataset']['rows_with_action_target']}`",
329
+ f"- Valid Cosmos action targets: `{payload['dataset']['valid_action_targets']}`",
330
+ f"- Status: `{decision['status']}`",
331
+ f"- Weights updated: `{decision['weights_updated']}`",
332
+ "",
333
+ "## Blockers",
334
+ "",
335
+ ]
336
+ if decision["blockers"]:
337
+ lines.extend(f"- {item}" for item in decision["blockers"])
338
+ else:
339
+ lines.append("- None")
340
+ lines.extend(["", "## Required Target Schema", "", "```json", json.dumps(REQUIRED_SCHEMA, indent=2), "```", ""])
341
+ lines.extend(["## Next Steps", ""])
342
+ lines.extend(f"- {item}" for item in decision["next_steps"])
343
+ path.write_text("\n".join(lines) + "\n", encoding="utf-8")
344
+
345
+
346
+ def main() -> int:
347
+ args = parse_args()
348
+ args.workspace = args.workspace.expanduser().resolve()
349
+ args.dataset_jsonl = args.dataset_jsonl.expanduser().resolve()
350
+ if args.model_dir is not None:
351
+ args.model_dir = args.model_dir.expanduser().resolve()
352
+ output_dir = args.output_dir or args.workspace / "results" / "omni_finetune" / args.run_id
353
+ output_dir = output_dir.expanduser().resolve()
354
+ output_dir.mkdir(parents=True, exist_ok=True)
355
+ progress_path = output_dir / "progress.jsonl"
356
+
357
+ started = time.time()
358
+ append_jsonl(progress_path, {"event": "start", "time": started, "run_id": args.run_id})
359
+ rows = load_jsonl(args.dataset_jsonl)
360
+ if args.sample_limit > 0:
361
+ rows = rows[: args.sample_limit]
362
+ append_jsonl(progress_path, {"event": "dataset_loaded", "time": time.time(), "rows": len(rows)})
363
+
364
+ dataset = dataset_summary(rows)
365
+ model = model_summary(args.model_dir)
366
+ backbone = read_json(args.backbone_config)
367
+ decision = decide(dataset, model)
368
+ payload = {
369
+ "run_id": args.run_id,
370
+ "run_kind": "cosmos3_super_training_contract_audit",
371
+ "started_at_unix": started,
372
+ "finished_at_unix": time.time(),
373
+ "elapsed_seconds": time.time() - started,
374
+ "workspace": str(args.workspace),
375
+ "dataset_jsonl": str(args.dataset_jsonl),
376
+ "sample_limit": args.sample_limit,
377
+ "backbone_config": str(args.backbone_config),
378
+ "backbone": {
379
+ "id": backbone.get("id"),
380
+ "display_name": backbone.get("display_name"),
381
+ "training_objective": backbone.get("training_objective"),
382
+ },
383
+ "model": model,
384
+ "dataset": dataset,
385
+ "decision": decision,
386
+ }
387
+ write_json(output_dir / "training_contract_audit.json", payload)
388
+ write_json(output_dir / "training_metadata.json", {
389
+ "run_id": args.run_id,
390
+ "run_kind": payload["run_kind"],
391
+ "weights_updated": False,
392
+ "checkpoint_dir": None,
393
+ "decision": decision,
394
+ })
395
+ write_report(output_dir / "RUN_REPORT.md", payload)
396
+ append_jsonl(progress_path, {"event": "complete", "time": time.time(), "status": decision["status"]})
397
+ print(json.dumps({"status": decision["status"], "output_dir": str(output_dir)}, indent=2))
398
+ ready_statuses = {
399
+ "ready_for_cosmos3_super_action_lora",
400
+ "ready_for_cosmos3_super_forward_dynamics_lora",
401
+ }
402
+ return 1 if args.require_trainable and decision["status"] not in ready_statuses else 0
403
+
404
+
405
+ if __name__ == "__main__":
406
+ raise SystemExit(main())
scripts/omni/build_omni_model_comparison.py CHANGED
@@ -315,8 +315,93 @@ def cosmos3_super_readiness_entry() -> dict[str, Any] | None:
315
  "weights": "none; readiness audit only, no adapter checkpoint",
316
  "interpretation": (
317
  "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and "
318
- "the same JSON QA dataset are visible, but blocks true fine-tuning until "
319
- "a Cosmos-specific diffusion/action target packer and supervised loss are implemented."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
320
  ),
321
  }
322
 
@@ -344,6 +429,8 @@ def model_grouped_view(versions: list[dict[str, Any]]) -> list[dict[str, Any]]:
344
  cosmos_nano_branches = [branch for branch in branches if branch.get("backbone") == "cosmos_world_model"]
345
  cosmos_super_branches = [branch for branch in branches if branch.get("backbone") == "cosmos3_super_reasoner"]
346
  cosmos_super_readiness = cosmos3_super_readiness_entry()
 
 
347
  if qwen_branches:
348
  current_qwen = max(qwen_branches, key=lambda item: item.get("primary_metrics", {}).get("json_validity_rate") or -1)
349
  for branch in qwen_branches:
@@ -451,13 +538,17 @@ def model_grouped_view(versions: list[dict[str, Any]]) -> list[dict[str, Any]]:
451
  ),
452
  }
453
  ],
454
- "readiness_runs": [cosmos_super_readiness] if cosmos_super_readiness else [],
 
 
455
  "multi_episode_128_runs": cosmos_super_branches,
456
  "comparison_note": (
457
  "Cosmos3-Super is now represented by a verified 448-window held-out "
458
  "Reasoner evaluation on the same JSON task as Qwen3. It uses staged base "
459
  "weights through vLLM, so it is a model-branch diagnostic, not a weight release. "
460
- "The readiness probe records why true Cosmos3-Super fine-tuning is not launched yet."
 
 
461
  ),
462
  },
463
  ]
@@ -481,7 +572,7 @@ def build_report() -> dict[str, Any]:
481
  "version_reading_notes": [
482
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
483
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
484
- "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation rather than a new fine-tuned weight release.",
485
  ],
486
  "versions": versions,
487
  "model_groups": model_groups,
@@ -490,11 +581,11 @@ def build_report() -> dict[str, Any]:
490
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
491
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
492
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
493
- "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a training-readiness probe; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist.",
494
  ],
495
  "pending": [
496
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
497
- "Promote Cosmos3 from Nano compatibility and Super base-weight evaluation to true fine-tuning only after a dedicated Cosmos diffusion/action target packer and supervised loss produce new weights.",
498
  ],
499
  }
500
 
@@ -512,7 +603,7 @@ def entry_count_text(entry: dict[str, Any]) -> str:
512
  pieces = []
513
  for label, keys in (
514
  ("episodes", ("episodes", "dataset_episodes", "held_out_episode_count")),
515
- ("windows/samples", ("windows", "rows", "dataset_samples", "eval_samples")),
516
  ("eval", ("eval_samples",)),
517
  ):
518
  value = next((counts.get(key) for key in keys if counts.get(key) is not None), None)
@@ -534,6 +625,12 @@ def entry_metric_text(entry: dict[str, Any]) -> str:
534
  "contact_accuracy",
535
  "accuracy",
536
  "macro_f1",
 
 
 
 
 
 
537
  "diffusers_runtime_supported",
538
  "chat_sft_supported",
539
  "weights_updated",
@@ -559,7 +656,7 @@ def append_model_group(lines: list[str], group: dict[str, Any]) -> None:
559
  for entry in group.get("one_episode_runs", []):
560
  rows.append(("1 episode", entry))
561
  for entry in group.get("readiness_runs", []):
562
- rows.append(("readiness", entry))
563
  for entry in group.get("multi_episode_128_runs", []):
564
  rows.append(("128 episode", entry))
565
  for scope, entry in rows:
 
315
  "weights": "none; readiness audit only, no adapter checkpoint",
316
  "interpretation": (
317
  "This probe confirms the staged Cosmos3-Super Diffusers/GPU runtime and "
318
+ "the same JSON QA dataset are visible. It predates the camera-pose action-target "
319
+ "export, so use the 20260608 contract audit for the current trainer-readiness status."
320
+ ),
321
+ }
322
+
323
+
324
+ def cosmos3_super_action_contract_entry() -> dict[str, Any] | None:
325
+ paths = sorted(
326
+ (ROOT / "results/omni_finetune").glob(
327
+ "xperience10m_cosmos3_super_training_contract_audit_*/training_contract_audit.json"
328
+ )
329
+ )
330
+ if not paths:
331
+ return None
332
+ payloads = [(path, load_json(path)) for path in paths]
333
+ path, payload = max(payloads, key=lambda item: item[1].get("finished_at_unix") or 0)
334
+ decision = payload.get("decision", {}) if isinstance(payload.get("decision"), dict) else {}
335
+ dataset = payload.get("dataset", {}) if isinstance(payload.get("dataset"), dict) else {}
336
+ target_modes = dataset.get("target_mode_counts", {}) if isinstance(dataset.get("target_mode_counts"), dict) else {}
337
+ only_forward_dynamics = set(target_modes) == {"forward_dynamics"}
338
+ return {
339
+ "id": payload.get("run_id", path.parent.name),
340
+ "title": "Cosmos3-Super Camera-Pose Target Audit",
341
+ "scope_label": "action target contract",
342
+ "scope": "selected 128-episode 96/16/16 dataset augmented with camera_pose proxy cosmos_action_target records",
343
+ "status": "ready_for_forward_dynamics_trainer" if only_forward_dynamics else "ready_for_action_lora_trainer" if decision.get("status") == "ready_for_cosmos3_super_action_lora" else decision.get("status", "unknown"),
344
+ "source": rel(path),
345
+ "split": "train/val/test by selected episode/session",
346
+ "counts": {
347
+ "dataset_samples": dataset.get("num_rows"),
348
+ "rows_with_action_target": dataset.get("rows_with_action_target"),
349
+ "valid_action_targets": dataset.get("valid_action_targets"),
350
+ "split_counts": dataset.get("split_counts"),
351
+ "episode_split_counts": dataset.get("episode_split_counts"),
352
+ },
353
+ "primary_metrics": {
354
+ "domain_name": "camera_pose",
355
+ "raw_action_dim": 9,
356
+ "mode": next(iter(target_modes), "forward_dynamics"),
357
+ "valid_action_targets": dataset.get("valid_action_targets"),
358
+ "weights_updated": decision.get("weights_updated"),
359
+ },
360
+ "weights": "none; action-target contract audit only, no adapter checkpoint",
361
+ "interpretation": (
362
+ "The selected dataset now has valid Cosmos3 camera_pose forward_dynamics targets "
363
+ "for an egocentric camera-motion proxy. These remove the target-schema blocker "
364
+ "for action-conditioned world-model training, but they supervise noisy vision "
365
+ "tokens rather than preds_action. The remaining work is a pipeline-loaded packer "
366
+ "check and one-sample forward-dynamics overfit; action-token prediction needs a "
367
+ "separate policy or inverse-dynamics target export."
368
+ ),
369
+ }
370
+
371
+
372
+ def cosmos3_super_packer_entry() -> dict[str, Any] | None:
373
+ paths = sorted(
374
+ (ROOT / "results/omni_finetune").glob("xperience10m_cosmos3_super_action_packer_*/packer_summary.json")
375
+ )
376
+ if not paths:
377
+ return None
378
+ payloads = [(path, load_json(path)) for path in paths]
379
+ path, payload = max(payloads, key=lambda item: item[1].get("finished_at_unix") or 0)
380
+ row_contract = payload.get("row_contract", {}) if isinstance(payload.get("row_contract"), dict) else {}
381
+ pack_result = payload.get("pack_result", {}) if isinstance(payload.get("pack_result"), dict) else {}
382
+ return {
383
+ "id": payload.get("run_id", path.parent.name),
384
+ "title": "Cosmos3-Super Action Batch Packer Smoke",
385
+ "scope_label": "batch packer",
386
+ "scope": "one selected train row from the camera_pose forward_dynamics augmented JSONL",
387
+ "status": payload.get("status", "unknown"),
388
+ "source": rel(path),
389
+ "split": row_contract.get("split"),
390
+ "counts": {
391
+ "samples": 1,
392
+ "raw_action_rows": (row_contract.get("raw_actions_shape") or [None, None])[0],
393
+ "raw_action_dim": row_contract.get("raw_action_dim"),
394
+ },
395
+ "primary_metrics": {
396
+ "mode": row_contract.get("mode"),
397
+ "loss_surface": row_contract.get("loss_surface"),
398
+ "pipeline_loaded": pack_result.get("pipeline_loaded"),
399
+ "weights_updated": payload.get("weights_updated"),
400
+ },
401
+ "weights": "none; schema-only packer smoke, no adapter checkpoint",
402
+ "interpretation": (
403
+ "The selected row maps to a camera_pose forward_dynamics contract. In the installed Cosmos3 pipeline this "
404
+ "uses raw actions as conditioning and supervises noisy vision tokens; it does not supervise preds_action."
405
  ),
406
  }
407
 
 
429
  cosmos_nano_branches = [branch for branch in branches if branch.get("backbone") == "cosmos_world_model"]
430
  cosmos_super_branches = [branch for branch in branches if branch.get("backbone") == "cosmos3_super_reasoner"]
431
  cosmos_super_readiness = cosmos3_super_readiness_entry()
432
+ cosmos_super_action_contract = cosmos3_super_action_contract_entry()
433
+ cosmos_super_packer = cosmos3_super_packer_entry()
434
  if qwen_branches:
435
  current_qwen = max(qwen_branches, key=lambda item: item.get("primary_metrics", {}).get("json_validity_rate") or -1)
436
  for branch in qwen_branches:
 
538
  ),
539
  }
540
  ],
541
+ "readiness_runs": [
542
+ entry for entry in (cosmos_super_readiness, cosmos_super_action_contract, cosmos_super_packer) if entry
543
+ ],
544
  "multi_episode_128_runs": cosmos_super_branches,
545
  "comparison_note": (
546
  "Cosmos3-Super is now represented by a verified 448-window held-out "
547
  "Reasoner evaluation on the same JSON task as Qwen3. It uses staged base "
548
  "weights through vLLM, so it is a model-branch diagnostic, not a weight release. "
549
+ "A camera-pose proxy forward-dynamics target export now passes the contract audit "
550
+ "and schema-only packer smoke; true Cosmos3-Super fine-tuning is still not launched "
551
+ "until the pipeline-loaded packer check and one-sample overfit exist."
552
  ),
553
  },
554
  ]
 
572
  "version_reading_notes": [
573
  "Version 1 is the public-sample 12-task harness with minimal and neural heads.",
574
  "Version 2 is the selected 128-episode same-split simple/NN baseline alignment.",
575
+ "Version 3 is the verified model-branch layer: the current final Qwen3-Omni LoRA package is the JSON-task diagnostic result, Cosmos3-Nano is a future-window compatibility result, and Cosmos3-Super Reasoner is a base-weight JSON-task evaluation; Cosmos3-Super now has a camera-pose forward-dynamics contract audit and schema-only packer smoke, but no new fine-tuned weight release.",
576
  ],
577
  "versions": versions,
578
  "model_groups": model_groups,
 
581
  "Task-head baselines have both a one-episode public-sample run and a 128-episode same-split metadata/text run.",
582
  "Qwen3-Omni has a one-episode sensor-adapter smoke test and separate 128-episode LoRA diagnostic packages; only the final 128-episode adapter belongs in the Qwen LoRA model repo.",
583
  "Cosmos3-Nano has a 128-episode future-window compatibility package.",
584
+ "Cosmos3-Super has a 128-episode base-weight Reasoner evaluation on the JSON task plus a camera-pose forward-dynamics contract audit; create a separate Cosmos model repo only after real Cosmos adapter/fine-tuned weights exist.",
585
  ],
586
  "pending": [
587
  "Use the final Qwen3 full-eval package as the current Qwen result; older Qwen package rows remain historical diagnostics for comparison.",
588
+ "Promote Cosmos3 from Nano compatibility, Super base-weight evaluation, and the camera-pose forward-dynamics contract to true fine-tuning only after the pipeline-loaded packer check and one-sample overfit produce new weights.",
589
  ],
590
  }
591
 
 
603
  pieces = []
604
  for label, keys in (
605
  ("episodes", ("episodes", "dataset_episodes", "held_out_episode_count")),
606
+ ("windows/samples", ("windows", "rows", "dataset_samples", "eval_samples", "samples")),
607
  ("eval", ("eval_samples",)),
608
  ):
609
  value = next((counts.get(key) for key in keys if counts.get(key) is not None), None)
 
625
  "contact_accuracy",
626
  "accuracy",
627
  "macro_f1",
628
+ "domain_name",
629
+ "raw_action_dim",
630
+ "mode",
631
+ "valid_action_targets",
632
+ "loss_surface",
633
+ "pipeline_loaded",
634
  "diffusers_runtime_supported",
635
  "chat_sft_supported",
636
  "weights_updated",
 
656
  for entry in group.get("one_episode_runs", []):
657
  rows.append(("1 episode", entry))
658
  for entry in group.get("readiness_runs", []):
659
+ rows.append((entry.get("scope_label", "readiness"), entry))
660
  for entry in group.get("multi_episode_128_runs", []):
661
  rows.append(("128 episode", entry))
662
  for scope, entry in rows:
scripts/omni/export_cosmos3_camera_pose_targets.py ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Augment exported Xperience windows with Cosmos3 camera-pose action targets.
3
+
4
+ This does not invent robot-control labels. It converts frame-aligned SLAM poses
5
+ from `annotation.hdf5` into the Cosmos3-supported `camera_pose` action domain:
6
+ 9D per-transition vectors with translation delta, rotation delta as a rotation
7
+ vector, and absolute displacement from the window start. The target is a
8
+ continuous egocentric-motion proxy suitable for a first Cosmos3 action-packer
9
+ smoke run; it is intentionally separate from the semantic JSON QA target.
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ import argparse
15
+ import json
16
+ import math
17
+ from collections import Counter
18
+ from pathlib import Path
19
+ from typing import Any
20
+
21
+ import h5py
22
+ import numpy as np
23
+
24
+ from qwen3_omni_dataset_utils import load_jsonl, write_jsonl
25
+
26
+
27
+ RAW_ACTION_DIM = 9
28
+ DOMAIN_NAME = "camera_pose"
29
+
30
+
31
+ def parse_args() -> argparse.Namespace:
32
+ workspace_default = Path(__file__).resolve().parents[2]
33
+ parser = argparse.ArgumentParser(description=__doc__)
34
+ parser.add_argument("--dataset-jsonl", type=Path, required=True)
35
+ parser.add_argument("--output-jsonl", type=Path, required=True)
36
+ parser.add_argument("--output-manifest", type=Path, required=True)
37
+ parser.add_argument("--chunk-size", type=int, default=8)
38
+ parser.add_argument("--resolution-tier", type=int, default=480, choices=[256, 480, 704, 720])
39
+ parser.add_argument("--view-point", default="ego_view")
40
+ parser.add_argument("--max-records", type=int, default=0)
41
+ parser.add_argument("--strict", action="store_true")
42
+ return parser.parse_args()
43
+
44
+
45
+ def read_pose_cache(annotation_path: Path) -> dict[str, np.ndarray]:
46
+ with h5py.File(annotation_path, "r") as h5:
47
+ slam = h5["slam"]
48
+ trans = np.asarray(slam["trans_xyz"], dtype=np.float64)
49
+ quat = np.asarray(slam["quat_wxyz"], dtype=np.float64)
50
+ frame_numbers = np.asarray(h5["video"]["frame_number"], dtype=np.int64)
51
+ return {"trans": trans, "quat": normalize_quat_array(quat), "frame_numbers": frame_numbers}
52
+
53
+
54
+ def normalize_quat_array(quat: np.ndarray) -> np.ndarray:
55
+ norm = np.linalg.norm(quat, axis=-1, keepdims=True)
56
+ norm[norm <= 1e-12] = 1.0
57
+ quat = quat / norm
58
+ # Keep quaternion sign continuous enough for simple deltas.
59
+ for idx in range(1, len(quat)):
60
+ if np.dot(quat[idx - 1], quat[idx]) < 0:
61
+ quat[idx] *= -1.0
62
+ return quat
63
+
64
+
65
+ def quat_inverse(q: np.ndarray) -> np.ndarray:
66
+ return np.asarray([q[0], -q[1], -q[2], -q[3]], dtype=np.float64) / max(float(np.dot(q, q)), 1e-12)
67
+
68
+
69
+ def quat_multiply(a: np.ndarray, b: np.ndarray) -> np.ndarray:
70
+ aw, ax, ay, az = a
71
+ bw, bx, by, bz = b
72
+ return np.asarray(
73
+ [
74
+ aw * bw - ax * bx - ay * by - az * bz,
75
+ aw * bx + ax * bw + ay * bz - az * by,
76
+ aw * by - ax * bz + ay * bw + az * bx,
77
+ aw * bz + ax * by - ay * bx + az * bw,
78
+ ],
79
+ dtype=np.float64,
80
+ )
81
+
82
+
83
+ def quat_to_rotvec(q: np.ndarray) -> np.ndarray:
84
+ q = q / max(float(np.linalg.norm(q)), 1e-12)
85
+ if q[0] < 0:
86
+ q = -q
87
+ w = float(np.clip(q[0], -1.0, 1.0))
88
+ xyz = q[1:]
89
+ sin_half = float(np.linalg.norm(xyz))
90
+ if sin_half < 1e-8:
91
+ return 2.0 * xyz
92
+ angle = 2.0 * math.atan2(sin_half, w)
93
+ if angle > math.pi:
94
+ angle -= 2.0 * math.pi
95
+ return xyz / sin_half * angle
96
+
97
+
98
+ def nearest_index(frame_numbers: np.ndarray, frame: int) -> int:
99
+ if frame <= int(frame_numbers[0]):
100
+ return 0
101
+ if frame >= int(frame_numbers[-1]):
102
+ return len(frame_numbers) - 1
103
+ return int(np.searchsorted(frame_numbers, frame, side="left"))
104
+
105
+
106
+ def sampled_frame_pairs(start_frame: int, end_frame: int, chunk_size: int) -> list[tuple[int, int]]:
107
+ if chunk_size < 1:
108
+ raise ValueError("chunk_size must be >= 1")
109
+ if end_frame <= start_frame:
110
+ end_frame = start_frame + chunk_size
111
+ points = np.linspace(start_frame, end_frame, chunk_size + 1)
112
+ frames = [int(round(value)) for value in points]
113
+ pairs: list[tuple[int, int]] = []
114
+ for left, right in zip(frames[:-1], frames[1:]):
115
+ if right <= left:
116
+ right = left + 1
117
+ pairs.append((left, right))
118
+ return pairs
119
+
120
+
121
+ def camera_pose_actions(pose: dict[str, np.ndarray], start_frame: int, end_frame: int, chunk_size: int) -> list[list[float]]:
122
+ trans = pose["trans"]
123
+ quat = pose["quat"]
124
+ frame_numbers = pose["frame_numbers"]
125
+ start_idx = nearest_index(frame_numbers, start_frame)
126
+ origin = trans[start_idx]
127
+ rows: list[list[float]] = []
128
+ for left_frame, right_frame in sampled_frame_pairs(start_frame, end_frame, chunk_size):
129
+ li = nearest_index(frame_numbers, left_frame)
130
+ ri = nearest_index(frame_numbers, right_frame)
131
+ delta_t = trans[ri] - trans[li]
132
+ delta_q = quat_multiply(quat[ri], quat_inverse(quat[li]))
133
+ delta_r = quat_to_rotvec(delta_q)
134
+ displacement = trans[ri] - origin
135
+ row = np.concatenate([delta_t, delta_r, displacement]).astype(np.float32)
136
+ if row.shape[0] != RAW_ACTION_DIM:
137
+ raise AssertionError(row.shape)
138
+ rows.append([float(value) for value in row])
139
+ return rows
140
+
141
+
142
+ def media_condition(row: dict[str, Any]) -> dict[str, Any]:
143
+ media = row.get("media") if isinstance(row.get("media"), dict) else {}
144
+ return {
145
+ "mosaic_video_path": media.get("mosaic_video_path"),
146
+ "video_paths": media.get("video_paths") if isinstance(media.get("video_paths"), list) else [],
147
+ "context_start_frame": media.get("context_start_frame"),
148
+ "context_end_frame": media.get("context_end_frame"),
149
+ }
150
+
151
+
152
+ def augment_rows(rows: list[dict[str, Any]], args: argparse.Namespace) -> tuple[list[dict[str, Any]], dict[str, Any]]:
153
+ pose_cache: dict[str, dict[str, np.ndarray]] = {}
154
+ counters = Counter()
155
+ issues: list[dict[str, Any]] = []
156
+ augmented: list[dict[str, Any]] = []
157
+ selected = rows[: args.max_records] if args.max_records > 0 else rows
158
+
159
+ for idx, row in enumerate(selected):
160
+ counters["rows_seen"] += 1
161
+ episode_path_raw = row.get("episode_path")
162
+ window = row.get("center_window") if isinstance(row.get("center_window"), dict) else {}
163
+ if not episode_path_raw or "start_frame" not in window or "end_frame" not in window:
164
+ counters["rows_skipped_missing_source_fields"] += 1
165
+ issues.append({"row_index": idx, "id": row.get("id"), "reason": "missing episode_path or center_window"})
166
+ if args.strict:
167
+ raise ValueError(issues[-1])
168
+ continue
169
+ annotation_path = Path(str(episode_path_raw)) / "annotation.hdf5"
170
+ if not annotation_path.exists():
171
+ counters["rows_skipped_missing_annotation"] += 1
172
+ issues.append({"row_index": idx, "id": row.get("id"), "reason": f"missing {annotation_path}"})
173
+ if args.strict:
174
+ raise FileNotFoundError(annotation_path)
175
+ continue
176
+ key = str(annotation_path)
177
+ if key not in pose_cache:
178
+ pose_cache[key] = read_pose_cache(annotation_path)
179
+ start_frame = int(window["start_frame"])
180
+ end_frame = int(window["end_frame"])
181
+ try:
182
+ raw_actions = camera_pose_actions(pose_cache[key], start_frame, end_frame, args.chunk_size)
183
+ except Exception as exc:
184
+ counters["rows_skipped_action_build_error"] += 1
185
+ issues.append({"row_index": idx, "id": row.get("id"), "reason": repr(exc)})
186
+ if args.strict:
187
+ raise
188
+ continue
189
+
190
+ copied = dict(row)
191
+ copied["cosmos_action_target"] = {
192
+ "mode": "forward_dynamics",
193
+ "domain_name": DOMAIN_NAME,
194
+ "chunk_size": args.chunk_size,
195
+ "raw_action_dim": RAW_ACTION_DIM,
196
+ "raw_actions": raw_actions,
197
+ "resolution_tier": args.resolution_tier,
198
+ "view_point": args.view_point,
199
+ "source": {
200
+ "kind": "slam_camera_pose_delta_proxy_v1",
201
+ "annotation_hdf5": str(annotation_path),
202
+ "frame_range": {"start_frame": start_frame, "end_frame": end_frame},
203
+ "fields": [
204
+ "slam/trans_xyz delta",
205
+ "slam/quat_wxyz delta as rotation vector",
206
+ "slam/trans_xyz displacement from window start",
207
+ ],
208
+ "units": "translation in annotation coordinate units; rotation in radians",
209
+ },
210
+ "conditioning": media_condition(row),
211
+ }
212
+ augmented.append(copied)
213
+ counters["rows_augmented"] += 1
214
+
215
+ manifest = {
216
+ "status": "pass" if counters["rows_augmented"] else "fail",
217
+ "input_dataset_jsonl": str(args.dataset_jsonl),
218
+ "output_jsonl": str(args.output_jsonl),
219
+ "domain_name": DOMAIN_NAME,
220
+ "raw_action_dim": RAW_ACTION_DIM,
221
+ "chunk_size": args.chunk_size,
222
+ "resolution_tier": args.resolution_tier,
223
+ "view_point": args.view_point,
224
+ "target_kind": "slam_camera_pose_delta_proxy_v1",
225
+ "counts": dict(counters),
226
+ "episode_annotation_files_read": len(pose_cache),
227
+ "issues": issues[:100],
228
+ "limitations": [
229
+ "This is an egocentric camera-motion proxy, not a robot gripper or human hand-control action.",
230
+ "Use it for Cosmos3 action-packer and one-episode overfit smoke tests before claiming model-quality improvement.",
231
+ "Fit any normalization on train episodes only before a full publishable Cosmos adapter run.",
232
+ ],
233
+ }
234
+ return augmented, manifest
235
+
236
+
237
+ def main() -> int:
238
+ args = parse_args()
239
+ rows = load_jsonl(args.dataset_jsonl)
240
+ augmented, manifest = augment_rows(rows, args)
241
+ args.output_jsonl.parent.mkdir(parents=True, exist_ok=True)
242
+ args.output_manifest.parent.mkdir(parents=True, exist_ok=True)
243
+ write_jsonl(args.output_jsonl, augmented)
244
+ args.output_manifest.write_text(json.dumps(manifest, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
245
+ print(json.dumps(manifest, indent=2, ensure_ascii=False))
246
+ return 0 if manifest["status"] == "pass" else 1
247
+
248
+
249
+ if __name__ == "__main__":
250
+ raise SystemExit(main())
scripts/omni/pack_cosmos3_super_action_batch.py ADDED
@@ -0,0 +1,459 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Pack one Cosmos3-Super action-conditioning batch from Xperience windows.
3
+
4
+ This is the bridge between the public-safe Xperience JSONL export and a real
5
+ Cosmos3 Diffusers trainer. It can run in two modes:
6
+
7
+ - schema mode: validate the selected row and infer the supervised loss surface
8
+ without loading the huge model.
9
+ - pipeline mode: load Cosmos3OmniPipeline and call the installed
10
+ prepare_latents/_prepare_*_segment helpers to verify tensor shapes and loss
11
+ indexes for one sample.
12
+
13
+ The current camera_pose target export uses mode=forward_dynamics. In the
14
+ installed Cosmos3 pipeline that mode treats actions as conditioning and
15
+ supervises noisy vision tokens, not preds_action. Policy/inverse-dynamics action
16
+ prediction requires a separate target export mode.
17
+ """
18
+
19
+ from __future__ import annotations
20
+
21
+ import argparse
22
+ import json
23
+ import time
24
+ from pathlib import Path
25
+ from typing import Any
26
+
27
+ from qwen3_omni_dataset_utils import load_jsonl
28
+
29
+
30
+ ACTION_TARGET_KEYS = (
31
+ "cosmos_action_target",
32
+ "cosmos3_action_target",
33
+ "cosmos_action_condition",
34
+ "action_target",
35
+ )
36
+
37
+
38
+ def parse_args() -> argparse.Namespace:
39
+ workspace_default = Path(__file__).resolve().parents[2]
40
+ parser = argparse.ArgumentParser(description=__doc__)
41
+ parser.add_argument("--workspace", type=Path, default=workspace_default)
42
+ parser.add_argument("--dataset-jsonl", type=Path, required=True)
43
+ parser.add_argument("--run-id", default="xperience10m_cosmos3_super_action_packer_smoke")
44
+ parser.add_argument("--output-dir", type=Path)
45
+ parser.add_argument("--model-dir", type=Path)
46
+ parser.add_argument(
47
+ "--backbone-config",
48
+ type=Path,
49
+ default=workspace_default / "configs" / "omni_backbones" / "cosmos3_super_reasoner.json",
50
+ )
51
+ parser.add_argument("--split", default="train")
52
+ parser.add_argument("--sample-index", type=int, default=0)
53
+ parser.add_argument("--sample-id")
54
+ parser.add_argument("--prompt", default="Predict the embodied future under the provided camera-pose action condition.")
55
+ parser.add_argument("--negative-prompt")
56
+ parser.add_argument("--fps", type=float, default=24.0)
57
+ parser.add_argument("--device", default="cuda")
58
+ parser.add_argument("--dtype", default="bfloat16", choices=["bfloat16", "float16", "float32"])
59
+ parser.add_argument("--load-pipeline", action="store_true")
60
+ parser.add_argument("--local-files-only", action=argparse.BooleanOptionalAction, default=True)
61
+ parser.add_argument("--require-media-exists", action="store_true")
62
+ return parser.parse_args()
63
+
64
+
65
+ def dtype_from_name(name: str):
66
+ import torch
67
+
68
+ return {
69
+ "bfloat16": torch.bfloat16,
70
+ "float16": torch.float16,
71
+ "float32": torch.float32,
72
+ }[name]
73
+
74
+
75
+ def write_json(path: Path, payload: dict[str, Any]) -> None:
76
+ path.parent.mkdir(parents=True, exist_ok=True)
77
+ path.write_text(json.dumps(payload, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
78
+
79
+
80
+ def append_jsonl(path: Path, payload: dict[str, Any]) -> None:
81
+ path.parent.mkdir(parents=True, exist_ok=True)
82
+ with path.open("a", encoding="utf-8") as handle:
83
+ handle.write(json.dumps(payload, sort_keys=True, ensure_ascii=False) + "\n")
84
+
85
+
86
+ def read_json(path: Path) -> dict[str, Any]:
87
+ if not path.exists():
88
+ return {}
89
+ return json.loads(path.read_text(encoding="utf-8"))
90
+
91
+
92
+ def find_action_target(row: dict[str, Any]) -> tuple[str | None, dict[str, Any] | None]:
93
+ for key in ACTION_TARGET_KEYS:
94
+ value = row.get(key)
95
+ if isinstance(value, dict):
96
+ return key, value
97
+ return None, None
98
+
99
+
100
+ def selected_row(rows: list[dict[str, Any]], args: argparse.Namespace) -> dict[str, Any]:
101
+ candidates = [row for row in rows if row.get("split") == args.split and find_action_target(row)[1] is not None]
102
+ if args.sample_id:
103
+ for row in rows:
104
+ if row.get("id") == args.sample_id:
105
+ return row
106
+ raise ValueError(f"sample id not found: {args.sample_id}")
107
+ if not candidates:
108
+ raise ValueError(f"no rows with action targets found for split={args.split!r}")
109
+ if args.sample_index < 0 or args.sample_index >= len(candidates):
110
+ raise ValueError(f"sample-index {args.sample_index} outside 0..{len(candidates)-1}")
111
+ return candidates[args.sample_index]
112
+
113
+
114
+ def numeric_matrix(value: Any) -> tuple[bool, tuple[int, int] | None]:
115
+ if not isinstance(value, list) or not value:
116
+ return False, None
117
+ width = None
118
+ for item in value:
119
+ if not isinstance(item, list) or not item:
120
+ return False, None
121
+ width = len(item) if width is None else width
122
+ if len(item) != width:
123
+ return False, None
124
+ for number in item:
125
+ if not isinstance(number, (int, float)):
126
+ return False, None
127
+ return True, (len(value), int(width or 0))
128
+
129
+
130
+ def media_video_path(row: dict[str, Any], target: dict[str, Any]) -> str | None:
131
+ conditioning = target.get("conditioning") if isinstance(target.get("conditioning"), dict) else {}
132
+ media = row.get("media") if isinstance(row.get("media"), dict) else {}
133
+ for block in (conditioning, media):
134
+ value = block.get("mosaic_video_path")
135
+ if value:
136
+ return str(value)
137
+ for block in (conditioning, media):
138
+ paths = block.get("video_paths")
139
+ if isinstance(paths, list):
140
+ for item in paths:
141
+ if isinstance(item, dict) and item.get("path"):
142
+ return str(item["path"])
143
+ return None
144
+
145
+
146
+ def row_contract(row: dict[str, Any], require_media_exists: bool) -> dict[str, Any]:
147
+ key, target = find_action_target(row)
148
+ if target is None:
149
+ raise ValueError(f"row has no Cosmos action target: {row.get('id')}")
150
+
151
+ video_path = media_video_path(row, target)
152
+ if not video_path:
153
+ raise ValueError(f"row has no video conditioning path: {row.get('id')}")
154
+ if require_media_exists and not Path(video_path).exists():
155
+ raise FileNotFoundError(video_path)
156
+
157
+ mode = str(target.get("mode"))
158
+ domain_name = str(target.get("domain_name"))
159
+ chunk_size = int(target.get("chunk_size"))
160
+ raw_actions = target.get("raw_actions")
161
+ ok, shape = numeric_matrix(raw_actions)
162
+ raw_action_dim = int(target.get("raw_action_dim") or (shape[1] if shape else 0))
163
+ issues: list[str] = []
164
+ if mode not in {"forward_dynamics", "policy", "inverse_dynamics"}:
165
+ issues.append(f"unsupported mode={mode!r}")
166
+ if domain_name != "camera_pose":
167
+ issues.append(f"expected camera_pose target for this export, got {domain_name!r}")
168
+ if chunk_size < 1:
169
+ issues.append("chunk_size must be >= 1")
170
+ if mode == "forward_dynamics":
171
+ if not ok:
172
+ issues.append("forward_dynamics requires numeric raw_actions")
173
+ elif shape and shape[1] != raw_action_dim:
174
+ issues.append(f"raw_actions width {shape[1]} does not match raw_action_dim {raw_action_dim}")
175
+
176
+ if mode == "forward_dynamics":
177
+ loss_surface = "vision_velocity_conditioned_on_camera_pose"
178
+ action_loss_expected = False
179
+ note = (
180
+ "Cosmos3 forward_dynamics consumes raw_actions as conditioning and predicts noisy vision tokens. "
181
+ "It does not supervise preds_action for this target mode."
182
+ )
183
+ else:
184
+ loss_surface = "action_velocity"
185
+ action_loss_expected = True
186
+ note = (
187
+ "Cosmos3 policy/inverse_dynamics can expose noisy action tokens, but the current camera-pose export "
188
+ "does not yet create that target mode."
189
+ )
190
+
191
+ return {
192
+ "row_id": row.get("id"),
193
+ "episode_id": row.get("episode_id"),
194
+ "split": row.get("split"),
195
+ "target_key": key,
196
+ "mode": mode,
197
+ "domain_name": domain_name,
198
+ "chunk_size": chunk_size,
199
+ "raw_action_dim": raw_action_dim,
200
+ "raw_actions_shape": list(shape) if shape else None,
201
+ "video_path": video_path,
202
+ "video_path_exists": Path(video_path).exists(),
203
+ "loss_surface": loss_surface,
204
+ "action_loss_expected": action_loss_expected,
205
+ "interpretation": note,
206
+ "issues": issues,
207
+ }
208
+
209
+
210
+ def instantiate_action_condition(row: dict[str, Any], contract: dict[str, Any]):
211
+ import torch
212
+ from diffusers.pipelines.cosmos.pipeline_cosmos3_omni import CosmosActionCondition
213
+
214
+ _, target = find_action_target(row)
215
+ if target is None:
216
+ raise ValueError("missing action target")
217
+ raw_actions = None
218
+ if target.get("raw_actions") is not None:
219
+ raw_actions = torch.tensor(target["raw_actions"], dtype=torch.float32)
220
+ video = [contract["video_path"]]
221
+ return CosmosActionCondition(
222
+ mode=contract["mode"],
223
+ chunk_size=int(contract["chunk_size"]),
224
+ domain_name=contract["domain_name"],
225
+ resolution_tier=int(target.get("resolution_tier", 480)),
226
+ raw_actions=raw_actions,
227
+ video=video,
228
+ view_point=str(target.get("view_point", "ego_view")),
229
+ )
230
+
231
+
232
+ def resolve_action_canvas(pipe, action) -> tuple[int | None, int | None]:
233
+ try:
234
+ from diffusers.pipelines.cosmos.pipeline_cosmos3_omni import _ACTION_RESOLUTION_BINS, VideoProcessor
235
+
236
+ conditioning_clip = [action.image] if action.image is not None else action.video
237
+ probe = pipe.video_processor.preprocess_video(conditioning_clip)
238
+ source_h, source_w = int(probe.shape[-2]), int(probe.shape[-1])
239
+ resolution_key = str(action.resolution_tier)
240
+ return VideoProcessor.classify_height_width_bin(source_h, source_w, ratios=_ACTION_RESOLUTION_BINS[resolution_key])
241
+ except Exception:
242
+ return None, None
243
+
244
+
245
+ def tokenize_prompt(pipe, args: argparse.Namespace, action, height: int | None, width: int | None) -> list[int]:
246
+ if hasattr(pipe, "tokenize_prompt"):
247
+ cond_ids, _ = pipe.tokenize_prompt(
248
+ args.prompt,
249
+ args.negative_prompt,
250
+ num_frames=action.chunk_size + 1,
251
+ height=height,
252
+ width=width,
253
+ fps=args.fps,
254
+ action_mode=action.mode,
255
+ action_view_point=action.view_point,
256
+ )
257
+ return list(cond_ids)
258
+ encoded = pipe.tokenizer(args.prompt, add_special_tokens=True)
259
+ return list(encoded["input_ids"])
260
+
261
+
262
+ def pack_with_pipeline(row: dict[str, Any], contract: dict[str, Any], args: argparse.Namespace) -> dict[str, Any]:
263
+ import torch
264
+ from diffusers import Cosmos3OmniPipeline
265
+
266
+ if args.model_dir is None:
267
+ raise ValueError("--model-dir is required with --load-pipeline")
268
+ dtype = dtype_from_name(args.dtype)
269
+ pipe = Cosmos3OmniPipeline.from_pretrained(
270
+ str(args.model_dir),
271
+ torch_dtype=dtype,
272
+ local_files_only=args.local_files_only,
273
+ )
274
+ pipe.to(args.device)
275
+ if hasattr(pipe, "set_progress_bar_config"):
276
+ pipe.set_progress_bar_config(disable=True)
277
+
278
+ action = instantiate_action_condition(row, contract)
279
+ height, width = resolve_action_canvas(pipe, action)
280
+ input_ids = tokenize_prompt(pipe, args, action, height, width)
281
+ text_segment = pipe._prepare_text_segment(input_ids, device=args.device)
282
+ (
283
+ latents,
284
+ sound_latents,
285
+ action_latents,
286
+ fps_vision,
287
+ fps_sound,
288
+ vision_condition_mask,
289
+ sound_condition_mask,
290
+ action_condition_mask,
291
+ action_domain_id,
292
+ action_image_size,
293
+ raw_action_dim_resolved,
294
+ action_condition_frame_indexes,
295
+ ) = pipe.prepare_latents(
296
+ num_frames=action.chunk_size + 1,
297
+ height=height,
298
+ width=width,
299
+ fps=args.fps,
300
+ device=args.device,
301
+ dtype=dtype,
302
+ enable_sound=False,
303
+ action=action,
304
+ )
305
+ vision_condition_indexes = torch.nonzero(vision_condition_mask[:, 0, 0] > 0, as_tuple=False).flatten()
306
+ vision_condition_indexes = [int(idx.item()) for idx in vision_condition_indexes]
307
+ vision_segment = pipe._prepare_vision_segment(
308
+ input_vision_tokens=latents,
309
+ has_image_condition=bool(vision_condition_indexes),
310
+ mrope_offset=text_segment["vision_start_temporal_offset"],
311
+ vision_fps=fps_vision,
312
+ curr=text_segment["und_len"],
313
+ device=args.device,
314
+ condition_frame_indexes=vision_condition_indexes,
315
+ )
316
+ action_segment = {}
317
+ if action_latents is not None:
318
+ action_segment = pipe._prepare_action_segment(
319
+ input_action_tokens=action_latents,
320
+ condition_frame_indexes=action_condition_frame_indexes,
321
+ mrope_offset=text_segment["vision_start_temporal_offset"],
322
+ action_fps=fps_vision,
323
+ curr=text_segment["und_len"] + vision_segment["num_vision_tokens"],
324
+ device=args.device,
325
+ )
326
+ action_loss_tokens = int(action_segment.get("action_mse_loss_indexes", torch.tensor([])).numel())
327
+ vision_loss_tokens = int(vision_segment.get("vision_mse_loss_indexes", torch.tensor([])).numel())
328
+ status = "pass"
329
+ if contract["mode"] == "forward_dynamics" and action_loss_tokens != 0:
330
+ status = "warning_unexpected_action_loss_tokens"
331
+ elif contract["mode"] != "forward_dynamics" and action_loss_tokens == 0:
332
+ status = "warning_no_action_loss_tokens"
333
+
334
+ return {
335
+ "status": status,
336
+ "pipeline_loaded": True,
337
+ "model_dir": str(args.model_dir),
338
+ "dtype": args.dtype,
339
+ "device": args.device,
340
+ "canvas": {"height": height, "width": width},
341
+ "text_tokens": int(text_segment["und_len"]),
342
+ "vision_latents_shape": list(latents.shape),
343
+ "vision_condition_frames": vision_condition_indexes,
344
+ "vision_loss_tokens": vision_loss_tokens,
345
+ "action_latents_shape": list(action_latents.shape) if action_latents is not None else None,
346
+ "action_condition_frames": list(action_condition_frame_indexes),
347
+ "action_loss_tokens": action_loss_tokens,
348
+ "raw_action_dim_resolved": raw_action_dim_resolved,
349
+ "action_domain_id": action_domain_id.detach().cpu().tolist() if action_domain_id is not None else None,
350
+ "loss_surface": contract["loss_surface"],
351
+ "training_readout": (
352
+ "Use a vision velocity/rectified-flow loss for this forward_dynamics camera_pose target."
353
+ if contract["mode"] == "forward_dynamics"
354
+ else "Use an action velocity loss for policy/inverse_dynamics targets."
355
+ ),
356
+ "unused_optional": {
357
+ "sound_latents": sound_latents is not None,
358
+ "fps_sound": fps_sound,
359
+ "sound_condition_mask": sound_condition_mask is not None,
360
+ "action_image_size": list(action_image_size.shape) if hasattr(action_image_size, "shape") else None,
361
+ },
362
+ }
363
+
364
+
365
+ def write_report(path: Path, payload: dict[str, Any]) -> None:
366
+ contract = payload["row_contract"]
367
+ pack = payload["pack_result"]
368
+ lines = [
369
+ "# Cosmos3-Super Action Batch Packer",
370
+ "",
371
+ f"- Run id: `{payload['run_id']}`",
372
+ f"- Row: `{contract.get('row_id')}`",
373
+ f"- Mode: `{contract.get('mode')}`",
374
+ f"- Domain: `{contract.get('domain_name')}`",
375
+ f"- Raw action shape: `{contract.get('raw_actions_shape')}`",
376
+ f"- Pipeline loaded: `{pack.get('pipeline_loaded')}`",
377
+ f"- Status: `{payload['status']}`",
378
+ "",
379
+ "## Loss Surface",
380
+ "",
381
+ f"- `{contract.get('loss_surface')}`",
382
+ f"- {contract.get('interpretation')}",
383
+ "",
384
+ "## Next Step",
385
+ "",
386
+ ]
387
+ if contract.get("mode") == "forward_dynamics":
388
+ lines.append("- Implement the one-sample overfit with a vision velocity/rectified-flow loss under camera-pose action conditioning.")
389
+ lines.append("- Add a separate policy or inverse-dynamics target export before claiming supervised action-token prediction.")
390
+ else:
391
+ lines.append("- Implement the one-sample overfit with action velocity loss over noisy action tokens.")
392
+ path.write_text("\n".join(lines) + "\n", encoding="utf-8")
393
+
394
+
395
+ def main() -> int:
396
+ args = parse_args()
397
+ args.workspace = args.workspace.expanduser().resolve()
398
+ args.dataset_jsonl = args.dataset_jsonl.expanduser().resolve()
399
+ if args.model_dir is not None:
400
+ args.model_dir = args.model_dir.expanduser().resolve()
401
+ output_dir = args.output_dir or args.workspace / "results" / "omni_finetune" / args.run_id
402
+ output_dir = output_dir.expanduser().resolve()
403
+ progress_path = output_dir / "progress.jsonl"
404
+ if progress_path.exists():
405
+ progress_path.unlink()
406
+
407
+ started = time.time()
408
+ append_jsonl(progress_path, {"event": "start", "time": started, "run_id": args.run_id})
409
+ rows = load_jsonl(args.dataset_jsonl)
410
+ row = selected_row(rows, args)
411
+ contract = row_contract(row, require_media_exists=args.require_media_exists)
412
+ append_jsonl(progress_path, {"event": "row_selected", "time": time.time(), "row_id": contract["row_id"]})
413
+ if contract["issues"]:
414
+ pack_result = {"status": "blocked_row_contract", "pipeline_loaded": False, "issues": contract["issues"]}
415
+ elif args.load_pipeline:
416
+ pack_result = pack_with_pipeline(row, contract, args)
417
+ else:
418
+ pack_result = {
419
+ "status": "schema_ready_pipeline_not_loaded",
420
+ "pipeline_loaded": False,
421
+ "loss_surface": contract["loss_surface"],
422
+ "action_loss_expected": contract["action_loss_expected"],
423
+ }
424
+
425
+ status = "pass" if not contract["issues"] and not str(pack_result["status"]).startswith("warning") else pack_result["status"]
426
+ payload = {
427
+ "run_id": args.run_id,
428
+ "run_kind": "cosmos3_super_action_batch_packer",
429
+ "started_at_unix": started,
430
+ "finished_at_unix": time.time(),
431
+ "elapsed_seconds": time.time() - started,
432
+ "dataset_jsonl": str(args.dataset_jsonl),
433
+ "backbone_config": str(args.backbone_config),
434
+ "backbone": read_json(args.backbone_config),
435
+ "status": status,
436
+ "row_contract": contract,
437
+ "pack_result": pack_result,
438
+ "weights_updated": False,
439
+ }
440
+ write_json(output_dir / "packer_summary.json", payload)
441
+ write_json(
442
+ output_dir / "training_metadata.json",
443
+ {
444
+ "run_id": args.run_id,
445
+ "run_kind": payload["run_kind"],
446
+ "weights_updated": False,
447
+ "checkpoint_dir": None,
448
+ "status": status,
449
+ "loss_surface": contract["loss_surface"],
450
+ },
451
+ )
452
+ write_report(output_dir / "RUN_REPORT.md", payload)
453
+ append_jsonl(progress_path, {"event": "complete", "time": time.time(), "status": status})
454
+ print(json.dumps({"status": status, "output_dir": str(output_dir)}, indent=2))
455
+ return 0 if status == "pass" else 1
456
+
457
+
458
+ if __name__ == "__main__":
459
+ raise SystemExit(main())
scripts/omni/run_qwen3_omni_v4_4epoch_8gpu.sh ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+
4
+ # Stronger Qwen3-Omni LoRA continuation over the already exported 128-episode
5
+ # 96/16/16 dataset. This launcher intentionally reuses the sealed split and
6
+ # writes a distinct run id so it cannot overwrite the public v3 diagnostic.
7
+
8
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
9
+ PROJECT_DIR="${PROJECT_DIR:-$(cd "$SCRIPT_DIR/../.." && pwd)}"
10
+ cd "$PROJECT_DIR"
11
+
12
+ RUN_ID="${RUN_ID:-xperience10m_qwen3_omni_128ep_structured_json_v4_4epoch_full8gpu_lora}"
13
+ DATASET_JSONL="${DATASET_JSONL:-results/omni_finetune/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_dataset/dataset.jsonl}"
14
+ MODEL_ID="${MODEL_ID:-$HOME/Ropedia/modelscope_models/Qwen__Qwen3-Omni-30B-A3B-Instruct}"
15
+ BACKBONE_CONFIG="${BACKBONE_CONFIG:-configs/omni_backbones/qwen3_omni_lora.json}"
16
+ EPOCHS="${EPOCHS:-4}"
17
+ GRADIENT_ACCUMULATION_STEPS="${GRADIENT_ACCUMULATION_STEPS:-8}"
18
+ MAX_VAL_SAMPLES="${MAX_VAL_SAMPLES:-512}"
19
+
20
+ RUN_DIR="results/omni_finetune/${RUN_ID}"
21
+ LOG="${RUN_DIR}/train.launch.log"
22
+ STATUS="${RUN_DIR}/launch_status.jsonl"
23
+ mkdir -p "$RUN_DIR"
24
+
25
+ json_status() {
26
+ .venv/bin/python - "$STATUS" "$@" <<'PY'
27
+ import json
28
+ import sys
29
+ import time
30
+
31
+ path = sys.argv[1]
32
+ payload = {"time": time.time()}
33
+ for item in sys.argv[2:]:
34
+ key, value = item.split("=", 1)
35
+ if value.isdigit():
36
+ value = int(value)
37
+ payload[key] = value
38
+ with open(path, "a", encoding="utf-8") as handle:
39
+ handle.write(json.dumps(payload, sort_keys=True) + "\n")
40
+ print(json.dumps(payload, sort_keys=True), flush=True)
41
+ PY
42
+ }
43
+
44
+ if [[ ! -s "$DATASET_JSONL" ]]; then
45
+ json_status event=blocked_missing_dataset dataset_jsonl="$DATASET_JSONL"
46
+ exit 2
47
+ fi
48
+
49
+ if pgrep -af "train_qwen3_omni_lora.py.*--run-id ${RUN_ID}" >/dev/null 2>&1; then
50
+ json_status event=already_running run_id="$RUN_ID"
51
+ pgrep -af "train_qwen3_omni_lora.py.*--run-id ${RUN_ID}"
52
+ exit 0
53
+ fi
54
+
55
+ if pgrep -af "train_qwen3_omni_lora.py" >/dev/null 2>&1; then
56
+ json_status event=blocked_other_training run_id="$RUN_ID"
57
+ pgrep -af "train_qwen3_omni_lora.py"
58
+ exit 3
59
+ fi
60
+
61
+ cmd=(
62
+ .venv/bin/python -m accelerate.commands.launch
63
+ --num_processes 8
64
+ --mixed_precision bf16
65
+ --use_fsdp
66
+ --fsdp_sharding_strategy FULL_SHARD
67
+ --fsdp_auto_wrap_policy TRANSFORMER_BASED_WRAP
68
+ --fsdp_transformer_layer_cls_to_wrap Qwen3OmniMoeThinkerTextDecoderLayer
69
+ --fsdp_use_orig_params true
70
+ --fsdp_cpu_ram_efficient_loading true
71
+ --fsdp_sync_module_states true
72
+ --fsdp_activation_checkpointing true
73
+ scripts/omni/train_qwen3_omni_lora.py
74
+ --dataset-jsonl "$DATASET_JSONL"
75
+ --model-id "$MODEL_ID"
76
+ --backbone-config "$BACKBONE_CONFIG"
77
+ --run-id "$RUN_ID"
78
+ --train-split train
79
+ --val-split val
80
+ --epochs "$EPOCHS"
81
+ --batch-size 1
82
+ --gradient-accumulation-steps "$GRADIENT_ACCUMULATION_STEPS"
83
+ --max-train-samples 0
84
+ --max-val-samples "$MAX_VAL_SAMPLES"
85
+ --local-files-only
86
+ --gradient-checkpointing
87
+ --progress-every 10
88
+ )
89
+
90
+ json_status event=launch_start run_id="$RUN_ID" epochs="$EPOCHS" dataset_jsonl="$DATASET_JSONL"
91
+ CUDA_VISIBLE_DEVICES="${CUDA_VISIBLE_DEVICES:-0,1,2,3,4,5,6,7}" \
92
+ PYTORCH_CUDA_ALLOC_CONF="${PYTORCH_CUDA_ALLOC_CONF:-expandable_segments:True}" \
93
+ nohup "${cmd[@]}" > "$LOG" 2>&1 < /dev/null &
94
+ pid=$!
95
+ sleep 3
96
+
97
+ if ps -p "$pid" >/dev/null 2>&1; then
98
+ json_status event=launch_detached run_id="$RUN_ID" pid="$pid" log="$LOG"
99
+ echo "launched run_id=${RUN_ID} pid=${pid} log=${LOG}"
100
+ exit 0
101
+ fi
102
+
103
+ json_status event=launch_failed run_id="$RUN_ID" log="$LOG"
104
+ tail -120 "$LOG" || true
105
+ exit 1
scripts/verify_live_publication.py CHANGED
@@ -311,7 +311,7 @@ MARKER_CHECKS = [
311
  "100.00%",
312
  "omni_model_comparison.json",
313
  "ropedia-qwen3-omni-lora-128ep",
314
- "Cosmos3-Super has a verified base-weight Reasoner JSON-task evaluation",
315
  ],
316
  "forbidden": [
317
  "xperience10m-" + "taskfirst-v10",
@@ -340,7 +340,7 @@ MARKER_CHECKS = [
340
  "100.00%",
341
  "omni_model_comparison.json",
342
  "ropedia-qwen3-omni-lora-128ep",
343
- "Cosmos3-Super has a verified base-weight Reasoner JSON-task evaluation",
344
  ],
345
  "forbidden": [
346
  "xperience10m-" + "taskfirst-v10",
 
311
  "100.00%",
312
  "omni_model_comparison.json",
313
  "ropedia-qwen3-omni-lora-128ep",
314
+ "Cosmos3-Super has a verified base-weight JSON-task evaluation plus a camera-pose forward-dynamics contract audit",
315
  ],
316
  "forbidden": [
317
  "xperience10m-" + "taskfirst-v10",
 
340
  "100.00%",
341
  "omni_model_comparison.json",
342
  "ropedia-qwen3-omni-lora-128ep",
343
+ "Cosmos3-Super has a verified base-weight JSON-task evaluation plus a camera-pose forward-dynamics contract audit",
344
  ],
345
  "forbidden": [
346
  "xperience10m-" + "taskfirst-v10",