File size: 20,256 Bytes
4bd6e11 f590d7e 29690f5 bfcf156 4bd6e11 f590d7e a49986a 2bd560e bfcf156 31e3087 2bd560e d96f266 bfcf156 cf07180 f590d7e 3c21768 29690f5 94a5118 b7a466b 94a5118 29690f5 94a5118 cca436c b7a466b cca436c cf07180 cca436c 9d58132 b7a466b 9d58132 2c5b88c 756e790 2c5b88c d9be7c0 9d58132 f590d7e b7a466b f590d7e 08a4bf0 45c1706 6a1869c ca4ac1c 08a4bf0 f590d7e b7a466b f590d7e d9be7c0 08a4bf0 3c21768 08a4bf0 d9be7c0 06e91ec 3c21768 ca4ac1c 45c1706 4173e02 f590d7e b7a466b 4bd6e11 f590d7e 29690f5 540e67a f590d7e 29690f5 f590d7e c325020 b7bdcde f590d7e 4bd6e11 29690f5 c4212da 540e67a 4bd6e11 f590d7e 4bd6e11 2bd560e 4602161 a8fd797 2bd560e eb6a05d eeac43c a8fd797 98cf463 3c21768 a07660e f590d7e 2bd560e 91b502e 98cf463 2bd560e 31e3087 d96f266 bfcf156 4bd6e11 f590d7e 4bd6e11 f590d7e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | # Artifact Guide
This guide is the human-readable map for the public Ropedia Xperience-10M task
suite artifacts. It is organized around what a reader usually wants to do:
understand the project, inspect the sample episode, compare baselines, read the
task results, follow the Qwen3-Omni scale-up path, and understand the longer
Xperience-native pretraining goal.
## Start Here
| Artifact | Why to open it first |
| --- | --- |
| [`PUBLIC_READER_MAP.md`](PUBLIC_READER_MAP.md) | Chooses the right public surface first: GitHub source, website, HF Space, artifact dataset, baseline model repo, model-branch repos, or release-health files. |
| [`PROJECT_STATUS.md`](PROJECT_STATUS.md) | Gives the fastest current-state table: implemented, being improved, and outside current scope. |
| [`RESEARCH_ROADMAP.md`](RESEARCH_ROADMAP.md) | Shows the roadmap from public-sample task development to multi-episode data preparation, Qwen3-Omni LoRA, robustness runs, model branches, and the future native-pretraining goal. |
| [`FOUNDATION_MODEL_PLAN.md`](FOUNDATION_MODEL_PLAN.md) | Explains which foundation backbones fit which Xperience-10M objective: Qwen3-Omni first, Cosmos 3 for world modeling, and VLA/policy models after action-target conversion. |
| [`OMNI_MODEL_EXTENSION_CONTRACT.md`](OMNI_MODEL_EXTENSION_CONTRACT.md) | Defines the shared manifest, split, evaluation, packaging, and public-safety contract that future Qwen, Cosmos-style, and VLA/policy branches must satisfy. |
| [`ADDITIONAL_DEVELOPMENT_DIRECTIONS.md`](ADDITIONAL_DEVELOPMENT_DIRECTIONS.md) | Records concrete non-backbone development tracks: taxonomy, benchmark protocol, representation learning, skill graphs, affordances, 3D/4D memory, QA, and policy transfer. |
| [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
| [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
| [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
| [`results/audio_ablation/AUDIO_ABLATION_SUMMARY.md`](results/audio_ablation/AUDIO_ABLATION_SUMMARY.md) | Shows measured current-audio and raw log-mel replacement deltas across the original task contracts. |
| [`docs/single_episode_explorer.html`](docs/single_episode_explorer.html) | Gives a static window-level explorer for the public sample episode. |
| [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Optional detail for readers who need official dataset and access-term context. |
## Dataset Context
| Artifact | What it shows |
| --- | --- |
| [`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`](XPERIENCE10M_DATASET_CARD_ALIGNMENT.md) | Human-readable summary of the official gated Xperience-10M dataset, public sample, modalities, access terms, intended uses, and limitations. |
| [`docs/data/xperience10m_dataset_card_alignment.json`](docs/data/xperience10m_dataset_card_alignment.json) | Machine-readable dataset-context bundle for the website and Hub pages. |
| [`SOURCE_ALIGNMENT_AUDIT.md`](SOURCE_ALIGNMENT_AUDIT.md) | Supporting provenance note for maintainers who want to inspect how public dataset descriptions were checked. |
| [`docs/data/source_alignment_audit.json`](docs/data/source_alignment_audit.json) | Machine-readable provenance record for generated project pages. |
| [`scripts/validate_source_alignment.py`](scripts/validate_source_alignment.py) | Maintenance script for refreshing the dataset-context note. |
## Evaluation Protocol
| Artifact | What it shows |
| --- | --- |
| [`EVALUATION_PROTOCOL.md`](EVALUATION_PROTOCOL.md) | Human-readable task protocol: window unit, chronological split, input/target contracts, primary metrics, leakage controls, and current limitations. |
| [`docs/data/evaluation_protocol.json`](docs/data/evaluation_protocol.json) | Machine-readable protocol generated from committed task metrics. |
| [`scripts/build_evaluation_protocol.py`](scripts/build_evaluation_protocol.py) | Regenerates the protocol from `docs/data/summary_metrics.json` and source task artifacts. |
## Visual Evidence
| Artifact | What it shows |
| --- | --- |
| [`FIGURE_INDEX.md`](FIGURE_INDEX.md) | Human-readable catalog of public visual assets, dimensions, hashes, roles, and source scripts. |
| [`docs/data/figure_index.json`](docs/data/figure_index.json) | Machine-readable visual asset index mirrored to the website, artifact dataset, and model repo. |
| [`scripts/build_figure_index.py`](scripts/build_figure_index.py) | Regenerates visual-asset hashes, dimensions, and source-script provenance. |
| [`docs/data/brand_assets.json`](docs/data/brand_assets.json) | Machine-readable logo/brand manifest for the website, README, Hugging Face cards, favicon, app icon, and social preview. |
| [`docs/assets/brand/xperience10m-logo-social-card.png`](docs/assets/brand/xperience10m-logo-social-card.png) | Project logo card used by README and Hugging Face cards. |
| [`scripts/build_brand_assets.py`](scripts/build_brand_assets.py) | Regenerates deterministic logo derivatives, favicon variants, app icons, and the social card from the generated logo mark. |
| [`docs/assets/task_suite_infographic.png`](docs/assets/task_suite_infographic.png) | Primary task-suite map with sample modality thumbnails. |
| [`docs/assets/pipeline_diagram.png`](docs/assets/pipeline_diagram.png) | Episode-to-task pipeline overview. |
| [`docs/assets/task_architectures.png`](docs/assets/task_architectures.png) | Minimal and neural task-head architecture map. |
## Data Contract
| Artifact | What it shows |
| --- | --- |
| [`results/episode_task_suite/windows.csv`](results/episode_task_suite/windows.csv) | The sample episode is converted into 1,161 aligned 20-frame windows. |
| [`results/episode_task_suite/feature_manifest.json`](results/episode_task_suite/feature_manifest.json) | The current input vector has 8,546 dimensions with explicit modality-group boundaries, including a 168-d audio group. |
| [`results/episode_task_suite/available_modalities.json`](results/episode_task_suite/available_modalities.json) | The sample modality coverage is recorded, including the current audio-featurization status. |
| [`results/audio_ablation/raw_logmel_fisheye_cam0_sr16000_mels64_fft512_hop160.npz`](results/audio_ablation/raw_logmel_fisheye_cam0_sr16000_mels64_fft512_hop160.npz) | Derived 588-d raw log-mel window features decoded from the local public-sample MP4 audio stream; raw audio itself is not redistributed. |
| [`docs/data/modality_atlas.json`](docs/data/modality_atlas.json) | The responsive website modality cards and derived thumbnail assets are documented without redistributing raw data. |
| [`docs/assets/modalities/`](docs/assets/modalities/) | Small public-sample thumbnails used by the readable modality atlas. |
## Task Evidence
| Artifact | What it shows |
| --- | --- |
| [`TASK_SUITE_20.md`](TASK_SUITE_20.md) | Reader-facing table for the unified 20-task suite. |
| [`docs/data/task_suite_20.json`](docs/data/task_suite_20.json) | Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
| [`results/episode_task_suite/summary_report.json`](results/episode_task_suite/summary_report.json) | The original task contracts, chronological split, and minimal/neural metrics. |
| [`results/episode_task_suite/neural_mlp/`](results/episode_task_suite/neural_mlp/) | Matching PyTorch MLP heads for the same task contracts and feature windows. |
| [`results/episode_task_suite/research_directions/`](results/episode_task_suite/research_directions/) | Mapping from the unified 20-task suite to the four Ropedia research directions. |
| [`results/episode_task_suite/research_direction_extensions/`](results/episode_task_suite/research_direction_extensions/) | Four additional coded probes, one per research direction. |
| [`results/episode_task_suite/tier2_task_suite/`](results/episode_task_suite/tier2_task_suite/) | Historical result path for tasks 13-20 in the unified 20-task suite. |
| [`results/episode_task_suite/task_walkthroughs/`](results/episode_task_suite/task_walkthroughs/) | Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
| [`results/audio_ablation/audio_ablation_metrics.csv`](results/audio_ablation/audio_ablation_metrics.csv) | All measured audio rows for the original task contracts across six variants, including no-audio, audio-only, alternate-audio-only, representation replacement, and all-input variants. |
| [`results/audio_ablation/audio_delta_summary.csv`](results/audio_ablation/audio_delta_summary.csv) | Compact per-task audio delta table for quick manual inspection. |
| [`scripts/audio_ablation_and_raw_upgrade.py`](scripts/audio_ablation_and_raw_upgrade.py) | Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
| [`scripts/validate_task_surface.py`](scripts/validate_task_surface.py) | Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |
## Reproducibility
| Artifact | What it shows |
| --- | --- |
| [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Public commands, expected outputs, and non-reproducible boundaries are explicit. |
| [`docs/data/reproducibility_matrix.json`](docs/data/reproducibility_matrix.json) | Machine-readable command matrix for the website and Hub pages. |
| [`notes/reproducibility_audit.md`](notes/reproducibility_audit.md) | The last exact metric rebuild reproduced the public-sample metrics and matched committed artifacts. |
## Public Pages
| Surface | Purpose |
| --- | --- |
| [GitHub Pages dashboard](https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/) | Primary public website and visual research flow. |
| [GitHub Container package](https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite/pkgs/container/ropedia-xperience-10m-task-suite) | Static dashboard image for local browsing with Docker. |
| [Hugging Face Space](https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite) | Static app mirror for HF users. |
| [HF artifact dataset](https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts) | Derived CSV/JSON/Markdown/figure artifacts without raw Xperience-10M data. |
| [HF baseline model repo](https://huggingface.co/cy0307/ropedia-xperience-10m-task-baselines) | Lightweight minimal and neural task-head model files. |
| [HF collection](https://huggingface.co/collections/cy0307/ropedia-xperience-10m-task-suite) | One grouped landing page for the Space, artifact dataset, and baseline model repo. |
The public pages are meant to be the normal reader path. Supporting maintenance
checks remain in the repo, but they are not required for understanding the
research project.
## Scale-Up Readiness
| Artifact | Current status |
| --- | --- |
| [`results/omni_finetune/DATA_ACCESS_STATUS.md`](results/omni_finetune/DATA_ACCESS_STATUS.md) | Summarizes the data-readiness checks required before a held-out Qwen3-Omni pilot can report metrics. |
| [`results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md`](results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md) | Documents the public multi-episode access path, selected 128-episode pilot plan, and data requirements. |
| [`docs/data/omni_finetune_verified_result.json`](docs/data/omni_finetune_verified_result.json) | Compact verified summary for the final selected-episode Qwen3-Omni diagnostic result, including split counts, held-out metrics, quality-target status, and adapter repo. |
| [`results/omni_finetune/verified_public/`](results/omni_finetune/verified_public/) | Public-safe verified held-out result packages. These include metrics, predictions, reports, manifests, training metadata, validation summaries, and audit files, but not raw data or weights. |
| [`results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_multiscale_cap96_v6_rank64_lr5e5_full8gpu_lora_eval_test_full/`](results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_multiscale_cap96_v6_rank64_lr5e5_full8gpu_lora_eval_test_full/) | Current verified Qwen3-Omni v6 public package with 4,032 held-out predictions, 99.90% JSON validity, metrics, reports, training metadata, validation summaries, package audit, and v5/v6 comparison support. |
| [`docs/data/qwen3_v5_v6_comparison.json`](docs/data/qwen3_v5_v6_comparison.json) | Machine-readable comparison showing that v6 improves action macro-F1 and contact accuracy versus v5 while v5 remains stronger on JSON validity, subtask, next-action, transition, and object metrics. |
| [`results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v3_strict_label_prompt_reuse_lora_eval_test_full/`](results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v3_strict_label_prompt_reuse_lora_eval_test_full/) | Historical Qwen3-Omni strict-label v3 public package retained for prompt-contract and regression comparison. |
| [`results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v2_reuse_full8gpu_lora_eval_test_full/`](results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v2_reuse_full8gpu_lora_eval_test_full/) | Historical Qwen3-Omni v2 strict-JSON package retained for prompt-contract and regression comparison. |
| [`https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep`](https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep) | Public LoRA adapter weight repository for the final 128-episode Qwen3-Omni diagnostic run; raw Xperience-10M data and base Qwen weights remain excluded. |
| [`results/omni_finetune/QWEN3_FULL_PARAMETER_GATES_20260609.md`](results/omni_finetune/QWEN3_FULL_PARAMETER_GATES_20260609.md) | Full-parameter Qwen3-Omni feasibility-gate summary: 1/8/32/64-step guarded 8-GPU runs passed, the opportunistic 128-step run was preempted for Qwen v5 handoff, and no full checkpoints or weights are published. |
| [`docs/data/qwen3_full_parameter_gates.json`](docs/data/qwen3_full_parameter_gates.json) | Machine-readable full-parameter feasibility evidence and publication policy for the website and Hugging Face mirrors. |
| [`scripts/omni/defer_qwen3_fullparam_after_verified_qwen.sh`](scripts/omni/defer_qwen3_fullparam_after_verified_qwen.sh) | Waits for a verified Qwen held-out package, then launches a bounded 128-step full-parameter feasibility pilot on the same multiscale v5 dataset with no checkpoints or weights saved. |
| [`docs/data/task_method_20_result_matrix.json`](docs/data/task_method_20_result_matrix.json) | Same-split 128-episode simple and neural baselines reported on the unified 20-task axes, aligned to the 96/16/16 Qwen3-Omni split with source/proxy notes. |
| [`results/omni_finetune/multi_episode_128_task_baselines/summary_report.json`](results/omni_finetune/multi_episode_128_task_baselines/summary_report.json) | Machine-readable split counts, run configuration, simple metrics, neural metrics, and unsupported raw-feature markers for the aligned 128-episode baseline suite. |
| [`scripts/omni/run_128_task_baselines.py`](scripts/omni/run_128_task_baselines.py) | Runner for the aligned 128-episode metadata/text baselines; it consumes the derived Qwen JSONL export locally but does not publish raw data, Qwen weights, or LoRA weights. |
| [`scripts/omni/discover_xperience10m_sources.py`](scripts/omni/discover_xperience10m_sources.py) | Discovery gate for valid multi-episode Xperience-10M sources. |
| [`scripts/omni/train_qwen3_omni_lora.py`](scripts/omni/train_qwen3_omni_lora.py) | Training entrypoint for the Qwen3-Omni LoRA pilot after the data gate passes. |
| [`scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh`](scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh) | Full 96/16/16 launcher with parallel export, 8-process LoRA training, validation-sample monitoring, held-out test evaluation, and quality-target reporting. |
| [`scripts/omni/merge_qwen3_omni_eval_shards.py`](scripts/omni/merge_qwen3_omni_eval_shards.py) | Recomputes held-out metrics from deterministic Qwen eval shards and checks missing or duplicate prediction ids. |
| [`scripts/omni/package_verified_omni_result.py`](scripts/omni/package_verified_omni_result.py) | Creates a contract-driven public-safe package from validated held-out fine-tuning outputs without raw data, base weights, adapter/checkpoint weights, full checkpoints, or large archives. |
| [`scripts/omni/audit_verified_omni_package.py`](scripts/omni/audit_verified_omni_package.py) | Audits a verified package before README, website, or Hugging Face updates by checking validation status, required files, primary metrics, held-out evidence, and forbidden file types. |
| [`scripts/omni/analyze_qwen3_omni_errors.py`](scripts/omni/analyze_qwen3_omni_errors.py) | Computes public-safe held-out error-analysis tables from the verified Qwen3-Omni prediction package. |
| [`scripts/omni/build_qwen3_full_parameter_gate_summary.py`](scripts/omni/build_qwen3_full_parameter_gate_summary.py) | Regenerates the full-parameter feasibility-gate Markdown and JSON summaries from run-local evidence. |
| [`scripts/omni/watch_verified_omni_package.py`](scripts/omni/watch_verified_omni_package.py) | Waits for a passing held-out eval validation and then runs the verified public-safe packager automatically. |
| [`OMNI_MODEL_EXTENSION_CONTRACT.md`](OMNI_MODEL_EXTENSION_CONTRACT.md) | Human-readable contract for adding new model families while preserving the same episode split, held-out evaluation, packaging gate, and public-safety boundary. |
| [`configs/omni_backbones/`](configs/omni_backbones/) | Backbone registry for implemented Qwen3-Omni LoRA plus planned Cosmos-style world-model and VLA/policy branches. |
| [`scripts/omni/backbone_registry.py`](scripts/omni/backbone_registry.py) | Validates each backbone contract, required metrics, required files, split policy, and forbidden public package categories. |
| [`scripts/omni/export_model_neutral_window_index.py`](scripts/omni/export_model_neutral_window_index.py) | Converts Qwen JSONL records into a model-neutral window index that future Cosmos-style and policy/VLA exporters can consume. |
| [`scripts/omni/smoke_test_backbone_packaging.py`](scripts/omni/smoke_test_backbone_packaging.py) | Runs synthetic package-contract checks for every configured backbone, including Qwen3-Omni, Cosmos-style world modeling, and VLA/policy branches. |
| [`scripts/omni/scaffold_omni_backbone.py`](scripts/omni/scaffold_omni_backbone.py) | Creates a validated planned-backbone config from an existing contract template so new model branches inherit split, artifact, and publication rules. |
| [`FOUNDATION_MODEL_PLAN.md`](FOUNDATION_MODEL_PLAN.md) | Adds the post-data-gate backbone selection plan: Qwen3-Omni first, Cosmos 3 for world modeling, and OpenVLA/openpi/GR00T for policy/action branches. |
| [`docs/data/foundation_model_plan.json`](docs/data/foundation_model_plan.json) | Machine-readable model-family registry with source links, entry conditions, and evaluation additions. |
| [`ADDITIONAL_DEVELOPMENT_DIRECTIONS.md`](ADDITIONAL_DEVELOPMENT_DIRECTIONS.md) | Concise reader-facing plan for non-backbone tracks that can be built from Xperience-10M data. |
| [`docs/data/additional_development_directions.json`](docs/data/additional_development_directions.json) | Machine-readable copy of the additional directions for website and Hugging Face surfaces. |
| [`XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md`](XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md) | Future full-corpus Xperience-native pretraining plan; not a current model result. |
## What Is Not Included
The public repo and Hugging Face mirrors do not redistribute raw Xperience-10M
videos, raw `annotation.hdf5`, gated private dataset files, full Qwen weights,
or large full checkpoints. Dataset use remains governed by the official
Ropedia/Xperience-10M terms.
|