Artifact Guide
This guide is the human-readable map for the public Ropedia Xperience-10M task suite artifacts. It is organized around what a reader usually wants to do: understand the project, inspect the sample episode, compare baselines, read the task results, follow the Qwen3-Omni scale-up path, and understand the longer Xperience-native pretraining goal.
Start Here
| Artifact | Why to open it first |
|---|---|
PUBLIC_READER_MAP.md |
Chooses the right public surface first: GitHub source, website, HF Space, artifact dataset, baseline model repo, model-branch repos, or release-health files. |
GLOSSARY.md |
Defines terms that can be confused across the repo, website, Hugging Face mirrors, result matrix, Qwen/Cosmos branches, and public-safe package checks. |
PROJECT_STATUS.md |
Gives the fastest current-state table: implemented, being improved, and outside current scope. |
RESEARCH_ROADMAP.md |
Shows the roadmap from public-sample task development to multi-episode data preparation, Qwen3-Omni LoRA, robustness runs, model branches, and the future native-pretraining goal. |
FOUNDATION_MODEL_PLAN.md |
Explains which foundation backbones fit which Xperience-10M objective: Qwen3-Omni first, Cosmos 3 for world modeling, and VLA/policy models after action-target conversion. |
OMNI_MODEL_EXTENSION_CONTRACT.md |
Defines the shared manifest, split, evaluation, packaging, and public-safety contract that future Qwen, Cosmos-style, and VLA/policy branches must satisfy. |
ADDITIONAL_DEVELOPMENT_DIRECTIONS.md |
Records concrete non-backbone development tracks: taxonomy, benchmark protocol, representation learning, skill graphs, affordances, 3D/4D memory, QA, and policy transfer. |
XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md |
Describes the future full-corpus Xperience Embodied Foundation Model goal, including modules, objectives, staged scale-up, hardware ranges, and evaluation. |
EVALUATION_PROTOCOL.md |
Defines the task unit, chronological split, metrics, leakage controls, and current limitations. |
REPRODUCIBILITY.md |
Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
results/audio_ablation/AUDIO_ABLATION_SUMMARY.md |
Shows measured current-audio and raw log-mel replacement deltas across the walkthrough-backed task contracts. |
docs/single_episode_explorer.html |
Gives a static window-level explorer for the public sample episode. |
XPERIENCE10M_DATASET_CARD_ALIGNMENT.md |
Optional detail for readers who need official dataset and access-term context. |
Dataset Context
| Artifact | What it shows |
|---|---|
docs/data/glossary.json |
Machine-readable terminology layer for the website and Hugging Face mirrors. |
XPERIENCE10M_DATASET_CARD_ALIGNMENT.md |
Human-readable summary of the official gated Xperience-10M dataset, public sample, modalities, access terms, intended uses, and limitations. |
docs/data/xperience10m_dataset_card_alignment.json |
Machine-readable dataset-context bundle for the website and Hub pages. |
SOURCE_ALIGNMENT_AUDIT.md |
Supporting provenance note for maintainers who want to inspect how public dataset descriptions were checked. |
docs/data/source_alignment_audit.json |
Machine-readable provenance record for generated project pages. |
scripts/validate_source_alignment.py |
Maintenance script for refreshing the dataset-context note. |
Evaluation Protocol
| Artifact | What it shows |
|---|---|
EVALUATION_PROTOCOL.md |
Human-readable task protocol: window unit, chronological split, input/target contracts, primary metrics, leakage controls, and current limitations. |
docs/data/evaluation_protocol.json |
Machine-readable protocol generated from committed task metrics. |
scripts/build_evaluation_protocol.py |
Regenerates the protocol from docs/data/summary_metrics.json and source task artifacts. |
Visual Evidence
| Artifact | What it shows |
|---|---|
FIGURE_INDEX.md |
Human-readable catalog of public visual assets, dimensions, hashes, roles, and source scripts. |
docs/data/figure_index.json |
Machine-readable visual asset index mirrored to the website, artifact dataset, and model repo. |
scripts/build_figure_index.py |
Regenerates visual-asset hashes, dimensions, and source-script provenance. |
docs/data/brand_assets.json |
Machine-readable logo/brand manifest for the website, README, Hugging Face cards, favicon, app icon, and social preview. |
docs/assets/brand/xperience10m-logo-social-card.png |
Project logo card used by README and Hugging Face cards. |
scripts/build_brand_assets.py |
Regenerates deterministic logo derivatives, favicon variants, app icons, and the social card from the generated logo mark. |
docs/assets/task_suite_infographic.png |
Primary task-suite map with sample modality thumbnails. |
docs/assets/pipeline_diagram.png |
Episode-to-task pipeline overview. |
docs/assets/task_architectures.png |
Minimal and neural task-head architecture map. |
Data Contract
| Artifact | What it shows |
|---|---|
results/episode_task_suite/windows.csv |
The sample episode is converted into 1,161 aligned 20-frame windows. |
results/episode_task_suite/feature_manifest.json |
The current input vector has 8,546 dimensions with explicit modality-group boundaries, including a 168-d audio group. |
results/episode_task_suite/available_modalities.json |
The sample modality coverage is recorded, including the current audio-featurization status. |
results/audio_ablation/raw_logmel_fisheye_cam0_sr16000_mels64_fft512_hop160.npz |
Derived 588-d raw log-mel window features decoded from the local public-sample MP4 audio stream; raw audio itself is not redistributed. |
docs/data/modality_atlas.json |
The responsive website modality cards and derived thumbnail assets are documented without redistributing raw data. |
docs/assets/modalities/ |
Small public-sample thumbnails used by the readable modality atlas. |
Task Evidence
| Artifact | What it shows |
|---|---|
TASK_SUITE_20.md |
Reader-facing table for the unified 20-task suite. |
docs/data/task_suite_20.json |
Machine-readable unified 20-task suite for the website and Hugging Face mirrors. |
results/episode_task_suite/summary_report.json |
The walkthrough-backed task contracts, chronological split, and minimal/neural metrics. |
results/episode_task_suite/neural_mlp/ |
Matching PyTorch MLP heads for the same task contracts and feature windows. |
results/episode_task_suite/research_directions/ |
Mapping from the unified 20-task suite to the four Ropedia research directions. |
results/episode_task_suite/research_direction_extensions/ |
Four additional coded probes, one per research direction. |
results/episode_task_suite/tier2_task_suite/ |
Historical provenance path inside the unified 20-task suite. |
results/episode_task_suite/task_walkthroughs/ |
Human-readable research names and case studies explaining input, process modules, output, metric, limitation, and the website task-player data. |
results/audio_ablation/audio_ablation_metrics.csv |
All measured audio rows for the walkthrough-backed task contracts across six variants, including no-audio, audio-only, alternate-audio-only, representation replacement, and all-input variants. |
results/audio_ablation/audio_delta_summary.csv |
Compact per-task audio delta table for quick manual inspection. |
scripts/audio_ablation_and_raw_upgrade.py |
Regenerates audio contribution results from real task-suite artifacts plus the local public-sample MP4. |
scripts/validate_task_surface.py |
Fails publication if public task cards drift back to raw artifact ids or lose their thumbnail/player wiring. |
Reproducibility
| Artifact | What it shows |
|---|---|
REPRODUCIBILITY.md |
Public commands, expected outputs, and non-reproducible boundaries are explicit. |
docs/data/reproducibility_matrix.json |
Machine-readable command matrix for the website and Hub pages. |
notes/reproducibility_audit.md |
The last exact metric rebuild reproduced the public-sample metrics and matched committed artifacts. |
Public Pages
| Surface | Purpose |
|---|---|
| GitHub Pages dashboard | Primary public website and visual research flow. |
| GitHub Container package | Static dashboard image for local browsing with Docker. |
| Hugging Face Space | Static app mirror for HF users. |
| HF artifact dataset | Derived CSV/JSON/Markdown/figure artifacts without raw Xperience-10M data. |
| HF baseline model repo | Lightweight minimal and neural task-head model files. |
| HF collection | One grouped landing page for the Space, artifact dataset, and baseline model repo. |
The public pages are meant to be the normal reader path. Supporting maintenance checks remain in the repo, but they are not required for understanding the research project.
Scale-Up Readiness
| Artifact | Current status |
|---|---|
results/omni_finetune/DATA_ACCESS_STATUS.md |
Summarizes the data-readiness checks required before a held-out Qwen3-Omni pilot can report metrics. |
results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md |
Documents the public multi-episode access path, selected 128-episode pilot plan, and data requirements. |
docs/data/omni_finetune_verified_result.json |
Compact verified summary for the final selected-episode Qwen3-Omni diagnostic result, including split counts, held-out metrics, quality-target status, and adapter repo. |
results/omni_finetune/verified_public/ |
Public-safe verified held-out result packages. These include metrics, predictions, reports, manifests, training metadata, validation summaries, and audit files, but not raw data or weights. |
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_multiscale_cap96_v6_rank64_lr5e5_full8gpu_lora_eval_test_full/ |
Current verified Qwen3-Omni v6 public package with 4,032 held-out predictions, 99.90% JSON validity, metrics, reports, training metadata, validation summaries, package audit, and v5/v6 comparison support. |
docs/data/qwen3_v5_v6_comparison.json |
Machine-readable comparison showing that v6 improves action macro-F1 and contact accuracy versus v5 while v5 remains stronger on JSON validity, subtask, next-action, transition, and object metrics. |
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v3_strict_label_prompt_reuse_lora_eval_test_full/ |
Historical Qwen3-Omni strict-label v3 public package retained for prompt-contract and regression comparison. |
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_structured_json_v2_reuse_full8gpu_lora_eval_test_full/ |
Historical Qwen3-Omni v2 strict-JSON package retained for prompt-contract and regression comparison. |
https://huggingface.co/cy0307/ropedia-qwen3-omni-lora-128ep |
Public LoRA adapter weight repository for the final 128-episode Qwen3-Omni diagnostic run; raw Xperience-10M data and base Qwen weights remain excluded. |
results/omni_finetune/QWEN3_FULL_PARAMETER_GATES_20260609.md |
Full-parameter Qwen3-Omni feasibility-gate summary: 1/8/32/64-step guarded 8-GPU runs passed, the opportunistic 128-step run was preempted for Qwen v5 handoff, and no full checkpoints or weights are published. |
docs/data/qwen3_full_parameter_gates.json |
Machine-readable full-parameter feasibility evidence and publication policy for the website and Hugging Face mirrors. |
scripts/omni/defer_qwen3_fullparam_after_verified_qwen.sh |
Waits for a verified Qwen held-out package, then launches a bounded 128-step full-parameter feasibility pilot on the same multiscale v5 dataset with no checkpoints or weights saved. |
docs/data/task_method_20_result_matrix.json |
Same-split 128-episode simple and neural baselines reported on the unified 20-task axes, aligned to the 96/16/16 Qwen3-Omni split with source/proxy notes. |
results/omni_finetune/multi_episode_128_task_baselines/summary_report.json |
Machine-readable split counts, run configuration, simple metrics, neural metrics, and unsupported raw-feature markers for the aligned 128-episode baseline suite. |
scripts/omni/run_128_task_baselines.py |
Runner for the aligned 128-episode metadata/text baselines; it consumes the derived Qwen JSONL export locally but does not publish raw data, Qwen weights, or LoRA weights. |
scripts/omni/discover_xperience10m_sources.py |
Discovery gate for valid multi-episode Xperience-10M sources. |
scripts/omni/train_qwen3_omni_lora.py |
Training entrypoint for the Qwen3-Omni LoRA pilot after the data gate passes. |
scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh |
Full 96/16/16 launcher with parallel export, 8-process LoRA training, validation-sample monitoring, held-out test evaluation, and quality-target reporting. |
scripts/omni/merge_qwen3_omni_eval_shards.py |
Recomputes held-out metrics from deterministic Qwen eval shards and checks missing or duplicate prediction ids. |
scripts/omni/package_verified_omni_result.py |
Creates a contract-driven public-safe package from validated held-out fine-tuning outputs without raw data, base weights, adapter/checkpoint weights, full checkpoints, or large archives. |
scripts/omni/audit_verified_omni_package.py |
Audits a verified package before README, website, or Hugging Face updates by checking validation status, required files, primary metrics, held-out evidence, and forbidden file types. |
scripts/omni/analyze_qwen3_omni_errors.py |
Computes public-safe held-out error-analysis tables from the verified Qwen3-Omni prediction package. |
scripts/omni/build_qwen3_full_parameter_gate_summary.py |
Regenerates the full-parameter feasibility-gate Markdown and JSON summaries from run-local evidence. |
scripts/omni/watch_verified_omni_package.py |
Waits for a passing held-out eval validation and then runs the verified public-safe packager automatically. |
OMNI_MODEL_EXTENSION_CONTRACT.md |
Human-readable contract for adding new model families while preserving the same episode split, held-out evaluation, packaging gate, and public-safety boundary. |
configs/omni_backbones/ |
Backbone registry for implemented Qwen3-Omni LoRA plus planned Cosmos-style world-model and VLA/policy branches. |
scripts/omni/backbone_registry.py |
Validates each backbone contract, required metrics, required files, split policy, and forbidden public package categories. |
scripts/omni/export_model_neutral_window_index.py |
Converts Qwen JSONL records into a model-neutral window index that future Cosmos-style and policy/VLA exporters can consume. |
scripts/omni/smoke_test_backbone_packaging.py |
Runs synthetic package-contract checks for every configured backbone, including Qwen3-Omni, Cosmos-style world modeling, and VLA/policy branches. |
scripts/omni/scaffold_omni_backbone.py |
Creates a validated planned-backbone config from an existing contract template so new model branches inherit split, artifact, and publication rules. |
FOUNDATION_MODEL_PLAN.md |
Adds the post-data-gate backbone selection plan: Qwen3-Omni first, Cosmos 3 for world modeling, and OpenVLA/openpi/GR00T for policy/action branches. |
docs/data/foundation_model_plan.json |
Machine-readable model-family registry with source links, entry conditions, and evaluation additions. |
ADDITIONAL_DEVELOPMENT_DIRECTIONS.md |
Concise reader-facing plan for non-backbone tracks that can be built from Xperience-10M data. |
docs/data/additional_development_directions.json |
Machine-readable copy of the additional directions for website and Hugging Face surfaces. |
XPERIENCE_EMBODIED_FOUNDATION_MODEL_PRETRAINING.md |
Future full-corpus Xperience-native pretraining plan; not a current model result. |
What Is Not Included
The public repo and Hugging Face mirrors do not redistribute raw Xperience-10M
videos, raw annotation.hdf5, gated private dataset files, full Qwen weights,
or large full checkpoints. Dataset use remains governed by the official
Ropedia/Xperience-10M terms.