Public Reader Map
This project is intentionally evidence-heavy: the GitHub repo, website, and Hugging Face mirrors each expose a different view of the same research package. Use this map to choose the right entry point without losing the full artifact trail.
Fast Paths
| Reader goal | Start here | Then inspect |
|---|---|---|
| Understand the project in one pass | PROJECT_BRIEF.md |
PROJECT_STATUS.md, RESEARCH_TAKEAWAYS.md |
| Understand the two evidence lines | TWO_EVIDENCE_LINES.md |
docs/data/two_evidence_lines.json, docs/data/two_evidence_line_result_summary.json |
| See the visual public dashboard | GitHub Pages or the HF Space | docs/index.html, docs/data/project_packet.json |
| Decode project terminology | GLOSSARY.md |
docs/data/glossary.json, homepage Glossary section |
| Understand the data unit | results/episode_task_suite/windows.csv |
results/episode_task_suite/feature_manifest.json, docs/data/raw_sample_files.json |
| Trace the 128-episode split | XPERIENCE10M_128_EPISODE_FEATURE_INDEX.md |
docs/data/xperience10m_128_episode_feature_index.json, results/omni_finetune/xperience10m_128_episode_selection.csv |
| Inspect the 20-task benchmark | TASK_SUITE_20.md |
docs/data/task_suite_20.json, EVALUATION_PROTOCOL.md |
| Compare current results | RESEARCH_TAKEAWAYS.md |
docs/data/task_method_20_result_matrix.json, docs/data/unified_task_model_radar.json |
| Compare 1-episode and 128-episode methods | Homepage radar section | docs/data/single_episode_task_model_radar.json, docs/data/episode128_task_model_radar.json |
| Read Qwen3-Omni v1-v6 correctly | QWEN3_OMNI_RUN_LINEAGE.md |
docs/data/qwen3_omni_run_lineage.json, docs/data/qwen3_v5_v6_comparison.json |
| Find all derived artifacts | ARTIFACT_GUIDE.md |
HF artifact dataset, docs/data/artifact_index.json |
| Download model weights with their matching results | Hugging Face weights/results repo | manifest.json, analysis/docs/data/task_method_20_result_matrix.json, results/ |
| Reproduce or extend the work | REPRODUCIBILITY.md |
QUALITY_GATES.md, scripts/, results/ |
| Understand foundation-model directions | THREE_FOUNDATION_PIPELINES.md |
FOUNDATION_MODEL_PLAN.md, docs/data/three_foundation_pipelines.json |
| Check public-release health | PUBLIC_SURFACE_QA.md |
docs/data/live_publication_status.json, docs/data/mirror_parity.json |
Public Surfaces
| Surface | Responsibility | Best use |
|---|---|---|
| GitHub repo | Source of truth for docs, scripts, generated data, validators, and commit history | Auditing implementation and citing exact files |
| GitHub Pages dashboard | Reader-facing visual overview of the dataset sample, tasks, methods, results, directions, and resources | Understanding the project quickly |
| Hugging Face Space | Hub-hosted copy of the dashboard and static app assets | Sharing the visual dashboard from HF |
| HF artifact dataset | Public-safe derived artifacts, reports, metrics, website JSON, and sanitized model result packages | Downloading evidence bundles |
| HF baseline model repo | Baseline weights, metrics, figures, and mirrored task artifacts | Reusing compact baseline outputs |
| HF weights/results repo | Consolidated baseline weights, Qwen3-Omni v6 LoRA, Cosmos3-Super adapter/result artifacts, verified results, analysis files, and file-level manifest | Auditing all public-safe weight-bearing artifacts from one repo |
| Qwen3-Omni and Cosmos3 model repos | Adapter-specific public weights or package cards when a run is verified and publishable | Inspecting Qwen3-Omni and Cosmos3 artifacts |
Evidence Views
- Dataset/source boundary: upstream Xperience-10M links, public sample scope, raw-data exclusion, and derived-file policy.
- Glossary: definitions for evidence line, 20-frame window, compact-proxy score, Qwen v1-v6, Cosmos3 branches, LoRA adapters, and HF surfaces.
- Data contract: 20-frame windows, feature blocks, modality availability, split policy, and leakage controls.
- Task suite: 20 named tasks with inputs, outputs, metrics, baseline artifacts, and walkthroughs.
- Results: Line 1 minimal/neural heads, Line 2 selected-128 aligned baselines, Qwen3-Omni v6 diagnostics, Cosmos diagnostics, radar views, and explicit direct-vs-proxy labels.
- Foundation directions: spatial intelligence, human-video world modeling, and vision-language-action training pipelines.
- Public-release checks: website integrity, source alignment, mirror parity, publication package scan, and live URL/hash verification.
Reading Scope
| Topic | Public evidence | Scope note |
|---|---|---|
| Single public-sample task behavior | results/episode_task_suite/, docs/data/task_suite_20.json |
Describes one public sample episode, not the full dataset distribution |
| 128-episode method comparison | XPERIENCE10M_128_EPISODE_FEATURE_INDEX.md, docs/data/xperience10m_128_episode_feature_index.json, results/omni_finetune/*128*, docs/data/omni_model_comparison.json |
Uses selected held-out episodes and derived public-safe summaries; official raw files remain gated upstream |
| Qwen3-Omni v1-v6 lineage | QWEN3_OMNI_RUN_LINEAGE.md, docs/data/qwen3_omni_run_lineage.json |
v1-v4 are pipeline/ablation evidence, v5 is the pinned prior release, and v6 is the current public 20-task Qwen row |
| Foundation-model track quality | Verified Qwen3-Omni and Cosmos3 result packages and model cards | Numeric task scores appear only when a task-specific eval/probe exists |
| Reproducibility | REPRODUCIBILITY.md, QUALITY_GATES.md, release validators |
Raw gated Xperience-10M files and full foundation weights are not redistributed |