ropedia-xperience-10m-task-baselines / PUBLIC_READER_MAP.md
cy0307's picture
Polish remaining reader scope wording (1/2)
d91b077 verified
|
Raw
History Blame Contribute Delete
5.74 kB

Public Reader Map

This project is intentionally evidence-heavy: the GitHub repo, website, and Hugging Face mirrors each expose a different view of the same research package. Use this map to choose the right entry point without losing the full artifact trail.

Fast Paths

Reader goal Start here Then inspect
Understand the project in one pass PROJECT_BRIEF.md PROJECT_STATUS.md, RESEARCH_TAKEAWAYS.md
Understand the two evidence lines TWO_EVIDENCE_LINES.md docs/data/two_evidence_lines.json, docs/data/two_evidence_line_result_summary.json
See the visual public dashboard GitHub Pages or the HF Space docs/index.html, docs/data/project_packet.json
Decode project terminology GLOSSARY.md docs/data/glossary.json, homepage Glossary section
Understand the data unit results/episode_task_suite/windows.csv results/episode_task_suite/feature_manifest.json, docs/data/raw_sample_files.json
Trace the 128-episode split XPERIENCE10M_128_EPISODE_FEATURE_INDEX.md docs/data/xperience10m_128_episode_feature_index.json, results/omni_finetune/xperience10m_128_episode_selection.csv
Inspect the 20-task benchmark TASK_SUITE_20.md docs/data/task_suite_20.json, EVALUATION_PROTOCOL.md
Compare current results RESEARCH_TAKEAWAYS.md docs/data/task_method_20_result_matrix.json, docs/data/unified_task_model_radar.json
Compare 1-episode and 128-episode methods Homepage radar section docs/data/single_episode_task_model_radar.json, docs/data/episode128_task_model_radar.json
Read Qwen3-Omni v1-v6 correctly QWEN3_OMNI_RUN_LINEAGE.md docs/data/qwen3_omni_run_lineage.json, docs/data/qwen3_v5_v6_comparison.json
Find all derived artifacts ARTIFACT_GUIDE.md HF artifact dataset, docs/data/artifact_index.json
Download model weights with their matching results Hugging Face weights/results repo manifest.json, analysis/docs/data/task_method_20_result_matrix.json, results/
Reproduce or extend the work REPRODUCIBILITY.md QUALITY_GATES.md, scripts/, results/
Understand foundation-model directions THREE_FOUNDATION_PIPELINES.md FOUNDATION_MODEL_PLAN.md, docs/data/three_foundation_pipelines.json
Check public-release health PUBLIC_SURFACE_QA.md docs/data/live_publication_status.json, docs/data/mirror_parity.json

Public Surfaces

Surface Responsibility Best use
GitHub repo Source of truth for docs, scripts, generated data, validators, and commit history Auditing implementation and citing exact files
GitHub Pages dashboard Reader-facing visual overview of the dataset sample, tasks, methods, results, directions, and resources Understanding the project quickly
Hugging Face Space Hub-hosted copy of the dashboard and static app assets Sharing the visual dashboard from HF
HF artifact dataset Public-safe derived artifacts, reports, metrics, website JSON, and sanitized model result packages Downloading evidence bundles
HF baseline model repo Baseline weights, metrics, figures, and mirrored task artifacts Reusing compact baseline outputs
HF weights/results repo Consolidated baseline weights, Qwen3-Omni v6 LoRA, Cosmos3-Super adapter/result artifacts, verified results, analysis files, and file-level manifest Auditing all public-safe weight-bearing artifacts from one repo
Qwen3-Omni and Cosmos3 model repos Adapter-specific public weights or package cards when a run is verified and publishable Inspecting Qwen3-Omni and Cosmos3 artifacts

Evidence Views

  1. Dataset/source boundary: upstream Xperience-10M links, public sample scope, raw-data exclusion, and derived-file policy.
  2. Glossary: definitions for evidence line, 20-frame window, compact-proxy score, Qwen v1-v6, Cosmos3 branches, LoRA adapters, and HF surfaces.
  3. Data contract: 20-frame windows, feature blocks, modality availability, split policy, and leakage controls.
  4. Task suite: 20 named tasks with inputs, outputs, metrics, baseline artifacts, and walkthroughs.
  5. Results: Line 1 minimal/neural heads, Line 2 selected-128 aligned baselines, Qwen3-Omni v6 diagnostics, Cosmos diagnostics, radar views, and explicit direct-vs-proxy labels.
  6. Foundation directions: spatial intelligence, human-video world modeling, and vision-language-action training pipelines.
  7. Public-release checks: website integrity, source alignment, mirror parity, publication package scan, and live URL/hash verification.

Reading Scope

Topic Public evidence Scope note
Single public-sample task behavior results/episode_task_suite/, docs/data/task_suite_20.json Describes one public sample episode, not the full dataset distribution
128-episode method comparison XPERIENCE10M_128_EPISODE_FEATURE_INDEX.md, docs/data/xperience10m_128_episode_feature_index.json, results/omni_finetune/*128*, docs/data/omni_model_comparison.json Uses selected held-out episodes and derived public-safe summaries; official raw files remain gated upstream
Qwen3-Omni v1-v6 lineage QWEN3_OMNI_RUN_LINEAGE.md, docs/data/qwen3_omni_run_lineage.json v1-v4 are pipeline/ablation evidence, v5 is the pinned prior release, and v6 is the current public 20-task Qwen row
Foundation-model track quality Verified Qwen3-Omni and Cosmos3 result packages and model cards Numeric task scores appear only when a task-specific eval/probe exists
Reproducibility REPRODUCIBILITY.md, QUALITY_GATES.md, release validators Raw gated Xperience-10M files and full foundation weights are not redistributed