Evidence Contract
This project is organized as a research-development workspace. Every visible project statement should point to a local artifact that a reader can inspect before using the dashboard as a basis for further work.
| Project statement | Current evidence | Status | Current scope |
|---|---|---|---|
| A first-pass reader has a compact current-state summary. | PROJECT_STATUS.md, docs/data/project_status.json |
Verified guide | Summarizes existing evidence and current limitations |
| The staged research path is explicit. | RESEARCH_ROADMAP.md, docs/data/research_roadmap.json |
Current roadmap | Connects public-sample task development to multi-episode staging, Qwen3-Omni LoRA, robustness runs, and larger omni-model extensions |
| The public dataset description is aligned with the official gated Xperience-10M dataset card and public sample card. | XPERIENCE10M_DATASET_CARD_ALIGNMENT.md, docs/data/xperience10m_dataset_card_alignment.json |
Verified description alignment | Summarizes upstream public metadata, API listing facts, sample license/tooling, and card facts; does not grant access or mirror raw data |
| Source facts, sample details, API-listing notes, and project coverage are aligned across repo, website, and HF cards. | SOURCE_ALIGNMENT_AUDIT.md, docs/data/source_alignment_audit.json, scripts/validate_source_alignment.py |
Source alignment recorded | Offline committed-fact report; does not fetch private gated data |
| Public figures are indexed as project evidence. | FIGURE_INDEX.md, docs/data/figure_index.json, scripts/build_figure_index.py |
Verified visual evidence | Derived figures and thumbnails only; does not include raw MP4/HDF5/RRD data |
| The project logo is consistently packaged across public surfaces. | docs/data/brand_assets.json, docs/assets/brand/, scripts/build_brand_assets.py |
Verified brand packaging | Generated presentation assets only; does not contain raw Xperience-10M data or model weights |
| The public Xperience-10M sample has been converted into aligned model windows. | results/episode_task_suite/windows.csv, results/episode_task_suite/shared_windows.npz, results/episode_task_suite/summary_report.json |
Verified for 5,821 frames and 1,161 windows | One public sample episode only |
| The current feature contract is explicit and inspectable. | results/episode_task_suite/feature_manifest.json, results/episode_task_suite/available_modalities.json |
Verified for an 8,546-d feature vector | AAC audio is decoded from fisheye_cam0.mp4 into a 168-d feature block |
| The task evaluation protocol is explicit and generated from committed metrics. | EVALUATION_PROTOCOL.md, docs/data/evaluation_protocol.json, scripts/build_evaluation_protocol.py |
Verified protocol | Defines windows, split, per-task metrics, leakage controls, and current limitations |
| The public sample modalities are inspectable without raw data redistribution. | docs/data/modality_atlas.json, docs/assets/modalities/, website modality atlas |
Verified derived thumbnail atlas | Thumbnails are presentation assets, not a replacement for official raw data access |
| Public task cards stay readable for non-expert readers. | docs/data/task_surface_integrity.json, scripts/validate_task_surface.py, website task cards/player |
Task-surface report | Presentation layer only; it does not add model quality or new data |
| The 12 task heads are real scripts and artifacts, not presentation placeholders. | scripts/episode_task_suite.py, results/episode_task_suite/*/metrics.json, results/episode_task_suite/*/predictions.* |
Verified for all 12 task definitions | Chronological single-episode split, not cross-episode generalization |
| Minimal and neural heads use the same task contracts. | scripts/neural_task_models.py, results/episode_task_suite/neural_mlp/, docs/assets/task_architectures.png |
Verified for 12 minimal heads and 12 neural MLP heads | Small heads only; not a foundation model |
| Four Ropedia research directions are mapped honestly as direct, proxy, or diagnostic evidence. | results/episode_task_suite/research_directions/research_direction_taxonomy.json, docs/data/research_directions.json |
Verified taxonomy | Some directions remain proxy-only |
| Four extra direction probes are coded and evaluated. | results/episode_task_suite/research_direction_extensions/research_direction_extension_results.json, docs/data/research_direction_extensions.json |
Verified single-episode probes | Not full human modeling, neural rendering, intent modeling, or world modeling solutions |
| Qwen3-Omni infrastructure has passed setup checks. | results/omni_finetune/RUN_REPORT.md, results/omni_finetune/dataset_manifest.json, results/omni_finetune/metrics_eval.json |
Setup-stage evidence | One episode, 128 train windows; full metrics require the 32-episode pilot |
| The 32-episode LoRA pilot is waiting on gated data access. | results/omni_finetune/DATA_ACCESS_STATUS.md, results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md, results/omni_finetune/source_discovery.json |
Data access pending | Held-out metrics come after the data gate, manifest construction, training, and test evaluation |
Historical 32ep path strings are tracked as setup-file provenance. |
scripts/validate_scope_claims.py, docs/data/scope_claims_audit.json |
Multi-episode pilot status | Old run/path identifiers stay separate from completed 32-episode results |
| Prepared GitHub/Hugging Face mirrors carry matching critical files. | scripts/validate_mirror_parity.py, docs/data/mirror_parity.json |
Mirror parity report | Compares prepared data files, visual assets, website HTML, and validator scripts before upload; live URLs are checked after publishing |
| The public GitHub and Hugging Face bundles are ready to share. | scripts/validate_publication_package.py, docs/data/publication_audit.json |
Public bundle contents | Covers public files, HF bundles, and public-card freshness; ignored local scratch outputs are excluded |
| The public repo, website, and Hugging Face cards present one cohesive research project. | PUBLIC_SURFACE_QA.md, scripts/build_public_surface_qa.py, docs/data/public_surface_qa.json |
Public project surface | Covers SEO/social metadata, accessible tab semantics, public links, project links, and reader-facing copy |
| The public website has validated local references. | scripts/validate_website_integrity.py, docs/data/website_integrity.json |
Website reference report | Covers local links, anchors, JSON data, and referenced images; external URLs are not fetched |
| The rendered website walkthrough has a browser-level interaction check. | RENDERED_SITE_CHECK.md, scripts/build_rendered_site_check.py, docs/data/rendered_site_check.json |
Rendered website check | Covers local page load, tab switch, walkthrough deep link, player controls, and console health |
| The release checks are explicit. | QUALITY_GATES.md, scripts/build_quality_gates.py, docs/data/quality_gates.json |
Release checks | Summarizes packaging and live-mirror checks; cross-episode model quality is measured by later held-out reports |
| The live public mirrors are verified after upload. | scripts/verify_live_publication.py, docs/data/live_publication_status.json |
Live publication report | Fetches public GitHub/HF URLs; it does not validate private training state |
| The core project artifacts are indexed and grouped for fast reading. | ARTIFACT_GUIDE.md, scripts/build_artifact_index.py, docs/data/artifact_index.json |
Verified guide and index | Selective source-of-truth catalog, not a complete inventory of every output file |
| The public reproduction path is documented. | REPRODUCIBILITY.md, docs/data/reproducibility_matrix.json, notes/reproducibility_audit.md |
Verified documentation and prior exact-match check | Publicly reproduces the single-episode pipeline, not the gated 32-episode Qwen3-Omni pilot |
| The project is externally citable and machine-readable. | CITATION.cff, codemeta.json, docs/data/project_manifest.json, LICENSE |
Verified metadata files | Code license does not override original Xperience-10M dataset terms |
| A first-time reader has an explicit project path. | docs/data/project_packet.json, website project path section, README project path |
Verified project packet | Guides inspection across data, tasks, results, and scale-up status |
Reading Order
- Read
PROJECT_STATUS.mdanddocs/data/project_status.jsonfor the fastest current-state decision table. - Read
RESEARCH_ROADMAP.mdanddocs/data/research_roadmap.jsonfor the staged path from public-sample development to multi-episode modeling. - Read
docs/data/project_packet.jsonfor the shortest project path and current scope. - Read
XPERIENCE10M_DATASET_CARD_ALIGNMENT.mdanddocs/data/xperience10m_dataset_card_alignment.jsonto check the official dataset-card wording and how the current repo is scoped against it. - Read
SOURCE_ALIGNMENT_AUDIT.mdanddocs/data/source_alignment_audit.jsonto inspect the same source facts present across repo, website, and HF cards. - Read
FIGURE_INDEX.md,docs/data/figure_index.json, anddocs/data/brand_assets.jsonto inspect public figures, charts, modality thumbnails, logo assets, dimensions, hashes, and source scripts. - Read
EVALUATION_PROTOCOL.mdanddocs/data/evaluation_protocol.jsonto check windowing, split policy, per-task metrics, leakage controls, and current limitations. - Read
ARTIFACT_GUIDE.mdanddocs/data/artifact_index.jsonto see grouped project artifacts, indexed supporting artifacts, sizes, and stable-file hashes. - Read
docs/assets/task_suite_infographic.pnganddocs/data/modality_atlas.jsonfor the high-level map and modality atlas. - Read
REPRODUCIBILITY.mdanddocs/data/reproducibility_matrix.jsonbefore rerunning the public pipeline. - Inspect
results/episode_task_suite/summary_report.jsonfor the task and metric source of truth. - Inspect
results/episode_task_suite/feature_manifest.jsonto see which modalities enter the current feature vector. - Inspect
results/episode_task_suite/neural_mlp/to compare minimal and neural heads under the same splits. - Inspect
docs/data/scope_claims_audit.jsonbefore interpreting historical32epstrings in Qwen3-Omni setup artifacts. - Inspect
docs/data/mirror_parity.jsonbefore assuming the GitHub and Hugging Face mirrors contain the same critical data, visual, HTML, and validator files. - Inspect
results/omni_finetune/DATA_ACCESS_STATUS.mdandresults/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.mdbefore interpreting any Qwen3-Omni artifact. - Inspect
QUALITY_GATES.md,docs/data/quality_gates.json,PUBLIC_SURFACE_QA.md,docs/data/public_surface_qa.json,docs/data/publication_audit.json, anddocs/data/website_integrity.jsonbefore sharing a new public release. - Inspect
CITATION.cff,codemeta.json, andLICENSEbefore reusing or citing the project.