| --- |
| license: other |
| library_name: pytorch |
| tags: |
| - robotics |
| - embodied-ai |
| - multimodal |
| - ropedia |
| - xperience-10m |
| - baseline |
| - neural-network |
| - pytorch |
| - linear-model |
| - retrieval |
| metrics: |
| - accuracy |
| - f1 |
| - mean-reciprocal-rank |
| - mean-squared-error |
| model-index: |
| - name: Ropedia Xperience-10M Task Baselines |
| results: |
| - task: |
| type: robotics |
| name: Cross-modal retrieval |
| dataset: |
| type: ropedia-ai/xperience-10m-sample |
| name: Xperience-10M public sample episode |
| metrics: |
| - type: top_5_accuracy |
| value: 0.3764 |
| name: top-5 retrieval accuracy |
| - type: mrr |
| value: 0.2634 |
| name: mean reciprocal rank |
| - task: |
| type: robotics |
| name: Transition detection |
| dataset: |
| type: ropedia-ai/xperience-10m-sample |
| name: Xperience-10M public sample episode |
| metrics: |
| - type: f1 |
| value: 0.6552 |
| name: macro-F1 |
| - task: |
| type: robotics |
| name: Temporal order |
| dataset: |
| type: ropedia-ai/xperience-10m-sample |
| name: Xperience-10M public sample episode |
| metrics: |
| - type: f1 |
| value: 0.8718 |
| name: neural MLP F1 |
| --- |
| |
| # Ropedia Xperience-10M Task Baselines |
|
|
| This repo stores the minimal baseline weights, neural MLP task-head checkpoints, |
| and metrics for the 12-task Xperience-10M episode suite, plus four lightweight |
| direction-extension probes. It is meant to be read like a model audit, not |
| advertised as a robot foundation model. |
|
|
|  |
|
|
| The source Xperience-10M sample spans video, audio, depth, pose, motion |
| capture, inertial sensing, and language annotation. The committed minimal and |
| neural task heads use the current 8,378-d feature manifest; audio is documented |
| in the figures but is not yet extracted into a model input feature block. |
| The companion dashboard and this model card start with the task-first 12-head |
| map, then mirror the responsive modality atlas metadata in |
| `metrics/modality_atlas.json`, with standalone derived thumbnails in |
| `assets/modalities/`. |
|
|
| The model repo also mirrors the official-source alignment artifact at |
| `metrics/xperience10m_dataset_card_alignment.json` plus |
| `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`. That file records the official |
| `ropedia-ai/xperience-10m` card scope, gated access, full-scale modalities, |
| episode layout, intended uses, and the claims this small baseline repo does |
| not make. It also records the public sample card (`cc-by-nc-4.0`, HOMIE |
| Toolkit, Rerun 0.29.0 `.rrd` visualization) and the current HF API listing |
| snapshot: 803 session folders and 12,103 episode folders with |
| `annotation.hdf5`, plus the live HF 31.9 TB file-size display. The 31.9 TB |
| display is tracked separately from the official card's about-1PB full-scale |
| storage statement. Those are upstream metadata facts, not local downloads, |
| raw-data redistribution, or model-quality evidence. The source note also |
| preserves the official limited in diversity / showcase-quality disclaimer and |
| excludes identity, surveillance, biometric, sensitive-attribute, and |
| safety-critical uses. |
| The source-alignment audit is mirrored at `SOURCE_ALIGNMENT_AUDIT.md` and |
| `metrics/source_alignment_audit.json`; it validates the same full-dataset, |
| public sample-card, API-listing, and current-project boundary markers across |
| the repo, website, artifact dataset, Space, and this model card. |
|
|
| For first-pass model review, use `REVIEWER_SCORECARD.md` and |
| `metrics/reviewer_scorecard.json`. They state which baseline artifacts are |
| verified, which Omni claims remain data-gated, and which raw data/weights are |
| intentionally excluded. |
| Use `EVALUATION_PROTOCOL.md` and `metrics/evaluation_protocol.json` before |
| reading scores; they define the window unit, chronological split, leakage |
| controls, per-task metrics, and unsupported interpretations. |
| Use `FIGURE_INDEX.md` and `metrics/figure_index.json` to audit the public |
| figures, charts, modality thumbnails, dimensions, stable hashes, and source |
| scripts mirrored into this model repo. |
|
|
| The committed heads are intentionally small: |
|
|
| - z-score + linear softmax classifiers, |
| - dual ridge regression/projection heads, |
| - sigmoid multi-label logistic regression, |
| - cosine ranking for retrieval tasks. |
| - z-score + PyTorch MLP heads for all 12 task definitions. |
|
|
| The included architecture and suite figures use the same Ropedia-inspired dark |
| visual system as the public dashboard, but the text, dimensions, and metrics |
| are generated from the committed artifacts rather than drawn by hand. |
|
|
| Their purpose is to make every input/output contract auditable before scaling to many episodes. |
|
|
| ## 90-Second Reviewer Path |
|
|
| | Step | Question | Primary artifacts | |
| | --- | --- | --- | |
| | 1 | What is actually claimed? | `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json`, `EVIDENCE_CONTRACT.md`, `ARTIFACT_GUIDE.md`, `QUALITY_GATES.md`, `FIGURE_INDEX.md`, `metrics/artifact_index.json`, `metrics/figure_index.json`, `metrics/live_publication_status.json`, `metrics/quality_gates.json`, `metrics/mirror_parity.json`, `metrics/scope_claims_audit.json`, `metrics/publication_audit.json`, `metrics/website_integrity.json`, `metrics/project_manifest.json` | |
| | 2 | Are source facts consistently presented? | `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json`, `scripts/validate_source_alignment.py` | |
| | 3 | How do I reproduce it? | `REPRODUCIBILITY.md`, `metrics/reproducibility_matrix.json`, companion GitHub `notes/reproducibility_audit.md` | |
| | 4 | What is one model input? | `artifacts/episode_task_suite/feature_manifest.json`, `artifacts/episode_task_suite/available_modalities.json`, companion artifact dataset `windows.csv` | |
| | 5 | Are the task results backed by files? | `artifacts/episode_task_suite/summary_report.json`, `artifacts/episode_task_suite/neural_mlp/`, `metrics/summary_metrics.json` | |
| | 6 | What is still pending? | companion GitHub `results/omni_finetune/DATA_BLOCKER_REPORT.md` and `A100_HF_RELAY_STATUS.md` | |
|
|
| Human-readable artifact guide mirror: `ARTIFACT_GUIDE.md`. |
| Reviewer scorecard mirror: `REVIEWER_SCORECARD.md` and `metrics/reviewer_scorecard.json`. |
| Official dataset-card alignment mirror: `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md` and `metrics/xperience10m_dataset_card_alignment.json`. |
| Source-alignment audit mirror: `SOURCE_ALIGNMENT_AUDIT.md` and `metrics/source_alignment_audit.json`. |
| Publication quality gates mirror: `QUALITY_GATES.md` and `metrics/quality_gates.json`. |
| Live publication status mirror: `metrics/live_publication_status.json`. |
| Machine-readable reviewer packet mirror: `metrics/reviewer_packet.json`. |
| Source-of-truth artifact index mirror: `metrics/artifact_index.json`. |
| Source-of-truth figure index mirror: `FIGURE_INDEX.md` and `metrics/figure_index.json`. |
|
|
| ## Evidence Boundary |
|
|
| | Claim layer | Evidence | Boundary | |
| | --- | --- | --- | |
| | Reviewer scorecard | `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json` | compact verified/data-gated/not-redistributed decision table | |
| | Baseline weights | `artifacts/**/model.npz` | lightweight heads only | |
| | Neural checkpoints | `artifacts/episode_task_suite/neural_mlp/**/model.pt` | same single-episode windows and splits | |
| | Metrics | `artifacts/**/metrics.json`, prediction CSV/NPZ files | debugging and task-contract evidence | |
| | Feature contract | `artifacts/**/feature_manifest.json` | audio documented but not featurized | |
| | Evaluation protocol | `EVALUATION_PROTOCOL.md`, `metrics/evaluation_protocol.json` | windowing, chronological split, leakage controls, and task metrics | |
| | Qwen3-Omni | companion blocker and relay reports | smoke-only until 32 valid episodes are available | |
| | Scope claims guard | `metrics/scope_claims_audit.json` and `scripts/validate_scope_claims.py` | historical `32ep` path strings are provenance, not 32-episode results | |
| | Mirror parity | `metrics/mirror_parity.json` and `scripts/validate_mirror_parity.py` | prepared repo/HF mirrors carry matching critical data, figures, website HTML, and validator files | |
| | Publication hygiene | `metrics/publication_audit.json` and validator script mirror | public bundles contain no raw data, generated caches, heavy archives, token strings, or stale public-card figure references | |
| | Website integrity | `metrics/website_integrity.json` and validator script mirror | local links, anchors, JSON bundles, and referenced images only | |
| | Quality gates | `QUALITY_GATES.md`, `metrics/quality_gates.json`, and `scripts/build_quality_gates.py` | automated release gates plus live post-publish checks | |
| | Live publication | `metrics/live_publication_status.json`, `scripts/verify_live_publication.py` | last public GitHub/HF URL verification after upload | |
| | Official dataset card alignment | `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`, `metrics/xperience10m_dataset_card_alignment.json` | official source scope, public sample card, HF API listing, gated access, modality coverage, scale, and this repo's single-episode boundary | |
| | Source alignment audit | `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json`, `scripts/validate_source_alignment.py` | validates full-dataset facts, sample-card facts, API-listing caveats, and public-card boundary markers | |
| | Figure index | `FIGURE_INDEX.md`, `metrics/figure_index.json`, `scripts/build_figure_index.py` | public figures, charts, modality thumbnails, dimensions, hashes, and generation provenance | |
| | Artifact index | `metrics/artifact_index.json` and `scripts/build_artifact_index.py` | compact catalog of the reviewer-critical proof artifacts | |
| | Artifact guide | `ARTIFACT_GUIDE.md` | human-readable map of proof boundary, task evidence, mirrors, and scale-up status | |
| | Reproducibility | `REPRODUCIBILITY.md`, `metrics/reproducibility_matrix.json` | public commands, expected outputs, exact-match audit evidence, and non-reproducible boundaries | |
| | Citation metadata | GitHub `CITATION.cff`, `codemeta.json`, `project_manifest.json`, and `reviewer_packet.json` | code license remains separate from Xperience-10M dataset terms | |
|
|
| ## Qwen3-Omni LoRA Boundary |
|
|
| The companion GitHub repo now includes scripts for an A100-to-H20 |
| Xperience-10M relay and a Qwen3-Omni LoRA pilot path. The current LoRA checkpoint |
| is a technical smoke artifact from one locally available episode and 128 train |
| windows. It is not a full 32-episode result. |
|
|
| The next real model milestone is a 32-episode held-out-episode LoRA pilot after |
| Hugging Face access to `ropedia-ai/xperience-10m` is approved. The staging plan |
| selects 32 complete episodes from 32 different top-level session UUIDs, then |
| transfers them to H20 for manifest building, training, and evaluation. |
|
|
| ## What To Look At First |
|
|
| | Artifact | Why it is useful | |
| | --- | --- | |
| | `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json` | gives the compact current decision boundary before reading the full audit trail | |
| | `artifacts/**/model.npz` | stores the exact lightweight weights and scalers | |
| | `artifacts/episode_task_suite/neural_mlp/**/model.pt` | stores the neural MLP checkpoints | |
| | `artifacts/**/metrics.json` | records the committed metric values | |
| | `artifacts/**/feature_manifest.json` | maps feature blocks back to source modalities | |
| | `EVALUATION_PROTOCOL.md`, `metrics/evaluation_protocol.json` | defines task-unit, split, metric, leakage-control, and unsupported-interpretation rules | |
| | `artifacts/episode_task_suite/research_directions/` | maps every task to the four Ropedia research directions with minimal-vs-neural readouts | |
| | `artifacts/episode_task_suite/research_direction_extensions/` | adds one coded extension probe per research direction | |
| | `artifacts/episode_task_suite/task_walkthroughs/` | explains every task with case study, input, process modules, output, and limitation | |
| | `assets/task_architectures.png` | shows the shared pipeline and all 12 heads | |
| | `assets/task_suite_infographic.png` | presents the shared processing contract, 12 heads, verified metrics, and public-sample modality thumbnails | |
| | `assets/modalities/`, `metrics/modality_atlas.json` | responsive modality-card thumbnails and metadata for sample inspection | |
| | `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`, `metrics/xperience10m_dataset_card_alignment.json` | aligns public wording with the official gated Xperience-10M card, sample card, and HF API metadata | |
| | `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json` | verifies source facts and boundary markers across GitHub, the website, and HF cards | |
| | `FIGURE_INDEX.md`, `metrics/figure_index.json` | verifies public figures, charts, thumbnails, dimensions, hashes, and source scripts | |
| | `metrics/artifact_index.json` | indexes proof artifacts with existence, size, and stable-file hashes | |
| | `metrics/mirror_parity.json` | verifies prepared repo/HF mirrors have matching critical data, figures, website HTML, and validator files before upload | |
| | `metrics/scope_claims_audit.json` | verifies historical `32ep` smoke-run identifiers are not presented as real 32-episode results | |
| | `QUALITY_GATES.md`, `metrics/quality_gates.json` | summarizes the automated and post-publish release checks | |
| | `metrics/live_publication_status.json` | records the last live public URL verification after upload | |
| | `metrics/publication_audit.json` | records the latest public-bundle hygiene and public-card freshness check | |
| | `metrics/website_integrity.json` | records the latest local website link, anchor, JSON, and image integrity check | |
| | `metrics/project_manifest.json` | mirrors the public URL and citation metadata bundle | |
|
|
| ## Included |
|
|
| - `artifacts/**/model.npz`: minimal baseline weights, scalers, and labels |
| - `artifacts/episode_task_suite/neural_mlp/**/model.pt`: neural MLP task-head checkpoints |
| - `artifacts/episode_task_suite/neural_mlp/**/history.json`: neural training traces |
| - `artifacts/**/metrics.json`: committed metrics |
| - `artifacts/**/feature_manifest.json`: feature block boundaries where relevant |
| - `artifacts/episode_task_suite/research_directions/*.json|*.csv|*.md`: four-track task taxonomy |
| - `artifacts/episode_task_suite/research_direction_extensions/*.json|*.csv|*.md`: four extension-probe metrics and predictions |
| - `artifacts/episode_task_suite/task_walkthroughs/*.json|*.md`: beginner walkthroughs for all 12 tasks |
| - `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json`: compact current decision table |
| - `scripts/*.py`: training and visualization scripts |
| - `scripts/validate_mirror_parity.py`: prepared mirror parity validator |
| - `scripts/validate_scope_claims.py`: Qwen3-Omni smoke/result claim-boundary validator |
| - `scripts/validate_publication_package.py`: publication hygiene validator |
| - `scripts/validate_website_integrity.py`: website local-reference validator |
| - `notes/*.md`: interpretation and reproducibility notes |
|
|
| The companion artifact dataset repo stores CSV/JSON predictions and dashboard assets: |
|
|
| https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts |
|
|
| The public visual dashboard is here: |
|
|
| https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite |
|
|
| Direct static app: |
|
|
| https://cy0307-ropedia-xperience-10m-task-suite.static.hf.space/ |
|
|
| The full Hugging Face collection is here: |
|
|
| https://huggingface.co/collections/cy0307/ropedia-xperience-10m-task-suite |
|
|
| ## Minimal and Neural Architecture |
|
|
|  |
|
|
| ## Four Research Directions |
|
|
| The baselines are also grouped by the four Ropedia research tracks: |
|
|
| | Direction | Current status | Baseline evidence | |
| | --- | --- | --- | |
| | A. Human Modeling & Motion Understanding | partially implemented | hand trajectory forecasting improves from `0.8223` to `0.1116` MPJPE with the neural MLP; contact is degenerate in this sample | |
| | B. 3D/4D Reconstruction & Neural Rendering | proxy tasks only | cross-modal retrieval, feature reconstruction, and misalignment are prerequisites, not full neural rendering | |
| | C. Egocentric Vision & Interaction | strongest implemented track | action/subtask/transition/next-action/object/caption tasks plus alignment/order diagnostics | |
| | D. Scene Reconstruction & World Modeling | early proxy tasks | state, object, retrieval, reconstruction, and temporal tasks are first probes before scene graphs or maps | |
|
|
| Primary taxonomy file: |
|
|
| `artifacts/episode_task_suite/research_directions/research_direction_taxonomy.json` |
|
|
| ## Direction-Extension Probe Snapshot |
|
|
| | Direction | Extension task | Minimal | Neural MLP | |
| | --- | --- | ---: | ---: | |
| | A. Human Modeling & Motion Understanding | `body_motion_intensity` | 0.7827 macro-F1 | 0.7986 macro-F1 | |
| | B. 3D/4D Reconstruction & Neural Rendering | `multi_view_consistency_retrieval` | 0.5534 MRR | 0.3469 MRR | |
| | C. Egocentric Vision & Interaction | `action_phase_progress` | 0.3416 MAE | 0.3038 MAE | |
| | D. Scene Reconstruction & World Modeling | `ego_motion_forecast` | 0.1989 MAE | 0.0989 MAE | |
|
|
| These probes reuse the same 1,161-window feature tensor and chronological split |
| style. They are direction-specific diagnostics, not full human-body, neural |
| rendering, intent, or world-model solutions. |
|
|
| ## Metrics Snapshot |
|
|
| | Task | Neural MLP metric | Minimal metric | |
| | --- | ---: | ---: | |
| | `timeline_action` macro-F1 | 0.0263 | 0.0500 | |
| | `timeline_subtask` macro-F1 | 0.0175 | 0.0495 | |
| | `transition_detection` macro-F1 | 0.6485 | 0.6552 | |
| | `next_action` macro-F1 | 0.0235 | 0.0593 | |
| | `hand_trajectory_forecast` MPJPE, lower is better | 0.1116 | 0.8223 | |
| | `contact_prediction` macro-F1 | 1.0000 | 1.0000 | |
| | `object_relevance` micro-F1 | 0.1798 | 0.1839 | |
| | `caption_grounding` MRR | 0.0178 | 0.0172 | |
| | `cross_modal_retrieval` MRR | 0.1530 | 0.2634 | |
| | `modality_reconstruction` R2 | -0.0102 | -0.0160 | |
| | `temporal_order` F1 | 0.8718 | 0.5487 | |
| | `misalignment_detection` F1 | 0.7335 | 0.4866 | |
|
|
| ## Data Notice |
|
|
| This repo does not redistribute raw Xperience-10M videos or raw `annotation.hdf5`. Download the original sample from Ropedia / Hugging Face and follow the dataset terms: |
|
|
| - https://huggingface.co/datasets/ropedia-ai/xperience-10m-sample |
| - https://ropedia.com/dataset |
|
|
| ## Source |
|
|
| GitHub: |
|
|
| https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite |
|
|
| GitHub Pages: |
|
|
| https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/ |
|
|