File size: 18,423 Bytes
eea471e d0b3765 eea471e d0b3765 eea471e c9dfd11 eea471e d0b3765 eea471e c9dfd11 eea471e 28821af eea471e a6472b6 b60e20c 03ce7f7 d0b3765 03ce7f7 03b872c 3e04138 03ce7f7 94a5118 cca436c 9d58132 29331c9 94a5118 f7f39ba cca436c 9d58132 f7f39ba d0b3765 eea471e d0b3765 eea471e b7334ff 00ead12 b60e20c 9d58132 29331c9 b60e20c 4bd6e11 f7f39ba 94a5118 29331c9 08a4bf0 a6472b6 b60e20c 4b0d658 9d58132 b60e20c e647650 f7f39ba e647650 cca436c e647650 f590d7e 03b872c ae4f6df 7977885 08a4bf0 a6472b6 cca436c 29331c9 9d58132 4b0d658 4bd6e11 7faed79 b60e20c e647650 28c2c4c 00ead12 f7f39ba 00ead12 d0b3765 00ead12 cca436c 477807f 28821af 351650d 9ed7dde 03b872c 3e04138 cca436c 29331c9 9d58132 4b0d658 03b872c f590d7e 08a4bf0 a6472b6 ae4f6df 7977885 cd0c84b eea471e d0b3765 eea471e 477807f 28821af 351650d f7f39ba eea471e 0f9a8e2 f590d7e bcc8a9d 7977885 eea471e c9dfd11 eea471e c9dfd11 eea471e 08b7b3f c9dfd11 08b7b3f c9dfd11 08b7b3f d0b3765 eea471e 9ed7dde eea471e 477807f 28821af eea471e d0b3765 eea471e 00b2b8b eea471e c9dfd11 08b7b3f c9dfd11 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | ---
license: other
library_name: pytorch
tags:
- robotics
- embodied-ai
- multimodal
- ropedia
- xperience-10m
- baseline
- neural-network
- pytorch
- linear-model
- retrieval
metrics:
- accuracy
- f1
- mean-reciprocal-rank
- mean-squared-error
model-index:
- name: Ropedia Xperience-10M Task Baselines
results:
- task:
type: robotics
name: Cross-modal retrieval
dataset:
type: ropedia-ai/xperience-10m-sample
name: Xperience-10M public sample episode
metrics:
- type: top_5_accuracy
value: 0.3764
name: top-5 retrieval accuracy
- type: mrr
value: 0.2634
name: mean reciprocal rank
- task:
type: robotics
name: Transition detection
dataset:
type: ropedia-ai/xperience-10m-sample
name: Xperience-10M public sample episode
metrics:
- type: f1
value: 0.6552
name: macro-F1
- task:
type: robotics
name: Temporal order
dataset:
type: ropedia-ai/xperience-10m-sample
name: Xperience-10M public sample episode
metrics:
- type: f1
value: 0.8718
name: neural MLP F1
---
# Ropedia Xperience-10M Task Baselines
This repo stores the minimal baseline weights, neural MLP task-head checkpoints,
and metrics for the 12-task Xperience-10M episode suite, plus four lightweight
direction-extension probes. It is meant to be read like a model audit, not
advertised as a robot foundation model.

The source Xperience-10M sample spans video, audio, depth, pose, motion
capture, inertial sensing, and language annotation. The committed minimal and
neural task heads use the current 8,378-d feature manifest; audio is documented
in the figures but is not yet extracted into a model input feature block.
The companion dashboard and this model card start with the task-first 12-head
map, then mirror the responsive modality atlas metadata in
`metrics/modality_atlas.json`, with standalone derived thumbnails in
`assets/modalities/`.
The model repo also mirrors the official-source alignment artifact at
`metrics/xperience10m_dataset_card_alignment.json` plus
`XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`. That file records the official
`ropedia-ai/xperience-10m` card scope, gated access, full-scale modalities,
episode layout, intended uses, and the claims this small baseline repo does
not make. It also records the public sample card (`cc-by-nc-4.0`, HOMIE
Toolkit, Rerun 0.29.0 `.rrd` visualization) and the current HF API listing
snapshot: 803 session folders and 12,103 episode folders with
`annotation.hdf5`, plus the live HF 31.9 TB file-size display. The 31.9 TB
display is tracked separately from the official card's about-1PB full-scale
storage statement. Those are upstream metadata facts, not local downloads,
raw-data redistribution, or model-quality evidence. The source note also
preserves the official limited in diversity / showcase-quality disclaimer and
excludes identity, surveillance, biometric, sensitive-attribute, and
safety-critical uses.
The source-alignment audit is mirrored at `SOURCE_ALIGNMENT_AUDIT.md` and
`metrics/source_alignment_audit.json`; it validates the same full-dataset,
public sample-card, API-listing, and current-project boundary markers across
the repo, website, artifact dataset, Space, and this model card.
For first-pass model review, use `REVIEWER_SCORECARD.md` and
`metrics/reviewer_scorecard.json`. They state which baseline artifacts are
verified, which Omni claims remain data-gated, and which raw data/weights are
intentionally excluded.
Use `EVALUATION_PROTOCOL.md` and `metrics/evaluation_protocol.json` before
reading scores; they define the window unit, chronological split, leakage
controls, per-task metrics, and unsupported interpretations.
Use `FIGURE_INDEX.md` and `metrics/figure_index.json` to audit the public
figures, charts, modality thumbnails, dimensions, stable hashes, and source
scripts mirrored into this model repo.
The committed heads are intentionally small:
- z-score + linear softmax classifiers,
- dual ridge regression/projection heads,
- sigmoid multi-label logistic regression,
- cosine ranking for retrieval tasks.
- z-score + PyTorch MLP heads for all 12 task definitions.
The included architecture and suite figures use the same Ropedia-inspired dark
visual system as the public dashboard, but the text, dimensions, and metrics
are generated from the committed artifacts rather than drawn by hand.
Their purpose is to make every input/output contract auditable before scaling to many episodes.
## 90-Second Reviewer Path
| Step | Question | Primary artifacts |
| --- | --- | --- |
| 1 | What is actually claimed? | `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json`, `EVIDENCE_CONTRACT.md`, `ARTIFACT_GUIDE.md`, `QUALITY_GATES.md`, `FIGURE_INDEX.md`, `metrics/artifact_index.json`, `metrics/figure_index.json`, `metrics/live_publication_status.json`, `metrics/quality_gates.json`, `metrics/mirror_parity.json`, `metrics/scope_claims_audit.json`, `metrics/publication_audit.json`, `metrics/website_integrity.json`, `metrics/project_manifest.json` |
| 2 | Are source facts consistently presented? | `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json`, `scripts/validate_source_alignment.py` |
| 3 | How do I reproduce it? | `REPRODUCIBILITY.md`, `metrics/reproducibility_matrix.json`, companion GitHub `notes/reproducibility_audit.md` |
| 4 | What is one model input? | `artifacts/episode_task_suite/feature_manifest.json`, `artifacts/episode_task_suite/available_modalities.json`, companion artifact dataset `windows.csv` |
| 5 | Are the task results backed by files? | `artifacts/episode_task_suite/summary_report.json`, `artifacts/episode_task_suite/neural_mlp/`, `metrics/summary_metrics.json` |
| 6 | What is still pending? | companion GitHub `results/omni_finetune/DATA_BLOCKER_REPORT.md` and `A100_HF_RELAY_STATUS.md` |
Human-readable artifact guide mirror: `ARTIFACT_GUIDE.md`.
Reviewer scorecard mirror: `REVIEWER_SCORECARD.md` and `metrics/reviewer_scorecard.json`.
Official dataset-card alignment mirror: `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md` and `metrics/xperience10m_dataset_card_alignment.json`.
Source-alignment audit mirror: `SOURCE_ALIGNMENT_AUDIT.md` and `metrics/source_alignment_audit.json`.
Publication quality gates mirror: `QUALITY_GATES.md` and `metrics/quality_gates.json`.
Live publication status mirror: `metrics/live_publication_status.json`.
Machine-readable reviewer packet mirror: `metrics/reviewer_packet.json`.
Source-of-truth artifact index mirror: `metrics/artifact_index.json`.
Source-of-truth figure index mirror: `FIGURE_INDEX.md` and `metrics/figure_index.json`.
## Evidence Boundary
| Claim layer | Evidence | Boundary |
| --- | --- | --- |
| Reviewer scorecard | `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json` | compact verified/data-gated/not-redistributed decision table |
| Baseline weights | `artifacts/**/model.npz` | lightweight heads only |
| Neural checkpoints | `artifacts/episode_task_suite/neural_mlp/**/model.pt` | same single-episode windows and splits |
| Metrics | `artifacts/**/metrics.json`, prediction CSV/NPZ files | debugging and task-contract evidence |
| Feature contract | `artifacts/**/feature_manifest.json` | audio documented but not featurized |
| Evaluation protocol | `EVALUATION_PROTOCOL.md`, `metrics/evaluation_protocol.json` | windowing, chronological split, leakage controls, and task metrics |
| Qwen3-Omni | companion blocker and relay reports | smoke-only until 32 valid episodes are available |
| Scope claims guard | `metrics/scope_claims_audit.json` and `scripts/validate_scope_claims.py` | historical `32ep` path strings are provenance, not 32-episode results |
| Mirror parity | `metrics/mirror_parity.json` and `scripts/validate_mirror_parity.py` | prepared repo/HF mirrors carry matching critical data, figures, website HTML, and validator files |
| Publication hygiene | `metrics/publication_audit.json` and validator script mirror | public bundles contain no raw data, generated caches, heavy archives, token strings, or stale public-card figure references |
| Website integrity | `metrics/website_integrity.json` and validator script mirror | local links, anchors, JSON bundles, and referenced images only |
| Quality gates | `QUALITY_GATES.md`, `metrics/quality_gates.json`, and `scripts/build_quality_gates.py` | automated release gates plus live post-publish checks |
| Live publication | `metrics/live_publication_status.json`, `scripts/verify_live_publication.py` | last public GitHub/HF URL verification after upload |
| Official dataset card alignment | `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`, `metrics/xperience10m_dataset_card_alignment.json` | official source scope, public sample card, HF API listing, gated access, modality coverage, scale, and this repo's single-episode boundary |
| Source alignment audit | `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json`, `scripts/validate_source_alignment.py` | validates full-dataset facts, sample-card facts, API-listing caveats, and public-card boundary markers |
| Figure index | `FIGURE_INDEX.md`, `metrics/figure_index.json`, `scripts/build_figure_index.py` | public figures, charts, modality thumbnails, dimensions, hashes, and generation provenance |
| Artifact index | `metrics/artifact_index.json` and `scripts/build_artifact_index.py` | compact catalog of the reviewer-critical proof artifacts |
| Artifact guide | `ARTIFACT_GUIDE.md` | human-readable map of proof boundary, task evidence, mirrors, and scale-up status |
| Reproducibility | `REPRODUCIBILITY.md`, `metrics/reproducibility_matrix.json` | public commands, expected outputs, exact-match audit evidence, and non-reproducible boundaries |
| Citation metadata | GitHub `CITATION.cff`, `codemeta.json`, `project_manifest.json`, and `reviewer_packet.json` | code license remains separate from Xperience-10M dataset terms |
## Qwen3-Omni LoRA Boundary
The companion GitHub repo now includes scripts for an A100-to-H20
Xperience-10M relay and a Qwen3-Omni LoRA pilot path. The current LoRA checkpoint
is a technical smoke artifact from one locally available episode and 128 train
windows. It is not a full 32-episode result.
The next real model milestone is a 32-episode held-out-episode LoRA pilot after
Hugging Face access to `ropedia-ai/xperience-10m` is approved. The staging plan
selects 32 complete episodes from 32 different top-level session UUIDs, then
transfers them to H20 for manifest building, training, and evaluation.
## What To Look At First
| Artifact | Why it is useful |
| --- | --- |
| `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json` | gives the compact current decision boundary before reading the full audit trail |
| `artifacts/**/model.npz` | stores the exact lightweight weights and scalers |
| `artifacts/episode_task_suite/neural_mlp/**/model.pt` | stores the neural MLP checkpoints |
| `artifacts/**/metrics.json` | records the committed metric values |
| `artifacts/**/feature_manifest.json` | maps feature blocks back to source modalities |
| `EVALUATION_PROTOCOL.md`, `metrics/evaluation_protocol.json` | defines task-unit, split, metric, leakage-control, and unsupported-interpretation rules |
| `artifacts/episode_task_suite/research_directions/` | maps every task to the four Ropedia research directions with minimal-vs-neural readouts |
| `artifacts/episode_task_suite/research_direction_extensions/` | adds one coded extension probe per research direction |
| `artifacts/episode_task_suite/task_walkthroughs/` | explains every task with case study, input, process modules, output, and limitation |
| `assets/task_architectures.png` | shows the shared pipeline and all 12 heads |
| `assets/task_suite_infographic.png` | presents the shared processing contract, 12 heads, verified metrics, and public-sample modality thumbnails |
| `assets/modalities/`, `metrics/modality_atlas.json` | responsive modality-card thumbnails and metadata for sample inspection |
| `XPERIENCE10M_DATASET_CARD_ALIGNMENT.md`, `metrics/xperience10m_dataset_card_alignment.json` | aligns public wording with the official gated Xperience-10M card, sample card, and HF API metadata |
| `SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json` | verifies source facts and boundary markers across GitHub, the website, and HF cards |
| `FIGURE_INDEX.md`, `metrics/figure_index.json` | verifies public figures, charts, thumbnails, dimensions, hashes, and source scripts |
| `metrics/artifact_index.json` | indexes proof artifacts with existence, size, and stable-file hashes |
| `metrics/mirror_parity.json` | verifies prepared repo/HF mirrors have matching critical data, figures, website HTML, and validator files before upload |
| `metrics/scope_claims_audit.json` | verifies historical `32ep` smoke-run identifiers are not presented as real 32-episode results |
| `QUALITY_GATES.md`, `metrics/quality_gates.json` | summarizes the automated and post-publish release checks |
| `metrics/live_publication_status.json` | records the last live public URL verification after upload |
| `metrics/publication_audit.json` | records the latest public-bundle hygiene and public-card freshness check |
| `metrics/website_integrity.json` | records the latest local website link, anchor, JSON, and image integrity check |
| `metrics/project_manifest.json` | mirrors the public URL and citation metadata bundle |
## Included
- `artifacts/**/model.npz`: minimal baseline weights, scalers, and labels
- `artifacts/episode_task_suite/neural_mlp/**/model.pt`: neural MLP task-head checkpoints
- `artifacts/episode_task_suite/neural_mlp/**/history.json`: neural training traces
- `artifacts/**/metrics.json`: committed metrics
- `artifacts/**/feature_manifest.json`: feature block boundaries where relevant
- `artifacts/episode_task_suite/research_directions/*.json|*.csv|*.md`: four-track task taxonomy
- `artifacts/episode_task_suite/research_direction_extensions/*.json|*.csv|*.md`: four extension-probe metrics and predictions
- `artifacts/episode_task_suite/task_walkthroughs/*.json|*.md`: beginner walkthroughs for all 12 tasks
- `REVIEWER_SCORECARD.md`, `metrics/reviewer_scorecard.json`: compact current decision table
- `scripts/*.py`: training and visualization scripts
- `scripts/validate_mirror_parity.py`: prepared mirror parity validator
- `scripts/validate_scope_claims.py`: Qwen3-Omni smoke/result claim-boundary validator
- `scripts/validate_publication_package.py`: publication hygiene validator
- `scripts/validate_website_integrity.py`: website local-reference validator
- `notes/*.md`: interpretation and reproducibility notes
The companion artifact dataset repo stores CSV/JSON predictions and dashboard assets:
https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts
The public visual dashboard is here:
https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite
Direct static app:
https://cy0307-ropedia-xperience-10m-task-suite.static.hf.space/
The full Hugging Face collection is here:
https://huggingface.co/collections/cy0307/ropedia-xperience-10m-task-suite
## Minimal and Neural Architecture

## Four Research Directions
The baselines are also grouped by the four Ropedia research tracks:
| Direction | Current status | Baseline evidence |
| --- | --- | --- |
| A. Human Modeling & Motion Understanding | partially implemented | hand trajectory forecasting improves from `0.8223` to `0.1116` MPJPE with the neural MLP; contact is degenerate in this sample |
| B. 3D/4D Reconstruction & Neural Rendering | proxy tasks only | cross-modal retrieval, feature reconstruction, and misalignment are prerequisites, not full neural rendering |
| C. Egocentric Vision & Interaction | strongest implemented track | action/subtask/transition/next-action/object/caption tasks plus alignment/order diagnostics |
| D. Scene Reconstruction & World Modeling | early proxy tasks | state, object, retrieval, reconstruction, and temporal tasks are first probes before scene graphs or maps |
Primary taxonomy file:
`artifacts/episode_task_suite/research_directions/research_direction_taxonomy.json`
## Direction-Extension Probe Snapshot
| Direction | Extension task | Minimal | Neural MLP |
| --- | --- | ---: | ---: |
| A. Human Modeling & Motion Understanding | `body_motion_intensity` | 0.7827 macro-F1 | 0.7986 macro-F1 |
| B. 3D/4D Reconstruction & Neural Rendering | `multi_view_consistency_retrieval` | 0.5534 MRR | 0.3469 MRR |
| C. Egocentric Vision & Interaction | `action_phase_progress` | 0.3416 MAE | 0.3038 MAE |
| D. Scene Reconstruction & World Modeling | `ego_motion_forecast` | 0.1989 MAE | 0.0989 MAE |
These probes reuse the same 1,161-window feature tensor and chronological split
style. They are direction-specific diagnostics, not full human-body, neural
rendering, intent, or world-model solutions.
## Metrics Snapshot
| Task | Neural MLP metric | Minimal metric |
| --- | ---: | ---: |
| `timeline_action` macro-F1 | 0.0263 | 0.0500 |
| `timeline_subtask` macro-F1 | 0.0175 | 0.0495 |
| `transition_detection` macro-F1 | 0.6485 | 0.6552 |
| `next_action` macro-F1 | 0.0235 | 0.0593 |
| `hand_trajectory_forecast` MPJPE, lower is better | 0.1116 | 0.8223 |
| `contact_prediction` macro-F1 | 1.0000 | 1.0000 |
| `object_relevance` micro-F1 | 0.1798 | 0.1839 |
| `caption_grounding` MRR | 0.0178 | 0.0172 |
| `cross_modal_retrieval` MRR | 0.1530 | 0.2634 |
| `modality_reconstruction` R2 | -0.0102 | -0.0160 |
| `temporal_order` F1 | 0.8718 | 0.5487 |
| `misalignment_detection` F1 | 0.7335 | 0.4866 |
## Data Notice
This repo does not redistribute raw Xperience-10M videos or raw `annotation.hdf5`. Download the original sample from Ropedia / Hugging Face and follow the dataset terms:
- https://huggingface.co/datasets/ropedia-ai/xperience-10m-sample
- https://ropedia.com/dataset
## Source
GitHub:
https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite
GitHub Pages:
https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/
|