Publish Ropedia Xperience-10M task baseline cards

697d349 verified 23 days ago

8.93 kB

license: other
library_name: pytorch
tags:
  - robotics
  - embodied-ai
  - multimodal
  - ropedia
  - xperience-10m
  - baseline
  - neural-network
  - pytorch
  - linear-model
  - retrieval
metrics:
  - accuracy
  - f1
  - mean-reciprocal-rank
  - mean-squared-error
model-index:
  - name: Ropedia Xperience-10M Task Baselines
    results:
      - task:
          type: robotics
          name: Cross-modal retrieval
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: top_5_accuracy
            value: 0.3764
            name: top-5 retrieval accuracy
          - type: mrr
            value: 0.2634
            name: mean reciprocal rank
      - task:
          type: robotics
          name: Transition detection
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: f1
            value: 0.6552
            name: macro-F1
      - task:
          type: robotics
          name: Temporal order
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: f1
            value: 0.8718
            name: neural MLP F1

Ropedia Xperience-10M Task Baselines

This model repo stores the minimal baseline weights, compact neural MLP task-head checkpoints, metrics, and prediction artifacts for the Ropedia Xperience-10M 12-task public-sample suite. The goal is to make the task contracts and model inputs inspectable before larger multimodal fine-tuning runs. These are lightweight task heads, not a robot foundation model.

The source Xperience-10M sample spans video, audio, depth, pose, motion capture, inertial sensing, and language annotation. The committed minimal and neural task heads use the current 8,378-d feature manifest; audio is documented in the figures but is not yet extracted into a model input feature block.

The companion website HTML, task-first 12-head map, responsive modality atlas, interactive scrub/play storyboard, metrics/brand_assets.json, and scripts/build_brand_assets.py are included so this model repo stays aligned with the public Space and artifact dataset. The research takeaways layer, metrics/research_takeaways.json plus RESEARCH_TAKEAWAYS.md, is regenerated by scripts/build_research_takeaways.py. Project status and figure metadata are mirrored in metrics/project_status.json and metrics/figure_index.json. For a short first-reader path, open PROJECT_BRIEF.md or metrics/project_brief.json, then open RESEARCH_ROADMAP.md or metrics/research_roadmap.json before inspecting the model artifacts.

Release Artifacts

Artifact	Where to inspect
Project Brief	`PROJECT_BRIEF.md`, `metrics/project_brief.json`
Research Roadmap	`RESEARCH_ROADMAP.md`, `metrics/research_roadmap.json`
Research Takeaways	`RESEARCH_TAKEAWAYS.md`, `metrics/research_takeaways.json`
Multi-episode data status	`results/omni_finetune/DATA_ACCESS_STATUS.md`
Release checks	`QUALITY_GATES.md`, `metrics/quality_gates.json`
Public project surface	`PUBLIC_SURFACE_QA.md`, `metrics/public_surface_qa.json`
Mirror parity	`metrics/mirror_parity.json`
Single-episode explorer	`single_episode_explorer.html`, `metrics/single_episode_explorer.json`

Current Scope

Project layer	Evidence	Current scope
Baseline weights	`artifacts/**/model.npz`	minimal linear/ridge/logistic task heads
Neural checkpoints	`artifacts/episode_task_suite/neural_mlp/**/model.pt`	compact MLP heads over the same windows and split
Metrics	`artifacts/**/metrics.json`, `metrics/summary_metrics.json`	single public-sample chronological split
Feature contract	`artifacts/**/feature_manifest.json`	8,378 current feature dimensions; audio documented, not featurized
Evaluation protocol	`EVALUATION_PROTOCOL.md`, `metrics/evaluation_protocol.json`	window unit, split policy, leakage controls, task metrics
Research roadmap	`RESEARCH_ROADMAP.md`, `metrics/research_roadmap.json`	staged path from public-sample task work to multi-episode and larger omni-model work
Research takeaways	`RESEARCH_TAKEAWAYS.md`, `metrics/research_takeaways.json`	interpretation of committed sample metrics and next held-out stage
Source alignment	`SOURCE_ALIGNMENT_AUDIT.md`, `metrics/source_alignment_audit.json`, `metrics/xperience10m_dataset_card_alignment.json`	official dataset-card facts, public sample-card facts, and current project coverage
Public project surface	`PUBLIC_SURFACE_QA.md`, `metrics/public_surface_qa.json`	repo, website, and Hugging Face card consistency
Task surface	`metrics/task_surface_integrity.json`, `scripts/validate_task_surface.py`	readable task names, modality thumbnails, and walkthrough wiring
Rendered website check	`RENDERED_SITE_CHECK.md`, `metrics/rendered_site_check.json`, `scripts/build_rendered_site_check.py`	browser-level load, tab, walkthrough deep-link, control-click, and console-health check
Single-episode diagnostics	`results/single_episode_diagnostics/`, `single_episode_explorer.html`	window labels, object sets, predictions, feature-block statistics, and diagnostic probes

Metrics Snapshot

These are single-episode chronological-split metrics. They are useful for checking task definitions and input contracts; cross-episode model quality requires the later held-out multi-episode evaluation.

Task	Neural MLP metric	Minimal metric
Action Recognition macro-F1	0.0263	0.0500
Procedure Step Recognition macro-F1	0.0175	0.0495
Action Boundary Detection macro-F1	0.6485	0.6552
Next-Action Prediction macro-F1	0.0235	0.0593
Hand Trajectory Forecasting MPJPE, lower is better	0.1116	0.8223
Contact State Prediction macro-F1	1.0000	1.0000
Object Relevance Prediction micro-F1	0.1798	0.1839
Language Grounding MRR	0.0178	0.0172
Cross-Modal Retrieval MRR	0.1530	0.2634
Cross-Modal Reconstruction R2	-0.0102	-0.0160
Temporal Order Verification F1	0.8718	0.5487
Multimodal Synchronization Detection F1	0.7335	0.4866

Official Dataset Alignment

The model card mirrors the official-source alignment artifacts at metrics/xperience10m_dataset_card_alignment.json, metrics/source_alignment_audit.json, and XPERIENCE10M_DATASET_CARD_ALIGNMENT.md. Those files record the official gated ropedia-ai/xperience-10m dataset card scope, manually reviewed access, full-scale modality coverage, episode layout, intended uses, limitations, and the current project coverage. They also record the public sample card (cc-by-nc-4.0, HOMIE Toolkit, Rerun 0.29.0 .rrd visualization) and the observed HF API listing snapshot: 803 session folders and 12,103 episode folders with annotation.hdf5, plus the live HF 31.9 TB file-size display. The live file-size display is tracked separately from the official card's about-1PB full-scale storage statement. These are upstream metadata facts rather than local data possession. The official card also notes that the open dataset is limited in diversity and showcase/production quality.

Included

artifacts/**/model.npz: minimal baseline weights, scalers, and labels
artifacts/episode_task_suite/neural_mlp/**/model.pt: neural MLP task-head checkpoints
artifacts/episode_task_suite/neural_mlp/**/history.json: neural training traces
artifacts/**/metrics.json: committed metrics
artifacts/**/feature_manifest.json: feature block boundaries where relevant
assets/: mirrored figures, modality thumbnails, and brand assets
metrics/: project status, protocol, source-alignment, release, and public-surface JSON files
scripts/: reproduction, visualization, and validation scripts

Links

Resource	URL
HF Space	https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite
Artifact dataset	https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts
GitHub repo	https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite
GitHub Pages dashboard	https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/
Official Xperience-10M dataset	https://huggingface.co/datasets/ropedia-ai/xperience-10m
Public Xperience-10M sample	https://huggingface.co/datasets/ropedia-ai/xperience-10m-sample
Ropedia dataset page	https://ropedia.com/dataset

Dataset use remains governed by the official Ropedia/Xperience-10M terms.