cy0307's picture
Publish Ropedia Xperience-10M task baseline cards
a8124a8 verified
|
Raw
History Blame
9.11 kB
metadata
license: other
library_name: pytorch
tags:
  - robotics
  - embodied-ai
  - multimodal
  - ropedia
  - xperience-10m
  - baseline
  - neural-network
  - pytorch
  - linear-model
  - retrieval
metrics:
  - accuracy
  - f1
  - mean-reciprocal-rank
  - mean-squared-error
model-index:
  - name: Ropedia Xperience-10M Task Baselines
    results:
      - task:
          type: robotics
          name: Cross-modal retrieval
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: top_5_accuracy
            value: 0.3678
            name: top-5 retrieval accuracy
          - type: mrr
            value: 0.2693
            name: mean reciprocal rank
      - task:
          type: robotics
          name: Transition detection
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: f1
            value: 0.6118
            name: macro-F1
      - task:
          type: robotics
          name: Temporal order
        dataset:
          type: ropedia-ai/xperience-10m-sample
          name: Xperience-10M public sample episode
        metrics:
          - type: f1
            value: 0.852
            name: neural MLP F1

Ropedia Xperience-10M Task Baselines

This model repo stores the minimal baseline weights, compact neural MLP task-head checkpoints, metrics, and prediction artifacts for the Ropedia Xperience-10M 12-task public-sample suite. The goal is to make the task contracts and model inputs inspectable before larger multimodal fine-tuning runs. These are lightweight task heads, not a robot foundation model.

Ropedia Xperience-10M Task Suite logo

12-task suite with sample modalities

The source Xperience-10M sample spans video, audio, depth, pose, motion capture, inertial sensing, and language annotation. The committed minimal and neural task heads use the current 8,546-d feature manifest, including a 168-d AAC audio block decoded from fisheye_cam0.mp4.

The companion website HTML, task-first 12-head map, responsive modality atlas, interactive scrub/play storyboard, metrics/brand_assets.json, and scripts/build_brand_assets.py are included so this model repo stays aligned with the public Space and artifact dataset. The research takeaways layer, metrics/research_takeaways.json plus RESEARCH_TAKEAWAYS.md, is regenerated by scripts/build_research_takeaways.py. Project status and figure metadata are mirrored in metrics/project_status.json and metrics/figure_index.json. For a short first-reader path, open PROJECT_BRIEF.md or metrics/project_brief.json, then open research_roadmap.html, RESEARCH_ROADMAP.md, or metrics/research_roadmap_interactive.json before inspecting the model artifacts.

Release Artifacts

Artifact Where to inspect
Project Brief PROJECT_BRIEF.md, metrics/project_brief.json
Research Roadmap research_roadmap.html, RESEARCH_ROADMAP.md, metrics/research_roadmap.json, metrics/research_roadmap_interactive.json
Research Takeaways RESEARCH_TAKEAWAYS.md, metrics/research_takeaways.json
Multi-episode data status results/omni_finetune/DATA_ACCESS_STATUS.md
Release checks QUALITY_GATES.md, metrics/quality_gates.json
Public project surface PUBLIC_SURFACE_QA.md, metrics/public_surface_qa.json
Mirror parity metrics/mirror_parity.json
Single-episode explorer single_episode_explorer.html, metrics/single_episode_explorer.json

Current Scope

Project layer Evidence Current scope
Baseline weights artifacts/**/model.npz minimal linear/ridge/logistic task heads
Neural checkpoints artifacts/episode_task_suite/neural_mlp/**/model.pt compact MLP heads over the same windows and split
Metrics artifacts/**/metrics.json, metrics/summary_metrics.json single public-sample chronological split
Feature contract artifacts/**/feature_manifest.json 8,546 current feature dimensions, including audio_fisheye_cam0_aac
Evaluation protocol EVALUATION_PROTOCOL.md, metrics/evaluation_protocol.json window unit, split policy, leakage controls, task metrics
Research roadmap research_roadmap.html, RESEARCH_ROADMAP.md, metrics/research_roadmap.json, metrics/research_roadmap_interactive.json interactive and machine-readable path from public-sample task work to multi-episode and larger omni-model work
Research takeaways RESEARCH_TAKEAWAYS.md, metrics/research_takeaways.json interpretation of committed sample metrics and next held-out stage
Source alignment SOURCE_ALIGNMENT_AUDIT.md, metrics/source_alignment_audit.json, metrics/xperience10m_dataset_card_alignment.json official dataset-card facts, public sample-card facts, and current project coverage
Public project surface PUBLIC_SURFACE_QA.md, metrics/public_surface_qa.json repo, website, and Hugging Face card consistency
Task surface metrics/task_surface_integrity.json, scripts/validate_task_surface.py readable task names, modality thumbnails, and walkthrough wiring
Rendered website check RENDERED_SITE_CHECK.md, metrics/rendered_site_check.json, scripts/build_rendered_site_check.py browser-level load, tab, walkthrough deep-link, control-click, and console-health check
Single-episode diagnostics results/single_episode_diagnostics/, single_episode_explorer.html window labels, object sets, predictions, feature-block statistics, and diagnostic probes

Metrics Snapshot

These are single-episode chronological-split metrics. They are useful for checking task definitions and input contracts; cross-episode model quality requires the later held-out multi-episode evaluation.

Task Neural MLP metric Minimal metric
Action Recognition macro-F1 0.0148 0.0500
Procedure Step Recognition macro-F1 0.0281 0.0506
Action Boundary Detection macro-F1 0.5862 0.6118
Next-Action Prediction macro-F1 0.0419 0.0593
Hand Trajectory Forecasting MPJPE, lower is better 0.1079 0.8647
Contact State Prediction macro-F1 1.0000 1.0000
Object Relevance Prediction micro-F1 0.1679 0.1803
Language Grounding MRR 0.0168 0.0160
Cross-Modal Retrieval MRR 0.1300 0.2693
Cross-Modal Reconstruction R2 -0.0102 -0.0153
Temporal Order Verification F1 0.8520 0.5400
Multimodal Synchronization Detection F1 0.7153 0.5052

Official Dataset Alignment

The model card mirrors the official-source alignment artifacts at metrics/xperience10m_dataset_card_alignment.json, metrics/source_alignment_audit.json, and XPERIENCE10M_DATASET_CARD_ALIGNMENT.md. Those files record the official gated ropedia-ai/xperience-10m dataset card scope, manually reviewed access, full-scale modality coverage, episode layout, intended uses, limitations, and the current project coverage. They also record the public sample card (cc-by-nc-4.0, HOMIE Toolkit, Rerun 0.29.0 .rrd visualization) and the observed HF API listing snapshot: 803 session folders and 12,103 episode folders with annotation.hdf5, plus the live HF 31.9 TB file-size display. The live file-size display is tracked separately from the official card's about-1PB full-scale storage statement. These are upstream metadata facts rather than local data possession. The official card also notes that the open dataset is limited in diversity and showcase/production quality.

Included

  • artifacts/**/model.npz: minimal baseline weights, scalers, and labels
  • artifacts/episode_task_suite/neural_mlp/**/model.pt: neural MLP task-head checkpoints
  • artifacts/episode_task_suite/neural_mlp/**/history.json: neural training traces
  • artifacts/**/metrics.json: committed metrics
  • artifacts/**/feature_manifest.json: feature block boundaries where relevant
  • assets/: mirrored figures, modality thumbnails, and brand assets
  • metrics/: project status, protocol, source-alignment, release, and public-surface JSON files
  • scripts/: reproduction, visualization, and validation scripts

Links

Dataset use remains governed by the official Ropedia/Xperience-10M terms.