--- license: other library_name: numpy tags: - robotics - embodied-ai - multimodal - ropedia - xperience-10m - baseline - linear-model - retrieval metrics: - accuracy - f1 - mean-reciprocal-rank - mean-squared-error model-index: - name: Ropedia Minimal Task Baselines results: - task: type: robotics name: Cross-modal retrieval dataset: type: ropedia-ai/xperience-10m-sample name: Xperience-10M public sample episode metrics: - type: top_5_accuracy value: 0.3764 name: top-5 retrieval accuracy - type: mrr value: 0.2634 name: mean reciprocal rank - task: type: robotics name: Transition detection dataset: type: ropedia-ai/xperience-10m-sample name: Xperience-10M public sample episode metrics: - type: f1 value: 0.6552 name: macro-F1 --- # Ropedia Minimal Task Baselines This repo stores the minimal baseline weights and metrics for the 12-task Ropedia episode suite. These are intentionally small, transparent baselines: - z-score + linear softmax classifiers, - dual ridge regression/projection heads, - sigmoid multi-label logistic regression, - cosine ranking for retrieval tasks. They are not deep robot policies or foundation models. Their purpose is to make every input/output contract auditable before scaling to many episodes. ## Included - `artifacts/**/model.npz`: minimal baseline weights, scalers, and labels - `artifacts/**/metrics.json`: committed metrics - `artifacts/**/feature_manifest.json`: feature block boundaries where relevant - `scripts/*.py`: training and visualization scripts - `notes/*.md`: interpretation and reproducibility notes The companion artifact dataset repo stores CSV/JSON predictions and dashboard assets: https://huggingface.co/datasets/cy0307/ropedia-episode-task-suite-artifacts The public visual dashboard is here: https://huggingface.co/spaces/cy0307/ropedia-episode-task-suite ## Minimal Architecture ![Minimal 12-task architecture](assets/task_architectures.svg) ## Metrics Snapshot | Task | Minimal head | Main metric | | --- | --- | ---: | | `timeline_action` | linear softmax | 0.0500 macro-F1 | | `timeline_subtask` | linear softmax | 0.0495 macro-F1 | | `transition_detection` | linear softmax | 0.6552 macro-F1 | | `next_action` | linear softmax | 0.0593 macro-F1 | | `hand_trajectory_forecast` | ridge regression | 0.8223 MPJPE | | `contact_prediction` | linear softmax | 1.0000 macro-F1 | | `object_relevance` | multi-label logistic | 0.1839 micro-F1 | | `caption_grounding` | ridge + cosine rank | 0.0172 MRR | | `cross_modal_retrieval` | ridge + cosine rank | 0.3764 top-5 | | `modality_reconstruction` | ridge regression | -0.0160 R2 | | `temporal_order` | binary softmax | 0.5487 F1 | | `misalignment_detection` | binary softmax | 0.4866 F1 | ## Data Notice This repo does not redistribute raw Ropedia videos or raw `annotation.hdf5`. Download the original sample from Ropedia / Hugging Face and follow the dataset terms: - https://huggingface.co/datasets/ropedia-ai/xperience-10m-sample - https://ropedia.com/dataset ## Source GitHub: https://github.com/ChaoYue0307/ropedia-episode-task-suite