Robotics
PyTorch
Cosmos
xperience10m_task_baseline_suite
embodied-ai
multimodal
xperience-10m
baseline
evaluation
qwen3-omni
Instructions to use cy0307/ropedia-xperience-10m-task-baselines with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use cy0307/ropedia-xperience-10m-task-baselines with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| # Multi-Episode Access Status | |
| Current status: access to the gated full `ropedia-ai/xperience-10m` dataset is | |
| granted, and a metadata-only Hugging Face audit has been completed. A selected | |
| 128-episode pilot has produced a verified diagnostic Qwen3-Omni LoRA package | |
| with held-out evaluation. The result is useful as a pipeline and error-analysis | |
| baseline, not as a strong final model. | |
| This file records the public data-access status and pilot requirements. It does | |
| not include local-machine aliases, private paths, SSH hosts, or token locations. | |
| ## Selection Plan | |
| | Item | Value | | |
| | --- | ---: | | |
| | Dataset | `ropedia-ai/xperience-10m` | | |
| | Minimum pilot gate | 32 complete leaf episodes | | |
| | Strategy | stratified round-robin across top-level session UUIDs | | |
| | Metadata-audited visible complete episodes | 12,102 | | |
| | Metadata-audited complete sessions | 802 | | |
| | Current selected pilot | 128 source-balanced episodes | | |
| | Recommended split | 96 train / 16 val / 16 test | | |
| | Recommended estimated download | 277.71 GiB excluding `visualization.rrd` | | |
| | Representative 32-episode estimate | ~70.5 GiB at median episode size | | |
| | Smallest one-per-session 32-episode estimate | 35.35 GiB | | |
| | Excluded file | `visualization.rrd` | | |
| ## Current Stage | |
| The current Qwen3-Omni artifacts include a verified validation-monitored | |
| diagnostic held-out run: 96/16/16 selected train/val/test episodes, 3,808 | |
| exported windows, 2,848 train examples, 512 validation examples, and 448 | |
| held-out test predictions from 14 exported test episodes. Training used eight | |
| distributed accelerator processes for one epoch with LoRA rank 16 and recorded a | |
| final train loss of 0.4130 plus a validation loss of 0.0331. The result verifies | |
| the multi-episode pipeline and gives a real error-analysis baseline; it is still | |
| not a strong final model. | |
| A stronger model-quality pilot should be claimed only after: | |
| - selected valid episodes are available locally, | |
| - the manifest builder confirms complete held-out episode splits, | |
| - training finishes with recorded metadata and progress logs, | |
| - evaluation runs on held-out test episodes, | |
| - predictions, metrics, confusion matrices, and a run report are committed. | |
| - JSON validity and action/subtask metrics improve beyond the current | |
| diagnostic baseline. | |
| Current diagnostic metrics: | |
| - JSON validity: 87.50% | |
| - action macro-F1: 0.0027 | |
| - subtask accuracy: 0.0067 | |
| - transition accuracy: 0.8504 | |
| - next-action accuracy: 0.0246 | |
| - contact accuracy: 0.6451 | |
| - object micro-F1: 0.2230 | |
| The public data access summary is: | |
| `results/omni_finetune/DATA_ACCESS_STATUS.md` | |
| The current metadata-only full dataset audit is: | |
| `results/omni_finetune/FULL_DATASET_METADATA_AUDIT.md` | |
| The current 128-episode source-balanced download plan is: | |
| `results/omni_finetune/XPERIENCE10M_128_EPISODE_SELECTION.md` | |
| The current verified diagnostic package is: | |
| `docs/data/omni_finetune_verified_result.json` | |
| `results/omni_finetune/verified_public/` | |
| The older machine-generated source discovery report remains a pre-access local | |
| planning record: | |
| `results/omni_finetune/DATA_BLOCKER_REPORT.md` | |