Robotics
PyTorch
Cosmos
xperience10m_task_baseline_suite
embodied-ai
multimodal
xperience-10m
baseline
evaluation
qwen3-omni
Instructions to use cy0307/ropedia-xperience-10m-task-baselines with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use cy0307/ropedia-xperience-10m-task-baselines with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
File size: 5,983 Bytes
00b2b8b b7334ff 367c357 b7334ff 367c357 b7334ff 367c357 b7334ff 367c357 b7334ff 367c357 b7334ff 367c357 b7334ff 367c357 b7334ff 1e05f01 367c357 b7334ff 1e05f01 a8124a8 1e05f01 a8124a8 1e05f01 b7334ff 1e05f01 b7334ff 1e05f01 b7334ff 1e05f01 a8124a8 1e05f01 b7334ff 1e05f01 a8124a8 1e05f01 b7334ff 1e05f01 a8124a8 1e05f01 a8124a8 00b2b8b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | <svg xmlns="http://www.w3.org/2000/svg" width="1100" height="556" viewBox="0 0 1100 556">
<rect width="100%" height="100%" fill="#020502"/>
<rect x="18" y="18" width="1064" height="520" rx="18" fill="#050905" stroke="#ccffa0" stroke-opacity="0.25"/>
<text x="32" y="42" font-family="Inter Tight, Arial, sans-serif" font-size="26" font-weight="800" fill="#f4f8ef">Ropedia Xperience-10M Suite: Main Scores</text>
<text x="310" y="532" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#a5afa2">score</text>
<line x1="310.0" y1="60" x2="310.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="310.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.00</text>
<line x1="454.0" y1="60" x2="454.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="454.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.20</text>
<line x1="598.0" y1="60" x2="598.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="598.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.40</text>
<line x1="742.0" y1="60" x2="742.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="742.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.60</text>
<line x1="886.0" y1="60" x2="886.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="886.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.80</text>
<line x1="1030.0" y1="60" x2="1030.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="1030.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">1.00</text>
<text x="296" y="99" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Action Recognition</text>
<rect x="310" y="83" width="36.0" height="20" rx="4" fill="#ccffa0"/>
<text x="354.0" y="99" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0500</text>
<text x="296" y="133" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Procedure Step Recognition</text>
<rect x="310" y="117" width="36.4" height="20" rx="4" fill="#ffffff"/>
<text x="354.4" y="133" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0506</text>
<text x="296" y="167" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Action Boundary Detection</text>
<rect x="310" y="151" width="440.5" height="20" rx="4" fill="#7ae5c3"/>
<text x="758.5" y="167" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.6118</text>
<text x="296" y="201" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Next-Action Prediction</text>
<rect x="310" y="185" width="42.7" height="20" rx="4" fill="#d8f4a5"/>
<text x="360.7" y="201" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0593</text>
<text x="296" y="235" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Hand Trajectory Forecasting</text>
<rect x="310" y="219" width="0.0" height="20" rx="4" fill="#9bdfff"/>
<text x="318.0" y="235" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0000</text>
<text x="296" y="269" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Contact State Prediction</text>
<rect x="310" y="253" width="720.0" height="20" rx="4" fill="#ff8f7a"/>
<text x="1038.0" y="269" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">1.0000</text>
<text x="296" y="303" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Object Relevance Prediction</text>
<rect x="310" y="287" width="45.6" height="20" rx="4" fill="#ccffa0"/>
<text x="363.6" y="303" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0633</text>
<text x="296" y="337" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Language Grounding</text>
<rect x="310" y="321" width="8.3" height="20" rx="4" fill="#ffffff"/>
<text x="326.3" y="337" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0115</text>
<text x="296" y="371" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Cross-Modal Retrieval</text>
<rect x="310" y="355" width="264.8" height="20" rx="4" fill="#7ae5c3"/>
<text x="582.8" y="371" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.3678</text>
<text x="296" y="405" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Cross-Modal Reconstruction</text>
<rect x="310" y="389" width="0.0" height="20" rx="4" fill="#d8f4a5"/>
<text x="318.0" y="405" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0000</text>
<text x="296" y="439" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Temporal Order Verification</text>
<rect x="310" y="423" width="388.8" height="20" rx="4" fill="#9bdfff"/>
<text x="706.8" y="439" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.5400</text>
<text x="296" y="473" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Multimodal Synchronization Detection</text>
<rect x="310" y="457" width="363.7" height="20" rx="4" fill="#ff8f7a"/>
<text x="681.7" y="473" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.5052</text>
</svg> |