Robotics
PyTorch
Cosmos
xperience10m_task_baseline_suite
embodied-ai
multimodal
xperience-10m
baseline
evaluation
qwen3-omni
Instructions to use cy0307/ropedia-xperience-10m-task-baselines with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use cy0307/ropedia-xperience-10m-task-baselines with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
File size: 5,983 Bytes
540e67a 367c357 540e67a 367c357 540e67a 367c357 540e67a 367c357 540e67a 367c357 540e67a 367c357 540e67a 367c357 540e67a 3a10443 367c357 540e67a 3a10443 a8124a8 3a10443 a8124a8 3a10443 540e67a 3a10443 540e67a 3a10443 540e67a 3a10443 a8124a8 3a10443 540e67a 3a10443 a8124a8 3a10443 540e67a 3a10443 a8124a8 3a10443 a8124a8 540e67a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | <svg xmlns="http://www.w3.org/2000/svg" width="1100" height="556" viewBox="0 0 1100 556">
<rect width="100%" height="100%" fill="#020502"/>
<rect x="18" y="18" width="1064" height="520" rx="18" fill="#050905" stroke="#ccffa0" stroke-opacity="0.25"/>
<text x="32" y="42" font-family="Inter Tight, Arial, sans-serif" font-size="26" font-weight="800" fill="#f4f8ef">Ropedia Xperience-10M Suite: Main Scores</text>
<text x="310" y="532" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#a5afa2">score</text>
<line x1="310.0" y1="60" x2="310.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="310.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.00</text>
<line x1="454.0" y1="60" x2="454.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="454.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.20</text>
<line x1="598.0" y1="60" x2="598.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="598.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.40</text>
<line x1="742.0" y1="60" x2="742.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="742.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.60</text>
<line x1="886.0" y1="60" x2="886.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="886.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">0.80</text>
<line x1="1030.0" y1="60" x2="1030.0" y2="506" stroke="#ccffa0" stroke-opacity="0.13" stroke-width="1"/>
<text x="1030.0" y="526" text-anchor="middle" font-family="Space Grotesk, Arial, sans-serif" font-size="12" fill="#a5afa2">1.00</text>
<text x="296" y="99" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Action Recognition</text>
<rect x="310" y="83" width="36.0" height="20" rx="4" fill="#ccffa0"/>
<text x="354.0" y="99" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0500</text>
<text x="296" y="133" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Procedure Step Recognition</text>
<rect x="310" y="117" width="36.4" height="20" rx="4" fill="#ffffff"/>
<text x="354.4" y="133" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0506</text>
<text x="296" y="167" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Action Boundary Detection</text>
<rect x="310" y="151" width="440.5" height="20" rx="4" fill="#7ae5c3"/>
<text x="758.5" y="167" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.6118</text>
<text x="296" y="201" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Next-Action Prediction</text>
<rect x="310" y="185" width="42.7" height="20" rx="4" fill="#d8f4a5"/>
<text x="360.7" y="201" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0593</text>
<text x="296" y="235" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Hand Trajectory Forecasting</text>
<rect x="310" y="219" width="0.0" height="20" rx="4" fill="#9bdfff"/>
<text x="318.0" y="235" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0000</text>
<text x="296" y="269" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Contact State Prediction</text>
<rect x="310" y="253" width="720.0" height="20" rx="4" fill="#ff8f7a"/>
<text x="1038.0" y="269" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">1.0000</text>
<text x="296" y="303" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Object Relevance Prediction</text>
<rect x="310" y="287" width="45.6" height="20" rx="4" fill="#ccffa0"/>
<text x="363.6" y="303" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0633</text>
<text x="296" y="337" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Language Grounding</text>
<rect x="310" y="321" width="8.3" height="20" rx="4" fill="#ffffff"/>
<text x="326.3" y="337" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0115</text>
<text x="296" y="371" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Cross-Modal Retrieval</text>
<rect x="310" y="355" width="264.8" height="20" rx="4" fill="#7ae5c3"/>
<text x="582.8" y="371" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.3678</text>
<text x="296" y="405" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Cross-Modal Reconstruction</text>
<rect x="310" y="389" width="0.0" height="20" rx="4" fill="#d8f4a5"/>
<text x="318.0" y="405" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.0000</text>
<text x="296" y="439" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Temporal Order Verification</text>
<rect x="310" y="423" width="388.8" height="20" rx="4" fill="#9bdfff"/>
<text x="706.8" y="439" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.5400</text>
<text x="296" y="473" text-anchor="end" font-family="Space Grotesk, Arial, sans-serif" font-size="14" fill="#dce8d7">Multimodal Synchronization Detection</text>
<rect x="310" y="457" width="363.7" height="20" rx="4" fill="#ff8f7a"/>
<text x="681.7" y="473" font-family="Space Grotesk, Arial, sans-serif" font-size="13" fill="#f4f8ef">0.5052</text>
</svg> |