# Xperience-10M Annotation Record Probe Minimal-cost probe. Downloaded only `annotation.hdf5`; no MP4 or `visualization.rrd` files were downloaded. - Repo: `ropedia-ai/xperience-10m` - Probe count: 3 - Raw annotation cache: outside the published repo - Local files only: `False` ## 9cecac72-8874-4b97-9541-18d4858f8e43/ep10/annotation.hdf5 - Downloaded annotation size: 6.38 MiB (6,687,192 bytes) - HDF5 top-level keys: `calibration, caption, depth, full_body_mocap, hand_mocap, imu, metadata, slam, video` - HDF5 dataset count: 65 - Largest first-dimension dataset: `imu/accel_xyz` with first dimension `190` ### Caption JSON Summary | Measure | Value | | --- | --- | | Parse status | ok | | JSON bytes | 1,178 | | Segment count | 1 | | Current-action count | 1 | | Object-frame count | 1 | | Interaction-frame count | 1 | | Sampled-frame count | 1 | | Unique subtasks | 1 | | Unique action labels | 1 | | Unique objects | 3 | | Action labels | ["Arrange items in bin"] | | Objects | ["cardboard box", "hand", "plastic storage bin"] | ### Top Groups | Group | Dataset count | Max first dimension | First-dim histogram top values | | --- | --- | --- | --- | | calibration | 23 | 4 | {"4": 14} | | caption | 1 | 0 | {} | | depth | 5 | 20 | {"20": 2} | | full_body_mocap | 9 | 20 | {"20": 9} | | hand_mocap | 10 | 20 | {"20": 10} | | imu | 4 | 190 | {"190": 3, "20": 1} | | metadata | 6 | 0 | {} | | slam | 4 | 47 | {"20": 3, "47": 1} | | video | 3 | 20 | {"20": 2} | ### Caption / Action / Interaction Related Datasets | Dataset | Shape | Dtype | First dim | Sample values | | --- | --- | --- | --- | --- | | caption | [] | object | None | ["{\"config\": {\"segment_sec\": 20, \"sample_fps\": 0.5, \"total_tokens\": 2047, \"Main Task\": \"Packing items into a plastic bin. The person is placing va... | ## cdc1ae12-a460-48ac-a892-7d314095c4b1/ep23/annotation.hdf5 - Downloaded annotation size: 6.38 MiB (6,687,256 bytes) - HDF5 top-level keys: `calibration, caption, depth, full_body_mocap, hand_mocap, imu, metadata, slam, video` - HDF5 dataset count: 65 - Largest first-dimension dataset: `imu/accel_xyz` with first dimension `188` ### Caption JSON Summary | Measure | Value | | --- | --- | | Parse status | ok | | JSON bytes | 1,051 | | Segment count | 1 | | Current-action count | 1 | | Object-frame count | 1 | | Interaction-frame count | 1 | | Sampled-frame count | 1 | | Unique subtasks | 1 | | Unique action labels | 1 | | Unique objects | 4 | | Action labels | ["Pulling up sock"] | | Objects | ["bathroom floor", "feet", "sock", "toilet"] | ### Top Groups | Group | Dataset count | Max first dimension | First-dim histogram top values | | --- | --- | --- | --- | | calibration | 23 | 4 | {"4": 14} | | caption | 1 | 0 | {} | | depth | 5 | 20 | {"20": 2} | | full_body_mocap | 9 | 20 | {"20": 9} | | hand_mocap | 10 | 20 | {"20": 10} | | imu | 4 | 188 | {"188": 3, "20": 1} | | metadata | 6 | 0 | {} | | slam | 4 | 128 | {"20": 3, "128": 1} | | video | 3 | 20 | {"20": 2} | ### Caption / Action / Interaction Related Datasets | Dataset | Shape | Dtype | First dim | Sample values | | --- | --- | --- | --- | --- | | caption | [] | object | None | ["{\"config\": {\"segment_sec\": 20, \"sample_fps\": 0.5, \"total_tokens\": 2035, \"Main Task\": \"Putting on socks. The person is standing in a bathroom and... | ## 10282b64-a955-461e-9ef9-a1ddf8dc619a/ep5/annotation.hdf5 - Downloaded annotation size: 6.40 MiB (6,706,448 bytes) - HDF5 top-level keys: `calibration, caption, depth, full_body_mocap, hand_mocap, imu, metadata, slam, video` - HDF5 dataset count: 65 - Largest first-dimension dataset: `slam/point_cloud` with first dimension `837` ### Caption JSON Summary | Measure | Value | | --- | --- | | Parse status | ok | | JSON bytes | 1,299 | | Segment count | 1 | | Current-action count | 1 | | Object-frame count | 1 | | Interaction-frame count | 1 | | Sampled-frame count | 1 | | Unique subtasks | 1 | | Unique action labels | 1 | | Unique objects | 4 | | Action labels | ["Walk down retail aisle"] | | Objects | ["person seated", "product packaging", "retail shelf", "shopping bags"] | ### Top Groups | Group | Dataset count | Max first dimension | First-dim histogram top values | | --- | --- | --- | --- | | calibration | 23 | 4 | {"4": 14} | | caption | 1 | 0 | {} | | depth | 5 | 20 | {"20": 2} | | full_body_mocap | 9 | 20 | {"20": 9} | | hand_mocap | 10 | 20 | {"20": 10} | | imu | 4 | 190 | {"190": 3, "20": 1} | | metadata | 6 | 0 | {} | | slam | 4 | 837 | {"20": 3, "837": 1} | | video | 3 | 20 | {"20": 2} | ### Caption / Action / Interaction Related Datasets | Dataset | Shape | Dtype | First dim | Sample values | | --- | --- | --- | --- | --- | | caption | [] | object | None | ["{\"config\": {\"segment_sec\": 20, \"sample_fps\": 0.5, \"total_tokens\": 2060, \"Main Task\": \"walking through a retail store. The video shows a first-pe... |