# Reproducibility Audit

Audit date: 2026-05-30 Asia/Singapore.

Purpose: verify that the committed Ropedia Xperience-10M Task Suite artifacts are
real outputs from the scripts, not placeholder or fabricated metrics.

## Raw Inputs Checked

The audit used the local public sample episode:

```text
data/sample/xperience-10m-sample/
  annotation.hdf5
  fisheye_cam0.mp4
  fisheye_cam1.mp4
  fisheye_cam2.mp4
  fisheye_cam3.mp4
  stereo_left.mp4
  stereo_right.mp4
```

`annotation.hdf5` contains 5,821 aligned frames with depth, hand mocap, body
mocap, IMU, SLAM, calibration, and caption metadata. The video feature cache was
rebuilt from all six video files during the audit.

## Commands Re-run

All audit outputs were written outside the repo:

```bash
AUDIT=/private/tmp/ropedia-audit
WORKSPACE=/path/to/Ropedia
ANN=$WORKSPACE/data/sample/xperience-10m-sample/annotation.hdf5
PY=$WORKSPACE/.venv/bin/python

$PY -B scripts/train_min_action_model.py \
  --workspace $WORKSPACE \
  --annotation $ANN \
  --output-dir $AUDIT/min_action_model \
  --target action

$PY -B scripts/train_min_action_model.py \
  --workspace $WORKSPACE \
  --annotation $ANN \
  --output-dir $AUDIT/min_subtask_model \
  --target subtask

$PY -B scripts/train_all_modalities_model.py \
  --workspace $WORKSPACE \
  --annotation $ANN \
  --output-dir $AUDIT/min_all_modalities_action_model \
  --cache-dir $AUDIT/cache \
  --target action

$PY -B scripts/train_all_modalities_model.py \
  --workspace $WORKSPACE \
  --annotation $ANN \
  --output-dir $AUDIT/min_all_modalities_subtask_model \
  --cache-dir $AUDIT/cache \
  --target subtask

$PY -B scripts/episode_task_suite.py \
  --workspace $WORKSPACE \
  --annotation $ANN \
  --output-dir $AUDIT/episode_task_suite \
  --cache-dir $AUDIT/cache
```

## Exact Match Checks

The regenerated files matched the committed files:

```text
min_action_model/metrics.json: MATCH
min_subtask_model/metrics.json: MATCH
min_all_modalities_action_model/metrics.json: MATCH
min_all_modalities_subtask_model/metrics.json: MATCH
episode_task_suite/summary_report.json: MATCH
episode_task_suite/feature_manifest.json: MATCH
episode_task_suite/available_modalities.json: MATCH
```

Every per-task `metrics.json` also matched:

```text
caption_grounding/metrics.json: MATCH
contact_prediction/metrics.json: MATCH
cross_modal_retrieval/metrics.json: MATCH
hand_trajectory_forecast/metrics.json: MATCH
misalignment_detection/metrics.json: MATCH
modality_reconstruction/metrics.json: MATCH
next_action/metrics.json: MATCH
object_relevance/metrics.json: MATCH
temporal_order/metrics.json: MATCH
timeline_action/metrics.json: MATCH
timeline_subtask/metrics.json: MATCH
transition_detection/metrics.json: MATCH
```

## Fresh Cache Evidence

The all-modality audit rebuilt a fresh feature cache:

```text
depth_n5821_grid8.npz: shape=(5821, 140), nonzero=809107
video_fisheye_cam0_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=570458
video_fisheye_cam1_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=570400
video_fisheye_cam2_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=570458
video_fisheye_cam3_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=568723
video_stereo_left_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=570249
video_stereo_right_n5821_img32_grid8_hist8.npz: shape=(5821, 98), nonzero=570430
```

This confirms the committed metrics are reproducible from the raw sample and
that the all-modality pipeline reads real depth/video files instead of using
empty placeholder features.

## Caveats

The scripts contain a zero-feature fallback if a video file is missing. That is
not the path used in this audit: all six videos existed and produced nonzero
features. The repo remains a single-episode learning and pipeline-validation
project, not evidence of cross-episode generalization.