# Figure Index This file is generated by `scripts/build_figure_index.py`. It catalogs the public visual assets used by the repo, website, and Hugging Face mirrors. Current status: **pass** Public figures, diagrams, charts, and derived modality thumbnails. Raw Xperience-10M videos, annotations, RRD files, and Qwen weights are excluded. ## Figures | Figure | Path | Size | Source script | Role | | --- | --- | ---: | --- | --- | | Project logo mark | `docs/assets/brand/xperience10m-logo-mark-512.png` | 512 x 512 | `scripts/build_brand_assets.py` | Primary X-shaped multimodal camera mark used for the website header, README, HF cards, and brand identity. | | Project logo social card | `docs/assets/brand/xperience10m-logo-social-card.png` | 1200 x 630 | `scripts/build_brand_assets.py` | Large preview image for README, Hugging Face cards, and Open Graph/Twitter social sharing. | | Project favicon | `docs/assets/brand/xperience10m-logo-favicon-64.png` | 64 x 64 | `scripts/build_brand_assets.py` | Small dark-tile logo for browser tabs and compact navigation. | | Original task-suite infographic | `docs/assets/task_suite_infographic.png` | 1800 x 7600 | `scripts/render_task_suite_infographic.py` | Primary visual map of the walkthrough-backed task families, verified metrics, and sample modalities; the unified public suite is documented as 20 tasks. | | Episode-to-task pipeline diagram | `docs/assets/pipeline_diagram.png` | 1800 x 1120 | `scripts/generate_visualizations.py` | End-to-end data processing and evaluation pipeline overview. | | Qwen3-Omni LoRA training pipeline | `docs/assets/qwen3_omni_lora_pipeline.png` | 1536 x 1024 | `docs/assets/qwen3_omni_lora_pipeline.prompt.md` | Detailed raw-data-to-adapter flow for staged Xperience-10M Qwen3-Omni LoRA training. | | Spatial intelligence slide diagram | `docs/assets/foundation-pipelines/spatial-intelligence-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the spatial intelligence pipeline track. | | Human-video world model slide diagram | `docs/assets/foundation-pipelines/human-video-world-model-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the human-video world-model pipeline track. | | Vision-language-action slide diagram | `docs/assets/foundation-pipelines/vision-language-action-pipeline.png` | 2560 x 1920 | `scripts/render_foundation_pipeline_diagrams.py` | High-resolution slide diagram for the VLA/action-policy pipeline track. | | Minimal and neural task architecture map | `docs/assets/task_architectures.png` | 1800 x 2450 | `scripts/render_overview_figures.py` | Minimal and neural heads for the walkthrough-backed task contracts and shared feature contracts. | | Video modality thumbnail | `docs/assets/modalities/video.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived thumbnail for synchronized camera streams. | | Audio modality thumbnail | `docs/assets/modalities/audio.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived waveform thumbnail for the MP4 AAC stream. | | Depth modality thumbnail | `docs/assets/modalities/depth.jpg` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived depth and confidence thumbnail. | | Pose / SLAM modality thumbnail | `docs/assets/modalities/pose_slam.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived camera trajectory and sparse map thumbnail. | | Motion capture modality thumbnail | `docs/assets/modalities/motion_capture.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived body and hand motion-capture thumbnail. | | Inertial modality thumbnail | `docs/assets/modalities/inertial.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived accelerometer and gyroscope trace thumbnail. | | Language modality thumbnail | `docs/assets/modalities/language.png` | 880 x 520 | `scripts/export_modality_atlas_assets.py` | Derived object-tag and caption thumbnail. | | Model macro-F1 comparison chart | `docs/assets/charts/model_macro_f1.svg` | 1100 x 284 | `scripts/generate_visualizations.py` | Minimal-vs-neural classification score comparison. | | Neural MLP task score chart | `docs/assets/charts/episode_task_scores_neural_mlp.svg` | 1100 x 556 | `scripts/generate_visualizations.py` | Neural MLP metric snapshot across the task suite. | | Minimal-vs-neural task score chart | `docs/assets/charts/episode_task_scores_minimal_vs_neural.svg` | 1100 x 964 | `scripts/generate_visualizations.py` | Side-by-side baseline comparison over the same window contracts. | | Research direction coverage chart | `docs/assets/charts/research_direction_coverage.svg` | 1180 x 700 | `scripts/generate_visualizations.py` | Four-track coverage map for Ropedia research directions. | | Research direction extension chart | `docs/assets/charts/research_direction_extension_tasks.svg` | 1420 x 920 | `scripts/generate_visualizations.py` | Four coded extension probes, one per Ropedia research direction. | | Unified 20-task provenance chart | `docs/assets/charts/tier2_task_suite.svg` | 1440 x 832 | `scripts/tier2_task_suite.py` | Historical provenance rows inside the unified 20-task suite with aligned minimal and neural baseline metrics. | | Unified 20-task model radar | `docs/assets/charts/unified_task_model_radar.svg` | 2400 x 1900 | `scripts/build_unified_task_model_radar.py` | Grouped small-multiple 20-task radar board for all nine methods, separating single-episode, 128-episode metadata/text, 128-episode raw-feature, and foundation-model rows while preserving task keys and proxy notes. | | Single-episode 20-task model radar | `docs/assets/charts/single_episode_task_model_radar.svg` | 2400 x 1900 | `scripts/build_unified_task_model_radar.py` | Twenty-axis split radar for the one public-sample episode, comparing Minimal and Neural MLP as two complete 20/20 scored polygons. | | 128-episode 20-task model radar | `docs/assets/charts/episode128_task_model_radar.svg` | 2400 x 1900 | `scripts/build_unified_task_model_radar.py` | Grouped 20-task radar for selected 128-episode methods: metadata/text baselines, raw-feature simple/NN, Qwen3-Omni, Cosmos3-Super, and Cosmos3-Nano with local legends and proxy notes. | | Feature block chart | `docs/assets/charts/feature_blocks.svg` | 1100 x 760 | `scripts/generate_visualizations.py` | Feature allocation by modality block. | | Minimal task score chart | `docs/assets/charts/episode_task_scores.svg` | 1100 x 556 | `scripts/generate_visualizations.py` | Minimal baseline metric snapshot across the task suite. | | Cross-modal retrieval chart | `docs/assets/charts/cross_modal_retrieval.svg` | 1100 x 284 | `scripts/generate_visualizations.py` | Retrieval behavior chart for the cross-modal task. | ## Use and Scope - These figures are derived presentation artifacts or small thumbnails. - The index records file hashes and dimensions for reproducibility checks. - Raw Xperience-10M MP4/HDF5/RRD files and full model weights are not redistributed.