FrankenMotion: Part-level Human Motion Generation and Composition
Paper β’ 2601.10909 β’ Published β’ 19
Per-body-part TMR-style retrieval encoders used to evaluate motion generation in the CVPR 2026 paper FrankenMotion: Part-level Human Motion Generation and Composition.
Nine independently trained TMR encoders (one per body part + caption + action), each pairing a motion encoder with a text encoder into a shared retrieval latent space:
action/, head/, left_arm/, left_leg/, right_arm/, right_leg/,
sequence_caption/, spine/, trajectory/
βββ config.json
βββ last_weights/{motion,text}_encoder.pt
stats/{mean,std}.pt # shared motion-feature normaliser stats
These encoders feed the guo and guo+threshold retrieval protocols (R@1, R@3, MM-Dist, FID, Diversity) reported in paper Table 1.
from huggingface_hub import snapshot_download
snapshot_download(repo_id="Coral79/frankenmotion-eval-model", local_dir="pretrained/eval_model")
Then follow the main repo README β src.eval.paper_table auto-runs both retrieval protocols on CPU and reproduces the paper table.
@inproceedings{li2026frankenmotion,
title={{FrankenMotion}: Part-level Human Motion Generation and Composition},
author={Li, Chuqiao and Xie, Xianghui and Cao, Yong and Geiger, Andreas and Pons-Moll, Gerard},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}