| --- |
| license: mit |
| pipeline_tag: robotics |
| tags: |
| - retargeted-motion |
| - text-to-motion |
| - motion-to-text |
| - motion-prediction |
| - humanoid |
| - unitree-g1 |
| - robotics |
| - motion-generation |
| --- |
| |
| <h1 align="center" style="font-size: 1.6em;">TEXEDO — Checkpoints</h1> |
|
|
| <p align="center" style="font-size: 1.6em; font-weight: bold;">Test-Time Scaling for Controller-Aware Language-Conditioned Humanoid Motion Generation</p> |
|
|
| This repository hosts the pretrained **checkpoints and runtime assets** for **TEXEDO**, a text-to-motion pipeline for the Unitree G1 humanoid. Given a language prompt, TEXEDO generates multiple candidate motions, decodes them into a 36-dimensional G1 robot motion format, scores them with dynamic and semantic verifiers, and selects the best candidate for deployment. |
|
|
| - 🌐 **Project page:** https://jianuocao.github.io/TEXEDO/ |
| - 💻 **Code:** https://github.com/JianuoCao/TEXEDO |
| - 📄 **Paper:** https://arxiv.org/abs/2606.22998 |
| - 📦 **Dataset:** https://huggingface.co/datasets/JianuoCao/TEXEDO |
|
|
| ## Contents |
|
|
| | Logical name | What it is | Approx. size | |
| |---|---|---| |
| | `fsq_tokenizer` | FSQ motion tokenizer (encoder/decoder + codebook) for 36-dim G1 motion | ~216 MB | |
| | `fsq_norm_stats` | Per-channel normalization stats for the tokenizer | ~2 KB | |
| | `generator` | Stage-2 text→motion generator: flan-t5-base fine-tuned on FSQ motion tokens (multi-task) | ~3.2 GB | |
| | `dynamic_verifier` | Dynamic-feasibility (physical-plausibility) scorer | ~40 MB | |
| | `dynamic_norm_stats` | Normalization stats paired with the dynamic verifier | ~2 KB | |
| | `semantic_evaluator` | Text–motion matching evaluator (match net + decomposition + meta) | variable | |
| | `glove` | GloVe vocab for the semantic text encoder | ~20 MB | |
| | `g1_robot` | Unitree G1 MuJoCo model (XML + meshes) | ~26 MB | |
|
|
| > The base LM `google/flan-t5-base` is loaded from the public Hub at runtime and is not re-hosted here. |
|
|
| ## Usage |
|
|
| The checkpoints are designed to be fetched automatically by the [TEXEDO code](https://github.com/JianuoCao/TEXEDO): |
|
|
| ```bash |
| git clone https://github.com/JianuoCao/TEXEDO.git |
| cd TEXEDO |
| conda env create -f environment.yml |
| conda activate TEXEDO |
| pip install -e . |
| |
| # Downloads these checkpoints + runtime assets into ./assets |
| python scripts/download_assets.py |
| ``` |
|
|
| Then run the full generate → score → select → render pipeline: |
|
|
| ```bash |
| python -m pipeline.generate --prompt "a person waves with the right hand" --num-samples 8 --out-dir candidates/ |
| python -m pipeline.score --motion-dir candidates/ --caption "a person waves with the right hand" --output scores.csv |
| python -m pipeline.select_best_of_n --scores scores.csv --motion-dir candidates/ --copy-best-to best/ |
| python scripts/visualize_csv.py --input-dir best/ --output-dir viz/ |
| ``` |
|
|
| You can also download a single file directly: |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| |
| ckpt = hf_hub_download( |
| repo_id="JianuoCao/TEXEDO-Checkpoint", |
| filename="tokenizer/checkpoint_epoch_95.pt", |
| ) |
| ``` |
|
|
| See the repo's [docs/MODELS.md](https://github.com/JianuoCao/TEXEDO/blob/main/docs/MODELS.md) for the full asset manifest and layout. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{cao2026texedotesttime, |
| title={TEXEDO: Test-Time Scaling for Controller-Aware Language-Conditioned Humanoid Motion Generation}, |
| author={Jianuo Cao and Yuxin Chen and Yuzhen Song and Masayoshi Tomizuka and Chenran Li and Thomas Tian}, |
| year={2026}, |
| eprint={2606.22998}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.RO}, |
| url={https://arxiv.org/abs/2606.22998}, |
| } |
| ``` |
|
|
| ## License |
|
|
| Released under the MIT license. Third-party datasets, pretrained base models, robot assets, and dependencies retain their own licenses and terms of use. |