TEXEDO-Checkpoint / README.md
nielsr's picture
nielsr HF Staff
Add robotics pipeline tag to metadata
2420999 verified
|
Raw
History Blame
3.68 kB
---
license: mit
pipeline_tag: robotics
tags:
- retargeted-motion
- text-to-motion
- motion-to-text
- motion-prediction
- humanoid
- unitree-g1
- robotics
- motion-generation
---
<h1 align="center" style="font-size: 1.6em;">TEXEDO — Checkpoints</h1>
<p align="center" style="font-size: 1.6em; font-weight: bold;">Test-Time Scaling for Controller-Aware Language-Conditioned Humanoid Motion Generation</p>
This repository hosts the pretrained **checkpoints and runtime assets** for **TEXEDO**, a text-to-motion pipeline for the Unitree G1 humanoid. Given a language prompt, TEXEDO generates multiple candidate motions, decodes them into a 36-dimensional G1 robot motion format, scores them with dynamic and semantic verifiers, and selects the best candidate for deployment.
- 🌐 **Project page:** https://jianuocao.github.io/TEXEDO/
- 💻 **Code:** https://github.com/JianuoCao/TEXEDO
- 📄 **Paper:** https://arxiv.org/abs/2606.22998
- 📦 **Dataset:** https://huggingface.co/datasets/JianuoCao/TEXEDO
## Contents
| Logical name | What it is | Approx. size |
|---|---|---|
| `fsq_tokenizer` | FSQ motion tokenizer (encoder/decoder + codebook) for 36-dim G1 motion | ~216 MB |
| `fsq_norm_stats` | Per-channel normalization stats for the tokenizer | ~2 KB |
| `generator` | Stage-2 text→motion generator: flan-t5-base fine-tuned on FSQ motion tokens (multi-task) | ~3.2 GB |
| `dynamic_verifier` | Dynamic-feasibility (physical-plausibility) scorer | ~40 MB |
| `dynamic_norm_stats` | Normalization stats paired with the dynamic verifier | ~2 KB |
| `semantic_evaluator` | Text–motion matching evaluator (match net + decomposition + meta) | variable |
| `glove` | GloVe vocab for the semantic text encoder | ~20 MB |
| `g1_robot` | Unitree G1 MuJoCo model (XML + meshes) | ~26 MB |
> The base LM `google/flan-t5-base` is loaded from the public Hub at runtime and is not re-hosted here.
## Usage
The checkpoints are designed to be fetched automatically by the [TEXEDO code](https://github.com/JianuoCao/TEXEDO):
```bash
git clone https://github.com/JianuoCao/TEXEDO.git
cd TEXEDO
conda env create -f environment.yml
conda activate TEXEDO
pip install -e .
# Downloads these checkpoints + runtime assets into ./assets
python scripts/download_assets.py
```
Then run the full generate → score → select → render pipeline:
```bash
python -m pipeline.generate --prompt "a person waves with the right hand" --num-samples 8 --out-dir candidates/
python -m pipeline.score --motion-dir candidates/ --caption "a person waves with the right hand" --output scores.csv
python -m pipeline.select_best_of_n --scores scores.csv --motion-dir candidates/ --copy-best-to best/
python scripts/visualize_csv.py --input-dir best/ --output-dir viz/
```
You can also download a single file directly:
```python
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download(
repo_id="JianuoCao/TEXEDO-Checkpoint",
filename="tokenizer/checkpoint_epoch_95.pt",
)
```
See the repo's [docs/MODELS.md](https://github.com/JianuoCao/TEXEDO/blob/main/docs/MODELS.md) for the full asset manifest and layout.
## Citation
```bibtex
@misc{cao2026texedotesttime,
title={TEXEDO: Test-Time Scaling for Controller-Aware Language-Conditioned Humanoid Motion Generation},
author={Jianuo Cao and Yuxin Chen and Yuzhen Song and Masayoshi Tomizuka and Chenran Li and Thomas Tian},
year={2026},
eprint={2606.22998},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2606.22998},
}
```
## License
Released under the MIT license. Third-party datasets, pretrained base models, robot assets, and dependencies retain their own licenses and terms of use.