TWIST2 is our next-generation humanoid data collection system that combines portability, scalability, and holistic whole-body control.
We collect long-horizon, whole-body, dexterous, egocentric humanoid loco-manipulation data with TWIST2 and learn autonomy from data.
We have fully open-sourced everything of TWIST2, including the training/deployment code, the controller checkpoint, and the hardware design, allowing you to reproduce the system on your own Unitree G1.
We are open-sourcing our whole-body humanoid loco-manipulation dataset, collected by TWIST2. Welcome to join our community and contribute your own datasets! Check https://twist-data.github.io/ for more details.
TWIST2 is a humanoid data collection system that combines portability, scalability, and holistic whole-body control. With data collected, we further design a visuomotor humanoid policy learning framework.
We find that egocentric active perception is very crucial for long-horizon, dexterous teleoperation. We design an add-on 2-DoF neck that can be easily attached to the G1 and provides egocentric active perception. We also model the neck in Mujoco for simulation evaluation.
We use PICO 4 Ultra + 2 PICO Motion Trackers as our portable teleoperation device. Thanks to XRoboToolkit, we can have a unified PICO application for egocentric vision streaming and whole-body pose streaming.
We design a hierarchical visuomotor humanoid policy learning framework, where the low-level tracking policy (System 1) is trained via sim2real RL, and the high-level visuomotor policy (System 2) is trained via imitation learning from TWIST2 data.
We show 1) collecting 128 successful bimanual dexterous pick & place in 15 minutes, and 2) collecting 50 successful mobile pick & place in 15 minutes.
We show long-horizon, dexterous, loco-manipulation skills enabled by TWIST2, including 1) folding 3 towels consecutively, 2) finding a cloth and folding it, 3) kicking a soccer ball, 4) transporting through a door, 5) picking up a brick from the ground, etc.
We train visuomotor policies (e.g., Diffusion Policy) using TWIST2 data. We show two autonomous skills enabled by the trained policies: 1) Kick-T, where a robot kicks a T-shaped box to the target region, and 2) WB-Dex, where a robot performs dexterous pick & place.
We train a diffusion policy to predict future whole-body joint positions, including both upper-body and lower-body joints. Below we show the predicted future whole-body joint positions of WB-Dex and Kick-T as ghost trajectories. Space & Time.
Our work is built upon the following works. TWIST provides the initiative of the humanoid motion tracking controller, GMR provides the base humanoid motion retargeting framework, and iDP3 provides the base humanoid visuomotor policy learning framework.
Welcome to scan the QR code to join our community for humanoid robots.
(Note: as our WeChat group has more than 200 members, it now can be only joined by invitation. So please add me via WeChat if you want to join. Please provide info like "[TWIST2] [Your Name] [Your Affiliation]")
We have assembled 4 TWIST2 Necks. Expect your own TWIST2 Neck to be displayed here.
@article{ze2025twist2,
title={TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System},
author= {Yanjie Ze and Siheng Zhao and Weizhuo Wang and Angjoo Kanazawa and Rocky Duan and Pieter Abbeel and Guanya Shi and Jiajun Wu and C. Karen Liu},
year= {2025},
journal= {arXiv preprint arXiv:2511.02832}
}
Website modified from TWIST. Data visualizer modified from DROID.
© 2025 Yanjie Ze