VisualMimic enables generalizable visuomotor skills across time & space, allowing humanoid robots to perform complex loco-manipulation tasks by learning from visual demonstrations.
The system demonstrates remarkable robustness across different lighting conditions (morning, dusk, evening, midnight) and diverse environments (Hover Tower, Engineering Building, Robotics Center, Memorial Church).
VisualMimic enables humanoid robots to learn complex loco-manipulation behaviors with whole-body dexterity.
The framework combines motion tracking and generation capabilities to transfer human-like behaviors to humanoid robots,
enabling them to perform tasks that require both locomotion and manipulation skills using hands, feet, shoulders, and other body parts.
VisualMimic achieves this through a novel interface design for high-level/low-level communication and teacher-student distillation for training low-level trackers.
Human motion serves as a unified form factor that provides scalable data for controlling diverse robotic systems, eliminating the need for extensive robot-specific motion collection.
General Motion Retargeting (GMR) retargets human motions to diverse humanoid robots via real-time multi-objective inverse kinematics, jointly solving for rotation and position constraints to preserve rich spatial information from humans.
Built upon mink, GMR enables real-time inference and is utilized in TWIST for real-time whole-body teleoperation.
GMR transfers human motions to diverse humanoid robots in real-time.
We want humanoids to have the same level of whole-body dexterity as humans.
Imagine a messy kitchen, humans can hold things with two hands and use their feet to move obstacles, such as a basket on the ground; humans can also open the door using the sides of their bodies or their elbows. We want to make humanoids achieve the same by imitating humans directly.
Unprecedented human-like loco-manipulation abilities on a real humanoid robot.
TWIST, the Teleoperated Whole-Body Imitation System, utilizes data captured by MoCap devices to precisely track the body movements of humans. Compared to many teleoperation systems introduced in the past, TWIST leverages joints across the entire bodies of humanoid robots to closely replicate human movements, while also ensuring that the motions of different limbs are coordinated.