WT-UMI: Whole-Body Tactile UMI for Force-Supervised Humanoid Manipulation
* Equal Contribution
We evaluate generalization on three representative contact-rich tasks: pivoting, insertion, and lid closing. The first two test generalization to unseen objects within a trained task class; lid closing additionally tests generalization to an unseen task class.
Rollout Speed: 2x
Click to reveal ↗
Whole-body humanoid manipulation of bulky, deformable, and shared-load objects requires distributed contact sensing and explicit force regulation, yet most imitation policies treat contact force only implicitly. Demonstration sources are often incomplete: human demonstrations capture natural contact forces but not executable robot actions, while teleoperation provides action labels with less natural force regulation. We present WT-UMI, a wearable whole-body tactile interface worn by human demonstrators and mounted on the humanoid, providing calibrated observations of tactile images and normal forces across both collection modes. A force-supervised planner predicts end-effector pose chunks and contact-force trajectories with a cross-attention force head. A force-conditioned IK target correction module converts these predictions into contact-aware robot targets, while the predicted force drives a tactile-based admittance controller. On five contact-rich tasks spanning deformable objects, bulky rigid objects, and human–humanoid collaboration, WT-UMI improves success rate by XX.X points and reduces contact-region drift by XX% across four policy backbones.
[Describe your framework / pipeline overview here.]
[Describe the hardware setup — sensors, mounts, calibration, etc.]
| Task / Metric | Pi0 | Psi0 | ViT-FMT | ViT-DiT |
|---|---|---|---|---|
| Pillow Reorientation (%) | XX.X | XX.X | XX.X | XX.X |
| Yoga Ball (%) | XX.X | XX.X | XX.X | XX.X |
| Bucket (%) | XX.X | XX.X | XX.X | XX.X |
| Average Success Rate (%) ↑ | XX.X | XX.X | XX.X | XX.X |
| Motion Smoothness (m/s) ↓ | X.XX | X.XX | X.XX | X.XX |
| Contact Region Off-center Drift (mm) ↓ | X.XX | X.XX | X.XX | X.XX |
| Time Contact Maintain (s) ↓ | X.XX | X.XX | X.XX | X.XX |
Per-task success rate (%) and aggregate position tracking and contact-region drift on five contact-rich whole-body manipulation tasks, N = 20 trials per cell. Aggregate rows are averaged across all five tasks over successful trials. Best per row in bold.
@article{citekey,
title={Your Paper Title Goes Here},
author={Last, First and Last, First and Last, First},
journal={Venue / arXiv preprint arXiv:XXXX.XXXXX},
year={YYYY}
}