WTUMI

WT-UMI: Whole-Body Tactile UMI for Force-Supervised Humanoid Manipulation

* Equal Contribution

Whole-Body Manipulation

Pillow Reorientation
Yoga Ball
Bucket
Task 4
Task 5

In-the-Wild Generalization

We evaluate generalization on three representative contact-rich tasks: pivoting, insertion, and lid closing. The first two test generalization to unseen objects within a trained task class; lid closing additionally tests generalization to an unseen task class.

Seen-by-both Human-only Unseen-by-both

Rollout Speed: 2x

WT-UMI (Ours)

No Tactile (Baseline)

Robot Only (Baseline)

Teleoperation Data Collection

Headline goes here.

Click to reveal ↗

Label A
Metric Label
sub-label / unit
Label B
Metric Label
sub-label / unit

Abstract

Whole-body humanoid manipulation of bulky, deformable, and shared-load objects requires distributed contact sensing and explicit force regulation, yet most imitation policies treat contact force only implicitly. Demonstration sources are often incomplete: human demonstrations capture natural contact forces but not executable robot actions, while teleoperation provides action labels with less natural force regulation. We present WT-UMI, a wearable whole-body tactile interface worn by human demonstrators and mounted on the humanoid, providing calibrated observations of tactile images and normal forces across both collection modes. A force-supervised planner predicts end-effector pose chunks and contact-force trajectories with a cross-attention force head. A force-conditioned IK target correction module converts these predictions into contact-aware robot targets, while the predicted force drives a tactile-based admittance controller. On five contact-rich tasks spanning deformable objects, bulky rigid objects, and human–humanoid collaboration, WT-UMI improves success rate by XX.X points and reduces contact-region drift by XX% across four policy backbones.

Framework

[Describe your framework / pipeline overview here.]

Hardware

WT-UMI hardware diagram

[Describe the hardware setup — sensors, mounts, calibration, etc.]

Results

Task / Metric Pi0 Psi0 ViT-FMT ViT-DiT
Pillow Reorientation (%) XX.XXX.XXX.XXX.X
Yoga Ball (%) XX.XXX.XXX.XXX.X
Bucket (%) XX.XXX.XXX.XXX.X
Average Success Rate (%) XX.XXX.XXX.XXX.X
Motion Smoothness (m/s) X.XXX.XXX.XXX.XX
Contact Region Off-center Drift (mm) X.XXX.XXX.XXX.XX
Time Contact Maintain (s) X.XXX.XXX.XXX.XX

Per-task success rate (%) and aggregate position tracking and contact-region drift on five contact-rich whole-body manipulation tasks, N = 20 trials per cell. Aggregate rows are averaged across all five tasks over successful trials. Best per row in bold.

BibTeX

@article{citekey,
  title={Your Paper Title Goes Here},
  author={Last, First and Last, First and Last, First},
  journal={Venue / arXiv preprint arXiv:XXXX.XXXXX},
  year={YYYY}
}