Publication
Learning Sim-Grounded Policies for Bimanual Rope Manipulation from Human Teleoperation Data
Gina Wigginghaus; Tim Missal; Berk Guler; Simon Manschitz; Jan Peters
In: Computing Research Repository eprint Journal (CoRR), Vol. abs/2605.16043, Pages 1-5, arXiv, 2026.
Abstract
Deformable Linear Objects (DLOs) such as ropes
and cables are widely encountered in both household and indus-
trial applications, yet remain challenging to manipulate due to
their infinite-dimensional configuration space and frequent self-
occlusion. Imitation learning from teleoperation offers a practical
path to bimanual DLO manipulation, but its scalability is
limited by human effort, making the choice of observation space
critical for generalization from small datasets. In this study, we
investigate whether the lack of generalization in egocentric visual
policies for the knot-untangling task stems from the observation
space itself, rather than from the policy architecture or data scale.
We compare two Action Chunking with Transformers policies
trained on the same bimanual teleoperation data: a vision-based
policy conditioned on two egocentric RGB streams from wrist-
mounted cameras, and a state-based policy conditioned on the
DLO’s 3D particle state, extracted from an initial observation
via multi-view fusion and evolved in a particle-based eXtended
Position-Based Dynamics simulation. Evaluated open-loop on an
unseen rope configuration, the state-based policy outperforms
its visual counterpart with a 30.8% reduction in L1 error
when predicting the initial grasp-and-pull action, quantifying the
observability gap between pixels and physics-consistent state, and
pointing toward more data-efficient robot learning for the DLO
manipulation task from limited human demonstrations.
