Publikation

Robust policy updates for stochastic optimal control

Elmar Rueckert; Max Mindt; Jan Peters; Gerhard Neumann

In: 14th IEEE-RAS International Conference on Humanoid Robots. IEEE-RAS International Conference on Humanoid Robots (Humanoids-2014), November 18-20, Madrid, Spain, Pages 388-393, IEEE, 2014.

Zusammenfassung

For controlling high-dimensional robots, most stochastic optimal control algorithms use approximations of the system dynamics and of the cost function (e.g., using linearizations and Taylor expansions). These approximations are typically only locally correct, which might cause instabilities in the greedy policy updates, lead to oscillations or the algorithms diverge. To overcome these drawbacks, we add a regularization term to the cost function that punishes large policy update steps in the trajectory optimization procedure. We applied this concept to the Approximate Inference Control method (AICO), where the resulting algorithm guarantees convergence for uninformative initial solutions without complex hand-tuning of learning rates. We evaluated our new algorithm on two simulated robotic platforms. A robot arm with five joints was used for reaching multiple targets while keeping the roll angle constant. On the humanoid robot Nao, we show how complex skills like reaching and balancing can be inferred from desired center of gravity or end effector coordinates.

Weitere Links

https://doi.org/10.1109/HUMANOIDS.2014.7041389