Publication

Evaluating the Robustness of HJB Optimal Feedback Control

Michael Lutter; Debora Clever; Boris Belousov; Kim Listmann; Jan Peters

In: International Symposium on Robotics. International Symposium on Robotics (ISR-2020), 52th, December 9-10, Pages 1-8, VDE, 2020.

Abstract

Developing and tuning a feedback controller including dynamics compensation is a challenging but essential part of many control applications. In contrast, describing a control task using a cost function and learning the corresponding optimal controller can simplify this controller development. However, current popularized deep reinforcement learning methods to obtain these controllers automatically, do not achieve good control policies w.r.t. the required smoothness, generalization and robustness w.r.t. parameter uncertainty of the approximate dynamics model. In this paper we describe HJB optimal control (HJBopt) a different approach to obtain optimal feedback policies by optimization rather than repeated sampling actions. This approach optimizes the residual of the Hamilton-Jacobi-Bellman differential equation on the complete state domain to obtain an optimal value function that directly implies a continuous time optimal policy on the complete state domain. The experiments show that the proposed HJBopt learns a good approximation of the optimal policy and this approximation exhibits much better smoothness and generalization compared to the deep reinforcement learning baselines. In addition, we show empirically that these characteristics enable HJBopt to obtain much more robust policies compared to these baselines.

Evaluating the Robustness of HJB Optimal Feedback Control

Abstract

More links