Publication
Layered direct policy search for learning hierarchical skills
Felix End; Riad Akrour; Jan Peters; Gerhard Neumann
In: 2017 IEEE International Conference on Robotics and Automation. IEEE International Conference on Robotics and Automation (ICRA-2017), May 29 - June 3, Singapore, Pages 6442-6448, IEEE, 2017.
Abstract
Solutions to real world robotic tasks often require complex behaviors in high dimensional continuous state and action spaces. Reinforcement Learning (RL) is aimed at learning such behaviors but often fails for lack of scalability. To address this issue, Hierarchical RL (HRL) algorithms leverage hierarchical policies to exploit the structure of a task. However, many HRL algorithms rely on task specific knowledge such as a set of predefined sub-policies or sub-goals. In this paper we propose a new HRL algorithm based on information theoretic principles to autonomously uncover a diverse set of sub-policies and their activation policies. Moreover, the learning process mirrors the policys structure and is thus also hierarchical, consisting of a set of independent optimization problems. The hierarchical structure of the learning process allows us to control the learning rate of the sub-policies and the gating individually and add specific information theoretic constraints to each layer to ensure the diversification of the sub-policies. We evaluate our algorithm on two high dimensional continuous tasks and experimentally demonstrate its ability to autonomously discover a rich set of sub-policies.