Publikation

Maximum Total Correlation Reinforcement Learning

Bang You; Puze Liu; Huaping Liu; Jan Peters; Oleg Arenz

In: Computing Research Repository eprint Journal (CoRR), Vol. abs/2505.16734, Pages 1-23, arXiv, 2025.

Zusammenfassung

Simplicity is a powerful inductive bias. In re- inforcement learning, regularization is used for simpler policies, data augmentation for simpler representations, and sparse reward functions for simpler objectives, all that, with the underlying motivation to increase generalizability and robust- ness by focusing on the essentials. Supplemen- tary to these techniques, we investigate how to promote simple behavior throughout the episode. To that end, we introduce a modification of the reinforcement learning problem that additionally maximizes the total correlation within the induced trajectories. We propose a practical algorithm that optimizes all models, including policy and state representation, based on a lower-bound approx- imation. In simulated robot environments, our method naturally generates policies that induce periodic and compressible trajectories, and that exhibit superior robustness to noise and changes in dynamics compared to baseline methods, while also improving performance in the original tasks.

Weitere Links

https://doi.org/10.48550/arXiv.2505.16734

2505.16734v1.pdf (pdf, 1 MB )