Publikation
Maximum Total Correlation Reinforcement Learning
Bang You; Puze Liu; Huaping Liu; Jan Peters; Oleg Arenz
In: Computing Research Repository eprint Journal (CoRR), Vol. abs/2505.16734, Pages 1-23, arXiv, 2025.
Zusammenfassung
Simplicity is a powerful inductive bias. In re-
inforcement learning, regularization is used for
simpler policies, data augmentation for simpler
representations, and sparse reward functions for
simpler objectives, all that, with the underlying
motivation to increase generalizability and robust-
ness by focusing on the essentials. Supplemen-
tary to these techniques, we investigate how to
promote simple behavior throughout the episode.
To that end, we introduce a modification of the
reinforcement learning problem that additionally
maximizes the total correlation within the induced
trajectories. We propose a practical algorithm that
optimizes all models, including policy and state
representation, based on a lower-bound approx-
imation. In simulated robot environments, our
method naturally generates policies that induce
periodic and compressible trajectories, and that
exhibit superior robustness to noise and changes
in dynamics compared to baseline methods, while
also improving performance in the original tasks.
