Publication

Can Explainability Metrics Improve Genetic Programming? Lessons from 2048

Lauren Paul; Christina Plump; Bernhard J. Berger; Rolf Drechsler

In: The Genetic and Evolutionary Computation Conference (GECCO Companion). Genetic and Evolutionary Computation Conference (GECCO-2026), July 13-17, San José, Costa Rica, 2026.

Abstract

Genetic programming (GP) produces solutions that, while often less performant than neural networks, are uniquely amenable to analysis. In domains like game-playing, this explainability could reveal why certain policies succeed or fail—but such insights are rarely leveraged to improve the algorithms themselves. Here, we analyze a GP algorithm for 2048, testing whether structural, behavioral, or semantic metrics of evolved policies correlate with performance. While structural metrics showed no predictive power and behavioral features yielded ambiguous results, we identified a semantic feature that correlated with policy quality. Using this insight, we designed a mutation operator that improved performance. Though modest, this improvement suggests that explainability metrics can guide operator design, even when broader explanatory goals remain unmet. More broadly, our work highlights a potential advantage of GP: its solutions may be analyzable in ways that opaque methods like neural networks are not.