Skip to main content Skip to main navigation

Publication

The Power of Training: How Different Neural Network Setups Influence the Energy Demand

Daniel Geißler; Bo Zhou; Mengxi Liu; Sungho Suh; Paul Lukowicz
In: ARCS 2024: International Conference on Architecture of Computing Systems. International Conference on Architecture of Computing Systems (ARCS), HPC - Challenges for Sustainable Computing, Springer, 2024.

Abstract

This work offers a heuristic evaluation of the effects of variations in machine learning training regimes and learning paradigms on the energy consumption of computing, especially HPC hardware with a life-cycle aware perspective. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also fosters the fading perception of energy consumption and carbon emission. Therefore, the goal of this work is to raise awareness about the energy impact of general training parameters and processes, from learning rate over batch size to knowledge transfer. Multiple setups with different hyperparameter configurations are evaluated on three different hardware systems. Among many results, we have found out that even with the same model and hardware to reach the same accuracy, im- properly set training hyperparameters consume up to 5 times the energy of the optimal setup. We also extensively examined the energy-saving benefits of learning paradigms including recycling knowledge through pretraining and sharing knowledge through multitask training.

Projects