CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems

Sungho Suh, Haebom Lee, Paul Lukowicz, Yong Oh Lee

In: Neural Networks 133 Pages 69-86 Elsevier 1/2021.


The data imbalance problem in classification is a frequent but challenging task. In real-world datasets, numerous class distributions are imbalanced and the classification result under such condition reveals extreme bias in the majority data class. Recently, the potential of GAN as a data augmentation method on minority data has been studied. In this paper, we propose a classification enhancement generative adversarial networks (CEGAN) to enhance the quality of generated synthetic minority data and more importantly, to improve the prediction accuracy in data imbalanced condition. In addition, we propose an ambiguity reduction method using the generated synthetic minority data for the case of multiple similar classes that are degenerating the classification accuracy. The proposed method is demonstrated with five benchmark datasets. The results indicate that approximating the real data distribution using CEGAN improves the classification performance significantly in data imbalanced conditions compared with various standard data augmentation methods.

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz