Publication
Improving Medical Image Classification via Representation Fusion and Contrastive Learning
Pervaiz Khan; Andreas Dengel; Sheraz Ahmed
In: International Conference on Digital Image Computing: Techniques and Applications (DICTA 2024). International Conference on Digital Image Computing Techniques and Applications (DICTA-2024), November 27-29, Pert, Australia, IEEE Xplore, 2025.
Abstract
The recent surge in the use of Transformer-based models has advanced the field of automatic medical diagnosis, primarily due to their rich feature representation and discriminative ability among different classes. To further enhance feature representations, various approaches utilize domain-specific pre-training on large amounts of unlabelled data, while more recent methods employ the vision-language models. In this paper, we present a simple yet effective training method to enhance the model's features using representation fusion and contrastive learning approaches. Specifically, we fuse the representations of the model with those from its copy and then train both original and fused representations in parallel. Additionally, we employ a contrastive loss that refines representations by pulling the original and fused representations closer while pushing other representations away. We validate the proposed approach on 4 medical imaging datasets from the MedMNIST collection and use DINOv2 as the underlying model. Our approach outperforms its baseline method and 18 existing approaches on 3 out of 4 datasets.