Skip to main content Skip to main navigation

Publication

ColDBin: Cold Diffusion for Document Image Binarization

Saifullah Saifullah; Stefan Agne; Andreas Dengel; Sheraz Ahmed
In: Document Analysis and Recognition - ICDAR 2023. International Conference on Document Analysis and Recognition (ICDAR-2023), located at The 17th International Conference on Document Analysis and Recognition, August 21-26, San José, California, California, USA, Springer Nature Switzerland, 8/2023.

Abstract

Document images, when captured in real-world settings, either modern or historical, frequently exhibit various forms of degradation such as ink stains, smudges, faded text, and uneven illumination, which can significantly impede the performance of deep learning-based approaches for document processing. In this paper, we propose a novel end-to-end framework for binarization of degraded document images based on cold diffusion. In particular, our approach involves training a diffusion model with the objective of generating a binarized document image directly from a degraded input image. To the best of the authors’ knowledge, this is the first work that investigates diffusion models for the task of document binarization. In order to assess the effectiveness of our approach, we evaluate it on 9 different benchmark datasets for document binarization. The results of our experiments show that our proposed approach outperforms several existing state-of-the-art approaches, including complex approaches utilizing generative adversarial networks (GANs) and variational auto-encoders (VAEs), on 7 of the datasets, while achieving comparable performance on the remaining 2 datasets. Our findings suggest that diffusion models can be an effective tool for document binarization tasks and pave the way for future research on diffusion models for document image enhancement tasks. The implementation code of our framework is publicly available at: https://github.com/saifullah3396/coldbin.

More links