Publikation
International Conference on Computer Vision, ICCV 2025
Talha Uddin Sheikh; Sankalp Sinha; Shino Sam; Didier Stricker; Muhammad Zeshan Afzal (Hrsg.)
International Conference on Computer Vision (ICCV-2025), October 19-23, Honolulu, Hawai, USA, IEEE/CVF, 10/2025.
Zusammenfassung
In this paper, we introduce the first dataset specifically designed for zero-shot learning (ZSL) and out-of-distribution (OOD) detection in document images, helping to advance the field of document classification. Traditional document classification systems often struggle with handling inputs that deviate from their training distributions, a critical challenge for achieving robustness and generalizability. While the RVL-CDIP corpus serves as a standard benchmark due to its extensive scale, it primarily supports in-distribution evaluations. Furthermore, while there is a small dataset that addresses OOD tasks for document images, but none that simultaneously incorporate elements of both zero-shot and OOD detection. Our dataset, an extension of the RVL-CDIP, fills this crucial gap by facilitating both OOD detection and zero-shot image classification. It comprises approximately 38,000 images across ten classes--five overlapping with traditional RVL-CDIP categories and five completely new. This unique setup enables researchers to rigorously test both OOD detection and ZSL capabilities, providing a robust framework for evaluating the resilience of document classifiers in supervised and zero-shot settings. We present comprehensive benchmarking results, using state-of-the-art models, to demonstrate the utility of our dataset in enhancing the development and assessment of document classification systems that are both robust and generalizable. The dataset is publicly available.
