Publication
Benchmarking Deep Learning Models for Classification of Book Covers
Adriano Lucieri; Huzaifa Sabir; Muhammad Shoaib Ahmed Siddiqui; Syed Tahseen Raza Rizvi; Brian Kenji Iwana; Seiichi Uchida; Andreas Dengel; Sheraz Ahmed
In: Springer Singapore (Hrsg.). SN Computer Science, Vol. 1, No. Issue 3, Pages 1-16, Springer, Singapore, 5/2020.
Abstract
Book covers usually provide a good depiction of a book's content and its central idea. The classification of books in their respective genre usually involves subjectivity and contextuality. Book retrieval systems would utterly benefit from an automated framework that is able to classify a book's genre based on an image, specifically for archival documents where digitization of the complete book for the purpose of indexing is an expensive task. While various modalities are available (e.g., cover, title, author, abstract), benchmarking the image-based classification systems based on minimal information is a particularly exciting field due to the recent advancements in the domain of image-based deep learning and its applicability. For that purpose, a natural question arises regarding the plausibility of solving the problem of book classification by only utilizing an image of its cover along with the current state-of-the-art deep learning models. To answer this question, this paper makes a three-fold contribution. First, the publicly available book cover dataset comprising of 57k book covers belonging to 30 different categories is thoroughly analyzed and corrected. Second, it benchmarks the performance on a battery of state-of-the-art image classification models for the task of book cover classification. Third, it uses explicit attention mechanisms to identify the regions that the network focused on in order to make the prediction. All of our evaluations were performed on a subset of the mentioned public book cover dataset. Analysis of the results revealed the inefficacy of the most powerful models for solving the classification task. With the obtained results, it is evident that significant efforts need to be devoted in order to solve this image-based classification task to a satisfactory level.