Publication
The Effect of Border Noise on the Performance of Projection Based Page Segmentation Methods
Faisal Shafait; Thomas Breuel
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 4, Pages 846-851, IEEE, 4/2011.
Abstract
Projection methods have been used in the analysis of bi-tonal document images for different tasks
like page segmentation and skew correction for over two decades. However, these algorithms are sensitive
to the presence of border noise in document images. Border noise can appear along the page border
due to scanning or photocopying. Over the years, several page segmentation algorithms have been
proposed in the literature. Some of these algorithms have come to wide-spread use due to their high
accuracy and robustness with respect to border noise. This paper addresses two important questions in
this context: 1) Can existing border noise removal algorithms clean up document images to a degree
required by projection methods to achieve competitive performance? 2) Can projection methods reach
the performance of other state-of-the-art page segmentation algorithms (e.g. Docstrum or Voronoi) for
documents where border noise has successfully been removed? We perform extensive experiments on
the University of Washington (UW-III) dataset with six border noise removal methods. Our results show
that although projection methods can achieve the accuracy of other state-of-the-art algorithms on the
cleaned document images, existing border noise removal techniques cannot clean up documents captured
under a variety of scanning conditions to the degree required to achieve that accuracy.