Publikation
Coupled Snakelets for Curled Text-Line Segmentation from Warped Document Images
Syed Saqib Bukhari; Faisal Shafait; Thomas Breuel
In: International Journal on Document Analysis and Recognition (IJDAR), Vol. 15, Pages 1-16, Springer, 2012.
Zusammenfassung
Camera-captured, warped document images usually contain curled text-lines because of distortions caused
by camera perspective view and page curl. Warped document
images can be transformed into planar document images for
improving optical character recognition accuracy and human
readability using monocular dewarping techniques. Curled
text-lines segmentation is a crucial initial step for most of
the monocular dewarping techniques. Existing curled textline segmentation approaches are sensitive to geometric and
perspective distortions. In this paper, we introduce a novel
curled text-line segmentation algorithm by adapting active
contour (snake). Our algorithm performs text-line segmentation by estimating pairs of x-line and baseline. It estimates a
local pair of x-line and baseline on each connected component by jointly tracing top and bottom points of neighboring
connected components, and finally each group of overlapping pairs is considered as a segmented text-line. Our algorithm has achieved curled text-line segmentation accuracy of
above 95% on the DFKI-I (CBDAR 2007 dewarping contest)
dataset, which is significantly better than previously reported
results on this dataset.