Publikation
Ontology-Based Information Extraction from Handwritten Documents
Sebastian Ebert; Marcus Liwicki; Andreas Dengel
In: Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR-10), November 16-18, Kolkata, India, Pages 483-488, 2010.
Zusammenfassung
In this paper we introduce a new layer for the task
of handwriting recognition. We add semantic information
by means of ontologies. The task of our recognizer
therefore is not only to recognize the ASCII
transcription of the handwritten document, but also to
identify the semantic concepts which appear in the text.
This task is called ontology-based information extraction
(OBIE), which have been applied to electronic documents
recently. OBIE methods first segment the text
into tokens, then identify their values and their corresponding
instances of the ontology, and finally try to
generate new facts based on the text. To the authors’
knowledge, in this paper OBIE is proposed for the first
time in handwriting literature. In our experiments we
have evaluated the process up to the instantiation. We
have found that using not only the top alternative, but
also the k-best alternatives increases the performance
of information extraction. Furthermore, the use of an
ontology-based lexicon results in another performance
increase.