Publication
Preprocessing for Unification Parsing of Spoken Language
Mark-Jan Nederhof
In: Dimitris Christodoulakis (Hrsg.). Proceedings of the Natural Language Processing - NLP 2000, June 2-4. Conference on Applied Natural Language Processing, Patras, Greece, Pages 118-129, Lecture Notes in Artificial Intelligence, No. 1835, Springer, 2000.
Abstract
Wordgraphs are structures that may be output by speech recognizers. We discuss various methods for turning wordgraphs into smaller structures. One of these methods is novel; this method relies on a new kind of determinization of acyclic weighted finite automata that is language-preserving but not fully weight-preserving, and results in smaller automata than in the case of traditional determinization of weighted finite automata. We present empirical data comparing the respective methods. The methods are relevant for systems in which wordgraphs form the input to kinds of syntactic analysis that are very time consuming, such as unification parsing.