Publikation
Integrating Natural Language Processing Components with XML and XSLT
Ulrich Schäfer
ISBN 9783836490276, VDM Verlag Dr. Müller, Saarbrücken, 4/2008.
Zusammenfassung
This book describes novel software architectures for the integration of deep and shallow natural language processing (NLP) components in language technology. The generic markup language XML and the XML transformation language XSLT are used for flexible combination of linguistic markup produced by multiple NLP components. Shallow NLP components such as tokenizers, part-of-speech taggers, named entity recognizers and shallow parsers are combined with a deep parser, operating grammars written in the spirit of the Head-Driven Phrase Structure Grammar (HPSG) theory. The integration paradigm enables synergy leading to more robust deep parsing with increased coverage. It also constitutes a division of labor: the deep grammar models general, correct language use, while shallow systems are responsible for domain-specific extensions. Applications are presented in question answering, information extraction, natural language understanding, ontologies and the Semantic Web. The book addresses to software engineers, computational linguists and language technology engineers.
Projekte
- HyLAP - Hybrid language processing technologies for a personal associative information access and management application
- QUETAL - A multi-lingual template-based question/answering system for exploring large collections of annotated free text documents
- WHITEBOARD - Multilevel Annotation for Dynamic Free Text Processing