Publikation
Prepositions in Applications: A Survey and Introduction to the Special Issue
Timothy Baldwin; Valia Kordoni; Aline Villavicencio
In: Robert Dale (Hrsg.). Computational Linguistics (CL), Vol. 35, No. No. 2, Pages 119-149, MIT Press, 2009.
Zusammenfassung
Prepositions - as well as prepositional phrases (PPs) and markers of various sorts -
have a mixed history in computational linguistics (CL), as well as related fields such as
artificial intelligence, information retrieval (IR), and computational psycholinguistics:
On the one hand they have been championed as being vital to precise language understanding
(e.g., in information extraction), and on the other they have been ignored
on the grounds of being syntactically promiscuous and semantically vacuous, and
relegated to the ignominious rank of "stop word" (e.g., in text classification and IR).
Although NLP in general has benefitted from advances in those areas where prepositions
have received attention, there are still many issues to be addressed. For example,
in machine translation, generating a preposition (or "case marker" in languages such
as Japanese) incorrectly in the target language can lead to critical semantic divergences
over the source language string. Equivalently in information retrieval and information
extraction, it would seem desirable to be able to predict that book on NLP and book about
NLP mean largely the same thing, but paranoid about drugs and paranoid on drugs suggest
very different things.
Prepositions are often among the most frequent words in a language. For example,
based on the British National Corpus (BNC; Burnard 2000), four out of the top-ten
most-frequent words in English are prepositions (of, to, in, and for). In terms of both
parsing and generation, therefore, accurate models of preposition usage are essential to
avoid repeatedly making errors. Despite their frequency, however, they are notoriously
difficult to master, even for humans (Chodorow, Tetreault, and Han 2007). For example,
Lindstromberg (2001) estimates that less than 10% of upper-level English as a Second
Language (ESL) students can use and understand prepositions correctly, and Izumi et al.
(2003) reported error rates of English preposition usage by Japanese speakers of up
to 10%.
The purpose of this special issue is to showcase recent research on prepositions
across the spectrumof computational linguistics, focusing on computational syntax and
semantics. More importantly, however, we hope to reignite interest in the systematic
treatment of prepositions in applications. To this end, this article is intended to present
a cross-section view of research on prepositions and their use in NLP applications. We
begin by outlining the syntax of prepositions and its relevance to NLP applications,
focusing on PP attachment and prepositions in multiword expressions (Section 2). Next,
we discuss formal and lexical semantic aspects of prepositions, and again their relevance
to NLP applications (Section 3), and describe instances of applied research where
prepositions have featured prominently (Section 4). Finally,we outline the contributions
of the papers included in this special issue (Section 5) and conclude with a discussion of
research areas relevant to prepositions which we believe are ripe for further exploration
(Section 6).