Publication
Automatized Merging of Italian Lexical Resources
Thierry Declerck; Stefania Racioppa; Karlheinz Mörth
In: Núria Bel; Maria Gavrilidou; Monica Monachini; Valeria Quochi; Laura Rimell (Hrsg.). Proceeding of the LREC 2012 Workshop on Language Resource Merging. International Conference on Language Resources and Evaluation (LREC-12), 8th, located at LREC, May 22, Istanbul, Turkey, ELRA, Paris, 5/2012.
Abstract
In the context of a recently started European project, TrendMiner, there is a need for a large lexical coverage of various languages,
among those the Italian language. The lexicon should include morphological, syntactic and semantic information, but also features for
representing the level of opinion or sentiment that can be expressed by the lexical entries. Since there is no yet ready to use such lexicon,
we investigated the possibility to access and merge various Italian lexical resources. A departure point was the freely available
Morph-it! lexicon, which is containing inflected forms with their lemma and morphological features. We transformed the textual
format of Morph-it! onto a database schema, in order to support integration process with other resources. We then considered Italian
lexicon entries available in various versions of Wiktionary for adding further information, like origin, uses and senses of the entries.
We explore the need to have a standardized representation of lexical resources in order to better integrate the various lexical
information from the distinct sources, and we also describe a first conversion of the lexical information onto a computational lexicon.