Publication
Bootstrapping Relation Extraction from Semantic Seeds
Feiyu Xu
PhD-Thesis, Saarland University, 2007.
Abstract
Information Extraction (IE) is a technology for localizing and classifying pieces of relevant information in unstructured natural language texts and detecting relevant relations among them. This thesis deals with one of the central tasks of IE, i.e., relation extraction. The goal is to provide a general framework that automatically learns mappings between linguistic analyses and target semantic relations, with minimal human intervention. Furthermore, this framework is supposed to support the adaptation to new application domains and new relations with various complexities. The central result is a new approach to relation extraction which is based on a minimally supervised method for automatically learning extraction grammars from a large collection of parsed texts, initialized by some instances of the target relation, called semantic seed. Due to the semantic seed approach, the framework can accommodate new relation types and domains with minimal effort. It supports relations of different arity as well as their pro jections. Furthermore, this framework is general enough to employ any linguistic analysis tools that provide the required type and depth of analysis.