Publication
Analysis and Improvement of Minimally Supervised Machine Learning for Relation Extraction
Hans Uszkoreit; Feiyu Xu; Hong Li
In: 14th International Conference on Applications of Natural Language to Information Systems. International Conference on Applications of Natural Language to Information Systems (NLDB-09), June 23-26, Saarbrücken, Germany, Springer, 2009.
Abstract
The main contribution of this paper is a systematic analysis
of a minimally supervised machine learning method for relation extraction
grammars. The method is based on a bootstrapping approach in
which the bootstrapping is triggered by semantic seeds. The starting
point of our analysis is the pattern-learning graph which is a subgraph
of the bipartite graph representing all connections between linguistic patterns
and relation instances exhibited by the data. It is shown that the
performance of such general learning framework for actual tasks is dependent
on certain properties of the data and on the selection of seeds.
Several experiments have been conducted to gain explanatory insights
into the interaction of these two factors. From the investigation of more
effective seeds and benevolent data we understand how to improve the
learning in less fortunate configurations. A relation extraction method
only based on positive examples cannot avoid all false positives, especially
when the data properties yield a high recall. Therefore, negative
seeds are employed to learn negative patterns, which boost precision.