Publikation
Information Extraction from German Patient Records via Hybrid Parsing and Relation Extraction Strategies
Hans-Ulrich Krieger; Christian Spurk; Hans Uszkoreit; Feiyu Xu; Yi Zhang; Frank Müller; Thomas Tolxdorff
In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC-2014). International Conference on Language Resources and Evaluation (LREC-2014), European Language Resources Association, 2014.
Zusammenfassung
In this paper, we report on first attempts and findings to
analyzing German patient records, using a hybrid parsing
architecture and a combination of two relation extraction
strategies.
On a practical level, we are interested in the extraction
of concepts and relations among those concepts, a necessary
cornerstone for building medical information systems.
The parsing pipeline consists of a morphological analyzer,
a robust chunk parser adapted to Latin phrases used in
medical diagnosis, a repair rule stage, and a probabilistic
context-free parser that respects the output from the
chunker.
The relation extraction stage is a combination of two systems:
SProUT, a shallow processor which uses hand-written rules to
discover relation instances from local text units and DARE
which extracts relation instances from complete sentences, using
rules that are learned in a bootstrapping process, starting with
semantic seeds.
Two small experiments have been carried out for the parsing
pipeline and the relation extraction stage.