Publikation
A Data-driven Approach for Noise Reduction in Distantly Supervised Biomedical Relation Extraction
Saadullah Amin; Katherine Dunfield; Anna Vechkaeva; Günter Neumann
In: BioNLP 2020 Workshop on Biomedical Natural Language Processing. Workshop on Current Trends in Biomedical Natural Language Processing (BioNLP-2020), located at The 58th Annual Meeting of the Association for Computational Linguistics, July 9, ACL, 2020.
Zusammenfassung
Fact triples are a common form of structured
knowledge used within the biomedical domain.
As the amount of unstructured scientific texts
continues to grow, manual annotation of these
texts for the task of relation extraction becomes
increasingly expensive. Distant supervision
offers a viable approach to combat this
by quickly producing large amounts of labeled,
but considerably noisy, data. We aim to reduce
such noise by extending an entity-enriched relation
classification BERT model to the problem
of multiple instance learning, and defining
a simple data encoding scheme that significantly
reduces noise, reaching state-of-the-art
performance for distantly-supervised biomedical
relation extraction. Our approach further
encodes knowledge about the direction of relation
triples, allowing for increased focus on relation
learning by reducing noise and alleviating
the need for joint learning with knowledge
graph completion.
Projekte
DEEPLEE - Tiefes Lernen für End-to-End-Anwendungen in der Sprachtechnologie,
PRECISE4Q - Personalised Medicine by Predictive Modeling in Stroke for better Quality of Life
PRECISE4Q - Personalised Medicine by Predictive Modeling in Stroke for better Quality of Life