Publication
Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing
Noon Pokaratsiri; Saadullah Amin; Günter Neumann (Hrsg.)
Annual Meeting of the Association for Computational Linguistics (ACL-2024), The 62nd Annual Meeting of the Association of Computational Linguistics, located at ACL, August 11-16, Bangkok, Thailand, ISBN 979-8-89176-145-2, Association for Computational Linguistics (ACL), 8/2024.
Abstract
A common approach to automatically assigning diagnostic and procedural clinical codes to health records is to solve the task as a multi-label classification problem. Difficulties associated with this task stem from domain knowledge requirements, long document texts, large and imbalanced label space, reflecting the breadth and dependencies between medical diagnoses and procedures. Decisions in the healthcare domain also need to demonstrate sound reasoning, both when they are correct and when they are erroneous. Existing works address some of these challenges by incorporating external knowledge, which can be encoded into a graph-structured format. Incorporating graph structures on the output label space or between the input document and output label spaces have shown promising results in medical codes classification. Limited focus has been put on utilizing graph-based representation on the input document space. To partially bridge this gap, we represent clinical texts as graph-structured data through the UMLS Metathesaurus; we explore implicit graph representation through pre-trained knowledge graph embeddings and explicit domain-knowledge guided encoding of document concepts and relational information through graph neural networks. Our findings highlight the benefits of pre-trained knowledge graph embeddings in understanding model's attention-based reasoning. In contrast, transparent domain knowledge guidance in graph encoder approaches is overshadowed by performance loss. Our qualitative analysis identifies limitations that contribute to prediction errors.