Skip to main content Skip to main navigation

Publikation

Fine-grained Named Entity Recognition in Legal Documents

Elena Leitner; Georg Rehm; Julián Moreno-Schneider
In: Maribel Acosta; Philippe Cudré-Mauroux; Maria Maleshkova; Tassilo Pellegrini; Harald Sack; York Sure-Vetter (Hrsg.). Semantic Systems. The Power of AI and Knowledge Graphs. Proceedings of the 15th International Conference (SEMANTiCS 2019). ACM International Conference on Semantic Systems (SEMANTiCS-2019), Karlsruhe, Germany, Pages 272-287, Lecture Notes in Computer Science, No. 11702, Springer, 9/2019.

Zusammenfassung

This paper describes an approach at Named Entity Recog-nition (NER) in German language documents from the legal domain.For this purpose, a dataset consisting of German court decisions wasdeveloped. The source texts were manually annotated with 19 seman-tic classes:person,judge,lawyer,country,city,street,landscape,orga-nization,company,institution,court,brand,law,ordinance,Europeanlegal norm,regulation,contract,court decision,andlegal literature.Thedataset consists of approx. 67,000 sentences and contains 54,000 anno-tated entities. The 19 fine-grained classes were automatically generalisedto seven more coarse-grained classes (person,location,organization,legal norm,case-by-case regulation,court decision,andlegal literature).Thus, the dataset includes two annotation variants, i.e., coarse- andfine-grained. For the task of NER, Conditional Random Fields (CRFs)and bidirectional Long-Short Term Memory Networks (BiLSTMs) wereapplied to the dataset as state of the art models. Three different modelswere developed for each of these two model families and tested with thecoarse- and fine-grained annotations. The BiLSTM models achieve thebest performance with an 95.46 F1score for the fine-grained classes and95.95 for the coarse-grained ones. The CRF models reach a maximum of93.23 for the fine-grained classes and 93.22 for the coarse-grained ones.The work presented in this paper was carried out under the umbrellaof the European project LYNX that develops a semantic platform thatenables the development of various document processing and analysisapplications for the legal domain.

Projekte

Weitere Links