Publikation
Ordering Sentences and Paragraphs with Pre-trained Encoder-Decoder Transformers and Pointer Ensembles
Rémi Calizzano; Malte Ostendorff; Georg Rehm
In: Proceedings of the 21st ACM Symposium on Document Engineering (DocEng 2021). ACM Symposium on Document Engineering (DocEng-2021), Limerick, Ireland, Pages 1-9, Association for Computing Machinery, 8/2021.
Zusammenfassung
Passage ordering aims to maximise discourse coherence in document generation or document modification tasks such as summarisation or storytelling. This paper extends the passage ordering task from sentences to paragraphs, i.e., passages with multiple sentences. Increasing the passage length increases the task's difficulty. To account for this, we propose the combination of a pre-trained encoder-decoder Transformer model, namely BART, with variations of pointer networks. We empirically evaluate the proposed models for sentence and paragraph ordering. Our best model outperforms previous state of the art methods by 0.057 Kendall's Tau on one of three sentence ordering benchmarks (arXiv, VIST, ROC-Story). For paragraph ordering, we construct two novel datasets from Wikipedia and CNN-DailyMail on which we achieve 0.67 and 0.47 Kendall's Tau. The best model variation utilises multiple pointer networks in an ensemble-like fashion. We hypothesise that the use of multiple pointers better reflects the multitude of possible orders of paragraphs in more complex texts. Our code, data, and models are publicly available.