Publikation
Aligning Instruction-Tuned LLMs for Event Extraction with Multi-objective Reinforcement Learning
Omar Adjali; Siting Liang; Omair Shahzad Bhatti; Daniel Sonntag
In: Springer (Hrsg.). Advances in Information Retrieval. European Conference on Information Retrieval (ECIR), 48th European Conference on Information Retrieval, located at ECIR-2026, March 29 - April 2, Delft, Netherlands, ISBN 978-3-032-21300-6, Springer, Cham, 3/2026.
Zusammenfassung
Event extraction (EE) aims to identify event triggers and
their corresponding arguments from unstructured text, providing struc-
tured knowledge essential for many downstream tasks. Despite the success
of instruction-tuned Large Language Models (LLMs), current methods of-
ten produce inconsistent formats, semantically drifted outputs, and event
types that deviate from predefined schemes. These issues arise partly
because supervised fine-tuning relies on static loss functions that fail
to reflect task-specific objectives such as schema alignment. To address
these limitations, we introduce a reinforcement learning framework based
on Group Relative Policy Optimization (GRPO) designed to optimize
instruction-tuned LLMs for event and argument extraction. We propose
three complementary reward functions: a format reward to enforce syn-
tactic and structural validity, a BM25-based reward to enhance lexical
and semantic consistency with the input text, and a task-specific supervi-
sion reward that directly aligns optimization with task-level performance.
Extensive experiments on three standard EE datasets demonstrate that
our approach consistently and significantly improves EE performance
over strong baselines.
