Publication
Detecting when Users Disagree with Generated Captions
Omair Shahzad Bhatti; Harshinee Sriram; Abdulrahman Mohamed Selim; Cristina Conati; Michael Barz; Daniel Sonntag
In: Companion Proceedings of the 26th International Conference on Multimodal Interaction. ACM International Conference on Multimodal Interaction (ICMI-2024), November 4, San José, Costa Rica, Pages 195-203, ICMI Companion '24, ISBN 9798400704635, Association for Computing Machinery, New York, NY, USA, 2024.
Abstract
The pervasive integration of artificial intelligence (AI) into daily life has led to a growing interest in AI agents that can learn continuously. Interactive Machine Learning (IML) has emerged as a promising approach to meet this need, essentially involving human experts in the model training process, often through iterative user feedback. However, repeated feedback requests can lead to frustration and reduced trust in the system. Hence, there is increasing interest in refining how these systems interact with users to ensure efficiency without compromising user experience. Our research investigates the potential of eye tracking data as an implicit feedback mechanism to detect user disagreement with AI-generated captions in image captioning systems. We conducted a study with 30 participants using a simulated captioning interface and gathered their eye movement data as they assessed caption accuracy. The goal of the study was to determine whether eye tracking data can predict user agreement or disagreement effectively, thereby strengthening IML frameworks. Our findings reveal that, while eye tracking shows promise as a valuable feedback source, ensuring consistent and reliable model performance across diverse users remains a challenge.
Projekte
MASTER - MASTER: Mixed reality ecosystem for teaching robotics in manufacturing,
No-IDLE - Interactive Deep Learning Enterprise
No-IDLE - Interactive Deep Learning Enterprise