Publication
User Experience Design for Automatic Credibility Assessment of News Content About COVID-19
Konstantin Schulz; Jens Rauenbusch; Jan Fillies; Lisa Rutenburg; Dimitrios Karvelas; Georg Rehm
In: HCI International 2022 - Late Breaking Papers. International Conference on Human-Computer Interaction (HCII-2022), 24th International Conference on Human-Computer Interaction, located at International Conference on Human-Computer Interaction, June 26 - July 1, Virtual, Springer, 2022.
Abstract
The increasingly rapid spread of information about COVID-19 on the web calls for automatic measures of credibility assessment. If large parts of the population are expected to act responsibly during a pandemic, they need information that can be trusted. In that context, we model the credibility of texts using 25 linguistic phenomena, such as spelling, sentiment and lexical diversity. We integrate these measures in a graphical interface and present two empirical studies to evaluate its usability for credibility assessment on COVID-19 news. The interface prominently features three sub-scores and an aggregation for a quick overview. Besides, metadata about the concept, authorship and infrastructure of the underlying algorithm is provided explicitly. Our working definition of credibility is operationalized through the terms of trustworthiness, understandability, transparency, and relevance. Each of them builds on well-established scientific notions and is explained orally or through Likert scales. In a moderated qualitative interview with six participants, we introduce information transparency for news about COVID-19 as the general goal of a prototypical platform, accessible through an interface in the form of a wireframe. The participants’ answers are transcribed in excerpts. Then, we triangulate inductive and deductive coding methods to analyze their content. As a result, we identify rating scale, sub-criteria and algorithm authorship as important predictors of the usability. In a subsequent quantitative online survey, we present a questionnaire with wireframes to 50 crowdworkers. The question formats include Likert scales, multiple choice and open-ended types. This way, we aim to strike a balance between the known strengths and weaknesses of open vs. closed questions. The answers reveal a conflict between transparency and conciseness in the interface design: Users tend to ask for more information, but do not necessarily make explicit use of it when given. This discrepancy is influenced by capacity constraints of the human working memory. Moreover, a perceived hierarchy of metadata becomes apparent: the authorship of a news text is more important than the authorship of the algorithm used to assess its credibility. From the first to the second study, we notice an improved usability of the aggregated credibility score’s scale. That change is due to the conceptual introduction before seeing the actual interface, as well as the simplified binary indicators with direct visual support. Sub-scores need to be handled similarly if they are supposed to contribute meaningfully to the overall credibility assessment. By integrating detailed information about the employed algorithm, we are able to dissipate the users’ doubts about its anonymity and possible hidden agendas. However, the overall transparency can only be increased if other more important factors, like the source of the news article, are provided as well. Knowledge about this interaction enables software designers to build useful prototypes with a strong focus on the most important elements of credibility: source of text and algorithm, as well as distribution and composition of algorithm. All in all, the understandability of our interface was rated as acceptable (78% of responses being neutral or positive), while transparency (70%) and relevance (72%) still lag behind. This discrepancy is closely related to the missing article metadata and more meaningful visually supported explanations of credibility sub-scores. The insights from our studies lead to a better understanding of the amount, sequence and relation of information that needs to be provided in interfaces for credibility assessment. In particular, our integration of software metadata contributes to the more holistic notion of credibility that has become popular in recent years. Besides, it paves the way for a more thoroughly informed interaction between humans and machine-generated assessments, anticipating the users’ doubts and concerns in early stages of the software design process. Finally, we make suggestions for future research, such as proactively documenting credibility-related metadata for Natural Language Processing and Language Technology services and establishing an explicit hierarchical taxonomy of usability predictors for automatic credibility assessment.