Publikation
Exploiting Cultural Biases via Homoglyphs inText-to-Image Synthesis (Abstract Reprint)
Lukas Struppek; Dominik Hintersdorf; Felix Friedrich; Manuel Brack; Patrick Schramowski; Kristian Kersting
In: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, South Korea, August 3-9, 2024. International Joint Conference on Artificial Intelligence (IJCAI), ijcai.org, 2024.
Zusammenfassung
Models for text-to-image synthesis, such as DALL-E 2 and Stable Diffusion, have recently
drawn a lot of interest from academia and the general public. These models are capa-
ble of producing high-quality images that depict a variety of concepts and styles when
conditioned on textual descriptions. However, these models adopt cultural characteristics
associated with specific Unicode scripts from their vast amount of training data, which
may not be immediately apparent. We show that by simply inserting single non-Latin
characters in the textual description, common models reflect cultural biases in their gener-
ated images. We analyze this behavior both qualitatively and quantitatively and identify
a model’s text encoder as the root cause of the phenomenon. Such behavior can be inter-
preted as a model feature, offering users a simple way to customize the image generation
and reflect their own cultural background. Yet, malicious users or service providers may
also try to intentionally bias the image generation. One goal might be to create racist
stereotypes by replacing Latin characters with similarly-looking characters from non-Latin
scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a
novel homoglyph unlearning method to fine-tune a text encoder, making it robust against
homoglyph manipulations.
