Skip to main content Skip to main navigation

Publikation

The Importance of Facial Features in Vision-based Sign Language Recognition: Eyes, Mouth or Full Face?

Dinh Nam Pham; Eleftherios Avramidis
In: ACM International Conference on Intelligent Virtual Agents (IVA Adjunct ’25). International Workshop on Sign Language Translation and Avatar Technology (SLTAT-2025), 9th International Workshop on Sign Language Translation and Avatar Technology, located at IVA-2025, September 16, Berlin, Germany, ISBN 979-8-4007-1996-7/25/09, ACM, 9/2025.

Zusammenfassung

Non-manual facial features play a crucial role in sign language communication, yet their importance in automatic sign language recognition (ASLR) remains underexplored. While prior studies have shown that incorporating facial features can improve recognition, related work often relies on hand-crafted feature extraction and fails to go beyond the comparison of manual features versus the combination of manual and facial features. In this work, we systematically investigate the contribution of distinct facial regionseyes, mouth, and full faceusing two different deep learning models (a CNN-based model and a transformer-based model) trained on an SLR dataset of isolated signs with randomly selected classes. Through quantitative performance and qualitative saliency map evaluation, we reveal that the mouth is the most important non-manual facial feature, significantly improving accuracy. Our findings highlight the necessity of incorporating facial features in ASLR.

Projekte