Publikation

Exploring the Potential of Large Language Models in Adaptive Machine Translation for Generic Text and Subtitles

Abdelhadi Soudi; Mohamed Hannani; Kristof Van Laerhoven; Eleftherios Avramidis

In: Pierre Zweigenbaum; Reinhard Rapp; Serge Sharoff (Hrsg.). Proceedings of the 17th Workshop on Building and Using Comparable Corpora. Workshop on Building and Using Comparable Corpora (BUCC-2024), located at LREC 2024, May 20, Torino, Italy, Pages 51-58, ELRA and ICCL, 5/2024.

Zusammenfassung

This paper investigates the potential of contextual learning for adaptive real-time machine translation (MT) using Large Language Models (LLMs) in the context of subtitles and generic text with fuzzy matches. By using a strategy based on prompt composition and dynamic retrieval of fuzzy matches, we achieved improvements in the translation quality compared to previous work. Unlike static selection, which may not adequately meet all request sentences, our enhanced methodology allows for dynamic adaptation based on user input. It was also shown that LLMs and Encoder-Decoder models achieve better results with generic texts than with subtitles for the language pairs English-to-Arabic (En→Ar) and English-to-French (En→Fr). Experiments on datasets with different sizes for En→Ar subtitles indicate that the bigger is not really the better. Our experiments on subtitles support results from previous work on generic text that LLMs are capable of adapting to In-Context learning with few-shot, outperforming Encoder-Decoder MT models and that the combination of LLMs and Encoder-Decoder models improves the quality of the translation.

Projekte

Weitere Links

https://aclanthology.org/2024.bucc-1.6

2024.bucc-1.6-2.pdf (pdf, 361 KB )