Publikation
English-Oromo Machine Translation: An Experiment Using a Statistical Approach
Sisay Adugna; Andreas Eisele
In: Mike Rosner Daniel Tapias (Hrsg.). Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10). International Conference on Language Resources and Evaluation (LREC-2010), May 19-21, La Valletta, Malta, Pages 2196-2199, ISBN 2-9517408-6-7, European Language Resources Association (ELRA), 5/2010.
Zusammenfassung
This paper deals with translation of English documents to Oromo using statistical methods. Whereas English is the lingua franca of
online information, Oromo, despite its relative wide distribution within Ethiopia and neighbouring countries like Kenya and Somalia,
is one of the most resource scarce languages. The paper has two main goals: one is to test how far we can go with the available limited
parallel corpus for the English - Oromo language pair and the applicability of existing Statistical Machine Translation (SMT) systems
on this language pair. The second goal is to analyze the output of the system with the objective of identifying the challenges that need
to be tackled. Since the language is resource scarce as mentioned above, we cannot get as many parallel documents as we want for the
experiment. However, using a limited corpus of 20,000 bilingual sentences and 62, 300 monolingual sentences, translation accuracy in
terms of BLEU Score of 17.74% was achieved.