Publication
Morphemes and POS tags for n-gram based evaluation metrics
Maja Popovic
In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Workshop on Statistical Machine Translation (WMT-11), 6th, located at EMNLP, July 30-31, Edinburgh, United Kingdom, Pages 104-107, Association for Computational Linguistics, 7/2011.
Abstract
We propose the use of morphemes for automatic evaluation of machine
translation output, and systematically investigate a set of F score
and BLEU score based metrics calculated on words, morphemes
and POS tags along with all corresponding combinations.
Correlations between the new metrics and human judgments are
calculated on the data of the third, fourth and fifth shared tasks
of the Statistical Machine Translation Workshop. Machine
translation outputs in five different European languages are used:
English, Spanish, French, German and Czech. The results show that
the F scores which take into account morphemes and POS tags are
the most promising metrics.