Publication
Generating Virtual Parallel Corpus: A Compatibility Centric Method
Jia Xu; Weiwei Sun
In: MT Summit XIII. Machine Translation Summit (MT Summit-11), 13. September 19-23, Xiaman, China, NA, Xiamen, 9/2011.
Abstract
The processing of many natural languages suffers
from scarce linguistic resources. We introduce
the idea of compatibility to extend
training data for machine translation: If translation
hypotheses by multiple systems are
measured as compatible, they are considered
as reliable predictions. By this way, we generate
virtual parallel data per bridge language,
and re-compiling on this corpus improves our
machine translation quality by more than 30%
relatively.