Skip to main content Skip to main navigation

Publication

Generating Virtual Parallel Corpus: A Compatibility Centric Method

Jia Xu; Weiwei Sun
In: MT Summit XIII. Machine Translation Summit (MT Summit-11), 13. September 19-23, Xiaman, China, NA, Xiamen, 9/2011.

Abstract

The processing of many natural languages suffers from scarce linguistic resources. We introduce the idea of compatibility to extend training data for machine translation: If translation hypotheses by multiple systems are measured as compatible, they are considered as reliable predictions. By this way, we generate virtual parallel data per bridge language, and re-compiling on this corpus improves our machine translation quality by more than 30% relatively.

Projects