Publikation

Natural vs. Synthesized Speech in Spoken Dialog Systems Research - Comparing the Performance of Recognition Results

Tatjana Scheffler; Roland Roller; Florian Kretzschmar; Sebastian Möller; Norbert Reithinger

In: Tim Flingscheidt; Walter Kellermann (Hrsg.). ITG-Fachbericht Sprachkommunikation 2012. ITG-Fachtagung (ITG-2012), September 26-28, Braunschweig, Pages 127-130, ISBN 978-3-8007-3455-9, VDE Verlag, Berlin, 2012.

Zusammenfassung

In this paper, we test the effect of using speech synthe- sis when interacting with a spoken dialog system (SDS). We use a user simulation to connect our speech synthe- sis to a real, state-of-the-art automatic speech recognition (ASR) component deployed in a working commercial SDS via a standard telephone line. In a series of experiments, we compare human-machine dialogs and their recognition scores with simulated dialogs using synthesis. Our results show that a good text-to-speech synthesis configuration ri- vals human speech both in recognition scores as well as variability. This makes the speech interface in user simu- lation quite attractive.