Publication
Expressive Speech Synthesis: Past, Present, and Possible Futures
Marc Schröder
In: Jianhua Tao; Tieniu Tan. Affective Information Processing. Pages 111-126, Springer, London, 2009.
Abstract
Approaches towards adding expressivity to synthetic speech have changed considerably over the last 20 years. Early systems, including formant and diphone systems, have been focused around "explicit control" models; early unit selection systems have adopted a "playback" approach. Currently, various approaches are being pursued to increase the flexibility in expression while maintaining the quality of state-of-the-art systems, among them a new "implicit control" paradigm in statistical parametric speech synthesis, which provides control over expressivity by combining and interpolating between statistical models trained on different expressive databases. The present chapter provides an overview of the past and present approaches, and ventures a look into possible future developments.
Projects
- HUMAINE (2004) - Human-Machine Interaction Network on Emotions
- PAVOQUE - PArametrisation of prosody and VOice QUality for concatenative speech synthesis in view of Emotion expression
- SEMAINE - Sustained Emotionally coloured Machine-human Interaction using Nonverbal Expression