Publikation
Combining Short-term Cepstral and Long-term Prosodic Features for Automatic Recognition of Speaker Age
Christian Müller; Felix Burkhardt
In: Proceedings of the Interspeech 2007. Conference in the Annual Series of Interspeech Events (INTERSPEECH-07), August 27-31, Antwerp, Belgium, ISCA, 2007.
Zusammenfassung
The most successful systems in previous comparative studies
on speaker age recognition used short-term cepstral features
modeled with Gaussian Mixture Models (GMMs) or applied
multiple phone recognizers trained with the data of speakers
of the respective class. Acoustic analyses, however, indicate
that certain features such as pitch extracted from a longer span
of speech correlate clearly with the speaker age although the
systems based on those features have been inferior to the before
mentioned approaches. In this paper, three novel systems
combining short-term cepstral features and long-term features
for speaker age recognition are compared to each other. A system
combining GMMs using frame-based MFCCs and Support-
Vector-Machines using long-term pitch performs best. The results
indicate that the combination of the two feature types is
a promising approach, which corresponds to findings in related
fields like speaker recognition.