Publikation

A Machine Learning Approach for Resource-efficient and Subject-independent Speech Based Stress Detection

Roswitha Duwenbeck; Elsa Andrea Kirchner

In: 2024 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES). IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES-2024), December 11-13, Penang, Malaysia, ISBN 979-8-3503-8340-9, IEEE, 2024.

Zusammenfassung

The aim of this paper is to identify resource-efficient stress recognition methods based on prosodic speech features. Different machine learning classifiers were tested on a two-class problem to distinguish stressed- from non-stressed speakers. This is done using subject-dependent and subject-independent evaluation methods. The resource efficiency was determined by measuring the virtual RAM usage of the prediction, memory consumption of the stored classifiers and classification speed. The best result, a recall of 83.3%, was obtained by Passive Aggressive- and Stochastic Gradient Descent Classifiers when a subject-dependent train-test split with balanced data was used. The results worsened by an average of 28.3 percentage points when the subject-independent leave-one-subject-out cross-validation was used, and by 22.2 points when an also subject-independent balanced GroupKFold evaluation was used. These effects could not be reduced by using Principal Component Analysis on the features, while the inclusion of speaker-specific examples in the previously subject-independent training set brought benefits. Inclusion of speaker-specific examples in the Leave-One-Out training set set led to an average improvement of 6.6 percentage points for three examples, 13.2 percentage points for six examples and even 19.7 percentage points for nine additional examples. The peak recall of train-test split could not be achieved even when nine speaker-specific examples were added. The best classifier in this case was the Passive Aggressive Classifier, with a recall of 79.8% and nine additional samples. In terms of resource efficiency, the RAM consumption per prediction hardly varies between classifiers and lies between 4.6659GB for Aggregated Mondrian Forest Classifier and 4.7853 for One-Vs-Rest SVM, with Dummy Classifier being left out. The space required to store a model is minimal for a Decision Tree, at only 0.0034 MB. The fastest classification is achieved by the Aggregated Mondrian Forest Classifier in 0.0025 seconds. The best preforming classifier in regards of resources, the Passive Aggressive Classifier, needed 0.0495MB for storage, 0.0802s and 4.7677GB virtual Memory for a classification.

Weitere Links

https://ieeexplore.ieee.org/document/10991265

2024_-_StressedSpeechConferenceUpload.pdf (pdf, 136 KB )