Mood detection from daily conversational speech using denoising autoencoder and LSTM

In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismat...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) S. 5125 - 5129
Hauptverfasser:	Kun-Yi Huang, Chung-Hsien Wu, Ming-Hsiang Su, Hsiang-Chi Fu
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.03.2017
Schlagworte:	denoising autoencoder Emotion recognition Hidden Markov models long short-term memory Long-term emotion tracking Mood mood detection Predictive models Speech Support vector machines
ISSN:	2379-190X
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismatch between them. In addition, the individuals with different personalities generally have different emotion expressions. Based on the investigation, in this study, a support vector machine (SVM)-based emotion model is first developed to detect perceived emotion from daily conversational speech. Then, a denoising autoencoder (DAE) is used to construct an emotion conversion model to characterize the relationship between the perceived emotion and the expressed emotion of the subject for a specific personality. Finally, a long short-term memory (LSTM)-based mood model is constructed to model the temporal fluctuation of speech emotions for mood detection. Experimental results show that the proposed method achieved a detection accuracy of 64.5%, improving by 5.0% compared to the HMM-based method.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2017.7953133