Emotional speech feature normalization and recognition based on speaker-sensitive feature clustering

In this paper we propose a feature normalization method for speaker-independent speech emotion recognition. The performance of a speech emotion classifier largely depends on the training data, and a large number of unknown speakers may cause a great challenge. To address this problem, first, we extr...

Full description

Saved in:
Bibliographic Details
Published in:International journal of speech technology Vol. 19; no. 4; pp. 805 - 816
Main Authors: Huang, Chengwei, Song, Baolin, Zhao, Li
Format: Journal Article
Language:English
Published: New York Springer US 01.12.2016
Springer Nature B.V
Subjects:
ISSN:1381-2416, 1572-8110
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper we propose a feature normalization method for speaker-independent speech emotion recognition. The performance of a speech emotion classifier largely depends on the training data, and a large number of unknown speakers may cause a great challenge. To address this problem, first, we extract and analyse 481 basic acoustic features. Second, we use principal component analysis and linear discriminant analysis jointly to construct the speaker-sensitive feature space. Third, we classify the emotional utterances into pseudo-speaker groups in the speaker-sensitive feature space by using fuzzy k-means clustering. Finally, we normalize the original basic acoustic features of each utterance based on its group information. To verify our normalization algorithm, we adopt a Gaussian mixture model based classifier for recognition test. The experimental results show that our normalization algorithm is effective on our locally collected database, as well as on the eNTERFACE’05 Audio-Visual Emotion Database. The emotional features achieved using our method are robust to the speaker change, and an improved recognition rate is observed.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-016-9371-3