Deep Neural Network for Automatic Classification of Pathological Voice Signals

Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (su...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of voice Jg. 36; H. 2; S. 288.e15 - 288.e24
Hauptverfasser: Chen, Lili, Chen, Junjiang
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Elsevier Inc 01.03.2022
Schlagworte:
ISSN:0892-1997, 1873-4588, 1873-4588
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (support vector machines and random forests) were used to verify the effectiveness of DNN. In this paper, we extracted 12 Mel frequency cepstral coefficients of each voice sample as row features. The constructed DNN consists a two-layer stacked sparse autoencoders network and a softmax layer. The stacked sparse autoencoders layer can learn high-level features from raw Mel frequency cepstral coefficients features. Then, the softmax layer can diagnose pathological voice according to high-level features. The DNN and the other two comparison models used the same train set and test set for the experiment. Experimental results reveal that the value of sensitivity, specificity, precision, accuracy, and F1 score of the DNN can reach 97.8%, 99.4%, 99.4%, 98.6%, and 98.4%, respectively. The five indexes of DNN classification results are at least 6.2%, 5%, 5.6%, 5.7%, and 6.2% higher than the comparison models (support vector machine and random forest). The proposed DNN can learn advanced features from raw acoustic features, and distinguish pathological voice from healthy voice. To the extent of this preliminary study, future studies can further explore the application of DNN in other experiments and clinical practice
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0892-1997
1873-4588
1873-4588
DOI:10.1016/j.jvoice.2020.05.029