Deep Neural Network for Automatic Classification of Pathological Voice Signals

Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (su...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of voice Ročník 36; číslo 2; s. 288.e15 - 288.e24
Hlavní autori:	Chen, Lili, Chen, Junjiang
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	United States Elsevier Inc 01.03.2022
Predmet:	Acoustics Automatic classification Deep neural network Humans Mel frequency cepstral coefficients Neural Networks, Computer Otolaryngology Pathological voice Stacked sparse autoencoder Support Vector Machine Voice Deep neural network Automatic classification Pathological voice Stacked sparse autoencoder Mel frequency cepstral coefficients
ISSN:	0892-1997, 1873-4588, 1873-4588
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (support vector machines and random forests) were used to verify the effectiveness of DNN. In this paper, we extracted 12 Mel frequency cepstral coefficients of each voice sample as row features. The constructed DNN consists a two-layer stacked sparse autoencoders network and a softmax layer. The stacked sparse autoencoders layer can learn high-level features from raw Mel frequency cepstral coefficients features. Then, the softmax layer can diagnose pathological voice according to high-level features. The DNN and the other two comparison models used the same train set and test set for the experiment. Experimental results reveal that the value of sensitivity, specificity, precision, accuracy, and F1 score of the DNN can reach 97.8%, 99.4%, 99.4%, 98.6%, and 98.4%, respectively. The five indexes of DNN classification results are at least 6.2%, 5%, 5.6%, 5.7%, and 6.2% higher than the comparison models (support vector machine and random forest). The proposed DNN can learn advanced features from raw acoustic features, and distinguish pathological voice from healthy voice. To the extent of this preliminary study, future studies can further explore the application of DNN in other experiments and clinical practice
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0892-1997 1873-4588 1873-4588
DOI:	10.1016/j.jvoice.2020.05.029