Deep Neural Network for Automatic Classification of Pathological Voice Signals

Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (su...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of voice Ročník 36; číslo 2; s. 288.e15 - 288.e24
Hlavní autoři: Chen, Lili, Chen, Junjiang
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Elsevier Inc 01.03.2022
Témata:
ISSN:0892-1997, 1873-4588, 1873-4588
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (support vector machines and random forests) were used to verify the effectiveness of DNN. In this paper, we extracted 12 Mel frequency cepstral coefficients of each voice sample as row features. The constructed DNN consists a two-layer stacked sparse autoencoders network and a softmax layer. The stacked sparse autoencoders layer can learn high-level features from raw Mel frequency cepstral coefficients features. Then, the softmax layer can diagnose pathological voice according to high-level features. The DNN and the other two comparison models used the same train set and test set for the experiment. Experimental results reveal that the value of sensitivity, specificity, precision, accuracy, and F1 score of the DNN can reach 97.8%, 99.4%, 99.4%, 98.6%, and 98.4%, respectively. The five indexes of DNN classification results are at least 6.2%, 5%, 5.6%, 5.7%, and 6.2% higher than the comparison models (support vector machine and random forest). The proposed DNN can learn advanced features from raw acoustic features, and distinguish pathological voice from healthy voice. To the extent of this preliminary study, future studies can further explore the application of DNN in other experiments and clinical practice
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0892-1997
1873-4588
1873-4588
DOI:10.1016/j.jvoice.2020.05.029