A comparative study of the spectrogram, scalogram, melspectrogram and gammatonegram time-frequency representations for the classification of lung sounds using the ICBHI database based on CNNs

Uloženo v:
Podrobná bibliografie
Název: A comparative study of the spectrogram, scalogram, melspectrogram and gammatonegram time-frequency representations for the classification of lung sounds using the ICBHI database based on CNNs
Autoři: Neili, Zakaria, Sundaraj, Kenneth
Informace o vydavateli: De Gruyter
Rok vydání: 2022
Sbírka: Universiti Teknikal Malaysia Melaka (UTeM) Repository
Popis: In lung sound classification using deep learning, many studies have considered the use of short-time Fourier transform (STFT) as the most commonly used 2D representation of the input data. Consequently, STFT has been widely used as an analytical tool, but other versions of the representation have also been developed. This study aims to evaluate and compare the performance of the spectrogram, scalogram, melspectrogram and gammatonegram representations, and provide comparative information to users regarding the suitability of these time-frequency (TF) techniques in lung sound classification. Lung sound signals used in this study were obtained from the ICBHI 2017 respiratory sound database. These lung sound recordings were converted into images of spectrogram, scalogram, melspectrogram and gammatonegram TF representations respectively. The four types of images were fed separately into the VGG16, ResNet-50 and AlexNet deep-learning architectures. Network performances were analyzed and compared based on accuracy, precision, recall and F1-score. The results of the analysis on the performance of the four representations using these three commonly used CNN deep-learning networks indicate that the generated gammatonegram and scalogram TF images coupled with ResNet-50 achieved maximum classification accuracies.
Druh dokumentu: article in journal/newspaper
Popis souboru: text
Jazyk: English
Relation: http://eprints.utem.edu.my/id/eprint/26592/2/2022%20ZAKI%20BMT_COMPRESSED.PDF; Neili, Zakaria and Sundaraj, Kenneth (2022) A comparative study of the spectrogram, scalogram, melspectrogram and gammatonegram time-frequency representations for the classification of lung sounds using the ICBHI database based on CNNs. Biomedizinische Technik, 67 (5). pp. 367-390. ISSN 1862-278X
DOI: 10.1515/bmt-2022-0180/html?lang=en
Dostupnost: http://eprints.utem.edu.my/id/eprint/26592/
http://eprints.utem.edu.my/id/eprint/26592/2/2022%20ZAKI%20BMT_COMPRESSED.PDF
https://www.degruyter.com/document/doi/10.1515/bmt-2022-0180/html?lang=en
Přístupové číslo: edsbas.8FFE1EC0
Databáze: BASE
Popis
Abstrakt:In lung sound classification using deep learning, many studies have considered the use of short-time Fourier transform (STFT) as the most commonly used 2D representation of the input data. Consequently, STFT has been widely used as an analytical tool, but other versions of the representation have also been developed. This study aims to evaluate and compare the performance of the spectrogram, scalogram, melspectrogram and gammatonegram representations, and provide comparative information to users regarding the suitability of these time-frequency (TF) techniques in lung sound classification. Lung sound signals used in this study were obtained from the ICBHI 2017 respiratory sound database. These lung sound recordings were converted into images of spectrogram, scalogram, melspectrogram and gammatonegram TF representations respectively. The four types of images were fed separately into the VGG16, ResNet-50 and AlexNet deep-learning architectures. Network performances were analyzed and compared based on accuracy, precision, recall and F1-score. The results of the analysis on the performance of the four representations using these three commonly used CNN deep-learning networks indicate that the generated gammatonegram and scalogram TF images coupled with ResNet-50 achieved maximum classification accuracies.
DOI:10.1515/bmt-2022-0180/html?lang=en