Sensitivity analysis of data augmentation methods on performance of deep learning model for lung sounds classification.

Uložené v:
Podrobná bibliografia
Názov: Sensitivity analysis of data augmentation methods on performance of deep learning model for lung sounds classification.
Autori: Wang, Zhaoping, Wang, Yingcui, Sun, Zhiqiang
Zdroj: Scientific Reports; 11/10/2025, Vol. 15 Issue 1, p1-16, 16p
Predmety: DATA augmentation, DEEP learning, SENSITIVITY analysis, STETHOSCOPES, RESPIRATORY organ sounds, SPECTROGRAMS
Abstrakt: Physicians can distinguish between abnormal and normal lung conditions more easily with the help of an AI stethoscope. A well-trained deep learning model (DLM) for lung sounds classification is the core of an AI stethoscope. Some publicly accessible lung sound datasets suffer from inefficient data and/or imbalanced class distribution. And then the DLM based on them may be less generalized and overfitted. So, the datasets must be augmented to correct the imbalance. Some data augmentation (DA) methods for audio signals have been presented and applied to sound event classification or speech recognition. We selected seven popular DA methods and tried to explore the effects on lung sounds classification experimentally. The VGG-11 was chosen as the baseline model and ICBHI 2017 as the dataset. And the lung sound signals were transformed to mel-spectrograms as the training and test samples for VGG-11. The results showed that three spectrogram-like methods of Spectrogram flipping, mix-up and SpecMix performed better relatively, either at training or test phase. SpecMix achieved a percentage of F1_score as high as 78.6% at training phase, and Spectrogram flipping achieved the score as 75.4% at test phase. The ResNet-18 was selected as another baseline model to eliminate model bias, and the results showed that the sensitivities still held across models. [ABSTRACT FROM AUTHOR]
Copyright of Scientific Reports is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáza: Complementary Index
Popis
Abstrakt:Physicians can distinguish between abnormal and normal lung conditions more easily with the help of an AI stethoscope. A well-trained deep learning model (DLM) for lung sounds classification is the core of an AI stethoscope. Some publicly accessible lung sound datasets suffer from inefficient data and/or imbalanced class distribution. And then the DLM based on them may be less generalized and overfitted. So, the datasets must be augmented to correct the imbalance. Some data augmentation (DA) methods for audio signals have been presented and applied to sound event classification or speech recognition. We selected seven popular DA methods and tried to explore the effects on lung sounds classification experimentally. The VGG-11 was chosen as the baseline model and ICBHI 2017 as the dataset. And the lung sound signals were transformed to mel-spectrograms as the training and test samples for VGG-11. The results showed that three spectrogram-like methods of Spectrogram flipping, mix-up and SpecMix performed better relatively, either at training or test phase. SpecMix achieved a percentage of F1_score as high as 78.6% at training phase, and Spectrogram flipping achieved the score as 75.4% at test phase. The ResNet-18 was selected as another baseline model to eliminate model bias, and the results showed that the sensitivities still held across models. [ABSTRACT FROM AUTHOR]
ISSN:20452322
DOI:10.1038/s41598-025-23106-8