Detection of breath cycles in pediatric lung sounds via an object detection-based transfer learning method
•YOLOv1-based model for pediatric breath cycle detection via transfer learning.•Fine-tuned model achieves an F1 score of 0.824 on pediatric lung sounds dataset.•Utilized log Mel spectrogram for effective respiratory sound feature extraction.•Model outperforms baseline in precision, recall, and avera...
Uloženo v:
| Vydáno v: | Biomedical signal processing and control Ročník 105; s. 107693 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier Ltd
01.07.2025
|
| Témata: | |
| ISSN: | 1746-8094 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | •YOLOv1-based model for pediatric breath cycle detection via transfer learning.•Fine-tuned model achieves an F1 score of 0.824 on pediatric lung sounds dataset.•Utilized log Mel spectrogram for effective respiratory sound feature extraction.•Model outperforms baseline in precision, recall, and average precision metrics.•Facilitates large-scale annotated lung sound database creation for pediatric care.
Auscultation is critical for assessing the respiratory system in children; however, the lack of pediatric lung sound databases impedes the development of automated analysis tools. This study introduces an object detection-based transfer learning method to accurately predict breath cycles in pediatric lung sounds. We utilized a model based on the YOLOv1 architecture, initially pre-trained on an adult lung sound dataset (HF_Lung_v1) and subsequently fine-tuned on a pediatric dataset (SNUCH_Lung). The input feature was the log Mel spectrogram, which effectively captured the relevant frequency and temporal information. The pre-trained model achieved an F1 score of 0.900 ± 0.003 on the HF_Lung_v1 dataset. After fine-tuning, it reached an F1 score of 0.824 ± 0.009 on the SNUCH_Lung dataset, confirming the efficacy of transfer learning. This model surpassed the performance of a baseline model trained solely on the SNUCH_Lung dataset without transfer learning. We also explored the impact of segment length, width, and various audio feature extraction techniques; the optimal results were obtained with 15 s segments, a 2-second width, and the log Mel spectrogram. The model is promising for clinical applications, such as generating large-scale annotated datasets, visualizing and labeling individual breath cycles, and performing correlation analysis with physiological indicators. Future research will focus on expanding the pediatric lung sound database through auto-labeling techniques and integrating the model into stethoscopes for real-time analysis. This study highlights the potential of object detection-based transfer learning in enhancing the accuracy of breath cycle prediction in pediatric lung sounds and advancing pediatric respiratory sound analysis tools. |
|---|---|
| ISSN: | 1746-8094 |
| DOI: | 10.1016/j.bspc.2025.107693 |