Acoustic COVID-19 Detection Using Multiple Instance Learning

In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA chall...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of biomedical and health informatics Jg. 29; H. 1; S. 620 - 630
Hauptverfasser:	Reiter, Michael, Pernkopf, Franz
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	United States IEEE 01.01.2025
Schlagworte:	Algorithms Annotations Antigens audio-based infection prediction Bioinformatics Biological system modeling Costs coswara COVID-19 COVID-19 - diagnosis COVID-19 - physiopathology crowdsourced datasets DiCOVA Feature extraction Humans Labeling Machine Learning multiple instance learning Multiple-Instance Learning Algorithms Pandemics Predictive models SARS-CoV-2 Signal Processing, Computer-Assisted
ISSN:	2168-2194, 2168-2208, 2168-2208
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2168-2194 2168-2208 2168-2208
DOI:	10.1109/JBHI.2024.3474975