Acoustic COVID-19 Detection Using Multiple Instance Learning
In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA chall...
Uložené v:
| Vydané v: | IEEE journal of biomedical and health informatics Ročník 29; číslo 1; s. 620 - 630 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
United States
IEEE
01.01.2025
|
| Predmet: | |
| ISSN: | 2168-2194, 2168-2208, 2168-2208 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%. |
|---|---|
| AbstractList | In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%. In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%. |
| Author | Pernkopf, Franz Reiter, Michael |
| Author_xml | – sequence: 1 givenname: Michael surname: Reiter fullname: Reiter, Michael organization: Christian Doppler Laboratory for Dependable Intelligent Systems in Harsh Environments, Graz University of Technology, Graz, Austria – sequence: 2 givenname: Franz orcidid: 0000-0002-6356-3367 surname: Pernkopf fullname: Pernkopf, Franz email: pernkopf@tugraz.at organization: Christian Doppler Laboratory for Dependable Intelligent Systems in Harsh Environments, Graz University of Technology, Graz, Austria |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39365725$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNkM9PwjAUxxuDEUT-ABNjdvQy7I-taxMvCCozGC7itWm7NzMzOly3g_-9JYCxl9f0fd5Lv59LNHCNA4SuCZ4SguX96-Myn1JMkylLskRm6RkaUcJFTCkWg9OdyGSIJt5_4XBEeJL8Ag2ZZDzNaDpCDzPb9L6rbDRff-SLmMhoAR3YrmpctPGV-4ze-rqrdjVEufOddhaiFejWhdYVOi917WFyrGO0eX56ny_j1foln89WsaVZ0sXCSgbGAMEgLBXMFiCZzAwxkhSESS55WnItBbUhj9CmSC01usSpKYnhCRuju8PeXdt89-A7ta28hbrWDsLvFSOEEZrSjAb09oj2ZguF2rXVVrc_6pQ4AOQA2LbxvoXyDyFY7cWqvVi1F6uOYsPMzWGmAoB_fIZ5sMp-ARy2cTQ |
| CODEN | IJBHA9 |
| Cites_doi | 10.1038/s41597-021-00937-4 10.1038/s41746-021-00553-x 10.1016/j.compbiomed.2021.104572 10.1109/JBHI.2023.3339700 10.21437/Interspeech.2022-10389 10.5555/3045118.3045167 10.1109/OJEMB.2020.3026928 10.1371/journal.pone.0177926 10.1007/s00521-024-10150-0 10.1109/ICCV.2017.324 10.1016/j.ijid.2021.07.010 10.1109/ICASSP43922.2022.9747188 10.1016/j.aej.2021.08.070 10.1007/s11263-019-01198-w 10.1142/S021821302350046X 10.1109/CVPR.2016.90 10.21437/Interspeech.2020-2768 10.1109/ICASSP.2017.7952132 10.23919/eusipco58844.2023.10289774 10.1007/978-3-031-16474-3_13 10.1109/TBME.2022.3156293 10.1145/3394486.3412865 |
| ContentType | Journal Article |
| DBID | 97E RIA RIE AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1109/JBHI.2024.3474975 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine |
| EISSN | 2168-2208 |
| EndPage | 630 |
| ExternalDocumentID | 39365725 10_1109_JBHI_2024_3474975 10706000 |
| Genre | orig-research Research Support, Non-U.S. Gov't Journal Article |
| GrantInformation_xml | – fundername: Austrian Federal Ministry of Labour and Economy – fundername: Österreichische Nationalstiftung für Forschung, Technologie und Entwicklung; National Foundation for Research, Technology and Development funderid: 10.13039/100010132 – fundername: Christian Doppler Forschungsgesellschaft; Christian Doppler Research Association funderid: 10.13039/501100006012 |
| GroupedDBID | 0R~ 4.4 6IF 6IH 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACIWK ACPRK AENEX AFRAH AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION CGR CUY CVF ECM EIF NPM RIG 7X8 |
| ID | FETCH-LOGICAL-c274t-8c93ebbe10e8c283cde9397b1b91d1396965f6a982c2028abd5c2baf05bf1b643 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001392851400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2168-2194 2168-2208 |
| IngestDate | Sun Sep 28 06:02:07 EDT 2025 Thu Jul 24 02:16:48 EDT 2025 Sat Nov 29 04:18:40 EST 2025 Wed Aug 27 01:58:00 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c274t-8c93ebbe10e8c283cde9397b1b91d1396965f6a982c2028abd5c2baf05bf1b643 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ORCID | 0000-0002-6356-3367 |
| PMID | 39365725 |
| PQID | 3113125272 |
| PQPubID | 23479 |
| PageCount | 11 |
| ParticipantIDs | crossref_primary_10_1109_JBHI_2024_3474975 pubmed_primary_39365725 proquest_miscellaneous_3113125272 ieee_primary_10706000 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-Jan. 2025-1-00 2025-01-00 20250101 |
| PublicationDateYYYYMMDD | 2025-01-01 |
| PublicationDate_xml | – month: 01 year: 2025 text: 2025-Jan. |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | IEEE journal of biomedical and health informatics |
| PublicationTitleAbbrev | JBHI |
| PublicationTitleAlternate | IEEE J Biomed Health Inform |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| References | Subirana (ref7) 2020 ref13 ref35 ref12 ref34 ref15 Sharma (ref6) 2020 ref30 ref33 ref2 ref1 Chowdhury (ref22) 2022; 145 ref16 ref19 ref18 Ulyanov (ref28) 2016 Xia (ref10) 2021; 1 Kim (ref14) 2023; 13 ref24 ref23 ref26 ref20 ref21 Kim (ref29) 2022 Messner (ref4) 2020; 122 Hartquist (ref32) 2022 ref27 ref8 Ilse (ref31) 2018 ref9 Chen (ref17) 2024; 5 ref3 Chaudhari (ref11) 2011 Ahmad (ref5) 2023; 10 Tompson (ref25) 2014 |
| References_xml | – ident: ref8 doi: 10.1038/s41597-021-00937-4 – volume: 5 volume-title: Healthcare Analytics year: 2024 ident: ref17 article-title: A vision transformer machine learning model for COVID-19 diagnosis using chest X-ray images – ident: ref35 doi: 10.1038/s41746-021-00553-x – ident: ref21 doi: 10.1016/j.compbiomed.2021.104572 – volume: 1 volume-title: Proc. Int. Conf. Neural Inf. Process. Syst. Datasets Benchmarks year: 2021 ident: ref10 article-title: COVID-19 sounds: A large-scale audio dataset for digital respiratory screening – ident: ref12 doi: 10.1109/JBHI.2023.3339700 – ident: ref20 doi: 10.21437/Interspeech.2022-10389 – ident: ref26 doi: 10.5555/3045118.3045167 – ident: ref34 doi: 10.1109/OJEMB.2020.3026928 – ident: ref2 doi: 10.1371/journal.pone.0177926 – ident: ref16 doi: 10.1007/s00521-024-10150-0 – ident: ref33 doi: 10.1109/ICCV.2017.324 – ident: ref1 doi: 10.1016/j.ijid.2021.07.010 – year: 2022 ident: ref32 article-title: Fine-tuning resnet-18 for audio classification – ident: ref18 doi: 10.1109/ICASSP43922.2022.9747188 – year: 2016 ident: ref28 article-title: Instance normalization: The missing ingredient for fast stylization – ident: ref19 doi: 10.1016/j.aej.2021.08.070 – start-page: 2127 volume-title: Pro. 35th Int. Conf. Mach. Learn. year: 2018 ident: ref31 article-title: Attention-based deep multiple instance learning – start-page: 648 volume-title: Proc. 2015 IEEE Conf. Comput. Vis. Pattern Recognit. year: 2014 ident: ref25 article-title: Efficient object localization using convolutional networks – ident: ref27 doi: 10.1007/s11263-019-01198-w – ident: ref15 doi: 10.1142/S021821302350046X – ident: ref24 doi: 10.1109/CVPR.2016.90 – volume-title: Proc. Interspeech year: 2020 ident: ref6 article-title: Coswara a database of breathing, cough, and voice sounds for COVID-19 diagnosis doi: 10.21437/Interspeech.2020-2768 – year: 2011 ident: ref11 article-title: Virufy: Global applicability of crowdsourced and clinical datasets for AI detection of COVID-19 from cough – ident: ref13 doi: 10.1109/ICASSP.2017.7952132 – volume: 13 issue: 4 volume-title: Appl. Sci. year: 2023 ident: ref14 article-title: COVID-19 detection model with acoustic features from cough sound and its application – volume: 10 issue: 11 volume-title: Bioeng. year: 2023 ident: ref5 article-title: COVID-19 detection via ultra-low-dose X-ray images enabled by deep learning – year: 2020 ident: ref7 article-title: Hi sigma, do i have the coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic – year: 2022 ident: ref29 article-title: QTI submission to DCASE 2021: Residual normalization for device-imbalanced acoustic scene classification with efficient design – ident: ref30 doi: 10.23919/eusipco58844.2023.10289774 – volume: 145 volume-title: Comput. Biol. Med. year: 2022 ident: ref22 article-title: Machine learning for detecting COVID-19 from cough sounds: An ensemble-based MCDM method – volume: 122 volume-title: Comput. Biol. Med. year: 2020 ident: ref4 article-title: Multi-channel lung sound classification with convolutional recurrent neural networks – ident: ref23 doi: 10.1007/978-3-031-16474-3_13 – ident: ref3 doi: 10.1109/TBME.2022.3156293 – ident: ref9 doi: 10.1145/3394486.3412865 |
| SSID | ssj0000816896 |
| Score | 2.4256747 |
| Snippet | In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool... |
| SourceID | proquest pubmed crossref ieee |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 620 |
| SubjectTerms | Algorithms Annotations Antigens audio-based infection prediction Bioinformatics Biological system modeling Costs coswara COVID-19 COVID-19 - diagnosis COVID-19 - physiopathology crowdsourced datasets DiCOVA Feature extraction Humans Labeling Machine Learning multiple instance learning Multiple-Instance Learning Algorithms Pandemics Predictive models SARS-CoV-2 Signal Processing, Computer-Assisted |
| Title | Acoustic COVID-19 Detection Using Multiple Instance Learning |
| URI | https://ieeexplore.ieee.org/document/10706000 https://www.ncbi.nlm.nih.gov/pubmed/39365725 https://www.proquest.com/docview/3113125272 |
| Volume | 29 |
| WOSCitedRecordID | wos001392851400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) customDbUrl: eissn: 2168-2208 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816896 issn: 2168-2194 databaseCode: RIE dateStart: 20130101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagQoiFZ4HyqILEhJRix3lZYiktVYtUYADULbIvF8SSIgj8fs5Oirp0YPOQWNbns-67h_0xdikgBGVA-5rcqZUwk76mQMvHSIa5lMCxqMUmkoeHdDZTT81ldXcXBhFd8xn27NDV8vM5fNtUGZ1w-9YLpwh9PUni-rLWX0LFKUg4Pa6ABj6dxLCpYgquru9vxxOKBoOwJ8OElmQVa6SScZRYkewll-Q0VlbTTed2Rjv_XPAu2274pdevDWKPrWG5zzanTQX9gN30Ye4EvLzB4-tk6AvlDbFy_Vil5_oHvGnTY-hNHHUE9JpHWN_a7GV09zwY-42Cgg8UbVZ-CkqiMSg4pkBEAnJURECMMErkxP1iFUdFrFUaAGGTapNHEBhd8MgUwhBZOWStcl7iMfMUcowKmQM3eQga0lQihc8aoiBMTMQ77GoBYvZRP5SRuQCDq8yCn1nwswb8DmtbsJY-rHHqsIsF7hmZua1d6BIJlkwKIYmLBUnQYUf1hvz9vdjHkxWznrKtwKr2usTJGWtVn994zjbgp3r_-uySLc3SrrOlX6a4wO8 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT-MwEB2t2BVwYZePXcp-ECROSAE7jptY4sLCohbawqEgbpE9maz2kqLS8vt37KSolx64-ZBY1vNYfs8z9gM4lpiicWhjy9uptzBTsWWhFZNWaakUCqoas4lsNMqfnsx9e1k93IUholB8Rqe-GXL55QTn_qiMV7h_60WwQv-oUxY-zXWttyOV4CERHLkSbsS8FtM2jymFObv53euzHkzSU5VmPCjvWaOM6urM22QvbUrBZWU14Qwbz_Xndw75C2y1DDO6aEJiGz5QvQPrwzaHvgvnFzgJFl7R5d1j_yqWJrqiWajIqqNQQRAN2yrDqB_II1LUPsP6dw8erv-ML3tx66EQI-vNWZyjUeQcSUE5MpXAkgxTECedkSWzv67p6qprTZ4gY5NbV2pMnK2EdpV0TFe-wlo9qWkfIkOCdKVKFK5M0WKeK2IBbVEnaea06MDJAsTiuXkqowgSQ5jCg1948IsW_A7sebCWPmxw6sDRAveCA91nL2xNDEuhpFTMxpIs6cC3ZkLe_l7M48GKXg9hozceDopBf3T7HTYT7-EbjlF-wNpsOqef8AlfZ_9epr9CRP0HU4XDTg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Acoustic+COVID-19+Detection+Using+Multiple+Instance+Learning&rft.jtitle=IEEE+journal+of+biomedical+and+health+informatics&rft.au=Reiter%2C+Michael&rft.au=Pernkopf%2C+Franz&rft.date=2025-01-01&rft.issn=2168-2208&rft.eissn=2168-2208&rft.volume=29&rft.issue=1&rft.spage=620&rft_id=info:doi/10.1109%2FJBHI.2024.3474975&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-2194&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-2194&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-2194&client=summon |