Acoustic COVID-19 Detection Using Multiple Instance Learning

In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA chall...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE journal of biomedical and health informatics Ročník 29; číslo 1; s. 620 - 630
Hlavní autori: Reiter, Michael, Pernkopf, Franz
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.01.2025
Predmet:
ISSN:2168-2194, 2168-2208, 2168-2208
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.
AbstractList In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.
In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.
Author Pernkopf, Franz
Reiter, Michael
Author_xml – sequence: 1
  givenname: Michael
  surname: Reiter
  fullname: Reiter, Michael
  organization: Christian Doppler Laboratory for Dependable Intelligent Systems in Harsh Environments, Graz University of Technology, Graz, Austria
– sequence: 2
  givenname: Franz
  orcidid: 0000-0002-6356-3367
  surname: Pernkopf
  fullname: Pernkopf, Franz
  email: pernkopf@tugraz.at
  organization: Christian Doppler Laboratory for Dependable Intelligent Systems in Harsh Environments, Graz University of Technology, Graz, Austria
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39365725$$D View this record in MEDLINE/PubMed
BookMark eNpNkM9PwjAUxxuDEUT-ABNjdvQy7I-taxMvCCozGC7itWm7NzMzOly3g_-9JYCxl9f0fd5Lv59LNHCNA4SuCZ4SguX96-Myn1JMkylLskRm6RkaUcJFTCkWg9OdyGSIJt5_4XBEeJL8Ag2ZZDzNaDpCDzPb9L6rbDRff-SLmMhoAR3YrmpctPGV-4ze-rqrdjVEufOddhaiFejWhdYVOi917WFyrGO0eX56ny_j1foln89WsaVZ0sXCSgbGAMEgLBXMFiCZzAwxkhSESS55WnItBbUhj9CmSC01usSpKYnhCRuju8PeXdt89-A7ta28hbrWDsLvFSOEEZrSjAb09oj2ZguF2rXVVrc_6pQ4AOQA2LbxvoXyDyFY7cWqvVi1F6uOYsPMzWGmAoB_fIZ5sMp-ARy2cTQ
CODEN IJBHA9
Cites_doi 10.1038/s41597-021-00937-4
10.1038/s41746-021-00553-x
10.1016/j.compbiomed.2021.104572
10.1109/JBHI.2023.3339700
10.21437/Interspeech.2022-10389
10.5555/3045118.3045167
10.1109/OJEMB.2020.3026928
10.1371/journal.pone.0177926
10.1007/s00521-024-10150-0
10.1109/ICCV.2017.324
10.1016/j.ijid.2021.07.010
10.1109/ICASSP43922.2022.9747188
10.1016/j.aej.2021.08.070
10.1007/s11263-019-01198-w
10.1142/S021821302350046X
10.1109/CVPR.2016.90
10.21437/Interspeech.2020-2768
10.1109/ICASSP.2017.7952132
10.23919/eusipco58844.2023.10289774
10.1007/978-3-031-16474-3_13
10.1109/TBME.2022.3156293
10.1145/3394486.3412865
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1109/JBHI.2024.3474975
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE

MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 2168-2208
EndPage 630
ExternalDocumentID 39365725
10_1109_JBHI_2024_3474975
10706000
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
GrantInformation_xml – fundername: Austrian Federal Ministry of Labour and Economy
– fundername: Österreichische Nationalstiftung für Forschung, Technologie und Entwicklung; National Foundation for Research, Technology and Development
  funderid: 10.13039/100010132
– fundername: Christian Doppler Forschungsgesellschaft; Christian Doppler Research Association
  funderid: 10.13039/501100006012
GroupedDBID 0R~
4.4
6IF
6IH
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACIWK
ACPRK
AENEX
AFRAH
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
HZ~
IFIPE
IPLJI
JAVBF
M43
O9-
OCL
PQQKQ
RIA
RIE
RNS
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
RIG
7X8
ID FETCH-LOGICAL-c274t-8c93ebbe10e8c283cde9397b1b91d1396965f6a982c2028abd5c2baf05bf1b643
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001392851400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2168-2194
2168-2208
IngestDate Sun Sep 28 06:02:07 EDT 2025
Thu Jul 24 02:16:48 EDT 2025
Sat Nov 29 04:18:40 EST 2025
Wed Aug 27 01:58:00 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c274t-8c93ebbe10e8c283cde9397b1b91d1396965f6a982c2028abd5c2baf05bf1b643
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0002-6356-3367
PMID 39365725
PQID 3113125272
PQPubID 23479
PageCount 11
ParticipantIDs crossref_primary_10_1109_JBHI_2024_3474975
pubmed_primary_39365725
proquest_miscellaneous_3113125272
ieee_primary_10706000
PublicationCentury 2000
PublicationDate 2025-Jan.
2025-1-00
2025-01-00
20250101
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – month: 01
  year: 2025
  text: 2025-Jan.
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle IEEE journal of biomedical and health informatics
PublicationTitleAbbrev JBHI
PublicationTitleAlternate IEEE J Biomed Health Inform
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
References Subirana (ref7) 2020
ref13
ref35
ref12
ref34
ref15
Sharma (ref6) 2020
ref30
ref33
ref2
ref1
Chowdhury (ref22) 2022; 145
ref16
ref19
ref18
Ulyanov (ref28) 2016
Xia (ref10) 2021; 1
Kim (ref14) 2023; 13
ref24
ref23
ref26
ref20
ref21
Kim (ref29) 2022
Messner (ref4) 2020; 122
Hartquist (ref32) 2022
ref27
ref8
Ilse (ref31) 2018
ref9
Chen (ref17) 2024; 5
ref3
Chaudhari (ref11) 2011
Ahmad (ref5) 2023; 10
Tompson (ref25) 2014
References_xml – ident: ref8
  doi: 10.1038/s41597-021-00937-4
– volume: 5
  volume-title: Healthcare Analytics
  year: 2024
  ident: ref17
  article-title: A vision transformer machine learning model for COVID-19 diagnosis using chest X-ray images
– ident: ref35
  doi: 10.1038/s41746-021-00553-x
– ident: ref21
  doi: 10.1016/j.compbiomed.2021.104572
– volume: 1
  volume-title: Proc. Int. Conf. Neural Inf. Process. Syst. Datasets Benchmarks
  year: 2021
  ident: ref10
  article-title: COVID-19 sounds: A large-scale audio dataset for digital respiratory screening
– ident: ref12
  doi: 10.1109/JBHI.2023.3339700
– ident: ref20
  doi: 10.21437/Interspeech.2022-10389
– ident: ref26
  doi: 10.5555/3045118.3045167
– ident: ref34
  doi: 10.1109/OJEMB.2020.3026928
– ident: ref2
  doi: 10.1371/journal.pone.0177926
– ident: ref16
  doi: 10.1007/s00521-024-10150-0
– ident: ref33
  doi: 10.1109/ICCV.2017.324
– ident: ref1
  doi: 10.1016/j.ijid.2021.07.010
– year: 2022
  ident: ref32
  article-title: Fine-tuning resnet-18 for audio classification
– ident: ref18
  doi: 10.1109/ICASSP43922.2022.9747188
– year: 2016
  ident: ref28
  article-title: Instance normalization: The missing ingredient for fast stylization
– ident: ref19
  doi: 10.1016/j.aej.2021.08.070
– start-page: 2127
  volume-title: Pro. 35th Int. Conf. Mach. Learn.
  year: 2018
  ident: ref31
  article-title: Attention-based deep multiple instance learning
– start-page: 648
  volume-title: Proc. 2015 IEEE Conf. Comput. Vis. Pattern Recognit.
  year: 2014
  ident: ref25
  article-title: Efficient object localization using convolutional networks
– ident: ref27
  doi: 10.1007/s11263-019-01198-w
– ident: ref15
  doi: 10.1142/S021821302350046X
– ident: ref24
  doi: 10.1109/CVPR.2016.90
– volume-title: Proc. Interspeech
  year: 2020
  ident: ref6
  article-title: Coswara a database of breathing, cough, and voice sounds for COVID-19 diagnosis
  doi: 10.21437/Interspeech.2020-2768
– year: 2011
  ident: ref11
  article-title: Virufy: Global applicability of crowdsourced and clinical datasets for AI detection of COVID-19 from cough
– ident: ref13
  doi: 10.1109/ICASSP.2017.7952132
– volume: 13
  issue: 4
  volume-title: Appl. Sci.
  year: 2023
  ident: ref14
  article-title: COVID-19 detection model with acoustic features from cough sound and its application
– volume: 10
  issue: 11
  volume-title: Bioeng.
  year: 2023
  ident: ref5
  article-title: COVID-19 detection via ultra-low-dose X-ray images enabled by deep learning
– year: 2020
  ident: ref7
  article-title: Hi sigma, do i have the coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic
– year: 2022
  ident: ref29
  article-title: QTI submission to DCASE 2021: Residual normalization for device-imbalanced acoustic scene classification with efficient design
– ident: ref30
  doi: 10.23919/eusipco58844.2023.10289774
– volume: 145
  volume-title: Comput. Biol. Med.
  year: 2022
  ident: ref22
  article-title: Machine learning for detecting COVID-19 from cough sounds: An ensemble-based MCDM method
– volume: 122
  volume-title: Comput. Biol. Med.
  year: 2020
  ident: ref4
  article-title: Multi-channel lung sound classification with convolutional recurrent neural networks
– ident: ref23
  doi: 10.1007/978-3-031-16474-3_13
– ident: ref3
  doi: 10.1109/TBME.2022.3156293
– ident: ref9
  doi: 10.1145/3394486.3412865
SSID ssj0000816896
Score 2.4256747
Snippet In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool...
SourceID proquest
pubmed
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 620
SubjectTerms Algorithms
Annotations
Antigens
audio-based infection prediction
Bioinformatics
Biological system modeling
Costs
coswara
COVID-19
COVID-19 - diagnosis
COVID-19 - physiopathology
crowdsourced datasets
DiCOVA
Feature extraction
Humans
Labeling
Machine Learning
multiple instance learning
Multiple-Instance Learning Algorithms
Pandemics
Predictive models
SARS-CoV-2
Signal Processing, Computer-Assisted
Title Acoustic COVID-19 Detection Using Multiple Instance Learning
URI https://ieeexplore.ieee.org/document/10706000
https://www.ncbi.nlm.nih.gov/pubmed/39365725
https://www.proquest.com/docview/3113125272
Volume 29
WOSCitedRecordID wos001392851400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  customDbUrl:
  eissn: 2168-2208
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000816896
  issn: 2168-2194
  databaseCode: RIE
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagQoiFZ4HyqILEhJRix3lZYiktVYtUYADULbIvF8SSIgj8fs5Oirp0YPOQWNbns-67h_0xdikgBGVA-5rcqZUwk76mQMvHSIa5lMCxqMUmkoeHdDZTT81ldXcXBhFd8xn27NDV8vM5fNtUGZ1w-9YLpwh9PUni-rLWX0LFKUg4Pa6ABj6dxLCpYgquru9vxxOKBoOwJ8OElmQVa6SScZRYkewll-Q0VlbTTed2Rjv_XPAu2274pdevDWKPrWG5zzanTQX9gN30Ye4EvLzB4-tk6AvlDbFy_Vil5_oHvGnTY-hNHHUE9JpHWN_a7GV09zwY-42Cgg8UbVZ-CkqiMSg4pkBEAnJURECMMErkxP1iFUdFrFUaAGGTapNHEBhd8MgUwhBZOWStcl7iMfMUcowKmQM3eQga0lQihc8aoiBMTMQ77GoBYvZRP5SRuQCDq8yCn1nwswb8DmtbsJY-rHHqsIsF7hmZua1d6BIJlkwKIYmLBUnQYUf1hvz9vdjHkxWznrKtwKr2usTJGWtVn994zjbgp3r_-uySLc3SrrOlX6a4wO8
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT-MwEB2t2BVwYZePXcp-ECROSAE7jptY4sLCohbawqEgbpE9maz2kqLS8vt37KSolx64-ZBY1vNYfs8z9gM4lpiicWhjy9uptzBTsWWhFZNWaakUCqoas4lsNMqfnsx9e1k93IUholB8Rqe-GXL55QTn_qiMV7h_60WwQv-oUxY-zXWttyOV4CERHLkSbsS8FtM2jymFObv53euzHkzSU5VmPCjvWaOM6urM22QvbUrBZWU14Qwbz_Xndw75C2y1DDO6aEJiGz5QvQPrwzaHvgvnFzgJFl7R5d1j_yqWJrqiWajIqqNQQRAN2yrDqB_II1LUPsP6dw8erv-ML3tx66EQI-vNWZyjUeQcSUE5MpXAkgxTECedkSWzv67p6qprTZ4gY5NbV2pMnK2EdpV0TFe-wlo9qWkfIkOCdKVKFK5M0WKeK2IBbVEnaea06MDJAsTiuXkqowgSQ5jCg1948IsW_A7sebCWPmxw6sDRAveCA91nL2xNDEuhpFTMxpIs6cC3ZkLe_l7M48GKXg9hozceDopBf3T7HTYT7-EbjlF-wNpsOqef8AlfZ_9epr9CRP0HU4XDTg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Acoustic+COVID-19+Detection+Using+Multiple+Instance+Learning&rft.jtitle=IEEE+journal+of+biomedical+and+health+informatics&rft.au=Reiter%2C+Michael&rft.au=Pernkopf%2C+Franz&rft.date=2025-01-01&rft.issn=2168-2208&rft.eissn=2168-2208&rft.volume=29&rft.issue=1&rft.spage=620&rft_id=info:doi/10.1109%2FJBHI.2024.3474975&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-2194&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-2194&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-2194&client=summon