Deep Learning Anomaly Detection methods to passively detect COVID-19 from Audio

The world has been severely affected by COVID-19, an infectious disease caused by the SARS-Cov-2 coronavirus. COVID-19 incubates in a patient for 7 days before symptoms manifest. The identification of the presence of COVID-19 is challenging as its symptoms are similar to influenza symptoms such as c...

Full description

Saved in:
Bibliographic Details
Published in:2021 IEEE International Conference on Digital Health (ICDH) pp. 114 - 121
Main Authors: Murthy, Shreesha Narasimha, Agu, Emmanuel
Format: Conference Proceeding
Language:English
Published: IEEE 01.09.2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The world has been severely affected by COVID-19, an infectious disease caused by the SARS-Cov-2 coronavirus. COVID-19 incubates in a patient for 7 days before symptoms manifest. The identification of the presence of COVID-19 is challenging as its symptoms are similar to influenza symptoms such as cough, cold, runny nose, and chills. COVID-19 affects human speech sub-systems involved in respiration, phonation, and articulation. We propose a deep anomaly detection framework for passive, speech-based detection of COVID-related anomalies in voice samples of COVID-19 affected individuals. The low percentage of positive cases and extreme imbalance in available COVID audio datasets present a challenge to machine learning classifiers but create an opportunity to utilize anomaly detection techniques. We investigate COVID detection from audio using various types of deep anomaly detectors and convolutional autoencoders. Contrastive loss methods are also explored to force our models to learn discrepancies between COVID and non-COVID cough data representations. In contrast with prior work that controlled data collection, our work focuses on crowdsourced datasets that are true representatives of the general population. In rigorous evaluation, the variational autoencoder with the elliptic envelope as the anomaly detector analyzing Mel Filterbanks audio representations performed best with an AUC of 0.65, outperforming the state of the art.
DOI:10.1109/ICDH52753.2021.00023