Single-cell RNA-seq denoising using a deep count autoencoder

Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNA-seq data are needed. We propose a deep count...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications Jg. 10; H. 1; S. 390 - 14
Hauptverfasser: Eraslan, Gökcen, Simon, Lukas M., Mircea, Maria, Mueller, Nikola S., Theis, Fabian J.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London Nature Publishing Group UK 23.01.2019
Nature Publishing Group
Nature Portfolio
Schlagworte:
ISSN:2041-1723, 2041-1723
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNA-seq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a negative binomial noise model with or without zero-inflation, and nonlinear gene-gene dependencies are captured. Our method scales linearly with the number of cells and can, therefore, be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery. Single-cell RNA sequencing is a powerful method to study gene expression, but noise in the data can obstruct analysis. Here the authors develop a denoising method based on a deep count autoencoder network that scales linearly with the number of cells, and therefore is compatible with large data sets.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-018-07931-2