Scalable high-performance single cell data analysis with BPCells

The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:bioRxiv
Hauptverfasser: Parks, Benjamin, Greenleaf, William
Format: Journal Article Paper
Sprache:Englisch
Veröffentlicht: United States Cold Spring Harbor Laboratory 01.04.2025
Ausgabe:1.1
Schlagworte:
ISSN:2692-8205, 2692-8205
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algorithms to reduce memory requirements by nearly 70-fold compared to in-memory workflows with little to no loss of execution speed. BPCells also introduces high-performance compressed formats based on bitpacking compression for ATAC-seq fragment files and single-cell sparse matrices. These novel compression algorithms help to accelerate disk-backed analysis by reducing data transfer from disk, while providing the lowest computational overhead of all compression algorithms tested. Using BPCells, we perform normalization and PCA of a 44 million cell dataset on a laptop, demonstrating that BPCells makes working with the largest contemporary single-cell datasets feasible on modest hardware, while leaving headroom on servers for future datasets an order of magnitude larger.
Bibliographie:ObjectType-Working Paper/Pre-Print-3
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Competing Interest Statement: B.P. declares no competing interests. W.J.G. is a scientific co-founder of Protillion Biosciences, and consultant for Guardant Health, Ultima Genomics, and Nvidia. W.J.G. is an inventor on patents licensed by 10x Genomics.
ISSN:2692-8205
2692-8205
DOI:10.1101/2025.03.27.645853