Scalable high-performance single cell data analysis with BPCells

The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algo...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:bioRxiv
Hlavní autori: Parks, Benjamin, Greenleaf, William
Médium: Journal Article Paper
Jazyk:English
Vydavateľské údaje: United States Cold Spring Harbor Laboratory 01.04.2025
Vydanie:1.1
Predmet:
ISSN:2692-8205, 2692-8205
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algorithms to reduce memory requirements by nearly 70-fold compared to in-memory workflows with little to no loss of execution speed. BPCells also introduces high-performance compressed formats based on bitpacking compression for ATAC-seq fragment files and single-cell sparse matrices. These novel compression algorithms help to accelerate disk-backed analysis by reducing data transfer from disk, while providing the lowest computational overhead of all compression algorithms tested. Using BPCells, we perform normalization and PCA of a 44 million cell dataset on a laptop, demonstrating that BPCells makes working with the largest contemporary single-cell datasets feasible on modest hardware, while leaving headroom on servers for future datasets an order of magnitude larger.
Bibliografia:ObjectType-Working Paper/Pre-Print-3
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Competing Interest Statement: B.P. declares no competing interests. W.J.G. is a scientific co-founder of Protillion Biosciences, and consultant for Guardant Health, Ultima Genomics, and Nvidia. W.J.G. is an inventor on patents licensed by 10x Genomics.
ISSN:2692-8205
2692-8205
DOI:10.1101/2025.03.27.645853