Scalable high-performance single cell data analysis with BPCells
The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algo...
Uložené v:
| Vydané v: | bioRxiv |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article Paper |
| Jazyk: | English |
| Vydavateľské údaje: |
United States
Cold Spring Harbor Laboratory
01.04.2025
|
| Vydanie: | 1.1 |
| Predmet: | |
| ISSN: | 2692-8205, 2692-8205 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algorithms to reduce memory requirements by nearly 70-fold compared to in-memory workflows with little to no loss of execution speed. BPCells also introduces high-performance compressed formats based on bitpacking compression for ATAC-seq fragment files and single-cell sparse matrices. These novel compression algorithms help to accelerate disk-backed analysis by reducing data transfer from disk, while providing the lowest computational overhead of all compression algorithms tested. Using BPCells, we perform normalization and PCA of a 44 million cell dataset on a laptop, demonstrating that BPCells makes working with the largest contemporary single-cell datasets feasible on modest hardware, while leaving headroom on servers for future datasets an order of magnitude larger. |
|---|---|
| Bibliografia: | ObjectType-Working Paper/Pre-Print-3 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Competing Interest Statement: B.P. declares no competing interests. W.J.G. is a scientific co-founder of Protillion Biosciences, and consultant for Guardant Health, Ultima Genomics, and Nvidia. W.J.G. is an inventor on patents licensed by 10x Genomics. |
| ISSN: | 2692-8205 2692-8205 |
| DOI: | 10.1101/2025.03.27.645853 |