Scalable high-performance single cell data analysis with BPCells
The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algo...
Gespeichert in:
| Veröffentlicht in: | bioRxiv |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article Paper |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
Cold Spring Harbor Laboratory
01.04.2025
|
| Ausgabe: | 1.1 |
| Schlagworte: | |
| ISSN: | 2692-8205, 2692-8205 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | The growth of single-cell datasets to multi-million cell atlases has uncovered major scalability problems for single-cell analysis software. Here, we present BPCells, a package for high-performance single-cell analysis of RNA-seq and ATAC-seq datasets. BPCells uses disk-backed streaming compute algorithms to reduce memory requirements by nearly 70-fold compared to in-memory workflows with little to no loss of execution speed. BPCells also introduces high-performance compressed formats based on bitpacking compression for ATAC-seq fragment files and single-cell sparse matrices. These novel compression algorithms help to accelerate disk-backed analysis by reducing data transfer from disk, while providing the lowest computational overhead of all compression algorithms tested. Using BPCells, we perform normalization and PCA of a 44 million cell dataset on a laptop, demonstrating that BPCells makes working with the largest contemporary single-cell datasets feasible on modest hardware, while leaving headroom on servers for future datasets an order of magnitude larger. |
|---|---|
| Bibliographie: | ObjectType-Working Paper/Pre-Print-3 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Competing Interest Statement: B.P. declares no competing interests. W.J.G. is a scientific co-founder of Protillion Biosciences, and consultant for Guardant Health, Ultima Genomics, and Nvidia. W.J.G. is an inventor on patents licensed by 10x Genomics. |
| ISSN: | 2692-8205 2692-8205 |
| DOI: | 10.1101/2025.03.27.645853 |