Benchmarking scRNA-seq copy number variation callers

Uloženo v:
Podrobná bibliografie
Název: Benchmarking scRNA-seq copy number variation callers
Autoři: Schmid, Katharina T, Symeonidi, Aikaterini, Hlushchenko, Dmytro, Richter, Maria L, Tijhuis, Andréa E, Foijer, Floris, Colomé-Tatché, Maria
Zdroj: Nature Communications. 16(1)
Informace o vydavateli: Nature Publishing Group, 2025.
Rok vydání: 2025
Témata: Benchmarking, Single-Cell Analysis/methods, Humans, DNA Copy Number Variations/genetics, RNA-Seq/methods, Computational Biology/methods, Neoplasms/genetics, Single-Cell Gene Expression Analysis
Popis: Copy number variations (CNVs), the gain or loss of genomic regions, are associated with disease, especially cancer. Single cell technologies offer new possibilities to capture within-sample heterogeneity of CNVs and identify subclones relevant for tumor progression and treatment outcome. Several computational tools have been developed to identify CNVs from scRNA-seq data. However, an independent benchmarking of them is lacking. Here, we evaluate six popular methods in their ability to correctly identify ground truth CNVs, euploid cells and subclonal structures in 21 scRNA-seq datasets. We discover dataset-specific factors influencing the performance, including dataset size, the number and type of CNVs in the sample and the choice of the reference dataset. Methods which include allelic information perform more robustly for large droplet-based datasets, but require higher runtime. Furthermore, the methods differ in their additional functionalities. We offer a benchmarking pipeline to identify the optimal method for new datasets, and improve methods' performance.
Druh dokumentu: Article
Jazyk: English
ISSN: 2041-1723
DOI: 10.1038/s41467-025-62359-9
Přístupová URL adresa: https://research.rug.nl/en/publications/e0d47017-c370-4bf5-abd0-8854923f8733
https://hdl.handle.net/11370/e0d47017-c370-4bf5-abd0-8854923f8733
Rights: CC BY
Přístupové číslo: edsair.dris...01423..7d5426b6623dabf70fa5c40868dff043
Databáze: OpenAIRE
Popis
Abstrakt:Copy number variations (CNVs), the gain or loss of genomic regions, are associated with disease, especially cancer. Single cell technologies offer new possibilities to capture within-sample heterogeneity of CNVs and identify subclones relevant for tumor progression and treatment outcome. Several computational tools have been developed to identify CNVs from scRNA-seq data. However, an independent benchmarking of them is lacking. Here, we evaluate six popular methods in their ability to correctly identify ground truth CNVs, euploid cells and subclonal structures in 21 scRNA-seq datasets. We discover dataset-specific factors influencing the performance, including dataset size, the number and type of CNVs in the sample and the choice of the reference dataset. Methods which include allelic information perform more robustly for large droplet-based datasets, but require higher runtime. Furthermore, the methods differ in their additional functionalities. We offer a benchmarking pipeline to identify the optimal method for new datasets, and improve methods' performance.
ISSN:20411723
DOI:10.1038/s41467-025-62359-9