Benchmarking scRNA-seq copy number variation callers

Uložené v:
Podrobná bibliografia
Názov: Benchmarking scRNA-seq copy number variation callers
Autori: Schmid, Katharina T, Symeonidi, Aikaterini, Hlushchenko, Dmytro, Richter, Maria L, Tijhuis, Andréa E, Foijer, Floris, Colomé-Tatché, Maria
Zdroj: Nature Communications. 16(1)
Informácie o vydavateľovi: Nature Publishing Group, 2025.
Rok vydania: 2025
Predmety: Benchmarking, Single-Cell Analysis/methods, Humans, DNA Copy Number Variations/genetics, RNA-Seq/methods, Computational Biology/methods, Neoplasms/genetics, Single-Cell Gene Expression Analysis
Popis: Copy number variations (CNVs), the gain or loss of genomic regions, are associated with disease, especially cancer. Single cell technologies offer new possibilities to capture within-sample heterogeneity of CNVs and identify subclones relevant for tumor progression and treatment outcome. Several computational tools have been developed to identify CNVs from scRNA-seq data. However, an independent benchmarking of them is lacking. Here, we evaluate six popular methods in their ability to correctly identify ground truth CNVs, euploid cells and subclonal structures in 21 scRNA-seq datasets. We discover dataset-specific factors influencing the performance, including dataset size, the number and type of CNVs in the sample and the choice of the reference dataset. Methods which include allelic information perform more robustly for large droplet-based datasets, but require higher runtime. Furthermore, the methods differ in their additional functionalities. We offer a benchmarking pipeline to identify the optimal method for new datasets, and improve methods' performance.
Druh dokumentu: Article
Jazyk: English
ISSN: 2041-1723
DOI: 10.1038/s41467-025-62359-9
Prístupová URL adresa: https://research.rug.nl/en/publications/e0d47017-c370-4bf5-abd0-8854923f8733
https://hdl.handle.net/11370/e0d47017-c370-4bf5-abd0-8854923f8733
Rights: CC BY
Prístupové číslo: edsair.dris...01423..7d5426b6623dabf70fa5c40868dff043
Databáza: OpenAIRE
Popis
Abstrakt:Copy number variations (CNVs), the gain or loss of genomic regions, are associated with disease, especially cancer. Single cell technologies offer new possibilities to capture within-sample heterogeneity of CNVs and identify subclones relevant for tumor progression and treatment outcome. Several computational tools have been developed to identify CNVs from scRNA-seq data. However, an independent benchmarking of them is lacking. Here, we evaluate six popular methods in their ability to correctly identify ground truth CNVs, euploid cells and subclonal structures in 21 scRNA-seq datasets. We discover dataset-specific factors influencing the performance, including dataset size, the number and type of CNVs in the sample and the choice of the reference dataset. Methods which include allelic information perform more robustly for large droplet-based datasets, but require higher runtime. Furthermore, the methods differ in their additional functionalities. We offer a benchmarking pipeline to identify the optimal method for new datasets, and improve methods' performance.
ISSN:20411723
DOI:10.1038/s41467-025-62359-9