Fast tumor phylogeny regression via tree-structured dual dynamic programming

Reconstructing the evolutionary history of tumors from bulk DNA sequencing of multiple tissue samples remains a challenging computational problem, requiring simultaneous deconvolution of the tumor tissue and inference of its evolutionary history. Recently, phylogenetic reconstruction methods have ma...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Bioinformatics (Oxford, England) Ročník 41; číslo Supplement_1; s. i170 - i179
Hlavní autoři: Schmidt, Henri, Qi, Yuanyuan, Raphael, Benjamin J, El-Kebir, Mohammed
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford Publishing Limited (England) 01.07.2025
Oxford University Press
Témata:
ISSN:1367-4803, 1367-4811, 1367-4811
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Reconstructing the evolutionary history of tumors from bulk DNA sequencing of multiple tissue samples remains a challenging computational problem, requiring simultaneous deconvolution of the tumor tissue and inference of its evolutionary history. Recently, phylogenetic reconstruction methods have made significant progress by breaking the reconstruction problem into two parts: a regression problem over a fixed topology and a search over tree space. While effective techniques have been developed for the latter search problem, the regression problem remains a bottleneck in both method design and implementation due to the lack of fast, specialized algorithms. Here, we introduce fastppm, a fast tool to solve the perfect phylogeny regression problem via tree-structured dual dynamic programming. fastppm supports arbitrary, separable convex loss functions including the ℓ2, piecewise linear, binomial and beta-binomial loss and provides asymptotic improvements for the ℓ2 and piecewise linear loss over existing algorithms. We find that fastppm empirically outperforms both specialized and general purpose regression algorithms, obtaining 50-450× speedups while providing as accurate solutions as existing approaches. Incorporating fastppm into several phylogeny inference algorithms immediately yields up to 400× speedups, requiring only a small change to the program code of existing software. Finally, fastppm enables analysis of low-coverage bulk DNA sequencing data on both simulated data and in a patient-derived mouse model of colorectal cancer, outperforming state-of-the-art phylogeny inference algorithms in terms of both accuracy and runtime. fastppm is implemented in C++ and available as both a command-line interface and Python library at github.com/elkebir-group/fastppm.git under an MIT license.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Henri Schmidt and Yuanyuan Qi equal contribution.
ISSN:1367-4803
1367-4811
1367-4811
DOI:10.1093/bioinformatics/btaf235