Median quartet tree search algorithms using optimal subtree prune and regraft

Gene trees can be different from the species tree due to biological processes and inference errors. One way to obtain a species tree is to find one that maximizes some measure of similarity to a set of gene trees. The number of shared quartets between a potential species tree and gene trees provides...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Algorithms for molecular biology Jg. 19; H. 1; S. 12 - 18
Hauptverfasser: Arasti, Shayesteh, Mirarab, Siavash
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London BioMed Central 13.03.2024
Springer Nature B.V
BMC
Schlagworte:
ISSN:1748-7188, 1748-7188
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Gene trees can be different from the species tree due to biological processes and inference errors. One way to obtain a species tree is to find one that maximizes some measure of similarity to a set of gene trees. The number of shared quartets between a potential species tree and gene trees provides a statistically justifiable score; if maximized properly, it could result in a statistically consistent estimator of the species tree under several statistical models of discordance. However, finding the median quartet score tree, one that maximizes this score, is NP-Hard, motivating several existing heuristic algorithms. These heuristics do not follow the hill-climbing paradigm used extensively in phylogenetics. In this paper, we make theoretical contributions that enable an efficient hill-climbing approach. Specifically, we show that a subtree of size m can be placed optimally on a tree of size n in quasi-linear time with respect to n and (almost) independently of m . This result enables us to perform subtree prune and regraft (SPR) rearrangements as part of a hill-climbing search. We show that this approach can slightly improve upon the results of widely-used methods such as ASTRAL in terms of the optimization score but not necessarily accuracy.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1748-7188
1748-7188
DOI:10.1186/s13015-024-00257-3