Maximum parsimony distance on phylogenetic trees: A linear kernel and constant factor approximation algorithm: A linear kernel and constant factor approximation algorithm

Saved in:
Bibliographic Details
Title: Maximum parsimony distance on phylogenetic trees: A linear kernel and constant factor approximation algorithm: A linear kernel and constant factor approximation algorithm
Authors: Leen Stougie, Mark Jones, Steven Kelk
Contributors: Sagot, Marie-France
Source: Journal of Computer and System Sciences. 117:165-181
Publication Status: Preprint
Publisher Information: Elsevier BV, 2021.
Publication Year: 2021
Subject Terms: Maximum agreement forest, FOS: Computer and information sciences, COMPLEXITY, 0102 computer and information sciences, [INFO] Computer Science [cs], COMPATIBILITY, 01 natural sciences, [SDV] Life Sciences [q-bio], Phylogenetics, Computer Science - Data Structures and Algorithms, Fixed parameter tractability, Data Structures and Algorithms (cs.DS), Maximum parsimony, AGREEMENT FOREST
Description: Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant-factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics.
27 pages, 7 figures
Document Type: Article
File Description: application/pdf
Language: English
ISSN: 0022-0000
DOI: 10.1016/j.jcss.2020.10.003
DOI: 10.48550/arxiv.2004.02298
Access URL: http://arxiv.org/abs/2004.02298
https://hdl.handle.net/1871.1/0522f5f0-ab66-4220-84ab-f66b5b6df576
https://doi.org/10.1016/j.jcss.2020.10.003
https://research.vu.nl/en/publications/0522f5f0-ab66-4220-84ab-f66b5b6df576
https://cris.maastrichtuniversity.nl/en/publications/d9294f35-8b17-42ce-a4db-36cabf9c7d63
https://doi.org/10.1016/j.jcss.2020.10.003
https://inria.hal.science/hal-03498430v1
https://doi.org/10.1016/j.jcss.2020.10.003
https://inria.hal.science/hal-03498430v1/document
https://ir.cwi.nl/pub/30410
https://repository.tudelft.nl/islandora/object/uuid%3A8d5fc924-a45d-4472-a20c-54d511c45632/datastream/OBJ/download
https://research.tudelft.nl/en/publications/maximum-parsimony-distance-on-phylogenetic-trees-a-linear-kernel-
https://www.narcis.nl/publication/RecordID/oai%3Acwi.nl%3A30410
https://research.vu.nl/en/publications/maximum-parsimony-distance-on-phylogenetic-trees-a-linear-kernel-
https://ir.cwi.nl/pub/30410/30410.pdf
http://resolver.tudelft.nl/uuid:8d5fc924-a45d-4472-a20c-54d511c45632
Rights: CC BY
arXiv Non-Exclusive Distribution
Accession Number: edsair.doi.dedup.....677f9b51e4e2d56f977cbdc9757ff034
Database: OpenAIRE
Description
Abstract:Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant-factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics.<br />27 pages, 7 figures
ISSN:00220000
DOI:10.1016/j.jcss.2020.10.003