A Note on Probabilistic Models over Strings: The Linear Algebra Approach

Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an impo...

Full description

Saved in:

Bibliographic Details
Published in:	Bulletin of mathematical biology Vol. 75; no. 12; pp. 2529 - 2550
Main Author:	Bouchard-Côté, Alexandre
Format:	Journal Article
Language:	English
Published:	New York Springer US 01.12.2013 Springer Nature B.V
Subjects:	Algorithms Bayes Theorem Cell Biology Computational Biology Evolution, Molecular INDEL Mutation Life Sciences Linear algebra Linear Models Mathematical and Computational Biology Mathematical Concepts Mathematics Mathematics and Statistics Models, Genetic Models, Statistical Original Article Phylogeny Alignment Graphical models TKF91 Automata Phylogenetics Probabilistic models Factor graphs String transducers Indel
ISSN:	0092-8240, 1522-9602, 1522-9602
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question is the complexity of computing the normalization of a class of string-valued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a set of results based on this linear algebra view that facilitate the analysis and design of inference algorithms on string-valued graphical models. As an illustration, we use this method to give a new elementary proof of a known result on the complexity of inference on the “TKF91” model, a well-known probabilistic model over strings. Compared to previous work, our proving method is easier to extend to other models, since it relies on a novel weak condition, triangular transducers, which is easy to establish in practice. The linear algebra view provides a concise way of describing transducer algorithms and their compositions, opens the possibility of transferring fast linear algebra libraries (for example, based on GPUs), as well as low rank matrix approximation methods, to string-valued inference problems.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0092-8240 1522-9602 1522-9602
DOI:	10.1007/s11538-013-9906-6