Efficient ancestry and mutation simulation with msprime 1.0

Abstract Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetics (Austin) Jg. 220; H. 3
Hauptverfasser: Baumdicker, Franz, Bisschop, Gertjan, Goldstein, Daniel, Gower, Graham, Ragsdale, Aaron P, Tsambos, Georgia, Zhu, Sha, Eldon, Bjarki, Ellerman, E Castedo, Galloway, Jared G, Gladstein, Ariella L, Gorjanc, Gregor, Guo, Bing, Jeffery, Ben, Kretzschumar, Warren W, Lohse, Konrad, Matschiner, Michael, Nelson, Dominic, Pope, Nathaniel S, Quinto-Cortés, Consuelo D, Rodrigues, Murillo F, Saunack, Kumar, Sellinger, Thibaut, Thornton, Kevin, van Kemenade, Hugo, Wohns, Anthony W, Wong, Yan, Gravel, Simon, Kern, Andrew D, Koskela, Jere, Ralph, Peter L, Kelleher, Jerome
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Oxford University Press 03.03.2022
Schlagworte:
ISSN:1943-2631, 1943-2631
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement. Since its introduction in 2016, the msprime simulator has grown in popularity and is now one of the most commonly used tools in population genetics. This article marks the 1.0 release of msprime and summarizes the many features it has accumulated through an open source community development model. Despite its generality, msprime’s performance is excellent—in many cases orders of magnitude faster and more memory efficient than more specialized methods.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1943-2631
1943-2631
DOI:10.1093/genetics/iyab229