Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data

Gespeichert in:
Bibliographische Detailangaben
Titel: Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
Autoren: Koptekin, Dilek, Yapar, Etka, Vural, Kıvılcım Başak, Sağlıcan, Ekin, Altınışık, N. Ezgi, Malaspinas, Anna-Sapfo, Alkan, Can, Somel, Mehmet
Weitere Verfasser: Lund University, Faculty of Science, Department of Biology, Sections at the Department of Biology, Biodiversity and Evolution, Lunds universitet, Naturvetenskapliga fakulteten, Biologiska institutionen, Avdelningar vid Biologiska institutionen, Biodiversitet och evolution, Originator, Lund University, Faculty of Science, Department of Biology, Research groups at the Department of Biology, Systematic Biology Group, Lunds universitet, Naturvetenskapliga fakulteten, Biologiska institutionen, Forskargrupper vid Biologiska institutionen, Systematisk biologi, Originator
Quelle: Genome Biology. 26
Schlagwörter: Natural Sciences, Biological Sciences, Genetics and Genomics, Naturvetenskap, Biologi, Genetik och genomik, Computer and Information Sciences, Computer Sciences, Data- och informationsvetenskap (Datateknik), Datavetenskap (Datalogi)
Beschreibung: We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files.
Zugangs-URL: https://doi.org/10.1186/s13059-024-03462-w
Datenbank: SwePub
Beschreibung
Abstract:We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files.
ISSN:1474760X
DOI:10.1186/s13059-024-03462-w