Unified representation of genetic variants

A genetic variant can be represented in the Variant Call Format (VCF) in multiple different ways. Inconsistent representation of variants between variant callers and analyses will magnify discrepancies between them and complicate variant filtering and duplicate removal. We present a software tool vt...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics Jg. 31; H. 13; S. 2202 - 2204
Hauptverfasser: Tan, Adrian, Abecasis, Gonçalo R., Kang, Hyun Min
Format: Journal Article
Sprache:Englisch
Veröffentlicht: England Oxford University Press 01.07.2015
Schlagworte:
ISSN:1367-4803, 1367-4811, 1367-4811, 1460-2059
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A genetic variant can be represented in the Variant Call Format (VCF) in multiple different ways. Inconsistent representation of variants between variant callers and analyses will magnify discrepancies between them and complicate variant filtering and duplicate removal. We present a software tool vt normalize that normalizes representation of genetic variants in the VCF. We formally define variant normalization as the consistent representation of genetic variants in an unambiguous and concise way and derive a simple general algorithm to enforce it. We demonstrate the inconsistent representation of variants across existing sequence analysis tools and show that our tool facilitates integration of diverse variant types and call sets. Availability and implementation: The source code is available for download at http://github.com/atks/vt. More detailed documentation is available at http://genome.sph.umich.edu/wiki/Variant_Normalization. Contact:  hmkang@umich.edu Supplementary information:  Supplementary data are available at Bioinformatics online.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Associate Editor: John Hancock
ISSN:1367-4803
1367-4811
1367-4811
1460-2059
DOI:10.1093/bioinformatics/btv112