AlienTrimmer: A tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads

Contaminant oligonucleotide sequences such as primers and adapters can occur in both ends of high-throughput sequencing (HTS) reads. AlienTrimmer was developed in order to detect and remove such contaminants. Based on the decomposition of specified alien nucleotide sequences into k-mers, AlienTrimme...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genomics (San Diego, Calif.) Jg. 102; H. 5-6; S. 500 - 506
Hauptverfasser: Criscuolo, Alexis, Brisse, Sylvain
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Elsevier Inc 01.11.2013
Schlagworte:
ISSN:0888-7543, 1089-8646, 1089-8646
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Contaminant oligonucleotide sequences such as primers and adapters can occur in both ends of high-throughput sequencing (HTS) reads. AlienTrimmer was developed in order to detect and remove such contaminants. Based on the decomposition of specified alien nucleotide sequences into k-mers, AlienTrimmer is able to determine whether such alien k-mers are occurring in one or in both read ends by using a simple polynomial algorithm. Therefore, AlienTrimmer can process typical HTS single- or paired-end files with millions of reads in several minutes with very low computer resources. Based on the analysis of both simulated and real-case Illumina®, 454™ and Ion Torrent™ read data, we show that AlienTrimmer performs with excellent accuracy and speed in comparison with other trimming tools. The program is freely available at ftp://ftp.pasteur.fr/pub/gensoft/projects/AlienTrimmer/. •Removal of alien sequences (adapters, primers) from raw reads improves the quality of results from downstream analyses.•AlienTrimmer allows detecting and removing multiple alien sequences in both ends of sequence reads.•AlienTrimmer performs accurately and has fast running time.
Bibliographie:http://dx.doi.org/10.1016/j.ygeno.2013.07.011
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0888-7543
1089-8646
1089-8646
DOI:10.1016/j.ygeno.2013.07.011