Rapid and sensitive detection of genome contamination at scale with FCS-GX

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 minutes. Testing FCS-GX on ar...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:bioRxiv
Hlavní autori: Astashyn, Alexander, Tvedte, Eric S, Sweeney, Deacon, Sapojnikov, Victor, Bouk, Nathan, Joukov, Victor, Mozes, Eyal, Strope, Pooja K, Sylla, Pape M, Wagner, Lukas, Bidwell, Shelby L, Clark, Karen, Davis, Emily W, Smith-White, Brian, Hlavina, Wratko, Pruitt, Kim D, Schneider, Valerie A, Murphy, Terence D
Médium: Journal Article Paper
Jazyk:English
Vydavateľské údaje: United States Cold Spring Harbor Laboratory 06.06.2023
Vydanie:1.1
Predmet:
ISSN:2692-8205, 2692-8205
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 minutes. Testing FCS-GX on artificially fragmented genomes demonstrates sensitivity >95% for diverse contaminant species and specificity >99.93%. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination (0.16% of total bases), with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/.
Bibliografia:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Working Paper/Pre-Print-1
ObjectType-Feature-3
content type line 23
Competing Interest Statement: The authors have declared no competing interest.
ISSN:2692-8205
2692-8205
DOI:10.1101/2023.06.02.543519