Rapid and sensitive detection of genome contamination at scale with FCS-GX

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 minutes. Testing FCS-GX on ar...

Full description

Saved in:
Bibliographic Details
Published in:bioRxiv
Main Authors: Astashyn, Alexander, Tvedte, Eric S, Sweeney, Deacon, Sapojnikov, Victor, Bouk, Nathan, Joukov, Victor, Mozes, Eyal, Strope, Pooja K, Sylla, Pape M, Wagner, Lukas, Bidwell, Shelby L, Clark, Karen, Davis, Emily W, Smith-White, Brian, Hlavina, Wratko, Pruitt, Kim D, Schneider, Valerie A, Murphy, Terence D
Format: Journal Article Paper
Language:English
Published: United States Cold Spring Harbor Laboratory 06.06.2023
Edition:1.1
Subjects:
ISSN:2692-8205, 2692-8205
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 minutes. Testing FCS-GX on artificially fragmented genomes demonstrates sensitivity >95% for diverse contaminant species and specificity >99.93%. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination (0.16% of total bases), with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Working Paper/Pre-Print-1
ObjectType-Feature-3
content type line 23
Competing Interest Statement: The authors have declared no competing interest.
ISSN:2692-8205
2692-8205
DOI:10.1101/2023.06.02.543519