Rapid and sensitive detection of genome contamination at scale with FCS-GX

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI’s Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1–10 min. Testing FCS-GX on artificiall...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Genome Biology Ročník 25; číslo 1; s. 60
Hlavní autoři: Astashyn, Alexander, Tvedte, Eric S., Sweeney, Deacon, Sapojnikov, Victor, Bouk, Nathan, Joukov, Victor, Mozes, Eyal, Strope, Pooja K., Sylla, Pape M., Wagner, Lukas, Bidwell, Shelby L., Brown, Larissa C., Clark, Karen, Davis, Emily W., Smith-White, Brian, Hlavina, Wratko, Pruitt, Kim D., Schneider, Valerie A., Murphy, Terence D.
Médium: Journal Article
Jazyk:angličtina
Vydáno: London BioMed Central 26.02.2024
Springer Nature B.V
BMC
Témata:
ISSN:1474-760X, 1474-7596, 1474-760X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI’s Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1–10 min. Testing FCS-GX on artificially fragmented genomes demonstrates high sensitivity and specificity for diverse contaminant species. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination, comprising 0.16% of total bases, with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/ or https://doi.org/10.5281/zenodo.10651084 .
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1474-760X
1474-7596
1474-760X
DOI:10.1186/s13059-024-03198-7