NCBI RefSeq: reference sequence standards through 25 years of curation and annotation

Reference sequences and annotations serve as the foundation for many lines of research today, from organism and sequence identification to providing a core description of the genes, transcripts and proteins found in an organism's genome. Interpretation of data including transcriptomics, proteom...

Full description

Saved in:
Bibliographic Details
Published in:Nucleic acids research Vol. 53; no. D1; pp. D243 - D257
Main Authors: Goldfarb, Tamara, Kodali, Vamsi K, Pujar, Shashikant, Brover, Vyacheslav, Robbertse, Barbara, Farrell, Catherine M, Oh, Dong-Ha, Astashyn, Alexander, Ermolaeva, Olga, Haddad, Diana, Hlavina, Wratko, Hoffman, Jinna, Jackson, John D, Joardar, Vinita S, Kristensen, David, Masterson, Patrick, McGarvey, Kelly M, McVeigh, Richard, Mozes, Eyal, Murphy, Michael R, Schafer, Susan S, Souvorov, Alexander, Spurrier, Brett, Strope, Pooja K, Sun, Hanzhen, Vatsan, Anjana R, Wallin, Craig, Webb, David, Brister, J Rodney, Hatcher, Eneida, Kimchi, Avi, Klimke, William, Marchler-Bauer, Aron, Pruitt, Kim D, Thibaud-Nissen, Françoise, Murphy, Terence D
Format: Journal Article
Language:English
Published: England Oxford University Press 06.01.2025
Subjects:
ISSN:0305-1048, 1362-4962, 1362-4962
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Reference sequences and annotations serve as the foundation for many lines of research today, from organism and sequence identification to providing a core description of the genes, transcripts and proteins found in an organism's genome. Interpretation of data including transcriptomics, proteomics, sequence variation and comparative analyses based on reference gene annotations informs our understanding of gene function and possible disease mechanisms, leading to new biomedical discoveries. The Reference Sequence (RefSeq) resource created at the National Center for Biotechnology Information (NCBI) leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life. RefSeq continues to refine its annotation and quality control processes and utilize better quality genomes resulting from advances in sequencing technologies as well as RNA-Seq data to produce high-quality annotated genomes, ortholog predictions across more organisms and other products that are easily accessible through multiple NCBI resources. This report summarizes the current status of the eukaryotic, prokaryotic and viral RefSeq resources, with a focus on eukaryotic annotation, the increase in taxonomic representation and the effect it will have on comparative genomics. The RefSeq resource is publicly accessible at https://www.ncbi.nlm.nih.gov/refseq.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
The first two authors should be regarded as Joint First Authors.
ISSN:0305-1048
1362-4962
1362-4962
DOI:10.1093/nar/gkae1038