Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation

Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Nucleic acids research Ročník 46; číslo D1; s. D221 - D228
Hlavní autoři: Pujar, Shashikant, O'Leary, Nuala A, Farrell, Catherine M, Loveland, Jane E, Mudge, Jonathan M, Wallin, Craig, Girón, Carlos G, Diekhans, Mark, Barnes, If, Bennett, Ruth, Berry, Andrew E, Cox, Eric, Davidson, Claire, Goldfarb, Tamara, Gonzalez, Jose M, Hunt, Toby, Jackson, John, Joardar, Vinita, Kay, Mike P, Kodali, Vamsi K, Martin, Fergal J, McAndrews, Monica, McGarvey, Kelly M, Murphy, Michael, Rajput, Bhanu, Rangwala, Sanjida H, Riddick, Lillian D, Seal, Ruth L, Suner, Marie-Marthe, Webb, David, Zhu, Sophia, Aken, Bronwen L, Bruford, Elspeth A, Bult, Carol J, Frankish, Adam, Murphy, Terence, Pruitt, Kim D
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford University Press 04.01.2018
Témata:
ISSN:0305-1048, 1362-4962, 1362-4962
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
AbstractList The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
Author Seal, Ruth L
Webb, David
Martin, Fergal J
Gonzalez, Jose M
McAndrews, Monica
Bult, Carol J
Aken, Bronwen L
Jackson, John
Frankish, Adam
Girón, Carlos G
Pruitt, Kim D
Mudge, Jonathan M
Suner, Marie-Marthe
Zhu, Sophia
Kodali, Vamsi K
Murphy, Michael
Hunt, Toby
Berry, Andrew E
Joardar, Vinita
Bruford, Elspeth A
Wallin, Craig
Rangwala, Sanjida H
Bennett, Ruth
Goldfarb, Tamara
Farrell, Catherine M
Diekhans, Mark
Pujar, Shashikant
Loveland, Jane E
Barnes, If
Riddick, Lillian D
Rajput, Bhanu
Murphy, Terence
O'Leary, Nuala A
Cox, Eric
McGarvey, Kelly M
Kay, Mike P
Davidson, Claire
AuthorAffiliation Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
University of California Santa Cruz Genomics Institute, Santa Cruz, CA 95064, USA
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
HUGO Gene Nomenclature Committee, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
AuthorAffiliation_xml – name: HUGO Gene Nomenclature Committee, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– name: Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
– name: University of California Santa Cruz Genomics Institute, Santa Cruz, CA 95064, USA
– name: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– name: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
Author_xml – sequence: 1
  givenname: Shashikant
  surname: Pujar
  fullname: Pujar, Shashikant
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 2
  givenname: Nuala A
  surname: O'Leary
  fullname: O'Leary, Nuala A
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 3
  givenname: Catherine M
  surname: Farrell
  fullname: Farrell, Catherine M
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 4
  givenname: Jane E
  orcidid: 0000-0002-7669-2934
  surname: Loveland
  fullname: Loveland, Jane E
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 5
  givenname: Jonathan M
  surname: Mudge
  fullname: Mudge, Jonathan M
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 6
  givenname: Craig
  surname: Wallin
  fullname: Wallin, Craig
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 7
  givenname: Carlos G
  orcidid: 0000-0002-0935-7271
  surname: Girón
  fullname: Girón, Carlos G
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 8
  givenname: Mark
  surname: Diekhans
  fullname: Diekhans, Mark
  organization: University of California Santa Cruz Genomics Institute, Santa Cruz, CA 95064, USA
– sequence: 9
  givenname: If
  surname: Barnes
  fullname: Barnes, If
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 10
  givenname: Ruth
  surname: Bennett
  fullname: Bennett, Ruth
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 11
  givenname: Andrew E
  surname: Berry
  fullname: Berry, Andrew E
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 12
  givenname: Eric
  surname: Cox
  fullname: Cox, Eric
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 13
  givenname: Claire
  surname: Davidson
  fullname: Davidson, Claire
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 14
  givenname: Tamara
  surname: Goldfarb
  fullname: Goldfarb, Tamara
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 15
  givenname: Jose M
  surname: Gonzalez
  fullname: Gonzalez, Jose M
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 16
  givenname: Toby
  surname: Hunt
  fullname: Hunt, Toby
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 17
  givenname: John
  surname: Jackson
  fullname: Jackson, John
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 18
  givenname: Vinita
  surname: Joardar
  fullname: Joardar, Vinita
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 19
  givenname: Mike P
  surname: Kay
  fullname: Kay, Mike P
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 20
  givenname: Vamsi K
  surname: Kodali
  fullname: Kodali, Vamsi K
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 21
  givenname: Fergal J
  orcidid: 0000-0002-1672-050X
  surname: Martin
  fullname: Martin, Fergal J
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 22
  givenname: Monica
  surname: McAndrews
  fullname: McAndrews, Monica
  organization: Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
– sequence: 23
  givenname: Kelly M
  surname: McGarvey
  fullname: McGarvey, Kelly M
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 24
  givenname: Michael
  surname: Murphy
  fullname: Murphy, Michael
  email: murphyte@ncbi.nlm.nih.gov
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 25
  givenname: Bhanu
  surname: Rajput
  fullname: Rajput, Bhanu
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 26
  givenname: Sanjida H
  surname: Rangwala
  fullname: Rangwala, Sanjida H
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 27
  givenname: Lillian D
  surname: Riddick
  fullname: Riddick, Lillian D
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 28
  givenname: Ruth L
  surname: Seal
  fullname: Seal, Ruth L
  organization: HUGO Gene Nomenclature Committee, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 29
  givenname: Marie-Marthe
  orcidid: 0000-0002-0380-7171
  surname: Suner
  fullname: Suner, Marie-Marthe
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 30
  givenname: David
  surname: Webb
  fullname: Webb, David
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 31
  givenname: Sophia
  surname: Zhu
  fullname: Zhu, Sophia
  organization: Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
– sequence: 32
  givenname: Bronwen L
  surname: Aken
  fullname: Aken, Bronwen L
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 33
  givenname: Elspeth A
  orcidid: 0000-0002-8380-5247
  surname: Bruford
  fullname: Bruford, Elspeth A
  organization: HUGO Gene Nomenclature Committee, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 34
  givenname: Carol J
  surname: Bult
  fullname: Bult, Carol J
  organization: Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
– sequence: 35
  givenname: Adam
  surname: Frankish
  fullname: Frankish, Adam
  organization: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
– sequence: 36
  givenname: Terence
  surname: Murphy
  fullname: Murphy, Terence
  email: murphyte@ncbi.nlm.nih.gov
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
– sequence: 37
  givenname: Kim D
  surname: Pruitt
  fullname: Pruitt, Kim D
  organization: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/29126148$$D View this record in MEDLINE/PubMed
BookMark eNp9UUtv1DAQtlArul04cUc-oSIU6leSNQckFGiLVIkDcLYcZ7w1JHZqO6jlzA_HsEsFSHAaaeZ7zHxzjA588IDQI0qeUyL5qdfxdPv5hhJO76EV5Q2rhGzYAVoRTuqKErE5QscpfSKEClqL--iIScoaKjYr9K0LPoFPS8ImDM5vcYLrBbwBfNJ1r98_xYPOutcJXmCNU9Z-0HFwX2EowIyDxVfLpD0ufTyFJQGeY8jgfLWXi7B1xQKnZZ5DzIXX32K4mSFmbJaoc5k-QIdWjwke7usafTx786G7qC7fnb_tXl1WRlCWq9a0pBGcCC17zayVDWmlaYTlEiShdgOspYIzNjDbCiPKub3diIHUvRENqfkavdzpzks_wWDA56hHNUc36Xirgnbqz4l3V2obvqi6rTmTsgic7AViKCmlrCaXDIyj9lCOV1Q2vOwgS1mjx7973Zn8ir4Anu0AJoaUItg7CCXqx2NVeazaP7ag6V9o4_LP8MqibvwH58mOE5b5v-LfAYV3trQ
CitedBy_id crossref_primary_10_1212_WNL_0000000000200114
crossref_primary_10_1038_s41586_022_04558_8
crossref_primary_10_1093_nar_gkaa1034
crossref_primary_10_1038_s41588_025_02085_6
crossref_primary_10_1186_s13059_024_03201_1
crossref_primary_10_3390_biology11060824
crossref_primary_10_1111_epi_17166
crossref_primary_10_3389_fmed_2025_1503229
crossref_primary_10_1016_j_gpb_2021_09_002
crossref_primary_10_1093_nargab_lqaf115
crossref_primary_10_3390_biology11020283
crossref_primary_10_1093_nar_gkab1048
crossref_primary_10_1172_JCI161849
crossref_primary_10_1038_s41540_025_00564_4
crossref_primary_10_3390_ijms24054262
crossref_primary_10_1161_CIRCULATIONAHA_122_059591
crossref_primary_10_47813_2782_2818_2025_5_2_3071_3076
crossref_primary_10_1111_jnc_16226
crossref_primary_10_1186_s12863_022_01071_9
crossref_primary_10_1002_mgg3_1786
crossref_primary_10_1186_s13073_020_00809_3
crossref_primary_10_1146_annurev_genom_121119_083418
crossref_primary_10_1007_s10142_021_00810_y
crossref_primary_10_1007_s11427_024_2844_8
crossref_primary_10_1016_j_euroneuro_2020_06_002
crossref_primary_10_1016_j_pbi_2019_05_001
crossref_primary_10_1002_humu_24078
crossref_primary_10_1101_gr_266932_120
crossref_primary_10_7717_peerj_16671
crossref_primary_10_1002_pbc_30986
crossref_primary_10_15252_msb_202311987
crossref_primary_10_1038_s41467_023_39965_6
crossref_primary_10_1186_s13059_024_03314_7
crossref_primary_10_1093_nar_gky930
crossref_primary_10_1186_s13059_023_02868_2
crossref_primary_10_1371_journal_pgen_1009923
crossref_primary_10_1038_s41375_021_01436_6
crossref_primary_10_1038_s42003_024_06239_w
crossref_primary_10_7554_eLife_83593
crossref_primary_10_1371_journal_pgen_1010472
crossref_primary_10_1126_science_adi1763
crossref_primary_10_1038_s41586_018_0409_3
crossref_primary_10_1016_j_isci_2025_111884
crossref_primary_10_1007_s00018_022_04152_1
crossref_primary_10_1038_s10038_019_0691_4
crossref_primary_10_1007_s00335_021_09936_7
crossref_primary_10_1038_s41587_025_02733_6
crossref_primary_10_1126_science_adg6518
crossref_primary_10_1038_s41586_021_03855_y
crossref_primary_10_1038_s41586_023_06547_x
crossref_primary_10_1093_nar_gkae1038
crossref_primary_10_1016_j_cca_2022_08_008
crossref_primary_10_1016_j_celrep_2025_115355
crossref_primary_10_1186_s12859_021_04529_2
crossref_primary_10_1038_s41598_019_56894_x
crossref_primary_10_1210_endocr_bqaa166
crossref_primary_10_1016_j_cels_2023_02_002
crossref_primary_10_1038_s41588_024_01965_7
crossref_primary_10_1038_s41467_021_26867_8
crossref_primary_10_1186_s13321_020_00474_z
crossref_primary_10_1038_s41467_022_32390_1
crossref_primary_10_1038_s42255_023_00970_0
crossref_primary_10_1093_nar_gky955
crossref_primary_10_1038_s44320_024_00026_9
crossref_primary_10_1126_science_abj3013
crossref_primary_10_1038_s41598_023_34452_w
crossref_primary_10_1002_smtd_202201605
crossref_primary_10_1093_nar_gkae1048
crossref_primary_10_1016_j_enzmictec_2023_110197
crossref_primary_10_1186_s13059_019_1774_4
crossref_primary_10_7554_eLife_52611
crossref_primary_10_3389_fcell_2021_708754
Cites_doi 10.1021/pr501286b
10.1007/978-1-4939-6783-4_2
10.1186/gb-2005-6-5-r44
10.1083/jcb.201603072
10.1194/jlr.M017277
10.1093/nar/gkv1189
10.1093/nar/gkw1033
10.1093/nar/gkt1059
10.1038/nmeth.1226
10.1101/gr.135350.111
10.1093/nar/gkt1241
10.1101/gr.213611.116
10.1186/s12920-014-0067-8
10.1007/978-1-4939-6427-7_3
10.1101/gr.132563.111
10.1016/j.tcb.2017.04.006
10.1002/bies.201400103
10.1073/pnas.1418631112
10.1101/gr.080531.108
10.1093/nar/gkv1323
10.1073/pnas.2136655100
10.1093/bioinformatics/btr209
10.1371/journal.pbio.1001091
ContentType Journal Article
Copyright Published by Oxford University Press on behalf of Nucleic Acids Research 2017. 2018
Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Copyright_xml – notice: Published by Oxford University Press on behalf of Nucleic Acids Research 2017. 2018
– notice: Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOI 10.1093/nar/gkx1031
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
MEDLINE

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Anatomy & Physiology
Chemistry
DocumentTitleAlternate Database issue
EISSN 1362-4962
EndPage D228
ExternalDocumentID PMC5753299
29126148
10_1093_nar_gkx1031
10.1093/nar/gkx1031
Genre Research Support, N.I.H., Intramural
Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GeographicLocations United States
GeographicLocations_xml – name: United States
GrantInformation_xml – fundername: NHGRI NIH HHS
  grantid: U41 HG007234
– fundername: NHGRI NIH HHS
  grantid: U41 HG003345
GroupedDBID ---
-DZ
-~X
.I3
0R~
123
18M
1TH
29N
2WC
4.4
482
53G
5VS
5WA
70E
85S
A8Z
AAFWJ
AAHBH
AAMVS
AAOGV
AAPXW
AAUQX
AAVAP
ABEJV
ABGNP
ABPTD
ABQLI
ABXVV
ACGFO
ACGFS
ACIWK
ACNCT
ACPRK
ACUTJ
ADBBV
ADHZD
AEGXH
AENEX
AENZO
AFFNX
AFPKN
AFRAH
AFYAG
AHMBA
AIAGR
ALMA_UNASSIGNED_HOLDINGS
ALUQC
AMNDL
AOIJS
BAWUL
BAYMD
BCNDV
CAG
CIDKT
CS3
CZ4
DIK
DU5
D~K
E3Z
EBD
EBS
EJD
EMOBN
F5P
GROUPED_DOAJ
GX1
H13
HH5
HYE
HZ~
IH2
KAQDR
KQ8
KSI
M49
OAWHX
OBC
OBS
OEB
OES
OJQWA
P2P
PEELM
PQQKQ
R44
RD5
RNS
ROL
ROZ
RPM
RXO
SV3
TN5
TOX
TR2
WG7
WOQ
X7H
XSB
YSK
ZKX
~91
~D7
~KM
AAYXX
CITATION
OVT
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ESTFP
5PM
ID FETCH-LOGICAL-c412t-7c7064304a9ba2ff96079c64f39e901f8e2714322d2f74c4014bf84d05bc46053
IEDL.DBID TOX
ISICitedReferencesCount 86
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000419550700035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0305-1048
1362-4962
IngestDate Tue Sep 30 16:44:01 EDT 2025
Thu Oct 02 10:23:39 EDT 2025
Thu Apr 03 07:08:54 EDT 2025
Sat Nov 29 03:24:47 EST 2025
Tue Nov 18 20:49:34 EST 2025
Wed Apr 02 07:01:49 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue D1
Language English
License This work is written by (a) US Government employee(s) and is in the public domain in the US.
Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c412t-7c7064304a9ba2ff96079c64f39e901f8e2714322d2f74c4014bf84d05bc46053
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0002-8380-5247
0000-0002-1672-050X
0000-0002-0380-7171
0000-0002-7669-2934
0000-0002-0935-7271
OpenAccessLink http://dx.doi.org/10.1093/nar/gkx1031
PMID 29126148
PQID 1963271996
PQPubID 23479
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_5753299
proquest_miscellaneous_1963271996
pubmed_primary_29126148
crossref_primary_10_1093_nar_gkx1031
crossref_citationtrail_10_1093_nar_gkx1031
oup_primary_10_1093_nar_gkx1031
PublicationCentury 2000
PublicationDate 2018-01-04
PublicationDateYYYYMMDD 2018-01-04
PublicationDate_xml – month: 01
  year: 2018
  text: 2018-01-04
  day: 04
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Nucleic acids research
PublicationTitleAlternate Nucleic Acids Res
PublicationYear 2018
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References ( key 20180103185555_B7) 2017; 45
( key 20180103185555_B11) 2014; 7
( key 20180103185555_B8) 2017; 1488
( key 20180103185555_B12) 2015; 2015
( key 20180103185555_B15) 2014; 42
( key 20180103185555_B20) 2017; 27
( key 20180103185555_B3) 2009; 19
( key 20180103185555_B5) 2014; 42
International Nucleotide Sequence Database, C. ( key 20180103185555_B13) 2016; 44
( key 20180103185555_B25) 2003; 100
( key 20180103185555_B26) 2012; 22
( key 20180103185555_B2) 2016; 2016
( key 20180103185555_B1) 2016; 44
( key 20180103185555_B6) 2012; 22
( key 20180103185555_B22) 2017; 27
( key 20180103185555_B23) 2005; 6
( key 20180103185555_B21) 2015; 37
( key 20180103185555_B24) 2008; 5
( key 20180103185555_B16) 2011; 27
( key 20180103185555_B18) 2011; 52
( key 20180103185555_B14) 2017; 1558
( key 20180103185555_B9) 2015; 112
( key 20180103185555_B17) 2016; 213
( key 20180103185555_B10) 2015; 14
( key 20180103185555_B19) 2011; 9
( key 20180103185555_B4) 2012; 2012
References_xml – volume: 14
  start-page: 1880
  year: 2015
  ident: key 20180103185555_B10
  article-title: Most highly expressed protein-coding genes have a single dominant isoform
  publication-title: J. Proteome Res.
  doi: 10.1021/pr501286b
– volume: 1558
  start-page: 41
  year: 2017
  ident: key 20180103185555_B14
  article-title: UniProt Protein Knowledgebase
  publication-title: Methods Mol. Biol.
  doi: 10.1007/978-1-4939-6783-4_2
– volume: 6
  start-page: R44
  year: 2005
  ident: key 20180103185555_B23
  article-title: The Sequence Ontology: a tool for the unification of genome annotations
  publication-title: Genome Biol.
  doi: 10.1186/gb-2005-6-5-r44
– volume: 213
  start-page: 343
  year: 2016
  ident: key 20180103185555_B17
  article-title: TANGO1 and Mia2/cTAGE5 (TALI) cooperate to export bulky pre-chylomicrons/VLDLs from the endoplasmic reticulum
  publication-title: J. Cell Biol.
  doi: 10.1083/jcb.201603072
– volume: 52
  start-page: 1775
  year: 2011
  ident: key 20180103185555_B18
  article-title: Reduced cholesterol and triglycerides in mice with a mutation in Mia2, a liver protein that localizes to ER exit sites
  publication-title: J. Lipid Res.
  doi: 10.1194/jlr.M017277
– volume: 44
  start-page: D733
  year: 2016
  ident: key 20180103185555_B1
  article-title: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv1189
– volume: 45
  start-page: D619
  year: 2017
  ident: key 20180103185555_B7
  article-title: Genenames.org: the HGNC and VGNC resources in 2017
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkw1033
– volume: 2015
  start-page: 626
  year: 2015
  ident: key 20180103185555_B12
  article-title: Whole-exome enrichment with the Agilent SureSelect human all exon platform
  publication-title: Cold Spring Harb. Protoc.
– volume: 42
  start-page: D865
  year: 2014
  ident: key 20180103185555_B5
  article-title: Current status and new features of the Consensus Coding Sequence database
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkt1059
– volume: 5
  start-page: 621
  year: 2008
  ident: key 20180103185555_B24
  article-title: Mapping and quantifying mammalian transcriptomes by RNA-Seq
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.1226
– volume: 22
  start-page: 1760
  year: 2012
  ident: key 20180103185555_B6
  article-title: GENCODE: the reference human genome annotation for The ENCODE Project
  publication-title: Genome Res.
  doi: 10.1101/gr.135350.111
– volume: 42
  start-page: D771
  year: 2014
  ident: key 20180103185555_B15
  article-title: The Vertebrate Genome Annotation browser 10 years on
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkt1241
– volume: 27
  start-page: 849
  year: 2017
  ident: key 20180103185555_B20
  article-title: Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
  publication-title: Genome Res.
  doi: 10.1101/gr.213611.116
– volume: 7
  start-page: 67
  year: 2014
  ident: key 20180103185555_B11
  article-title: High throughput exome coverage of clinically relevant cardiac genes
  publication-title: BMC Med. Genomics
  doi: 10.1186/s12920-014-0067-8
– volume: 1488
  start-page: 47
  year: 2017
  ident: key 20180103185555_B8
  article-title: Mouse Genome Informatics (MGI): resources for mining mouse genetic, genomic, and biological data in support of primary and translational research
  publication-title: Methods Mol. Biol.
  doi: 10.1007/978-1-4939-6427-7_3
– volume: 22
  start-page: 1173
  year: 2012
  ident: key 20180103185555_B26
  article-title: A quantitative atlas of polyadenylation in five mammals
  publication-title: Genome Res.
  doi: 10.1101/gr.132563.111
– volume: 2012
  start-page: bas008
  year: 2012
  ident: key 20180103185555_B4
  article-title: Tracking and coordinating an international curation effort for the CCDS Project
  publication-title: Database
– volume: 27
  start-page: 685
  year: 2017
  ident: key 20180103185555_B22
  article-title: Mining for Micropeptides
  publication-title: Trends Cell Biol.
  doi: 10.1016/j.tcb.2017.04.006
– volume: 37
  start-page: 103
  year: 2015
  ident: key 20180103185555_B21
  article-title: Identifying (non-)coding RNAs and small peptides: challenges and opportunities
  publication-title: Bioessays
  doi: 10.1002/bies.201400103
– volume: 112
  start-page: 5473
  year: 2015
  ident: key 20180103185555_B9
  article-title: Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants
  publication-title: Proc. Natl. Acad. Sci. U.S.A.
  doi: 10.1073/pnas.1418631112
– volume: 2016
  start-page: 1
  year: 2016
  ident: key 20180103185555_B2
  article-title: The Ensembl gene annotation system
  publication-title: Database
– volume: 19
  start-page: 1316
  year: 2009
  ident: key 20180103185555_B3
  article-title: The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes
  publication-title: Genome Res.
  doi: 10.1101/gr.080531.108
– volume: 44
  start-page: D48
  year: 2016
  ident: key 20180103185555_B13
  article-title: The International Nucleotide Sequence Database Collaboration
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv1323
– volume: 100
  start-page: 15776
  year: 2003
  ident: key 20180103185555_B25
  article-title: Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage
  publication-title: Proc. Natl. Acad. Sci. U.S.A.
  doi: 10.1073/pnas.2136655100
– volume: 27
  start-page: i275
  year: 2011
  ident: key 20180103185555_B16
  article-title: PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr209
– volume: 9
  start-page: e1001091
  year: 2011
  ident: key 20180103185555_B19
  article-title: Modernizing reference genome assemblies
  publication-title: PLoS Biol.
  doi: 10.1371/journal.pbio.1001091
SSID ssj0014154
Score 2.5380216
Snippet Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse...
The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference...
SourceID pubmedcentral
proquest
pubmed
crossref
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage D221
SubjectTerms Animals
Consensus Sequence
Data Curation - methods
Data Curation - standards
Database Issue
Databases, Genetic - standards
Guidelines as Topic
Humans
Mice
Molecular Sequence Annotation
National Library of Medicine (U.S.)
Open Reading Frames
United States
User-Computer Interface
Title Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation
URI https://www.ncbi.nlm.nih.gov/pubmed/29126148
https://www.proquest.com/docview/1963271996
https://pubmed.ncbi.nlm.nih.gov/PMC5753299
Volume 46
WOSCitedRecordID wos000419550700035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1362-4962
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014154
  issn: 0305-1048
  databaseCode: DOA
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1362-4962
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014154
  issn: 0305-1048
  databaseCode: TOX
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dT9swED8BmjRegPFZxsohoWkgRaWOGzt7Qx2IJzYJJvUtchxnq9hS1KbT4Jk_fHdOUlGEBq_JJXHsi-8zvx_AYaYysiKRCcg8uEB2TRiYzESB6uUypaBIaWM92YS6vNSDQfytbpCdPFPCj8NOYcadHzd_mY-AttpuTzNRwfXXwaxYQDaoQonyoJpS17_hPbl2zvDM_cz2yKd82hr5yNacr752lGuwUnuTeFot_ztYcMU6bJwWFEn_vsOP6Ps7feJ8Hd72G263DXhgnk4muZigHbH1wqalGj_1-1-ujpA7R9nCfUaDTbpheO8yEixxlKPn9kM6jpw7cOjxHoZFUN-O-R7oETiZ3nrk9AzTO_RsAiXaaaV1m_D9_Oy6fxHUfAyBlV1RBsoqdmBOpIlTI_Kcgh8V20jmYezIrci1E8ymLkQmciUtRW4yzbXMTnqp5fJruAVLxahwO4Cxocg00hFpgpLaOe0i5YSzUiqbkc_RguNmsRJbg5UzZ8avpCqahwnNd1LPdwsOZ8K3FUbH82L7tOr_lzhoNCKhBeHSiSkczWLC-xS9HAWHLdiuNGR2IxF3BcOptkDN6c5MgBG8588Uw58eyZt85ZD8gd0XR_YelslT0z73I_dgqRxP3Qd4Y_-Uw8m4DYtqoNs-ndD2n8Y_gU8L0Q
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Consensus+coding+sequence+%28CCDS%29+database%3A+a+standardized+set+of+human+and+mouse+protein-coding+regions+supported+by+expert+curation&rft.jtitle=Nucleic+acids+research&rft.au=Pujar%2C+Shashikant&rft.au=O%E2%80%99Leary%2C+Nuala+A&rft.au=Farrell%2C+Catherine+M&rft.au=Loveland%2C+Jane+E&rft.date=2018-01-04&rft.issn=0305-1048&rft.eissn=1362-4962&rft.volume=46&rft.issue=D1&rft.spage=D221&rft.epage=D228&rft_id=info:doi/10.1093%2Fnar%2Fgkx1031&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_nar_gkx1031
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0305-1048&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0305-1048&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0305-1048&client=summon