DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications

The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extra...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Nature communications Ročník 14; číslo 1; s. 5045 - 18
Hlavní autoři: Rajan, Kohulan, Brinkhaus, Henning Otto, Agea, M. Isabel, Zielesny, Achim, Steinbeck, Christoph
Médium: Journal Article
Jazyk:angličtina
Vydáno: London Nature Publishing Group UK 19.08.2023
Nature Publishing Group
Nature Portfolio
Témata:
ISSN:2041-1723, 2041-1723
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai . Chemical structures are typically published as nonmachine-readable images in scientific literature. Here, the authors present DECIMER.ai, an open platform for translating chemical structures in publications into machine-readable representations.
AbstractList Abstract The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai .
The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai. Chemical structures are typically published as nonmachine-readable images in scientific literature. Here, the authors present DECIMER.ai, an open platform for translating chemical structures in publications into machine-readable representations.
The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai .
The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai . Chemical structures are typically published as nonmachine-readable images in scientific literature. Here, the authors present DECIMER.ai, an open platform for translating chemical structures in publications into machine-readable representations.
The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai .The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai .
ArticleNumber 5045
Author Zielesny, Achim
Agea, M. Isabel
Rajan, Kohulan
Brinkhaus, Henning Otto
Steinbeck, Christoph
Author_xml – sequence: 1
  givenname: Kohulan
  orcidid: 0000-0003-1066-7792
  surname: Rajan
  fullname: Rajan, Kohulan
  organization: Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena
– sequence: 2
  givenname: Henning Otto
  orcidid: 0000-0002-6664-2183
  surname: Brinkhaus
  fullname: Brinkhaus, Henning Otto
  organization: Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena
– sequence: 3
  givenname: M. Isabel
  orcidid: 0000-0002-3017-7742
  surname: Agea
  fullname: Agea, M. Isabel
  organization: Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague
– sequence: 4
  givenname: Achim
  orcidid: 0000-0003-0722-4229
  surname: Zielesny
  fullname: Zielesny, Achim
  organization: Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences
– sequence: 5
  givenname: Christoph
  orcidid: 0000-0001-6966-0814
  surname: Steinbeck
  fullname: Steinbeck, Christoph
  email: christoph.steinbeck@uni-jena.de
  organization: Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena
BookMark eNp9Ustu1DAUtVARLaU_wMoSGxak-JXYZoPQMMBIRUgI1pbj2KlHiT3YCRIfwT_jSUZAu6gX9n2cc3yvfZ-CsxCDBeA5RtcYUfE6M8waXiFCK4a4IBV6BC4IYrjCnNCz_-xzcJXzHpVFJRaMPQHnlNdSYIEuwO_3283u8_brtfZvoA4wHmyAh0FPLqYRlg3qeYqjnmxXcpM3eoDm1o6Lkac0m2lOFvrOhsm7Ep18DK9gtv1YIotXZDuYrIl98IvvA8zGnwjwMLfDiZefgcdOD9lenc5L8P3D9tvmU3Xz5eNu8-6mMjUWU0U4rxtmTWOIbh02FBPasqYtzUqMNHctbWjDKdYYdZjq2mlkXC1ZTRotakEvwW7V7aLeq0Pyo06_VNReLYGYeqVTaXawSghGJKlNR1HNWu6E5W1HmZSyda3RpGi9XbVKI6PtTOkr6eGO6N1M8Leqjz8VRoxKiZui8PKkkOKP2eZJjT4bOww62DhnRURNJWsoOhb-4h50H-cUylstKEyRpEcUWVEmxZyTdX-rwUgdp0et06PK9KhlehQqJHGPZPz6gaVqPzxMpSs1l3tCb9O_qh5g_QEsy9td
CitedBy_id crossref_primary_10_1002_adma_202405163
crossref_primary_10_3390_clockssleep7030030
crossref_primary_10_3897_rio_10_e124884
crossref_primary_10_1186_s13321_024_00823_2
crossref_primary_10_1360_SSC_2025_0111
crossref_primary_10_1002_adhm_202404261
crossref_primary_10_1002_ciuz_202400012
crossref_primary_10_1038_s41524_025_01538_0
crossref_primary_10_1016_j_ynexs_2025_100083
crossref_primary_10_1038_s41467_024_50779_y
crossref_primary_10_1186_s13321_024_00941_x
crossref_primary_10_1002_pro_70251
crossref_primary_10_1021_acs_analchem_5c00510
crossref_primary_10_1186_s13321_025_01094_1
crossref_primary_10_18311_jnr_2025_48515
crossref_primary_10_1021_acs_jcim_5c01552
crossref_primary_10_1093_nar_gkad944
crossref_primary_10_3389_fmicb_2025_1611403
crossref_primary_10_1080_13543776_2024_2412567
crossref_primary_10_1039_D3SC07012C
crossref_primary_10_1038_s43588_024_00699_0
crossref_primary_10_1007_s40747_024_01561_6
crossref_primary_10_1186_s13321_024_00872_7
crossref_primary_10_1186_s13321_023_00783_z
crossref_primary_10_3762_bjoc_20_144
Cites_doi 10.1186/s13321-020-00469-w
10.1186/s13321-019-0398-8
10.1002/zaac.202000339
10.1038/s41597-020-00602-2
10.1021/acs.jcim.6b00207
10.1186/s13321-022-00624-5
10.1038/sdata.2018.111
10.1016/j.jmb.2022.167514
10.1021/ci800067r
10.1016/j.sbi.2023.102542
10.1186/s13321-020-00465-0
10.1093/nar/gky949
10.1021/acs.jcim.2c00733
10.1186/s13321-022-00620-9
10.1093/nar/gkv1031
10.1186/s13321-021-00512-4
10.1021/ci00065a003
10.1039/D1DD00013F
10.1186/1758-2946-3-S1-P3
10.1002/qsar.200290002
10.1021/acs.jcim.1c01199
10.1038/s41597-019-0306-0
10.1093/nar/gkr777
10.1186/s13321-021-00496-1
10.1038/s41597-022-01355-w
10.1021/jacs.1c09820
10.1186/s13321-017-0220-4
10.1186/s13321-022-00642-3
10.1186/s13321-022-00609-4
10.1021/acs.jcim.8b00669
10.1111/j.1469-8137.1912.tb05611.x
10.1186/s13321-022-00616-5
10.1021/acs.jcim.0c00459
10.1021/ci800449t
10.1002/bimj.200410135
10.1186/s13321-021-00538-8
10.1039/D1SC02957F
10.1088/2632-2153/aba947
10.1021/ci00008a018
10.1093/nar/gkaa971
10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
10.1021/acs.jcim.1c00446
10.1186/s13321-020-00478-9
10.5281/zenodo.7655952.
10.1109/ICDAR.2019.00166
10.5281/ZENODO.6670746
10.1117/12.912185
10.5281/zenodo.8139383
10.1039/D1SC01839F
10.1109/ICCV.2017.322
10.1109/CVPR.2018.00474
10.1093/bib/bbac033
10.3115/1073083.1073135
10.1002/cmtd.202100069
10.6028/NIST.SP.500-296.chemical-GGA
10.5281/zenodo.7299334
10.5281/zenodo.8139328
10.26434/chemrxiv.7097960.v1
10.5281/zenodo.8146292
10.5281/ZENODO.7228583
10.5281/zenodo.7624994
ContentType Journal Article
Copyright The Author(s) 2023. corrected publication 2023
The Author(s) 2023. corrected publication 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2023. Springer Nature Limited.
Springer Nature Limited 2023
Copyright_xml – notice: The Author(s) 2023. corrected publication 2023
– notice: The Author(s) 2023. corrected publication 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2023. Springer Nature Limited.
– notice: Springer Nature Limited 2023
DBID C6C
AAYXX
CITATION
3V.
7QL
7QP
7QR
7SN
7SS
7ST
7T5
7T7
7TM
7TO
7X7
7XB
88E
8AO
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
C1K
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
H94
HCIFZ
K9.
LK8
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
RC3
SOI
7X8
5PM
DOA
DOI 10.1038/s41467-023-40782-0
DatabaseName Springer Nature OA Free Journals (WRLC)
CrossRef
ProQuest Central (Corporate)
Bacteriology Abstracts (Microbiology B)
Calcium & Calcified Tissue Abstracts
Chemoreception Abstracts
Ecology Abstracts
Entomology Abstracts (Full archive)
Environment Abstracts
Immunology Abstracts
Industrial and Applied Microbiology Abstracts (Microbiology A)
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni Edition)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
ProQuest Technology Collection
Natural Science Collection
Environmental Sciences and Pollution Management
ProQuest One Community College
ProQuest Central Korea
Engineering Research Database
ProQuest Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Biological Science Collection
Health & Medical Collection (Alumni Edition)
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Genetics Abstracts
Environment Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
ProQuest Central Student
Oncogenes and Growth Factors Abstracts
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
Chemoreception Abstracts
Industrial and Applied Microbiology Abstracts (Microbiology A)
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Ecology Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Entomology Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Bacteriology Abstracts (Microbiology B)
AIDS and Cancer Research Abstracts
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
Immunology Abstracts
Environment Abstracts
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList

Publicly Available Content Database
CrossRef

MEDLINE - Academic
Database_xml – sequence: 1
  dbid: DOA
  name: Open Access资源_DOAJ
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: PIMPY
  name: ProQuest - Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2041-1723
EndPage 18
ExternalDocumentID oai_doaj_org_article_8842925cd3054b7f8e7bd34999bfbca2
PMC10439916
10_1038_s41467_023_40782_0
GrantInformation_xml – fundername: Deutsche Forschungsgemeinschaft (German Research Foundation)
  grantid: 239748522; 239748522
  funderid: https://doi.org/10.13039/501100001659
– fundername: Carl-Zeiss-Stiftung (Carl Zeiss Foundation)
  funderid: https://doi.org/10.13039/100007569
– fundername: Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
  grantid: LM2023052
  funderid: https://doi.org/10.13039/501100001823
– fundername: ;
– fundername: ;
  grantid: 239748522; 239748522
– fundername: ;
  grantid: LM2023052
GroupedDBID ---
0R~
39C
3V.
53G
5VS
70F
7X7
88E
8AO
8FE
8FG
8FH
8FI
8FJ
AAHBH
AAJSJ
ABUWG
ACGFO
ACGFS
ACIWK
ACMJI
ACPRK
ACSMW
ADBBV
ADFRT
ADMLS
ADRAZ
AENEX
AEUYN
AFKRA
AFRAH
AHMBA
AJTQC
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AMTXH
AOIJS
ARAPS
ASPBG
AVWKF
AZFZN
BBNVY
BCNDV
BENPR
BGLVJ
BHPHI
BPHCQ
BVXVI
C6C
CCPQU
DIK
EBLON
EBS
EE.
EMOBN
F5P
FEDTE
FYUFA
GROUPED_DOAJ
HCIFZ
HMCUK
HVGLF
HYE
HZ~
KQ8
LGEZI
LK8
LOTEE
M1P
M48
M7P
M~E
NADUK
NAO
NXXTH
O9-
OK1
P2P
P62
PIMPY
PQQKQ
PROAC
PSQYO
RNS
RNT
RNTTT
RPM
SNYQT
SV3
TSG
UKHRP
AASML
AAYXX
AFFHD
CITATION
PHGZM
PHGZT
PJZUB
PPXIY
PQGLB
7QL
7QP
7QR
7SN
7SS
7ST
7T5
7T7
7TM
7TO
7XB
8FD
8FK
AZQEC
C1K
DWQXO
FR3
GNUQQ
H94
K9.
P64
PKEHL
PQEST
PQUKI
PRINS
RC3
SOI
7X8
PUEGO
5PM
ID FETCH-LOGICAL-c518t-277564ec6c2abf1c3123b46b723910a7fb3636731a10d13a5fa0cf594526a8583
IEDL.DBID M7P
ISICitedReferencesCount 35
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001051577000018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2041-1723
IngestDate Fri Oct 03 12:38:37 EDT 2025
Tue Nov 04 02:06:16 EST 2025
Thu Sep 04 15:37:40 EDT 2025
Tue Oct 07 06:58:42 EDT 2025
Sat Nov 29 03:29:21 EST 2025
Tue Nov 18 21:08:58 EST 2025
Fri Feb 21 02:38:10 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c518t-277564ec6c2abf1c3123b46b723910a7fb3636731a10d13a5fa0cf594526a8583
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-6966-0814
0000-0003-0722-4229
0000-0002-3017-7742
0000-0002-6664-2183
0000-0003-1066-7792
OpenAccessLink https://www.proquest.com/docview/2853130938?pq-origsite=%requestingapplication%
PMID 37598180
PQID 2853130938
PQPubID 546298
PageCount 18
ParticipantIDs doaj_primary_oai_doaj_org_article_8842925cd3054b7f8e7bd34999bfbca2
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10439916
proquest_miscellaneous_2853946308
proquest_journals_2853130938
crossref_primary_10_1038_s41467_023_40782_0
crossref_citationtrail_10_1038_s41467_023_40782_0
springer_journals_10_1038_s41467_023_40782_0
PublicationCentury 2000
PublicationDate 2023-08-19
PublicationDateYYYYMMDD 2023-08-19
PublicationDate_xml – month: 08
  year: 2023
  text: 2023-08-19
  day: 19
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
PublicationTitle Nature communications
PublicationTitleAbbrev Nat Commun
PublicationYear 2023
Publisher Nature Publishing Group UK
Nature Publishing Group
Nature Portfolio
Publisher_xml – name: Nature Publishing Group UK
– name: Nature Publishing Group
– name: Nature Portfolio
References Huang, Cole (CR46) 2020; 7
Gaulton (CR74) 2012; 40
Oldenhof, Arany, Moreau, Simm (CR21) 2020; 60
CR38
Mavračić, Court, Isazawa, Elliott, Cole (CR41) 2021; 61
Weir (CR58) 2021; 12
CR35
Rajan, Steinbeck, Zielesny (CR55) 2022; 1
CR34
CR77
CR32
Valko, Johnson (CR40) 2009; 49
CR76
CR30
CR73
CR72
McDaniel, Balmuth (CR11) 1992; 32
CR71
CR70
Jaccard (CR33) 1912; 11
Brinkhaus, Rajan, Zielesny, Steinbeck (CR37) 2022; 14
Terlouw, Vromans, Medema (CR31) 2022; 14
Rozas, Fernandez (CR10) 1990; 30
Court, Cole (CR44) 2018; 5
CR4
Beard, Cole (CR45) 2022; 9
Hormazabal (CR39) 2022; 35
CR49
Filippov, Nicklaus (CR12) 2009; 49
CR48
Xu, Li, Yang, Li, Li (CR24) 2022; 14
CR47
CR89
CR88
Rajan, Brinkhaus, Sorokina, Zielesny, Steinbeck (CR26) 2021; 13
CR87
CR86
CR85
Rajan, Zielesny, Steinbeck (CR17) 2020; 12
CR84
CR83
CR81
Herres-Pawlis, Liermann, Koepler (CR2) 2020; 646
Youden (CR78) 1950; 3
Swain, Cole (CR8) 2016; 56
Musazade, Jamalova, Hasanov (CR20) 2022; 14
Rajan, Brinkhaus, Zielesny, Steinbeck (CR19) 2020; 12
Kim (CR50) 2021; 49
Ashton (CR51) 2002; 21
(CR7) 2019; 47
CR15
CR59
CR14
CR13
CR57
CR56
Rajan, Zielesny, Steinbeck (CR82) 2021; 13
Kearnes (CR5) 2021; 143
Isazawa, Cole (CR42) 2022; 62
CR53
Sorokina, Merseburger, Rajan, Yirik, Steinbeck (CR75) 2021; 13
Brinkhaus, Zielesny, Steinbeck, Rajan (CR67) 2022; 14
Hastings (CR80) 2016; 44
Xu (CR23) 2022; 62
Fluss, Faraggi, Reiser (CR79) 2005; 47
Krenn, Häse, Nigam, Friederich, Aspuru-Guzik (CR54) 2020; 1
Contreras, Leonor Contreras, Allendes, Tomas Alvarez, Rozas (CR9) 1990; 30
Dalke (CR52) 2019; 11
Brinkhaus, Rajan, Schaub, Zielesny, Steinbeck (CR1) 2023; 79
Staker, Marshall, Abel, McQuaw (CR16) 2019; 59
Steinbeck (CR3) 2020; 6
CR29
CR27
Beard, Sivaraman, Vázquez-Mayagoitia, Vishwanath, Cole (CR43) 2019; 6
CR25
Karulin, Kozhevnikov (CR36) 2011; 3
CR69
CR68
CR22
CR66
CR65
CR64
CR63
CR62
Willighagen (CR28) 2017; 9
CR61
CR60
Kim (CR6) 2022; 434
Rajan, Zielesny, Steinbeck (CR18) 2021; 13
MC Swain (40782_CR8) 2016; 56
B Karulin (40782_CR36) 2011; 3
S Kim (40782_CR6) 2022; 434
R Rozas (40782_CR10) 1990; 30
S Huang (40782_CR46) 2020; 7
A Gaulton (40782_CR74) 2012; 40
K Rajan (40782_CR18) 2021; 13
40782_CR89
HO Brinkhaus (40782_CR37) 2022; 14
M Ashton (40782_CR51) 2002; 21
T Isazawa (40782_CR42) 2022; 62
40782_CR87
Y Xu (40782_CR23) 2022; 62
K Rajan (40782_CR55) 2022; 1
40782_CR88
40782_CR49
40782_CR47
R Hormazabal (40782_CR39) 2022; 35
J Mavračić (40782_CR41) 2021; 61
40782_CR48
40782_CR81
WJ Youden (40782_CR78) 1950; 3
EJ Beard (40782_CR45) 2022; 9
M Sorokina (40782_CR75) 2021; 13
40782_CR85
40782_CR86
SM Kearnes (40782_CR5) 2021; 143
40782_CR83
M Oldenhof (40782_CR21) 2020; 60
H Weir (40782_CR58) 2021; 12
40782_CR84
C Steinbeck (40782_CR3) 2020; 6
S Kim (40782_CR50) 2021; 49
40782_CR29
Z Xu (40782_CR24) 2022; 14
HO Brinkhaus (40782_CR67) 2022; 14
K Rajan (40782_CR26) 2021; 13
40782_CR34
40782_CR35
40782_CR32
40782_CR76
40782_CR77
40782_CR38
CJ Court (40782_CR44) 2018; 5
40782_CR70
40782_CR71
K Rajan (40782_CR17) 2020; 12
40782_CR30
40782_CR72
40782_CR73
IV Filippov (40782_CR12) 2009; 49
ML Contreras (40782_CR9) 1990; 30
JR McDaniel (40782_CR11) 1992; 32
wwPDB consortium. (40782_CR7) 2019; 47
P Jaccard (40782_CR33) 1912; 11
40782_CR4
J Hastings (40782_CR80) 2016; 44
40782_CR68
40782_CR65
40782_CR22
40782_CR66
40782_CR27
40782_CR25
40782_CR69
40782_CR60
40782_CR63
BR Terlouw (40782_CR31) 2022; 14
40782_CR64
EL Willighagen (40782_CR28) 2017; 9
40782_CR61
40782_CR62
R Fluss (40782_CR79) 2005; 47
AT Valko (40782_CR40) 2009; 49
K Rajan (40782_CR19) 2020; 12
M Krenn (40782_CR54) 2020; 1
J Staker (40782_CR16) 2019; 59
F Musazade (40782_CR20) 2022; 14
EJ Beard (40782_CR43) 2019; 6
HO Brinkhaus (40782_CR1) 2023; 79
A Dalke (40782_CR52) 2019; 11
40782_CR56
40782_CR13
40782_CR57
K Rajan (40782_CR82) 2021; 13
40782_CR14
40782_CR15
40782_CR59
S Herres-Pawlis (40782_CR2) 2020; 646
40782_CR53
References_xml – ident: CR70
– ident: CR22
– ident: CR49
– ident: CR68
– volume: 12
  start-page: 65
  year: 2020
  ident: CR17
  article-title: DECIMER: towards deep learning for chemical image recognition
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00469-w
– ident: CR4
– ident: CR87
– volume: 11
  start-page: 76
  year: 2019
  ident: CR52
  article-title: The chemfp project
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-019-0398-8
– volume: 646
  start-page: 1748
  year: 2020
  end-page: 1757
  ident: CR2
  article-title: Research data in chemistry–results of the first NFDI4Chem community survey
  publication-title: Z. Anorg. Allg. Chem.
  doi: 10.1002/zaac.202000339
– volume: 7
  year: 2020
  ident: CR46
  article-title: A database of battery materials auto-generated using ChemDataExtractor
  publication-title: Sci. Data
  doi: 10.1038/s41597-020-00602-2
– volume: 56
  start-page: 1894
  year: 2016
  end-page: 1904
  ident: CR8
  article-title: ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.6b00207
– volume: 14
  start-page: 41
  year: 2022
  ident: CR24
  article-title: SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00624-5
– volume: 5
  year: 2018
  ident: CR44
  article-title: Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
  publication-title: Sci. Data
  doi: 10.1038/sdata.2018.111
– volume: 434
  start-page: 167514
  year: 2022
  ident: CR6
  article-title: PubChem protein, gene, pathway, and taxonomy data collections: bridging biology and chemistry through target-centric views of PubChem Data
  publication-title: J. Mol. Biol.
  doi: 10.1016/j.jmb.2022.167514
– ident: CR35
– ident: CR29
– volume: 35
  start-page: 27114
  year: 2022
  end-page: 27126
  ident: CR39
  article-title: CEDe: a collection of expert-curated datasets with atom-level entity annotations for optical chemical structure recognition
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: CR61
– ident: CR77
– volume: 49
  start-page: 740
  year: 2009
  end-page: 743
  ident: CR12
  article-title: Optical structure recognition software to recover chemical information: OSRA, an open source solution
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci800067r
– ident: CR84
– volume: 79
  start-page: 102542
  year: 2023
  ident: CR1
  article-title: Open data and algorithms for open science in AI-driven molecular informatics
  publication-title: Curr. Opin. Struct. Biol.
  doi: 10.1016/j.sbi.2023.102542
– ident: CR25
– volume: 12
  start-page: 60
  year: 2020
  ident: CR19
  article-title: A review of optical chemical structure recognition tools
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00465-0
– volume: 47
  start-page: D520
  year: 2019
  end-page: D528
  ident: CR7
  article-title: Protein data bank: the single global archive for 3D macromolecular structure data
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky949
– ident: CR71
– volume: 62
  start-page: 5321
  year: 2022
  end-page: 5328
  ident: CR23
  article-title: MolMiner: you only look once for chemical structure recognition
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.2c00733
– volume: 14
  start-page: 36
  year: 2022
  ident: CR67
  article-title: DECIMER-hand-drawn molecule images dataset
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00620-9
– ident: CR15
– ident: CR88
– volume: 44
  start-page: D1214
  year: 2016
  end-page: D1219
  ident: CR80
  article-title: ChEBI in 2016: Improved services and an expanding collection of metabolites
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv1031
– volume: 13
  start-page: 34
  year: 2021
  ident: CR82
  article-title: STOUT: SMILES to IUPAC names using neural machine translation
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00512-4
– volume: 30
  start-page: 7
  year: 1990
  end-page: 12
  ident: CR10
  article-title: Automatic processing of graphics for image databases in science
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci00065a003
– volume: 1
  start-page: 84
  year: 2022
  end-page: 90
  ident: CR55
  article-title: Performance of chemical structure string representations for chemical image recognition using transformers
  publication-title: Digit. Discov.
  doi: 10.1039/D1DD00013F
– ident: CR57
– ident: CR32
– volume: 3
  start-page: 1
  year: 2011
  ident: CR36
  article-title: Ketcher: web-based chemical structure editor
  publication-title: J. Cheminform
  doi: 10.1186/1758-2946-3-S1-P3
– volume: 21
  start-page: 598
  year: 2002
  end-page: 604
  ident: CR51
  article-title: Identification of diverse database subsets using property-based and fragment-based molecular descriptions
  publication-title: Quant. Struct. Act. Relatsh.
  doi: 10.1002/qsar.200290002
– ident: CR60
– volume: 62
  start-page: 1207
  year: 2022
  end-page: 1213
  ident: CR42
  article-title: Single model for organic and inorganic chemical named entity recognition in ChemDataExtractor
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.1c01199
– ident: CR85
– volume: 6
  year: 2019
  ident: CR43
  article-title: Comparative dataset of experimental and computational attributes of UV/vis absorption spectra
  publication-title: Sci. Data
  doi: 10.1038/s41597-019-0306-0
– ident: CR81
– volume: 40
  start-page: D1100
  year: 2012
  end-page: D1107
  ident: CR74
  article-title: ChEMBL: a large-scale bioactivity database for drug discovery
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkr777
– ident: CR64
– volume: 13
  start-page: 20
  year: 2021
  ident: CR26
  article-title: DECIMER-segmentation: automated extraction of chemical structure depictions from scientific literature
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00496-1
– volume: 9
  year: 2022
  ident: CR45
  article-title: Perovskite- and dye-sensitized solar-cell device databases auto-generated using ChemDataExtractor
  publication-title: Sci. Data
  doi: 10.1038/s41597-022-01355-w
– volume: 143
  start-page: 18820
  year: 2021
  end-page: 18826
  ident: CR5
  article-title: The open reaction database
  publication-title: J. Am. Chem. Soc.
  doi: 10.1021/jacs.1c09820
– volume: 30
  start-page: 302
  year: 1990
  end-page: 307
  ident: CR9
  article-title: Computational perception and recognition of digitized molecular structures
  publication-title: J. Chem. Inf. Model.
– volume: 9
  start-page: 33
  year: 2017
  ident: CR28
  article-title: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-017-0220-4
– volume: 14
  start-page: 61
  year: 2022
  ident: CR20
  article-title: Review of techniques and models used in optical chemical structure recognition in images and scanned documents
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00642-3
– volume: 14
  start-page: 31
  year: 2022
  ident: CR37
  article-title: RanDepict: random chemical structure depiction generator
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00609-4
– ident: CR66
– ident: CR47
– ident: CR72
– volume: 59
  start-page: 1017
  year: 2019
  end-page: 1029
  ident: CR16
  article-title: Molecular structure extraction from documents using deep learning
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.8b00669
– ident: CR14
– ident: CR53
– ident: CR89
– ident: CR30
– volume: 11
  start-page: 37
  year: 1912
  end-page: 50
  ident: CR33
  article-title: The distribution of the flora in the alpine zone.1
  publication-title: New Phytol.
  doi: 10.1111/j.1469-8137.1912.tb05611.x
– volume: 14
  start-page: 34
  year: 2022
  ident: CR31
  article-title: PIKAChU: a Python-based informatics kit for analysing chemical units
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00616-5
– volume: 60
  start-page: 4506
  year: 2020
  end-page: 4517
  ident: CR21
  article-title: ChemGrapher: optical graph recognition of chemical compounds by deep learning
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.0c00459
– ident: CR56
– ident: CR86
– volume: 49
  start-page: 780
  year: 2009
  end-page: 787
  ident: CR40
  article-title: CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci800449t
– ident: CR63
– ident: CR27
– volume: 47
  start-page: 458
  year: 2005
  end-page: 472
  ident: CR79
  article-title: Estimation of the Youden Index and its associated cutoff point
  publication-title: Biom. J.
  doi: 10.1002/bimj.200410135
– volume: 13
  start-page: 61
  year: 2021
  ident: CR18
  article-title: DECIMER 1.0: deep learning for chemical image recognition using transformers
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00538-8
– volume: 12
  start-page: 10622
  year: 2021
  end-page: 10633
  ident: CR58
  article-title: ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning
  publication-title: Chem. Sci.
  doi: 10.1039/D1SC02957F
– ident: CR69
– volume: 1
  start-page: 045024
  year: 2020
  ident: CR54
  article-title: Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation
  publication-title: Mach. Learn. Sci. Technol.
  doi: 10.1088/2632-2153/aba947
– ident: CR48
– ident: CR73
– ident: CR65
– volume: 6
  start-page: e55852
  year: 2020
  ident: CR3
  article-title: NFDI4Chem-towards a national research data infrastructure for chemistry in Germany
  publication-title: Riogrande Odontol.
– ident: CR38
– volume: 32
  start-page: 373
  year: 1992
  end-page: 378
  ident: CR11
  article-title: Kekule: OCR-optical chemical (structure) recognition
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci00008a018
– volume: 49
  start-page: D1388
  year: 2021
  end-page: D1395
  ident: CR50
  article-title: PubChem in 2021: new data content and improved web interfaces
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkaa971
– ident: CR13
– ident: CR34
– volume: 3
  start-page: 32
  year: 1950
  end-page: 35
  ident: CR78
  article-title: Index for rating diagnostic tests
  publication-title: Cancer
  doi: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
– volume: 61
  start-page: 4280
  year: 2021
  end-page: 4289
  ident: CR41
  article-title: ChemDataExtractor 2.0: autopopulated ontologies for materials science
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.1c00446
– ident: CR59
– ident: CR76
– ident: CR83
– ident: CR62
– volume: 13
  start-page: 2
  year: 2021
  ident: CR75
  article-title: COCONUT online: collection of Open Natural Products database
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00478-9
– ident: 40782_CR48
– ident: 40782_CR73
– ident: 40782_CR87
  doi: 10.5281/zenodo.7655952.
– volume: 13
  start-page: 20
  year: 2021
  ident: 40782_CR26
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00496-1
– volume: 21
  start-page: 598
  year: 2002
  ident: 40782_CR51
  publication-title: Quant. Struct. Act. Relatsh.
  doi: 10.1002/qsar.200290002
– ident: 40782_CR81
  doi: 10.1109/ICDAR.2019.00166
– volume: 32
  start-page: 373
  year: 1992
  ident: 40782_CR11
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci00008a018
– volume: 62
  start-page: 5321
  year: 2022
  ident: 40782_CR23
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.2c00733
– volume: 47
  start-page: D520
  year: 2019
  ident: 40782_CR7
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky949
– volume: 13
  start-page: 61
  year: 2021
  ident: 40782_CR18
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00538-8
– volume: 9
  year: 2022
  ident: 40782_CR45
  publication-title: Sci. Data
  doi: 10.1038/s41597-022-01355-w
– ident: 40782_CR77
– ident: 40782_CR83
  doi: 10.5281/ZENODO.6670746
– volume: 56
  start-page: 1894
  year: 2016
  ident: 40782_CR8
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.6b00207
– volume: 14
  start-page: 61
  year: 2022
  ident: 40782_CR20
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00642-3
– volume: 1
  start-page: 045024
  year: 2020
  ident: 40782_CR54
  publication-title: Mach. Learn. Sci. Technol.
  doi: 10.1088/2632-2153/aba947
– ident: 40782_CR65
  doi: 10.1117/12.912185
– ident: 40782_CR29
– ident: 40782_CR88
  doi: 10.5281/zenodo.8139383
– volume: 434
  start-page: 167514
  year: 2022
  ident: 40782_CR6
  publication-title: J. Mol. Biol.
  doi: 10.1016/j.jmb.2022.167514
– ident: 40782_CR63
– volume: 3
  start-page: 32
  year: 1950
  ident: 40782_CR78
  publication-title: Cancer
  doi: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
– ident: 40782_CR15
  doi: 10.1039/D1SC01839F
– volume: 5
  year: 2018
  ident: 40782_CR44
  publication-title: Sci. Data
  doi: 10.1038/sdata.2018.111
– volume: 30
  start-page: 7
  year: 1990
  ident: 40782_CR10
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci00065a003
– volume: 14
  start-page: 41
  year: 2022
  ident: 40782_CR24
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00624-5
– ident: 40782_CR25
  doi: 10.1109/ICCV.2017.322
– ident: 40782_CR61
  doi: 10.1109/CVPR.2018.00474
– volume: 40
  start-page: D1100
  year: 2012
  ident: 40782_CR74
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkr777
– volume: 143
  start-page: 18820
  year: 2021
  ident: 40782_CR5
  publication-title: J. Am. Chem. Soc.
  doi: 10.1021/jacs.1c09820
– ident: 40782_CR32
– ident: 40782_CR38
  doi: 10.1093/bib/bbac033
– ident: 40782_CR4
– ident: 40782_CR70
– volume: 14
  start-page: 36
  year: 2022
  ident: 40782_CR67
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00620-9
– ident: 40782_CR34
  doi: 10.3115/1073083.1073135
– volume: 7
  year: 2020
  ident: 40782_CR46
  publication-title: Sci. Data
  doi: 10.1038/s41597-020-00602-2
– ident: 40782_CR57
– volume: 30
  start-page: 302
  year: 1990
  ident: 40782_CR9
  publication-title: J. Chem. Inf. Model.
– volume: 59
  start-page: 1017
  year: 2019
  ident: 40782_CR16
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.8b00669
– volume: 79
  start-page: 102542
  year: 2023
  ident: 40782_CR1
  publication-title: Curr. Opin. Struct. Biol.
  doi: 10.1016/j.sbi.2023.102542
– volume: 6
  year: 2019
  ident: 40782_CR43
  publication-title: Sci. Data
  doi: 10.1038/s41597-019-0306-0
– ident: 40782_CR60
– volume: 6
  start-page: e55852
  year: 2020
  ident: 40782_CR3
  publication-title: Riogrande Odontol.
– volume: 61
  start-page: 4280
  year: 2021
  ident: 40782_CR41
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.1c00446
– ident: 40782_CR22
  doi: 10.1002/cmtd.202100069
– ident: 40782_CR13
  doi: 10.6028/NIST.SP.500-296.chemical-GGA
– ident: 40782_CR64
– ident: 40782_CR47
– ident: 40782_CR68
– volume: 35
  start-page: 27114
  year: 2022
  ident: 40782_CR39
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: 40782_CR84
  doi: 10.5281/zenodo.7299334
– volume: 60
  start-page: 4506
  year: 2020
  ident: 40782_CR21
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.0c00459
– volume: 49
  start-page: D1388
  year: 2021
  ident: 40782_CR50
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkaa971
– volume: 11
  start-page: 76
  year: 2019
  ident: 40782_CR52
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-019-0398-8
– volume: 9
  start-page: 33
  year: 2017
  ident: 40782_CR28
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-017-0220-4
– volume: 12
  start-page: 60
  year: 2020
  ident: 40782_CR19
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00465-0
– volume: 14
  start-page: 31
  year: 2022
  ident: 40782_CR37
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00609-4
– ident: 40782_CR86
  doi: 10.5281/zenodo.8139328
– volume: 13
  start-page: 34
  year: 2021
  ident: 40782_CR82
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-021-00512-4
– volume: 12
  start-page: 65
  year: 2020
  ident: 40782_CR17
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00469-w
– ident: 40782_CR71
– ident: 40782_CR56
– volume: 14
  start-page: 34
  year: 2022
  ident: 40782_CR31
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-022-00616-5
– ident: 40782_CR53
  doi: 10.26434/chemrxiv.7097960.v1
– ident: 40782_CR89
  doi: 10.5281/zenodo.8146292
– ident: 40782_CR27
– ident: 40782_CR69
– volume: 3
  start-page: 1
  year: 2011
  ident: 40782_CR36
  publication-title: J. Cheminform
  doi: 10.1186/1758-2946-3-S1-P3
– volume: 62
  start-page: 1207
  year: 2022
  ident: 40782_CR42
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.1c01199
– ident: 40782_CR30
– volume: 13
  start-page: 2
  year: 2021
  ident: 40782_CR75
  publication-title: J. Cheminform.
  doi: 10.1186/s13321-020-00478-9
– ident: 40782_CR49
  doi: 10.5281/ZENODO.7228583
– ident: 40782_CR76
– ident: 40782_CR59
– ident: 40782_CR72
– ident: 40782_CR85
  doi: 10.5281/zenodo.7624994
– ident: 40782_CR62
– ident: 40782_CR35
– volume: 49
  start-page: 780
  year: 2009
  ident: 40782_CR40
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci800449t
– volume: 1
  start-page: 84
  year: 2022
  ident: 40782_CR55
  publication-title: Digit. Discov.
  doi: 10.1039/D1DD00013F
– ident: 40782_CR14
– volume: 11
  start-page: 37
  year: 1912
  ident: 40782_CR33
  publication-title: New Phytol.
  doi: 10.1111/j.1469-8137.1912.tb05611.x
– volume: 49
  start-page: 740
  year: 2009
  ident: 40782_CR12
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci800067r
– volume: 646
  start-page: 1748
  year: 2020
  ident: 40782_CR2
  publication-title: Z. Anorg. Allg. Chem.
  doi: 10.1002/zaac.202000339
– volume: 47
  start-page: 458
  year: 2005
  ident: 40782_CR79
  publication-title: Biom. J.
  doi: 10.1002/bimj.200410135
– volume: 44
  start-page: D1214
  year: 2016
  ident: 40782_CR80
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv1031
– volume: 12
  start-page: 10622
  year: 2021
  ident: 40782_CR58
  publication-title: Chem. Sci.
  doi: 10.1039/D1SC02957F
– ident: 40782_CR66
SSID ssj0000391844
Score 2.6081874
Snippet The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information...
Abstract The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical...
SourceID doaj
pubmedcentral
proquest
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 5045
SubjectTerms 631/114/1305
631/114/2164
631/114/2406
639/638/403
639/638/630
Applications programs
Automation
Computer vision
Datasets
Deep learning
Humanities and Social Sciences
Image segmentation
Information processing
Information retrieval
multidisciplinary
Natural language processing
Science
Science (multidisciplinary)
Source code
Translating
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA9yKPgifmL1lAi-efWaryb1Tc89FPQQUbi3kK9q4ewut7vC_RH3Pzv52HV7oL74UmiTtklmkplkZn6D0HMQ6l1gDnaqjMqae-nqrmF9LThR0vDOUJsChT_IkxN1etp92kn1FX3CMjxwHrhDpWJCJeE8MCa3sldBWs-inm5760xafUHr2dlMpTWYdbB14SVKpmHqcMnTmgAiqo6mK1o3E0mUAPsnWuZVH8krhtIkf45vo1tFccSvc4PvoGthvItu5FSSF_fQ5dvZ0fuPs88vzfAKmxHHrFh4cWZWUSnFcMFmvZqDeho8lKUDbOwKWADOILLr84AHX7yHEsEO8DJ8-1Gik0b4rMdbhyO4H0ac4ynjC3ixcwB4H309nn05eleXVAu1E0StaiqlaHlwraPG9sQxEGiWt1ZSGMjGyN6ylrWSEUMaT5gRvWlcL7qYoNwoodgDtDfOx_AQ4UaIQHrOnQHSBRhqWCKE9IoL753gtEJkM-zaFRzymA7jTCd7OFM6k0oDqXQilW4q9GL7ziKjcPy19ptIzW3NiKCdHgBf6cJX-l98VaH9DS_oMq2XmoJyQ6LtWFXo2bYYJmS0spgxzNe5Tsdb1kAdNeGhSYOmJePwPUF7kxipDBp7hQ427Pb773_u8aP_0ePH6CaN0yPi_Xb7aA9YLzxB193P1bA8f5rm1y8AKykJ
  priority: 102
  providerName: Directory of Open Access Journals
Title DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications
URI https://link.springer.com/article/10.1038/s41467-023-40782-0
https://www.proquest.com/docview/2853130938
https://www.proquest.com/docview/2853946308
https://pubmed.ncbi.nlm.nih.gov/PMC10439916
https://doaj.org/article/8842925cd3054b7f8e7bd34999bfbca2
Volume 14
WOSCitedRecordID wos001051577000018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: Open Access资源_DOAJ
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: M~E
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: P5Z
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: M7P
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: 7X7
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest - Publicly Available Content Database
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: PIMPY
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: BENPR
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfYBhIvfCMKozISbyws8Ufs8ILY6MQkVkUTSIWXyLGdEWlLuqZF4o_gf-bsuBmdxF54sZTaaS66893l7vw7hF6DUc8s1fClSomImBE6ymJaRZwlUiiWKVL6g8KfxXQqZ7MsDwG3LpRVrnWiV9Sm1S5Gvk_AriQubSffzy8j1zXKZVdDC40ttONQEqgv3cuHGItDP5eMhbMyMZX7HfOaAQxV5BJYJIo37JGH7d_wNa9XSl5Ll3ordHT_f-l_gO4F_xN_6AXmIbplm0foTt-R8tdj9Pvj5PD4ZHL6VtXvsGqwa66F5-dq6XxbDANWq2ULXq41MOfj4FgHzAHcY9GuFhbXJhQheb7v4c6eXYRDTg38rcFD3RJc1w3uj2W6G_D8rzjiE_T1aPLl8FMUOjZEmidyGREheMqsTjVRZZVoCnaxZGkpCHAiVqIqaUpTQROVxCahilcq1hXPXJ9zJbmkT9F20zb2GcIx5zapGNNKSmaBV6BpuDCScWM0Z2SEkjXfCh3gzF1XjfPCp9WpLHpeF8DrwvO6iEfozXDPvAfzuHH1gROHYaUD4vY_tIuzIuzrAogjGeHagN5kpaikFaWh7jOyrEqtgMzdtTgUQTt0xZUsjNCrYRr2tUvWqMa2q35NxlIawxq5IYQbBG3ONPUPjxCeuAPP4PiP0N5aXq-e_u83fn4zsS_QXeJ2jgMEznbRNgiVfYlu65_LuluM0ZaYCT_KMdo5mEzz07GPcIz9poQx599hJj8-yb_9AX7nPfY
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3JbtQwGLZKAZULO2KggJHgREMTL7GDhBC0U3XU6ahCRerNOLbTjlQywyygPgSvwjPy21lKKtFbD1wiJXYSx_n-xf43hF6BUM8cNbBSpUREzAoTZTEtIs4SKTTLNMlDoPBQjEby6Cg7WEG_m1gY71bZ8MTAqO3E-D3yTQJyJfFmO_lh-j3yVaO8dbUpoVHBYs-d_YQl2_z9YBv-72tCdvqHW7tRXVUgMjyRi4gIwVPmTGqIzovEUODdOUtzQSiITi2KnKY0FTTRSWwTqnmhY1PwzNfi1pJLCs-9hq6DGkFkcBU8aPd0fLZ1yVgdmxNTuTlngROBYIy8wYxEcUf-hTIBHd32omfmBfNskHo7d_63-bqLbtf6Nf5YEcQ9tOLK--hmVXHz7AH6td3fGuz3P7_V43dYl9gXD8PTU73wujuGA9bLxQS0eGehLezzY1PnVMBVrt3lzOGxrZ2sAq438Nwdf6uDuEp4rMWtXxacj0tchZ36G_D0r33Sh-jLlczFI7RaTkr3GOGYc5cUjBktJXOADeCkXFjJuLWGM9JDSYMTZep07b5qyKkKbgNUqgpbCrClArZU3ENv2numVbKSS3t_8vBre_pE4-HCZHasar6lYHAkI9xYkAssF4V0IrfUL5PzIjcahrnewE_V3G-uzrHXQy_bZuBb3hilSzdZVn0yltIY-sgO6DsD6raU45OQAT3xAd2wsOmhjYY-zt_-7y9-cvlgX6C13cP9oRoORntP0S3iqdYnP87W0SoAzD1DN8yPxXg-ex7IHqOvV003fwBxyZEH
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxELZKeYgLb0SggJHgRJfu-rH2IiEETSKilihCIPXmem1viVQ2IQ9QfwR_iF_H2PsoW4neeuASKbF3M9l8M2N7Zr5B6Dk49cxRAztVSkTErDBRFtMi4iyRQrNMkzwUCu-L8VgeHGSTDfS7qYXxaZWNTQyG2s6MPyPfIeBXEh-2kztFnRYx6Q_fzr9HvoOUj7Q27TQqiOy5k5-wfVu-GfXhv35ByHDwefdDVHcYiAxP5CoiQvCUOZMaovMiMRTseM7SXBAKblSLIqcpTQVNdBLbhGpe6NgUPPN9ubXkksJ9L6HLwpOWh7TBSXu-45nXJWN1nU4MYi9ZsErgJCMfPCNR3PGFoWVAZ517NkvzTKg2eMDhzf_52d1CN-p1N35XKcpttOHKO-hq1Ynz5C761R_sjj4OPr3S09dYl9g3FcPzY73ya3oML1ivVzNY3TsLY-H8H5uaawFXHLzrhcNTWydfBbxv46U7-lYXd5VwW4vbfC14Py1xVY7qL8Dzv85P76EvF_Is7qPNcla6BwjHnLukYMxoKZkDnICF5cJKxq01nJEeShrMKFPTuPtuIscqpBNQqSqcKcCZCjhTcQ-9bK-ZVyQm585-76HYzvQE5OGD2eJI1fZMgXAkI9xY8BcsF4V0IrfUb5_zIjcaxNxqoKhqq7hUpzjsoWftMNgzH6TSpZutqzkZS2kMc2RHAToCdUfK6dfAjJ74Qm_Y8PTQdqMrp9_-71_88Hxhn6JroC5qfzTee4SuE6_AnhM520KbgC_3GF0xP1bT5eJJsAAYHV602vwBDFyZxA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DECIMER.ai%3A+an+open+platform+for+automated+optical+chemical+structure+identification%2C+segmentation+and+recognition+in+scientific+publications&rft.jtitle=Nature+communications&rft.au=Rajan%2C+Kohulan&rft.au=Brinkhaus%2C+Henning+Otto&rft.au=Agea%2C+M.+Isabel&rft.au=Zielesny%2C+Achim&rft.date=2023-08-19&rft.issn=2041-1723&rft.eissn=2041-1723&rft.volume=14&rft.issue=1&rft_id=info:doi/10.1038%2Fs41467-023-40782-0&rft.externalDBID=n%2Fa&rft.externalDocID=10_1038_s41467_023_40782_0
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2041-1723&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2041-1723&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2041-1723&client=summon