A review of molecular representation in the age of machine learning

Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐re...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Wiley interdisciplinary reviews. Computational molecular science Ročník 12; číslo 5; s. e1603 - n/a
Hlavní autoři: Wigh, Daniel S., Goodman, Jonathan M., Lapkin, Alexei A.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Hoboken, USA Wiley Periodicals, Inc 01.09.2022
Wiley Subscription Services, Inc
Témata:
ISSN:1759-0876, 1759-0884
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be used in conjunction with a variational autoencoder (VAE) to yield a continuous representation of molecules. This is noteworthy because a continuous representation allows for efficient navigation of the immensely large chemical space of possible molecules. Since 2018, when the first model of this type was published, considerable effort has been put into developing novel and improved methodologies. Most, if not all, researchers in the community make their work easily accessible on GitHub, though discussion of computation time and domain of applicability is often overlooked. Herein, we present questions for consideration in future work which we believe will make chemical VAEs even more accessible. This article is categorized under: Data Science > Chemoinformatics Understanding how to best represent molecules in a machine‐readable format is a key challenge.
AbstractList Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be used in conjunction with a variational autoencoder (VAE) to yield a continuous representation of molecules. This is noteworthy because a continuous representation allows for efficient navigation of the immensely large chemical space of possible molecules. Since 2018, when the first model of this type was published, considerable effort has been put into developing novel and improved methodologies. Most, if not all, researchers in the community make their work easily accessible on GitHub, though discussion of computation time and domain of applicability is often overlooked. Herein, we present questions for consideration in future work which we believe will make chemical VAEs even more accessible. This article is categorized under: Data Science > Chemoinformatics Understanding how to best represent molecules in a machine‐readable format is a key challenge.
Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be used in conjunction with a variational autoencoder (VAE) to yield a continuous representation of molecules. This is noteworthy because a continuous representation allows for efficient navigation of the immensely large chemical space of possible molecules. Since 2018, when the first model of this type was published, considerable effort has been put into developing novel and improved methodologies. Most, if not all, researchers in the community make their work easily accessible on GitHub, though discussion of computation time and domain of applicability is often overlooked. Herein, we present questions for consideration in future work which we believe will make chemical VAEs even more accessible. This article is categorized under: Data Science > Chemoinformatics
Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be used in conjunction with a variational autoencoder (VAE) to yield a continuous representation of molecules. This is noteworthy because a continuous representation allows for efficient navigation of the immensely large chemical space of possible molecules. Since 2018, when the first model of this type was published, considerable effort has been put into developing novel and improved methodologies. Most, if not all, researchers in the community make their work easily accessible on GitHub, though discussion of computation time and domain of applicability is often overlooked. Herein, we present questions for consideration in future work which we believe will make chemical VAEs even more accessible.This article is categorized under:Data Science > Chemoinformatics
Author Goodman, Jonathan M.
Wigh, Daniel S.
Lapkin, Alexei A.
Author_xml – sequence: 1
  givenname: Daniel S.
  orcidid: 0000-0002-0494-643X
  surname: Wigh
  fullname: Wigh, Daniel S.
  organization: University of Cambridge
– sequence: 2
  givenname: Jonathan M.
  orcidid: 0000-0002-8693-9136
  surname: Goodman
  fullname: Goodman, Jonathan M.
  organization: University of Cambridge
– sequence: 3
  givenname: Alexei A.
  orcidid: 0000-0001-7621-0889
  surname: Lapkin
  fullname: Lapkin, Alexei A.
  email: aal35@cam.ac.uk
  organization: University of Cambridge
BookMark eNp1kE1PAjEQhhuDiYgc_AebePKw0I_txx7Jxq8E40GNx6bUWShZWmwXCf_eBYwHo3OZyeR538m856jngweELgkeEYzpeGtXaUQEZieoTyQvc6xU0fuZpThDw5SWuKuiJJSRPqomWYRPB9ss1NkqNGA3jYndbh0hgW9N64LPnM_aBWRmDgfM2IXzkDVgond-foFOa9MkGH73AXq9vXmp7vPp091DNZnmljHKcqBAucKyLJkSlBacGyMLNqsxSDljxALIWtRUcA5ClcBY8c6NMmCgsGAJG6Cro-86ho8NpFYvwyb67qSmkhBBZKlkR42PlI0hpQi1tu74RhuNazTBeh-W3oel92F1iutfinV0KxN3f7Lf7lvXwO5_UL9Vj88HxRcSLHrt
CitedBy_id crossref_primary_10_1016_j_compbiomed_2025_110100
crossref_primary_10_1016_j_matt_2025_101958
crossref_primary_10_1021_acs_jcim_5c00555
crossref_primary_10_3390_molecules28010208
crossref_primary_10_3390_pharmaceutics17091186
crossref_primary_10_1088_2632_2153_acdb30
crossref_primary_10_3390_molecules28010322
crossref_primary_10_1002_mrc_5336
crossref_primary_10_3390_ma16206687
crossref_primary_10_3390_pr11123325
crossref_primary_10_1093_bib_bbad306
crossref_primary_10_3390_app13074356
crossref_primary_10_1016_j_ejmech_2025_118176
crossref_primary_10_1016_j_tifs_2024_104578
crossref_primary_10_1515_pac_2025_0462
crossref_primary_10_1007_s00894_025_06365_0
crossref_primary_10_1016_j_pmatsci_2025_101544
crossref_primary_10_1186_s13321_024_00905_1
crossref_primary_10_1186_s13321_023_00798_6
crossref_primary_10_1007_s40820_023_01192_5
crossref_primary_10_1038_s41929_024_01150_3
crossref_primary_10_3390_pharmaceutics14112257
crossref_primary_10_3389_fphar_2024_1441587
crossref_primary_10_1016_j_rpth_2025_102691
crossref_primary_10_1038_s41467_024_49620_3
crossref_primary_10_3389_fpubh_2023_1140353
crossref_primary_10_1093_bib_bbad422
crossref_primary_10_3389_fceng_2023_1144115
crossref_primary_10_1021_acs_jcim_4c02255
crossref_primary_10_1063_5_0151122
crossref_primary_10_1080_17425255_2025_2551724
crossref_primary_10_1080_10643389_2025_2469868
crossref_primary_10_1002_adfm_202410075
crossref_primary_10_1186_s13321_025_01003_6
crossref_primary_10_1016_j_drudis_2023_103763
crossref_primary_10_1002_adma_202413695
crossref_primary_10_1186_s13321_025_01064_7
crossref_primary_10_1016_j_medidd_2025_100223
crossref_primary_10_1002_cjce_25525
crossref_primary_10_1002_cphc_202400773
crossref_primary_10_1515_pac_2022_1001
crossref_primary_10_1021_acsomega_5c00042
crossref_primary_10_1002_mats_202400008
crossref_primary_10_1021_acs_jcim_4c02102
crossref_primary_10_1021_acscentsci_3c00050
crossref_primary_10_1002_qub2_23
crossref_primary_10_1016_j_heliyon_2024_e39038
crossref_primary_10_1002_cplu_202300702
crossref_primary_10_1016_j_isci_2025_111881
crossref_primary_10_1063_5_0155012
crossref_primary_10_1016_j_csbj_2025_08_031
crossref_primary_10_1039_D4SC03921A
crossref_primary_10_1063_5_0282683
crossref_primary_10_1080_07391102_2023_2295974
crossref_primary_10_1007_s12293_024_00414_6
crossref_primary_10_1016_j_compchemeng_2025_109007
crossref_primary_10_3390_ijms26051860
crossref_primary_10_1038_s44386_025_00017_2
crossref_primary_10_1016_j_biopha_2024_117070
crossref_primary_10_1080_15376516_2025_2484318
crossref_primary_10_1038_s41598_024_61124_0
crossref_primary_10_3762_bjoc_20_196
crossref_primary_10_1002_cmdc_202400931
crossref_primary_10_1016_j_procs_2023_01_023
crossref_primary_10_1021_acscentsci_5c00785
crossref_primary_10_1016_j_clinthera_2024_06_011
crossref_primary_10_1016_j_jpha_2025_101313
crossref_primary_10_1038_s41467_024_54456_y
crossref_primary_10_3390_molecules30163442
crossref_primary_10_1063_5_0210910
crossref_primary_10_1021_acs_chemrev_4c00049
crossref_primary_10_1186_s12859_024_05654_4
crossref_primary_10_1186_s13321_024_00933_x
crossref_primary_10_1002_agt2_70089
crossref_primary_10_1016_j_compbiolchem_2025_108622
crossref_primary_10_3390_molecules29163902
crossref_primary_10_1007_s10967_022_08620_7
crossref_primary_10_1021_acs_jctc_5c00178
crossref_primary_10_1007_s40203_025_00384_8
crossref_primary_10_1016_j_ailsci_2022_100056
crossref_primary_10_1186_s13321_025_01081_6
crossref_primary_10_3390_app15073640
crossref_primary_10_3390_molecules28031342
crossref_primary_10_1016_j_cjche_2024_10_014
crossref_primary_10_3389_fchem_2025_1632046
crossref_primary_10_3390_ma17071664
crossref_primary_10_1021_acs_jcim_5c00354
crossref_primary_10_1016_j_comtox_2025_100369
crossref_primary_10_1016_j_cmpb_2024_108163
crossref_primary_10_1016_j_envres_2025_122811
crossref_primary_10_1021_acsnano_5c03690
crossref_primary_10_1186_s13321_023_00682_3
crossref_primary_10_1016_j_csbj_2024_07_003
crossref_primary_10_55225_sti_492
crossref_primary_10_1016_j_csbj_2024_04_030
crossref_primary_10_1002_chem_202401626
crossref_primary_10_1002_cptc_202500079
crossref_primary_10_1007_s12598_023_02358_1
crossref_primary_10_1038_s42004_025_01585_0
crossref_primary_10_1007_s11306_024_02090_6
crossref_primary_10_1186_s13321_023_00712_0
crossref_primary_10_3389_fddsv_2025_1674289
crossref_primary_10_1051_itmconf_20235602007
crossref_primary_10_1021_acs_jcim_5c00584
crossref_primary_10_1021_cbe_4c00170
crossref_primary_10_1016_j_drudis_2022_05_005
crossref_primary_10_3390_computation13090216
crossref_primary_10_1016_j_corsci_2022_110780
crossref_primary_10_1007_s11030_023_10752_1
crossref_primary_10_1007_s40242_025_5175_9
crossref_primary_10_1093_bib_bbad187
crossref_primary_10_1186_s13321_024_00937_7
crossref_primary_10_1016_j_tifs_2025_104887
crossref_primary_10_1038_s41598_024_75841_z
crossref_primary_10_1016_j_compchemeng_2023_108523
crossref_primary_10_1016_j_swevo_2025_101967
crossref_primary_10_1021_acscentsci_5c00561
crossref_primary_10_1002_aisy_202400985
crossref_primary_10_3390_jmse12030495
crossref_primary_10_1002_chem_202202834
crossref_primary_10_1038_s41524_025_01774_4
crossref_primary_10_1093_bib_bbae340
crossref_primary_10_60084_hjas_v3i1_270
crossref_primary_10_1002_smll_202207106
crossref_primary_10_60084_ijds_v3i1_306
crossref_primary_10_1016_j_ecoenv_2025_119008
crossref_primary_10_1186_s13321_025_00954_0
crossref_primary_10_1021_acs_chemrev_4c00893
crossref_primary_10_1021_acs_jcim_4c02261
crossref_primary_10_1016_j_inffus_2023_102092
crossref_primary_10_1088_2632_2153_acee42
crossref_primary_10_1039_D4DD00353E
crossref_primary_10_1021_acs_jpclett_5c01077
crossref_primary_10_1016_j_compbiomed_2024_107958
crossref_primary_10_1021_acs_jctc_5c00303
crossref_primary_10_1021_acs_jctc_5c00425
crossref_primary_10_1002_smtd_202301243
crossref_primary_10_1002_adem_202300104
crossref_primary_10_1109_MCI_2024_3401369
crossref_primary_10_1007_s11030_025_11118_5
crossref_primary_10_1016_j_trac_2025_118320
crossref_primary_10_1002_aenm_202304559
crossref_primary_10_1021_acs_jcim_5c00645
crossref_primary_10_1063_5_0245365
crossref_primary_10_1021_acs_est_4c14193
crossref_primary_10_1021_acs_jcim_5c00484
crossref_primary_10_3389_ebm_2025_10359
crossref_primary_10_3390_computation13070169
crossref_primary_10_1002_minf_202400146
crossref_primary_10_1016_j_envpol_2025_127011
crossref_primary_10_1021_acs_jpca_4c05718
crossref_primary_10_1002_jcc_27315
crossref_primary_10_1186_s13321_024_00818_z
crossref_primary_10_1186_s13321_025_01045_w
crossref_primary_10_3389_frhem_2024_1305741
Cites_doi 10.1021/cen-v030n034.p3523
10.1186/s13321-018-0258-y
10.1021/ci00039a002
10.26434/chemrxiv.14554803
10.1039/C9ME00039A
10.1021/acscentsci.8b00357
10.1016/j.neucom.2021.04.039
10.1021/acs.jcim.5b00543
10.1021/ci010132r
10.1007/978-3-030-30493-5_79
10.1093/nar/gkv352
10.1021/acs.chemrev.8b00588
10.1021/ci3001925
10.1021/jm0707727
10.1021/ed100697w
10.1021/ja00051a040
10.1088/2632-2153/aba947
10.1088/2632-2153/ac09d6
10.1021/ci900507g
10.1002/chem.201605499
10.1021/ci050400b
10.1186/s13321-021-00512-4
10.1016/S0003-2670(01)83100-7
10.1021/ci3001277
10.1038/s41586-021-03213-y
10.1021/acs.jcim.6b00601
10.1021/acs.jcim.8b00626
10.1021/ci00034a005
10.1186/s13321-016-0160-4
10.26434/chemrxiv.7097960.v1
10.1186/1758-2946-4-22
10.1016/j.ymeth.2014.08.005
10.1126/science.aar5169
10.1371/journal.pcbi.1002380
10.1021/ci600238j
10.1021/ci00054a007
10.1021/acscentsci.7b00512
10.1038/s41598-017-17299-w
10.1186/s13321-021-00517-z
10.1021/ci00007a012
10.1021/ci200488k
10.1021/ci970429i
10.1039/C9SC01844A
10.1002/wcms.36
10.1093/nar/gky1075
10.1002/cmdc.200800178
10.1039/D1SC00231G
10.1021/c160030a007
10.1021/acs.jcim.7b00616
10.1109/5.726791
10.1021/acscentsci.7b00572
10.1021/ci00062a008
10.1021/acs.jcim.0c00675
10.1016/j.drudis.2020.12.009
10.1038/sdata.2014.22
10.1021/ci100050t
10.1021/ci960109j
10.1039/C8SC04175J
10.1021/ci00047a024
10.1038/s41586-021-03819-2
10.1126/science.166.3902.178
10.1021/acs.accounts.0c00745
10.1021/acs.jcim.9b00286
10.1016/j.drudis.2020.11.037
10.1093/nar/gkw1074
10.1021/ci300116p
10.1016/j.drudis.2019.02.013
10.1021/ci00057a005
10.1021/acscentsci.9b00576
10.1093/nar/gkaa971
10.1186/s13321-015-0068-4
10.1021/acs.molpharmaceut.7b01134
10.1186/s13321-015-0057-7
10.1002/9783527618279.ch38
10.1039/D0CP00305K
10.1021/jacs.1c09820
10.1021/ci100384d
ContentType Journal Article
Copyright 2022 The Authors. published by Wiley Periodicals LLC.
2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2022 The Authors. published by Wiley Periodicals LLC.
– notice: 2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 24P
AAYXX
CITATION
7QH
7TN
7UA
C1K
F1W
H96
JQ2
L.G
DOI 10.1002/wcms.1603
DatabaseName Wiley Online Library Open Access
CrossRef
Aqualine
Oceanic Abstracts
Water Resources Abstracts
Environmental Sciences and Pollution Management
ASFA: Aquatic Sciences and Fisheries Abstracts
Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources
ProQuest Computer Science Collection
Aquatic Science & Fisheries Abstracts (ASFA) Professional
DatabaseTitle CrossRef
Aquatic Science & Fisheries Abstracts (ASFA) Professional
Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources
Oceanic Abstracts
ASFA: Aquatic Sciences and Fisheries Abstracts
ProQuest Computer Science Collection
Aqualine
Water Resources Abstracts
Environmental Sciences and Pollution Management
DatabaseTitleList
CrossRef
Aquatic Science & Fisheries Abstracts (ASFA) Professional
Database_xml – sequence: 1
  dbid: 24P
  name: Wiley Online Library Open Access
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
EISSN 1759-0884
EndPage n/a
ExternalDocumentID 10_1002_wcms_1603
WCMS1603
Genre reviewArticle
GrantInformation_xml – fundername: UCB
– fundername: Engineering and Physical Sciences Research Council
  funderid: EP/S024220/1
GroupedDBID 05W
0R~
1OC
1VH
24P
31~
33P
8-0
8-1
A00
AAESR
AAHHS
AAHQN
AAMNL
AANHP
AANLZ
AASGY
AAXRX
AAYCA
AAZKR
ABCUV
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACGFS
ACIWK
ACPOU
ACPRK
ACRPL
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
AEEZP
AEIGN
AEQDE
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFRAH
AFWVQ
AFZJQ
AHBTC
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMYDB
ASPBG
AUFTA
AVWKF
AZFZN
AZVAB
BDRZF
BFHJK
BHBCM
BMNLL
BMXJE
BRXPI
D-A
DCZOG
DRFUL
DRSTM
EBS
EJD
FEDTE
G-S
GODZA
HGLYW
HVGLF
HZ~
LATKE
LEEKS
LITHE
LOXES
LUTES
LYRES
MEWTI
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
MY.
MY~
O66
O9-
P2W
ROL
SUPJJ
WBKPD
WHWMO
WIH
WIK
WOHZO
WVDHM
WXSBR
WYJ
ZZTAW
~S-
AAYXX
AEYWJ
AGHNM
AGQPQ
AGYGG
CITATION
LH4
7QH
7TN
7UA
C1K
F1W
H96
JQ2
L.G
ID FETCH-LOGICAL-c3323-e2e258079938622455aa743bf0e77b31cee7f6f2655e689e334d5a8aeae4cec13
IEDL.DBID 24P
ISICitedReferencesCount 230
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000757657800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1759-0876
IngestDate Fri Jul 25 11:59:00 EDT 2025
Sat Nov 29 02:23:39 EST 2025
Tue Nov 18 22:08:15 EST 2025
Wed Jan 22 16:22:19 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
License Attribution
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c3323-e2e258079938622455aa743bf0e77b31cee7f6f2655e689e334d5a8aeae4cec13
Notes Funding information
Engineering and Physical Sciences Research Council, Grant/Award Number: EP/S024220/1; UCB
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-8693-9136
0000-0002-0494-643X
0000-0001-7621-0889
OpenAccessLink https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fwcms.1603
PQID 2711617987
PQPubID 2034594
PageCount 19
ParticipantIDs proquest_journals_2711617987
crossref_citationtrail_10_1002_wcms_1603
crossref_primary_10_1002_wcms_1603
wiley_primary_10_1002_wcms_1603_WCMS1603
PublicationCentury 2000
PublicationDate September/October 2022
2022-09-00
20220901
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: September/October 2022
PublicationDecade 2020
PublicationPlace Hoboken, USA
PublicationPlace_xml – name: Hoboken, USA
– name: Hoboken
PublicationTitle Wiley interdisciplinary reviews. Computational molecular science
PublicationYear 2022
Publisher Wiley Periodicals, Inc
Wiley Subscription Services, Inc
Publisher_xml – name: Wiley Periodicals, Inc
– name: Wiley Subscription Services, Inc
References 1968; 8
2017; 7
2021; 26
2018; 360
2021; 23
2015; 71
2019; 10
2020; 60
1952; 30
2019; 59
2017; 45
2008; 3
1998; 86
2012; 52
1969; 166
1985; 25
2014; 1
2020; 1
2018; 4
2021; 596
2002; 42
2019; 24
1992; 114
2015; 43
1982; 22
2021; 590
2019; 119
1978; 103
1983; 23
2021; 49
2019; 4
2011; 1
2019; 5
2021; 2
2011
2015; 55
2017; 23
2008
2021; 143
1992; 32
2008; 51
2015; 7
1989; 29
1998; 38
2021; 13
2010; 87
2021; 54
2021; 12
2006; 46
2021
2020
1997; 37
2011; 51
2017; 57
2019; 47
1988; 28
2019
2018
2017
2016
2021; 450
2014
2020; 22
2013
2012; 4
2018; 10
2016; 8
2018; 15
2007; 47
2010; 50
2018; 58
2012; 8
1987; 27
e_1_2_14_73_1
e_1_2_14_96_1
e_1_2_14_110_1
e_1_2_14_31_1
e_1_2_14_50_1
e_1_2_14_92_1
Favre HA (e_1_2_14_19_1) 2014
e_1_2_14_35_1
e_1_2_14_12_1
e_1_2_14_54_1
e_1_2_14_39_1
e_1_2_14_77_1
e_1_2_14_16_1
e_1_2_14_58_1
e_1_2_14_6_1
e_1_2_14_107_1
e_1_2_14_103_1
e_1_2_14_85_1
e_1_2_14_2_1
Kingma DP (e_1_2_14_69_1) 2013
e_1_2_14_20_1
e_1_2_14_62_1
e_1_2_14_81_1
e_1_2_14_24_1
e_1_2_14_43_1
e_1_2_14_66_1
e_1_2_14_28_1
e_1_2_14_89_1
e_1_2_14_47_1
e_1_2_14_72_1
e_1_2_14_111_1
e_1_2_14_30_1
e_1_2_14_53_1
e_1_2_14_11_1
e_1_2_14_34_1
e_1_2_14_57_1
e_1_2_14_15_1
e_1_2_14_38_1
e_1_2_14_76_1
e_1_2_14_99_1
e_1_2_14_7_1
e_1_2_14_108_1
e_1_2_14_104_1
e_1_2_14_84_1
e_1_2_14_100_1
e_1_2_14_42_1
e_1_2_14_80_1
e_1_2_14_3_1
e_1_2_14_61_1
e_1_2_14_23_1
e_1_2_14_46_1
e_1_2_14_65_1
e_1_2_14_27_1
e_1_2_14_88_1
Sabando MV (e_1_2_14_95_1) 2021; 23
e_1_2_14_94_1
e_1_2_14_112_1
e_1_2_14_75_1
e_1_2_14_52_1
Henderson P (e_1_2_14_93_1) 2019
e_1_2_14_10_1
e_1_2_14_56_1
Software N. (e_1_2_14_101_1) 2021
e_1_2_14_33_1
e_1_2_14_14_1
e_1_2_14_98_1
e_1_2_14_37_1
e_1_2_14_79_1
e_1_2_14_8_1
e_1_2_14_109_1
e_1_2_14_105_1
e_1_2_14_60_1
e_1_2_14_83_1
e_1_2_14_41_1
e_1_2_14_64_1
e_1_2_14_4_1
e_1_2_14_45_1
e_1_2_14_68_1
e_1_2_14_22_1
e_1_2_14_87_1
Kurach K (e_1_2_14_91_1) 2019
e_1_2_14_49_1
Melis G (e_1_2_14_90_1) 2017
e_1_2_14_26_1
e_1_2_14_74_1
e_1_2_14_97_1
Neil D (e_1_2_14_71_1) 2018
e_1_2_14_51_1
e_1_2_14_70_1
e_1_2_14_13_1
e_1_2_14_32_1
e_1_2_14_55_1
e_1_2_14_17_1
e_1_2_14_36_1
e_1_2_14_59_1
e_1_2_14_78_1
e_1_2_14_29_1
e_1_2_14_5_1
e_1_2_14_9_1
e_1_2_14_106_1
e_1_2_14_102_1
e_1_2_14_86_1
e_1_2_14_63_1
e_1_2_14_40_1
e_1_2_14_82_1
e_1_2_14_67_1
e_1_2_14_21_1
e_1_2_14_44_1
e_1_2_14_25_1
e_1_2_14_48_1
Derwent C (e_1_2_14_18_1) 2020
References_xml – year: 2011
– volume: 10
  start-page: 4
  year: 2018
  article-title: Mordred: a molecular descriptor calculator
  publication-title: J Chem
– volume: 23
  start-page: 1
  year: 2021
  end-page: 21
  article-title: Using molecular embeddings in QSAR modeling: does it make a difference?
  publication-title: Brief Bioinform
– volume: 60
  start-page: 6065
  year: 2020
  end-page: 73
  article-title: ZINC20—a free ultralarge‐scale chemical database for ligand discovery
  publication-title: J Chem Inf Model
– volume: 15
  start-page: 4378
  year: 2018
  end-page: 85
  article-title: 3D molecular representations based on the wave transform for convolutional neural networks
  publication-title: Mol Pharm
– volume: 51
  start-page: 739
  year: 2011
  end-page: 53
  article-title: Chemical name to structure: OPSIN, an open source solution
  publication-title: J Chem Inf Model
– volume: 52
  start-page: 1745
  year: 2012
  end-page: 56
  article-title: Mining electronic laboratory notebooks: analysis, retrosynthesis, and reaction based enumeration
  publication-title: J Chem Inf Model
– year: 2018
  article-title: DeepSMILES: An adaptation of SMILES for use in machine‐learning of chemical structures
  publication-title: ChemRxiv
– volume: 1
  start-page: 557
  year: 2011
  end-page: 79
  article-title: Representation of chemical structures
  publication-title: WIREs Comput Mol Sci
– volume: 23
  start-page: 93
  year: 1983
  end-page: 102
  article-title: The CAS ONLINE search system. 1. General system design and selection, generation, and use of search screens
  publication-title: J Chem Inf Comput Sci
– year: 2021
– volume: 59
  start-page: 1136
  year: 2019
  end-page: 46
  article-title: De novo molecule design by translating from reduced graphs to SMILES
  publication-title: J Chem Inf Model
– volume: 30
  start-page: 3523
  year: 1952
  end-page: 6
  article-title: Thw (sic) Wiswesser line formula notation
  publication-title: Chem Eng News Arch
– volume: 27
  start-page: 74
  year: 1987
  end-page: 82
  article-title: DARC system: notions of defined and generic substructures. Filiation and coding of FREL substructure (SS) classes
  publication-title: J Chem Inf Comput Sci
– volume: 7
  start-page: 23
  year: 2015
  article-title: InChI, the IUPAC international chemical identifier
  publication-title: J Chem
– volume: 50
  start-page: 742
  year: 2010
  end-page: 54
  article-title: Extended‐connectivity fingerprints
  publication-title: J Chem Inf Model
– volume: 26
  start-page: 511
  year: 2021
  end-page: 24
  article-title: Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: ways to make an impact, and why we are not there yet
  publication-title: Drug Discov Today
– volume: 114
  start-page: 10024
  year: 1992
  end-page: 35
  article-title: UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations
  publication-title: J Am Chem Soc
– volume: 45
  start-page: D945
  year: 2017
  end-page: 54
  article-title: The ChEMBL database in 2017
  publication-title: Nucleic Acids Res
– volume: 12
  start-page: 7079
  year: 2021
  end-page: 90
  article-title: Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (STONED) algorithm for molecules using SELFIES
  publication-title: Chem Sci
– year: 2018
– volume: 55
  start-page: 2111
  year: 2015
  end-page: 20
  article-title: Get your atoms in order—an open‐source implementation of a novel and robust molecular canonicalization algorithm
  publication-title: J Chem Inf Model
– volume: 119
  start-page: 6561
  year: 2019
  end-page: 94
  article-title: Computational ligand descriptors for catalyst design
  publication-title: Chem Rev
– year: 2014
– volume: 1
  year: 2014
  article-title: Quantum chemistry structures and properties of 134 kilo molecules
  publication-title: Sci Data
– volume: 47
  start-page: 1734
  year: 2007
  end-page: 46
  article-title: Algorithm for advanced canonical coding of planar chemical structures that considers stereochemical and symmetric information
  publication-title: J Chem Inf Model
– volume: 4
  start-page: 22
  year: 2012
  article-title: Towards a universal SMILES representation—a standard method to generate canonical SMILES based on the InChI
  publication-title: J Chem
– volume: 37
  start-page: 71
  year: 1997
  end-page: 9
  article-title: SYBYL Line Notation (SLN): a versatile language for chemical structure representation
  publication-title: J Chem Inf Comput Sci
– volume: 22
  start-page: 88
  year: 1982
  end-page: 93
  article-title: How the WLN began in 1949 and how it might be in 1999
  publication-title: J Chem Inf Comput Sci
– volume: 43
  start-page: W612
  year: 2015
  end-page: 20
  article-title: ChEMBL web services: streamlining access to drug discovery data and utilities
  publication-title: Nucleic Acids Res
– volume: 3
  start-page: 1503
  year: 2008
  end-page: 7
  article-title: On the art of compiling and using “drug‐like” chemical fragment spaces
  publication-title: ChemMedChem
– volume: 51
  start-page: 3149
  year: 2011
  end-page: 57
  article-title: Accurate specification of molecular structures: the case for zero‐order bonds and explicit hydrogen counting
  publication-title: J Chem Inf Model
– volume: 8
  start-page: 50
  year: 2016
  article-title: Jmol SMILES and Jmol SMARTS: specifications and applications
  publication-title: J Chem
– volume: 86
  start-page: 2278
  year: 1998
  end-page: 324
  article-title: Gradient‐based learning applied to document recognition
  publication-title: Proc IEEE
– volume: 71
  start-page: 58
  year: 2015
  end-page: 63
  article-title: Molecular fingerprint similarity search in virtual screening
  publication-title: Methods
– year: 2008
– volume: 13
  start-page: 40
  year: 2021
  article-title: InChI version 1.06: now more than 99.99% reliable
  publication-title: J Chem
– volume: 5
  start-page: 1572
  year: 2019
  end-page: 83
  article-title: Molecular transformer: a model for uncertainty‐calibrated chemical reaction prediction
  publication-title: ACS Cent Sci
– volume: 24
  start-page: 1148
  year: 2019
  end-page: 56
  article-title: The next level in chemical space navigation: going far beyond enumerable compound libraries
  publication-title: Drug Discov Today
– volume: 7
  start-page: 16991
  year: 2017
  article-title: A universal 3D voxel descriptor for solid‐state material informatics with deep convolutional neural networks
  publication-title: Sci Rep
– volume: 26
  start-page: 1040
  year: 2021
  end-page: 52
  article-title: Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data
  publication-title: Drug Discov Today
– volume: 52
  start-page: 2796
  year: 2012
  end-page: 806
  article-title: HELM: a hierarchical notation language for complex biomolecule structure representation
  publication-title: J Chem Inf Model
– year: 2019
– volume: 49
  start-page: D1388
  year: 2021
  end-page: 95
  article-title: PubChem in 2021: new data content and improved web interfaces
  publication-title: Nucleic Acids Res
– volume: 4
  start-page: 1465
  year: 2018
  end-page: 76
  article-title: Using machine learning to predict suitable conditions for organic reactions
  publication-title: ACS Cent Sci
– volume: 51
  start-page: 2468
  year: 2008
  end-page: 80
  article-title: Similarity searching and scaffold hopping in synthetically accessible combinatorial chemistry spaces
  publication-title: J Med Chem
– volume: 13
  start-page: 34
  year: 2021
  article-title: STOUT: SMILES to IUPAC names using neural machine translation
  publication-title: J Chem
– volume: 596
  start-page: 583
  year: 2021
  end-page: 9
  article-title: Highly accurate protein structure prediction with AlphaFold
  publication-title: Nature
– volume: 8
  year: 2012
  article-title: DOGS: reaction‐driven de novo design of bioactive compounds
  publication-title: PLoS Comput Biol
– volume: 54
  start-page: 827
  year: 2021
  end-page: 36
  article-title: Importance of engineered and learned molecular representations in predicting organic reactivity, selectivity, and chemical properties
  publication-title: Acc Chem Res
– volume: 166
  start-page: 178
  year: 1969
  end-page: 92
  article-title: Computer‐assisted design of complex organic syntheses
  publication-title: Science
– volume: 57
  start-page: 1757
  year: 2017
  end-page: 72
  article-title: Convolutional embedding of attributed molecular graphs for physical property prediction
  publication-title: J Chem Inf Model
– volume: 50
  start-page: 992
  year: 2010
  end-page: 1004
  article-title: Comparative evaluation of 3D virtual ligand screening methods: impact of the molecular alignment on enrichment
  publication-title: J Chem Inf Model
– volume: 87
  start-page: 1123
  year: 2010
  end-page: 4
  article-title: ChemSpider: an online chemical information resource
  publication-title: J Chem Educ
– volume: 32
  start-page: 244
  year: 1992
  end-page: 55
  article-title: Description of several chemical structure file formats usedby computer programs developed at Molecular Design Limited
  publication-title: J Chem Inf Comput Sci
– volume: 2019
  start-page: 831
  end-page: 835
– volume: 10
  start-page: 6697
  year: 2019
  end-page: 706
  article-title: Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
  publication-title: Chem Sci
– volume: 103
  start-page: 355
  year: 1978
  end-page: 65
  article-title: Hose—a novel substructure code
  publication-title: Anal Chim Acta
– volume: 29
  start-page: 97
  year: 1989
  end-page: 101
  article-title: SMILES. 2. Algorithm for generation of unique SMILES notation
  publication-title: J Chem Inf Comput Sci
– volume: 58
  start-page: 27
  year: 2018
  end-page: 35
  article-title: Mol2vec: unsupervised machine learning approach with chemical intuition
  publication-title: J Chem Inf Model
– volume: 38
  start-page: 511
  year: 1998
  end-page: 22
  article-title: RECAP—retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry
  publication-title: J Chem Inf Comput Sci
– volume: 52
  start-page: 1757
  year: 2012
  end-page: 68
  article-title: ZINC: a free tool to discover chemistry for biology
  publication-title: J Chem Inf Model
– volume: 4
  start-page: 828
  year: 2019
  end-page: 49
  article-title: Deep learning for molecular design—a review of the state of the art
  publication-title: Mol Syst Design Eng
– volume: 360
  start-page: 186
  year: 2018
  end-page: 90
  article-title: Predicting reaction performance in C–N cross‐coupling using machine learning
  publication-title: Science
– volume: 46
  start-page: 991
  year: 2006
  end-page: 8
  article-title: The blue obelisk—interoperability in chemical informatics
  publication-title: J Chem Inf Model
– volume: 23
  start-page: 5966
  year: 2017
  end-page: 71
  article-title: Neural‐symbolic machine learning for retrosynthesis and reaction prediction
  publication-title: Chem Eur J
– volume: 8
  start-page: 146
  year: 1968
  end-page: 50
  article-title: 107 Years of line‐formula notations (1861–1968)
  publication-title: J Chem Doc
– year: 2016
– volume: 1
  start-page: 1757
  year: 2020
  end-page: 72
  article-title: Self‐referencing embedded strings (SELFIES): a 100% robust molecular string representation
  publication-title: Mach Learn Sci Technol
– volume: 22
  start-page: 8373
  year: 2020
  end-page: 90
  article-title: Are 2D fingerprints still valuable for drug discovery?
  publication-title: Phys Chem Chem Phys
– volume: 59
  start-page: 2529
  year: 2019
  end-page: 37
  article-title: RDChiral: an RDKit wrapper for handling stereochemistry in retrosynthetic template extraction and application
  publication-title: J Chem Inf Model
– volume: 25
  start-page: 264
  year: 1985
  end-page: 70
  article-title: Generic structure storage and retrieval
  publication-title: J Chem Inf Comput Sci
– volume: 42
  start-page: 1273
  year: 2002
  end-page: 80
  article-title: Reoptimization of MDL keys for use in drug discovery
  publication-title: J Chem Inf Comput Sci
– volume: 4
  start-page: 268
  year: 2018
  end-page: 76
  article-title: Automatic chemical design using a data‐driven continuous representation of molecules
  publication-title: ACS Cent Sci
– volume: 4
  start-page: 120
  year: 2018
  end-page: 31
  article-title: Generating focused molecule libraries for drug discovery with recurrent neural networks
  publication-title: ACS Cent Sci
– volume: 2
  year: 2021
  article-title: Deep molecular dreaming: inverse machine learning for de‐novo molecular design and interpretability with surjective representations
  publication-title: Mach Learn Sci Technol
– volume: 28
  start-page: 31
  year: 1988
  end-page: 6
  article-title: A chemical language and information system. 1. Introduction to methodology and encoding rules
  publication-title: J Chem Inf Comput Sci
– volume: 7
  start-page: 9
  year: 2015
  article-title: Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data
  publication-title: J Chem
– volume: 47
  start-page: D930
  year: 2019
  end-page: 40
  article-title: ChEMBL: towards direct deposition of bioassay data
  publication-title: Nucleic Acids Res
– year: 2020
– volume: 10
  start-page: 1692
  year: 2019
  end-page: 701
  article-title: Learning continuous and datadriven molecular descriptors by translating equivalent chemical representations
  publication-title: Chem Sci
– start-page: 311
  year: 2020
– year: 2017
– volume: 143
  start-page: 18820
  year: 2021
  end-page: 6
  article-title: The open reaction database
  publication-title: J Am Chem Soc
– volume: 590
  start-page: 89
  year: 2021
  end-page: 96
  article-title: Bayesian reaction optimization as a tool for chemical synthesis
  publication-title: Nature
– year: 2013
– volume: 450
  start-page: 242
  year: 2021
  end-page: 52
  article-title: Molecular graph generation with graph neural networks
  publication-title: Neurocomputing
– ident: e_1_2_14_14_1
  doi: 10.1021/cen-v030n034.p3523
– ident: e_1_2_14_74_1
– volume-title: Proceedings of the 36th international conference on machine learning (ICML)
  year: 2019
  ident: e_1_2_14_91_1
– ident: e_1_2_14_43_1
– ident: e_1_2_14_56_1
  doi: 10.1186/s13321-018-0258-y
– ident: e_1_2_14_8_1
  doi: 10.1021/ci00039a002
– ident: e_1_2_14_103_1
  doi: 10.26434/chemrxiv.14554803
– ident: e_1_2_14_68_1
  doi: 10.1039/C9ME00039A
– ident: e_1_2_14_66_1
  doi: 10.1021/acscentsci.8b00357
– ident: e_1_2_14_82_1
  doi: 10.1016/j.neucom.2021.04.039
– ident: e_1_2_14_92_1
– ident: e_1_2_14_42_1
– ident: e_1_2_14_24_1
  doi: 10.1021/acs.jcim.5b00543
– ident: e_1_2_14_64_1
  doi: 10.1021/ci010132r
– ident: e_1_2_14_75_1
– volume-title: Deep reinforcement learning that matters. AAAI Conference on Artificial Intelligence (AAAI)
  year: 2019
  ident: e_1_2_14_93_1
– ident: e_1_2_14_72_1
  doi: 10.1007/978-3-030-30493-5_79
– ident: e_1_2_14_13_1
  doi: 10.1093/nar/gkv352
– ident: e_1_2_14_86_1
– ident: e_1_2_14_54_1
  doi: 10.1021/acs.chemrev.8b00588
– start-page: 311
  volume-title: Derwent World Patents Index
  year: 2020
  ident: e_1_2_14_18_1
– ident: e_1_2_14_3_1
  doi: 10.1021/ci3001925
– ident: e_1_2_14_99_1
  doi: 10.1021/jm0707727
– ident: e_1_2_14_10_1
  doi: 10.1021/ed100697w
– ident: e_1_2_14_111_1
  doi: 10.1021/ja00051a040
– ident: e_1_2_14_36_1
  doi: 10.1088/2632-2153/aba947
– ident: e_1_2_14_38_1
  doi: 10.1088/2632-2153/ac09d6
– ident: e_1_2_14_5_1
  doi: 10.1021/ci900507g
– ident: e_1_2_14_80_1
– ident: e_1_2_14_31_1
  doi: 10.1002/chem.201605499
– ident: e_1_2_14_29_1
– ident: e_1_2_14_112_1
  doi: 10.1021/ci050400b
– ident: e_1_2_14_33_1
  doi: 10.1186/s13321-021-00512-4
– ident: e_1_2_14_60_1
  doi: 10.1016/S0003-2670(01)83100-7
– ident: e_1_2_14_78_1
  doi: 10.1021/ci3001277
– ident: e_1_2_14_55_1
  doi: 10.1038/s41586-021-03213-y
– ident: e_1_2_14_58_1
  doi: 10.1021/acs.jcim.6b00601
– ident: e_1_2_14_59_1
  doi: 10.1021/acs.jcim.8b00626
– ident: e_1_2_14_6_1
– volume-title: Exploring deep recurrent models with reinforcement learning for molecule design. International Conference on Learning Representations (ICLR) workshop
  year: 2018
  ident: e_1_2_14_71_1
– ident: e_1_2_14_110_1
– ident: e_1_2_14_16_1
  doi: 10.1021/ci00034a005
– volume-title: Nomenclature of organic chemistry: IUPAC recommendations and preferred names 2013
  year: 2014
  ident: e_1_2_14_19_1
– ident: e_1_2_14_30_1
– ident: e_1_2_14_104_1
  doi: 10.1186/s13321-016-0160-4
– ident: e_1_2_14_52_1
– ident: e_1_2_14_39_1
– ident: e_1_2_14_35_1
  doi: 10.26434/chemrxiv.7097960.v1
– ident: e_1_2_14_23_1
  doi: 10.1186/1758-2946-4-22
– ident: e_1_2_14_62_1
  doi: 10.1016/j.ymeth.2014.08.005
– ident: e_1_2_14_57_1
  doi: 10.1126/science.aar5169
– ident: e_1_2_14_98_1
  doi: 10.1371/journal.pcbi.1002380
– ident: e_1_2_14_106_1
  doi: 10.1021/ci600238j
– ident: e_1_2_14_61_1
  doi: 10.1021/ci00054a007
– ident: e_1_2_14_70_1
  doi: 10.1021/acscentsci.7b00512
– ident: e_1_2_14_84_1
  doi: 10.1038/s41598-017-17299-w
– ident: e_1_2_14_45_1
  doi: 10.1186/s13321-021-00517-z
– ident: e_1_2_14_83_1
– ident: e_1_2_14_49_1
  doi: 10.1021/ci00007a012
– ident: e_1_2_14_65_1
– ident: e_1_2_14_50_1
  doi: 10.1021/ci200488k
– ident: e_1_2_14_79_1
– ident: e_1_2_14_97_1
  doi: 10.1021/ci970429i
– ident: e_1_2_14_48_1
– ident: e_1_2_14_53_1
  doi: 10.1039/C9SC01844A
– ident: e_1_2_14_81_1
– ident: e_1_2_14_4_1
  doi: 10.1002/wcms.36
– ident: e_1_2_14_11_1
  doi: 10.1093/nar/gky1075
– ident: e_1_2_14_96_1
  doi: 10.1002/cmdc.200800178
– ident: e_1_2_14_40_1
  doi: 10.1039/D1SC00231G
– volume-title: Auto‐encoding variational Bayes. The 2nd International Conference on Learning Representations (ICLR)
  year: 2013
  ident: e_1_2_14_69_1
– ident: e_1_2_14_15_1
  doi: 10.1021/c160030a007
– ident: e_1_2_14_34_1
  doi: 10.1021/acs.jcim.7b00616
– ident: e_1_2_14_109_1
  doi: 10.1109/5.726791
– volume: 23
  start-page: 1
  year: 2021
  ident: e_1_2_14_95_1
  article-title: Using molecular embeddings in QSAR modeling: does it make a difference?
  publication-title: Brief Bioinform
– ident: e_1_2_14_67_1
  doi: 10.1021/acscentsci.7b00572
– ident: e_1_2_14_25_1
  doi: 10.1021/ci00062a008
– ident: e_1_2_14_7_1
– ident: e_1_2_14_102_1
  doi: 10.1021/acs.jcim.0c00675
– ident: e_1_2_14_87_1
  doi: 10.1016/j.drudis.2020.12.009
– ident: e_1_2_14_77_1
  doi: 10.1038/sdata.2014.22
– ident: e_1_2_14_63_1
  doi: 10.1021/ci100050t
– ident: e_1_2_14_46_1
  doi: 10.1021/ci960109j
– ident: e_1_2_14_73_1
  doi: 10.1039/C8SC04175J
– ident: e_1_2_14_17_1
  doi: 10.1021/ci00047a024
– ident: e_1_2_14_2_1
  doi: 10.1038/s41586-021-03819-2
– ident: e_1_2_14_26_1
  doi: 10.1126/science.166.3902.178
– ident: e_1_2_14_37_1
– ident: e_1_2_14_107_1
  doi: 10.1021/acs.accounts.0c00745
– ident: e_1_2_14_76_1
– ident: e_1_2_14_28_1
  doi: 10.1021/acs.jcim.9b00286
– ident: e_1_2_14_88_1
  doi: 10.1016/j.drudis.2020.11.037
– ident: e_1_2_14_12_1
  doi: 10.1093/nar/gkw1074
– ident: e_1_2_14_27_1
  doi: 10.1021/ci300116p
– ident: e_1_2_14_100_1
  doi: 10.1016/j.drudis.2019.02.013
– ident: e_1_2_14_21_1
  doi: 10.1021/ci00057a005
– ident: e_1_2_14_32_1
  doi: 10.1021/acscentsci.9b00576
– volume-title: SmallWorld
  year: 2021
  ident: e_1_2_14_101_1
– ident: e_1_2_14_9_1
  doi: 10.1093/nar/gkaa971
– ident: e_1_2_14_41_1
  doi: 10.1186/s13321-015-0068-4
– ident: e_1_2_14_85_1
  doi: 10.1021/acs.molpharmaceut.7b01134
– ident: e_1_2_14_22_1
– ident: e_1_2_14_51_1
  doi: 10.1186/s13321-015-0057-7
– ident: e_1_2_14_105_1
– ident: e_1_2_14_47_1
  doi: 10.1002/9783527618279.ch38
– ident: e_1_2_14_108_1
– ident: e_1_2_14_44_1
– ident: e_1_2_14_94_1
  doi: 10.1039/D0CP00305K
– volume-title: On the state of the art of evaluation in neural language models. International Conference on Learning Representations (ICLR)
  year: 2017
  ident: e_1_2_14_90_1
– ident: e_1_2_14_89_1
  doi: 10.1021/jacs.1c09820
– ident: e_1_2_14_20_1
  doi: 10.1021/ci100384d
SSID ssj0000491231
Score 2.6788127
SecondaryResourceType review_article
Snippet Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial...
SourceID proquest
crossref
wiley
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage e1603
SubjectTerms Accessibility
Artificial intelligence
chemoinformatics
Computation
Computational chemistry
Computer applications
fingerprints
Learning algorithms
Machine learning
molecular representation
Navigation
Representations
variational autoencoder
Title A review of molecular representation in the age of machine learning
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fwcms.1603
https://www.proquest.com/docview/2711617987
Volume 12
WOSCitedRecordID wos000757657800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library Full Collection 2020
  customDbUrl:
  eissn: 1759-0884
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000491231
  issn: 1759-0876
  databaseCode: DRFUL
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFH7MTdCLv8XpHEE8eClbk6Zp8TSqw4OOoQ53K2mWyMB1sk79903SH1NQELz18JqWJO-974W87wM4VyaN6MpDRz9fOR4TwgmESx03CZQG-BOPJsqKTbDBIBiPw2ENLstemJwfojpwM55h47VxcJ5knRVp6IeYZeZshKxBw3VJYHQbsDesDlg09NVR2RZcjIaOoV4rmYW6uFO9_T0frUDmV6hqc01_-19_uQNbBcREvXxP7EJNpnuwEZXKbvsQ9VDesYLmCs1KfVxk-S3LXqQUTVOkwSHS8caa2UuXEhUqE88HMOpfP0Y3TiGm4AhCMHEklpgGXabxiC5isEcp5xo9JKorGUuIq5MlU77CPqXSD0JJiDehPOCSS09I4ZJDqKfzVB4B8iVOdBXEFRfEC7ucJzrJS6JzfUK4cr0mXJRTGouCadwIXrzEOUcyjs2sxGZWmnBWmb7m9Bo_GbXKdYkLD8tizFxTmoUB05-zK_D7APFTdPdgHo7_bnoCm9h0OtjrZC2oLxdv8hTWxftymi3adqu1oXF13x_dfgKuH9gF
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEB5qFerFt1ituogHL6HNPrIJeClFqdiWghV7C5vtrhRsKm3Vv-_uJmkVFARvOUw2YR_zfTPsfANwqS2MmMjDeL9Ae5RL6YXSZ56fhNoQ_BFliXbNJnivFw6HUb8E10UtTKYPsUy42ZPh_LU94DYhXV-phn7IydwmR8garFODMnaXY9pfZlgM9zVu2UVcnEWe1V4rpIUauL58-zsgrVjmV67qwOZ2-3-_uQNbOclEzWxX7EJJpXtQaRW93fah1URZzQqaajQpOuQip3BZVCOlaJwiQw-R8TjOzF27VCjvM_F8AI-3N4NW28vbKXiSEEw8hRVmYYMbRmLCGEwZE8Lwh0Q3FOcJ8Q1cch1oHDCmgjBShNARE6FQQlGppE8OoZxOU3UEKFA4MXGQ0EISGjWESAzMK2LQPiFC-7QKV8WcxjLXGrctL17iTCUZx3ZWYjsrVbhYmr5mAhs_GdWKhYnzMzaPMfdtcBaF3HzOLcHvA8RPre6DfTj-u-k5VNqDbifu3PXuT2AT27oHd7msBuXF7E2dwoZ8X4znszO37z4ByaLaUA
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFH7MTdSLv8Xp1CAevJStSdO04GVsDsU5hjrcraRZIgPXjW3qv2-StpuCguCth9e0JHnvfS_kfR_AhTJpRFceOvr5yvGYEE4gXOq4caA0wB94NFZWbIJ1OkG_H3YLcJX3wqT8EIsDN-MZNl4bB5eTgaouWUM_xGhmDkfICpQ8IyJThFLzodVrL85YNPrVgdnWXIyGjmFfy8mFari6eP97SlrizK9o1aab1tb_fnQbNjOYierpvtiBgkx2Yb2Rq7vtQaOO0q4VNFZolGvkIstxmfcjJWiYIA0QkY451sxevJQoU5p42Yde6_qpceNkggqOIAQTR2KJaVBjGpPoQgZ7lHKuEUSsapKxmLg6YTLlK-xTKv0glIR4A8oDLrn0hBQuOYBiMk7kISBf4lhXQlxxQbywxnmsE70kOt_HhCvXK8NlPqeRyNjGjejFa5TyJOPIzEpkZqUM5wvTSUqx8ZNRJV-YKPOyWYSZa8qzMGD6c3YJfh8gem7cP5qHo7-bnsFat9mK2redu2PYwKbxwd4uq0BxPn2TJ7Aq3ufD2fQ023ifj6_bZg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+review+of+molecular+representation+in+the+age+of+machine+learning&rft.jtitle=Wiley+interdisciplinary+reviews.+Computational+molecular+science&rft.au=Wigh%2C+Daniel+S.&rft.au=Goodman%2C+Jonathan+M.&rft.au=Lapkin%2C+Alexei+A.&rft.date=2022-09-01&rft.issn=1759-0876&rft.eissn=1759-0884&rft.volume=12&rft.issue=5&rft_id=info:doi/10.1002%2Fwcms.1603&rft.externalDBID=n%2Fa&rft.externalDocID=10_1002_wcms_1603
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1759-0876&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1759-0876&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1759-0876&client=summon