Hypershape Recognition: A General Framework for Moment-Based Molecular Similarity

Saved in:
Bibliographic Details
Title: Hypershape Recognition: A General Framework for Moment-Based Molecular Similarity
Authors: Marcello Costamagna, Marco Foscato, David Grellscheid, Vidar R. Jensen
Publication Year: 2025
Collection: The University of Auckland: Figshare
Subject Terms: Biophysics, Medicine, Genetics, Molecular Biology, Pharmacology, Biotechnology, Cancer, Space Science, Mathematical Sciences not elsewhere classified, Chemical Sciences not elsewhere classified, reference system consisting, provided numerical values, https :// github, generate distance distributions, different oxidation states, chemical representations annotated, molecular similarity assessments, based similarity method, based similarity assessment, principal component analysis, >- dimensional objects, features containing information, introduce hypershape recognition, >- dimensional coordinates, hsr similarity scores, similarity scores, hypershape recognition, principal components, cartesian coordinates, hypershape <
Description: Due to the widespread use of molecular similarity assessments in drug design, numerous methods for the calculation of similarity scores of organic molecules have been developed. When applied to other types of molecules, such as inorganic and organometallic compounds, these methods face significant challenges. To overcome these challenges, we here introduce Hypershape Recognition (HSR), a versatile framework for moment-based similarity assessment of three-dimensional (3D) chemical representations annotated with atomic features. In a default, general-purpose, implementation of the framework, features containing information about the atomic number, the isotope (the number of neutrons), and the formal charge of each atom are combined with its Cartesian coordinates to form the N -dimensional objects, termed hypershapes , that are compared. The hypershapes may account for any atomic features, including, as the first moment-based similarity method, any user-provided numerical values. Thus, the HSR framework can be tailored for specific applications, such as that of distinguishing between isotopologues and transition-metal complexes with different oxidation states, not handled by other moment-based molecular similarity methods. Moreover, by placing each hypershape in a reference system consisting of its own principal components (PCs, derived from principal component analysis, PCA, of the centered N -dimensional coordinates and features of the hypershape ) and using reference points located on PCs instead of on atoms to generate distance distributions and their moments, HSR similarity scores are continuous across geometry fluctuations. The PC-based reference system also enables HSR to distinguish between enantiomers. HSR is available as open source at https://github.com/denoptim-project/HSR.
Document Type: article in journal/newspaper
Language: unknown
DOI: 10.1021/acs.jcim.5c00555.s001
Availability: https://doi.org/10.1021/acs.jcim.5c00555.s001
https://figshare.com/articles/journal_contribution/Hypershape_Recognition_A_General_Framework_for_Moment-Based_Molecular_Similarity/29291557
Rights: CC BY-NC 4.0
Accession Number: edsbas.BDAF35FB
Database: BASE
Description
Abstract:Due to the widespread use of molecular similarity assessments in drug design, numerous methods for the calculation of similarity scores of organic molecules have been developed. When applied to other types of molecules, such as inorganic and organometallic compounds, these methods face significant challenges. To overcome these challenges, we here introduce Hypershape Recognition (HSR), a versatile framework for moment-based similarity assessment of three-dimensional (3D) chemical representations annotated with atomic features. In a default, general-purpose, implementation of the framework, features containing information about the atomic number, the isotope (the number of neutrons), and the formal charge of each atom are combined with its Cartesian coordinates to form the N -dimensional objects, termed hypershapes , that are compared. The hypershapes may account for any atomic features, including, as the first moment-based similarity method, any user-provided numerical values. Thus, the HSR framework can be tailored for specific applications, such as that of distinguishing between isotopologues and transition-metal complexes with different oxidation states, not handled by other moment-based molecular similarity methods. Moreover, by placing each hypershape in a reference system consisting of its own principal components (PCs, derived from principal component analysis, PCA, of the centered N -dimensional coordinates and features of the hypershape ) and using reference points located on PCs instead of on atoms to generate distance distributions and their moments, HSR similarity scores are continuous across geometry fluctuations. The PC-based reference system also enables HSR to distinguish between enantiomers. HSR is available as open source at https://github.com/denoptim-project/HSR.
DOI:10.1021/acs.jcim.5c00555.s001