Database diversity assessment: New ideas, concepts, and tools.

Gespeichert in:
Bibliographische Detailangaben
Titel: Database diversity assessment: New ideas, concepts, and tools.
Autoren: Nilakantan, Ramaswamy, Bauman, Norman, Haraki, Kevin
Quelle: Journal of Computer-Aided Molecular Design; Sep1997, Vol. 11 Issue 5, p447-452, 6p
Abstract: We present some new ideas for characterizing and comparing largechemical databases. The comparison of the contents of large databases is nottrivial since it implies pairwise comparison of hundreds of thousands ofcompounds. We have developed methods for categorizing compounds into groupsor series based on their ring-system content, using precalculatedstructure-based hashcodes. Two large databases can then be compared bysimply comparing their hashcode tables. Furthermore, the number of distinctring-system combinations can be used as an indicator of database diversity.We also present an indepen- dent technique for diversity assessment calledthe ’saturation diversity‘ approach. This method is based on picking as manymutually dissimilar compounds as possible from a database or a subsetthereof. We show that both methods yield similar results. Since the twomethods measure very different properties, this probably says more about theproperties of the databases studied than about the methods. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Computer-Aided Molecular Design is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank: Complementary Index