Adversarial Threshold Neural Computer for Molecular de Novo Design

In this article, we propose the deep neural network Adversarial Threshold Neural Computer (ATNC). The ATNC model is intended for the de novo design of novel small-molecule organic structures. The model is based on generative adversarial network architecture and reinforcement learning. ATNC uses a Di...

Full description

Saved in:
Bibliographic Details
Published in:Molecular pharmaceutics Vol. 15; no. 10; p. 4386
Main Authors: Putin, Evgeny, Asadulaev, Arip, Vanhaelen, Quentin, Ivanenkov, Yan, Aladinskaya, Anastasia V, Aliper, Alex, Zhavoronkov, Alex
Format: Journal Article
Language:English
Published: United States 01.10.2018
Subjects:
ISSN:1543-8392, 1543-8392
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this article, we propose the deep neural network Adversarial Threshold Neural Computer (ATNC). The ATNC model is intended for the de novo design of novel small-molecule organic structures. The model is based on generative adversarial network architecture and reinforcement learning. ATNC uses a Differentiable Neural Computer as a generator and has a new specific block, called adversarial threshold (AT). AT acts as a filter between the agent (generator) and the environment (discriminator + objective reward functions). Furthermore, to generate more diverse molecules we introduce a new objective reward function named Internal Diversity Clustering (IDC). In this work, ATNC is tested and compared with the ORGANIC model. Both models were trained on the SMILES string representation of the molecules, using four objective functions (internal similarity, Muegge druglikeness filter, presence or absence of sp -rich fragments, and IDC). The SMILES representations of 15K druglike molecules from the ChemDiv collection were used as a training data set. For the different functions, ATNC outperforms ORGANIC. Combined with the IDC, ATNC generates 72% of valid and 77% of unique SMILES strings, while ORGANIC generates only 7% of valid and 86% of unique SMILES strings. For each set of molecules generated by ATNC and ORGANIC, we analyzed distributions of four molecular descriptors (number of atoms, molecular weight, logP, and tpsa) and calculated five chemical statistical features (internal diversity, number of unique heterocycles, number of clusters, number of singletons, and number of compounds that have not been passed through medicinal chemistry filters). Analysis of key molecular descriptors and chemical statistical features demonstrated that the molecules generated by ATNC elicited better druglikeness properties. We also performed in vitro validation of the molecules generated by ATNC; results indicated that ATNC is an effective method for producing hit compounds.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1543-8392
1543-8392
DOI:10.1021/acs.molpharmaceut.7b01137