MolEM: a unified generative framework for molecular graphs and sequential orders
Abstract Structure-based drug design aims to generate molecules that fill the cavity of the protein pocket with a high binding affinity. Many contemporary studies employ sequential generative models. Their standard training method is to sequentialize molecular graphs into ordered sequences and then...
Uloženo v:
| Vydáno v: | Briefings in bioinformatics Ročník 26; číslo 2 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
England
Oxford University Press
04.03.2025
Oxford Publishing Limited (England) |
| Témata: | |
| ISSN: | 1467-5463, 1477-4054, 1477-4054 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Abstract
Structure-based drug design aims to generate molecules that fill the cavity of the protein pocket with a high binding affinity. Many contemporary studies employ sequential generative models. Their standard training method is to sequentialize molecular graphs into ordered sequences and then maximize the likelihood of the resulting sequences. However, the exact likelihood is computationally intractable, which involves a sum over all possible sequential orders. Molecular graphs lack an inherent order and the number of orders is factorial in the graph size. To avoid the intractable full space of factorially-many orders, existing works pre-define a fixed node ordering scheme such as depth-first search to sequentialize the 3D molecular graphs. In these cases, the training objectives are loose lower bounds of the exact likelihoods which are suboptimal for generation. To address the challenges, we propose a unified generative framework named MolEM to learn the 3D molecular graphs and corresponding sequential orders jointly. We derive a tight lower bound of the likelihood and maximize it via variational expectation-maximization algorithm, opening a new line of research in learning-based ordering schemes for 3D molecular graph generation. Besides, we first incorporate the molecular docking method QuickVina 2 to manipulate the binding poses, leading to accurate and flexible ligand conformations. Experimental results demonstrate that MolEM significantly outperforms baseline models in generating molecules with high binding affinities and realistic structures. Our approach efficiently approximates the true marginal graph likelihood and identifies reasonable orderings for 3D molecular graphs, aligning well with relevant chemical priors. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1467-5463 1477-4054 1477-4054 |
| DOI: | 10.1093/bib/bbaf094 |