Uncertainty quantification enables reliable deep learning for protein-ligand binding affinity prediction.
Uložené v:
| Názov: | Uncertainty quantification enables reliable deep learning for protein-ligand binding affinity prediction. |
|---|---|
| Autori: | Rayka M; Department of Physical and Computational Chemistry, Shahid Beheshti University, Tehran, 1983969411, Iran. miladrayka93@gmail.com., Naghavi SS; Department of Physical and Computational Chemistry, Shahid Beheshti University, Tehran, 1983969411, Iran. s_naghavi@sbu.ac.ir. |
| Zdroj: | Scientific reports [Sci Rep] 2025 Dec 04; Vol. 15 (1), pp. 43156. Date of Electronic Publication: 2025 Dec 04. |
| Spôsob vydávania: | Journal Article |
| Jazyk: | English |
| Informácie o časopise: | Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: London : Nature Publishing Group, copyright 2011- |
| Výrazy zo slovníka MeSH: | Deep Learning* , Proteins*/metabolism , Proteins*/chemistry, Ligands ; Uncertainty ; Protein Binding ; Neural Networks, Computer ; Bayes Theorem ; Algorithms ; Drug Design |
| Abstrakt: | Deep learning (DL) algorithms have increasingly been applied to predict protein-ligand binding affinity, a critical step in drug design. Yet, many models still struggle to generalize to unseen data, and when coupled with the absence of confidence estimates for predictions, they hinder effective decision-making. To address these challenges, we thoroughly compare five uncertainty quantification methods: Deep Ensemble, Monte Carlo Dropout, Laplace approximation, Bayes by Backprop, and Evidential Neural Networks. Notably, Bayes by Backprop-applied for the first time in this study area-offers a novel and promising approach to uncertainty quantification. To ensure unbiased training and validation, we leverage the Leak-Proof PDBBind dataset and rigorously evaluate performance across multiple external test sets. Our results reveal that a feed-forward neural network (FFNN) using extended connectivity interaction features (ECIF) as a protein-ligand representation, paired with the Bayes by Backprop method, achieves superior predictive performance and highly reliable uncertainty quantification. Notably, Bayes by Backprop demonstrated balanced performance across multiple evaluation metrics, particularly excelling in calibration without needing additional recalibration. Our findings not only advance the state of uncertainty quantification in deep learning models for binding affinity prediction but also open avenues for more reliable, calibrated, and reproducible applications in drug discovery and active learning-driven model development. (© 2025. The Author(s).) |
| References: | Brief Bioinform. 2021 Nov 5;22(6):. (PMID: 34169324) Bioinformatics. 2025 Feb 04;41(2):. (PMID: 39874452) J Med Chem. 2021 Dec 23;64(24):18209-18232. (PMID: 34878785) Nat Commun. 2020 Sep 4;11(1):4428. (PMID: 32887879) J Chem Inf Model. 2024 Apr 8;64(7):2323-2330. (PMID: 38366974) J Chem Inf Model. 2019 Feb 25;59(2):895-913. (PMID: 30481020) J Chem Inf Model. 2024 Mar 11;64(5):1456-1472. (PMID: 38385768) Expert Opin Drug Discov. 2024 Jun;19(6):649-670. (PMID: 38715415) Int J Mol Sci. 2023 Nov 09;24(22):. (PMID: 38003312) J Chem Inf Model. 2010 Nov 22;50(11):1961-9. (PMID: 20936880) Brief Bioinform. 2024 Jan 22;25(2):. (PMID: 38446737) J Chem Inf Model. 2024 Mar 25;64(6):1955-1965. (PMID: 38446131) Comput Biol Chem. 2023 Dec;107:107972. (PMID: 37883905) Mol Inform. 2024 Apr;43(4):e202300292. (PMID: 38358080) Mol Inform. 2023 Mar;42(3):e2200135. (PMID: 36722733) Bioinformatics. 2018 Sep 1;34(17):i821-i829. (PMID: 30423097) Drug Discov Today. 2024 Jun;29(6):103985. (PMID: 38642700) Mol Inform. 2016 May;35(5):160-80. (PMID: 27492083) iScience. 2022 Jul 21;25(8):104814. (PMID: 35996575) Brief Bioinform. 2021 Jan 18;22(1):497-514. (PMID: 31982914) J Chem Inf Model. 2020 Mar 23;60(3):1122-1136. (PMID: 32085675) J Chem Inf Model. 2024 Nov 25;64(22):8379-8386. (PMID: 39542432) Front Chem. 2021 Oct 27;9:753002. (PMID: 34778208) Nat Methods. 2020 Mar;17(3):261-272. (PMID: 32015543) Brief Bioinform. 2023 Nov 22;25(1):. (PMID: 38102069) J Chem Inf Model. 2017 Apr 24;57(4):1007-1012. (PMID: 28358210) J Chem Inf Model. 2022 Nov 28;62(22):5485-5502. (PMID: 36268980) Bioinformatics. 2021 Jun 16;37(10):1376-1382. (PMID: 33226061) J Phys Chem Lett. 2023 Mar 2;14(8):2020-2033. (PMID: 36794930) ACS Cent Sci. 2021 Aug 25;7(8):1356-1367. (PMID: 34471680) Commun Mater. 2022;3(1):93. (PMID: 36468086) J Chem Inf Model. 2024 Aug 12;64(15):5817-5831. (PMID: 39037942) |
| Grant Information: | 4028502 Iran National Science Foundation; 4028502 Iran National Science Foundation |
| Contributed Indexing: | Keywords: Bayesian neural network; Deep learning; Drug discovery; Feature engineering; Protein-ligand binding affinity; Uncertainty quantification |
| Substance Nomenclature: | 0 (Ligands) 0 (Proteins) |
| Entry Date(s): | Date Created: 20251204 Date Completed: 20251204 Latest Revision: 20251207 |
| Update Code: | 20251207 |
| PubMed Central ID: | PMC12678490 |
| DOI: | 10.1038/s41598-025-27167-7 |
| PMID: | 41345415 |
| Databáza: | MEDLINE |
| Abstrakt: | Deep learning (DL) algorithms have increasingly been applied to predict protein-ligand binding affinity, a critical step in drug design. Yet, many models still struggle to generalize to unseen data, and when coupled with the absence of confidence estimates for predictions, they hinder effective decision-making. To address these challenges, we thoroughly compare five uncertainty quantification methods: Deep Ensemble, Monte Carlo Dropout, Laplace approximation, Bayes by Backprop, and Evidential Neural Networks. Notably, Bayes by Backprop-applied for the first time in this study area-offers a novel and promising approach to uncertainty quantification. To ensure unbiased training and validation, we leverage the Leak-Proof PDBBind dataset and rigorously evaluate performance across multiple external test sets. Our results reveal that a feed-forward neural network (FFNN) using extended connectivity interaction features (ECIF) as a protein-ligand representation, paired with the Bayes by Backprop method, achieves superior predictive performance and highly reliable uncertainty quantification. Notably, Bayes by Backprop demonstrated balanced performance across multiple evaluation metrics, particularly excelling in calibration without needing additional recalibration. Our findings not only advance the state of uncertainty quantification in deep learning models for binding affinity prediction but also open avenues for more reliable, calibrated, and reproducible applications in drug discovery and active learning-driven model development.<br /> (© 2025. The Author(s).) |
|---|---|
| ISSN: | 2045-2322 |
| DOI: | 10.1038/s41598-025-27167-7 |
Full Text Finder
Nájsť tento článok vo Web of Science