DeepTargetClass: a web-based platform for predicting protein target classes of small molecules.
Gespeichert in:
| Titel: | DeepTargetClass: a web-based platform for predicting protein target classes of small molecules. |
|---|---|
| Autoren: | Ouassaf M; LMCE Laboratory, Group of Computational and Medicinal Chemistry, University of Biskra, 07000, Biskra, Algeria. nouassaf@univ-biskra.dz., Alhatlani BY; Unit of Scientific Research, Applied College, Qassim University, 52571, Buraydah, Saudi Arabia. balhatlani@qu.edu.sa. |
| Quelle: | Journal of computer-aided molecular design [J Comput Aided Mol Des] 2025 Dec 03; Vol. 40 (1), pp. 11. Date of Electronic Publication: 2025 Dec 03. |
| Publikationsart: | Journal Article |
| Sprache: | English |
| Info zur Zeitschrift: | Publisher: Springer Country of Publication: Netherlands NLM ID: 8710425 Publication Model: Electronic Cited Medium: Internet ISSN: 1573-4951 (Electronic) Linking ISSN: 0920654X NLM ISO Abbreviation: J Comput Aided Mol Des Subsets: MEDLINE |
| Imprint Name(s): | Publication: Amsterdam : Springer Original Publication: Leiden, The Netherlands : ESCOM, [c1987- |
| MeSH-Schlagworte: | Drug Discovery*/methods , Small Molecule Libraries*/chemistry , Small Molecule Libraries*/pharmacology , Deep Learning* , Proteins*/chemistry, Internet ; Receptors, G-Protein-Coupled/chemistry ; Receptors, G-Protein-Coupled/antagonists & inhibitors ; Ligands ; Humans ; Receptors, Cytoplasmic and Nuclear/chemistry ; Receptors, Cytoplasmic and Nuclear/antagonists & inhibitors |
| Abstract: | Competing Interests: Declarations. Competing interests: The authors declare no competing interests. Informed consent: Not applicable. Institutional review board statement: Not applicable. The identification of protein target classes is a key step in drug discovery, as it enables prioritization of screening campaigns and supports target-based drug repurpose. In this study, we developed a deep-learning pipeline based on a multilayer perceptron (MLP) trained on 15,804 curated compounds representing four major pharmacological target classes: G protein-coupled receptors (GPCRs), kinases, nuclear receptors, and transporters. Using extended connectivity fingerprints (ECFP4) as molecular descriptors, the model achieved 96% accuracy in internal cross-validation and 87% accuracy on an external test set, demonstrating performance comparable to ensemble classifiers such as Random Forest, XGBoost, and LightGBM. Class-specific F1 scores confirmed robust and balanced predictions across GPCR, kinase, nuclear receptor, and transporter categories. Model interpretability was addressed using SHAP values, which highlighted pharmacophore-like substructures consistent with known ligand-target interactions. Application to reference drugs further validated predictive utility, with correct assignment of most compounds to their canonical protein target class. The final MLP model was deployed as a user-friendly web application to facilitate accessible protein class prediction for novel compounds. Overall, this work presents a reliable and interpretable computational framework to support target-class-based drug discovery and repositioning. (© 2025. The Author(s), under exclusive licence to Springer Nature Switzerland AG.) |
| References: | Trapotsi M-A, Hosseini-Gerami L, Bender A (2022) Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol 3:170–200. (PMID: 10.1039/D1CB00069A35360890) Cirinciani M, Da Pozzo E, Trincavelli ML, Milazzo P, Martini C (2024) Drug mechanism: a bioinformatic update. Biochem Pharmacol 228:116078. (PMID: 10.1016/j.bcp.2024.11607838402909) Schirle M, Jenkins JL (2016) Identifying compound efficacy targets in phenotypic drug discovery. Drug Discov Today 21:82–89. (PMID: 10.1016/j.drudis.2015.08.00126272035) López-Pérez K et al (2024) Molecular similarity: theory, applications, and perspectives. Artif Intell Chem 2:100077. (PMID: 10.1016/j.aichem.2024.1000774012465411928018) Yuan Y et al (2025) A new paradigm for drug discovery in the treatment of complex diseases: drug discovery and optimization. Chin Med 20:40. (PMID: 10.1186/s13020-025-01075-44012280011931805) Smith JS, Lefkowitz RJ, Rajagopal S (2018) Biased signalling: from simple switches to allosteric microprocessors. Nat Rev Drug Discov 17:243–260. (PMID: 10.1038/nrd.2017.229293020675936084) Powers AS et al (2023) Structural basis of efficacy-driven ligand selectivity at GPCRs. Nat Chem Biol 19:805–814. (PMID: 10.1038/s41589-022-01247-53678201010299909) Ouma RBO, Ngari SM, Kibet JK (2024) A review of the current trends in computational approaches in drug design and metabolism. Discov Public Health 21:108. (PMID: 10.1186/s12982-024-00229-3) Flores-Hernandez H, Martinez-Ledesma E (2024) A systematic review of deep learning chemical language models in recent era. J Cheminform 16:129. (PMID: 10.1186/s13321-024-00916-y3955837611571686) Ogbonna UE et al (2025) Advances in machine learning for optimizing pharmaceutical drug discovery. Curr Proteomics 22:100015. (PMID: 10.1016/j.curpro.2025.100015) Ahmed SF et al (2023) Deep learning modelling techniques: current progress, applications, advantages, and challenges. Artif Intell Rev 56:13521–13617. (PMID: 10.1007/s10462-023-10466-8) Carracedo-Reboredo P et al (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538–4558. (PMID: 10.1016/j.csbj.2021.08.011344714988387781) Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. (PMID: 10.1021/ci100050t20426451) Schapin N, Majewski M, Varela-Rial A, Arroniz C, Fabritiis GD (2023) Machine learning small molecule properties in drug discovery. Artif Intell Chem 1:100020. (PMID: 10.1016/j.aichem.2023.100020) Salehin I, Kang D-K (2023) A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 12:3106. (PMID: 10.3390/electronics12143106) Tanoli Z et al (2025) Computational drug repurposing: approaches, evaluation of in silico resources and case studies. Nat Rev Drug Discov 24:521–542. (PMID: 10.1038/s41573-025-01164-x40102635) Tanoli Z, Schulman A, Aittokallio T (2025) Validation guidelines for drug-target prediction methods. Expert Opin Drug Discov 20:31–45. (PMID: 10.1080/17460441.2024.243095539568436) Duong Nguyen TT et al (2025) PGxDB: an interactive web-platform for pharmacogenomics research. Nucleic Acids Res 53:D1486–D1497. (PMID: 10.1093/nar/gkae112739565203) Wang Y et al (2022) Drugrepo: a novel approach to repurposing drugs based on chemical and genomic features. Sci Rep 12:21116. (PMID: 10.1038/s41598-022-24980-2364776049729186) Karimi M, Wu D, Wang Z, Shen Y (2019) DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinforma Oxf Engl 35:3329–3338. (PMID: 10.1093/bioinformatics/btz111) Tanoli Z et al (2018) Drug target commons 2.0: a community platform for systematic analysis of drug–target interaction profiles. Database 2018:bay083. (PMID: 10.1093/database/bay083302198396146131) Ianevski A et al (2024) RepurposeDrugs: an interactive web-portal and predictive platform for repurposing mono- and combination therapies. Brief Bioinform 25:bbae328. (PMID: 10.1093/bib/bbae3283898037011232279) Tanoli Z et al (2021) Exploration of databases and methods supporting drug repurposing: a comprehensive survey. Brief Bioinform 22:1656–1678. (PMID: 10.1093/bib/bbaa00332055842) Dablander M, Hanser T, Lambiotte R, Morris GM (2024) Sort & slice: a simple and superior alternative to hash-based folding for extended-connectivity fingerprints. J Cheminform 16:135. (PMID: 10.1186/s13321-024-00932-y3962786111616156) Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Transact A Math Phys Eng Sci 374:20150202. Cheng Y, Wang X, Xia Y (2021) Supervised t-distributed stochastic neighbor embedding for data visualization and classification. INFORMS J Comput 33:419–835. (PMID: 34354339) Rácz A, Bajusz D, Héberger K (2018) Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints. J Cheminform 10:48. (PMID: 10.1186/s13321-018-0302-y302886266755604) Pedregosa, F et al (2018) Scikit-learn: machine learning in Python. Preprint at https://doi.org/10.48550/arXiv.1201.0490. Paszke A et al (2019) PyTorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, vol 32. Curran Associates, Inc. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, Omnipress, Madison, WI, USA, pp 807–814. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958. Kingma, D. P. & Ba, L. J. Adam: A Method for Stochastic Optimization. https://dare.uva.nl/search?identifier=a20791d3-1aff-464a-8544-268383c33a75 (2015). Pattern Recognition and Machine Learning | SpringerLink. https://link.springer.com/book/9780387310732. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437. (PMID: 10.1016/j.ipm.2009.03.002) Breiman L (2001) Random forests. Mach Learn 45:5–32. (PMID: 10.1023/A:1010933404324) Breiman L, Friedman J, Olshen RA, Stone CJ (2017) Classification and regression trees. Chapman and Hall/CRC, New York. https://doi.org/10.1201/9781315139470. (PMID: 10.1201/9781315139470) Geron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc., Sebastopol. Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36. (PMID: 10.1148/radiology.143.1.70637477063747) Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874. (PMID: 10.1016/j.patrec.2005.10.010) Schulman A, Rousu J, Aittokallio T, Tanoli Z (2024) Attention-based approach to predict drug–target interactions across seven target superfamilies. Bioinformatics 40:btae496. (PMID: 10.1093/bioinformatics/btae4963911537911520408) |
| Contributed Indexing: | Keywords: Deep learning; Drug discovery; ECFP4 fingerprints; Interpretable AI; Protein target class; SHAP |
| Substance Nomenclature: | 0 (Small Molecule Libraries) 0 (Receptors, G-Protein-Coupled) 0 (Ligands) 0 (Proteins) 0 (Receptors, Cytoplasmic and Nuclear) |
| Entry Date(s): | Date Created: 20251202 Date Completed: 20251202 Latest Revision: 20251202 |
| Update Code: | 20251203 |
| DOI: | 10.1007/s10822-025-00717-x |
| PMID: | 41331195 |
| Datenbank: | MEDLINE |
| Abstract: | Competing Interests: Declarations. Competing interests: The authors declare no competing interests. Informed consent: Not applicable. Institutional review board statement: Not applicable.<br />The identification of protein target classes is a key step in drug discovery, as it enables prioritization of screening campaigns and supports target-based drug repurpose. In this study, we developed a deep-learning pipeline based on a multilayer perceptron (MLP) trained on 15,804 curated compounds representing four major pharmacological target classes: G protein-coupled receptors (GPCRs), kinases, nuclear receptors, and transporters. Using extended connectivity fingerprints (ECFP4) as molecular descriptors, the model achieved 96% accuracy in internal cross-validation and 87% accuracy on an external test set, demonstrating performance comparable to ensemble classifiers such as Random Forest, XGBoost, and LightGBM. Class-specific F1 scores confirmed robust and balanced predictions across GPCR, kinase, nuclear receptor, and transporter categories. Model interpretability was addressed using SHAP values, which highlighted pharmacophore-like substructures consistent with known ligand-target interactions. Application to reference drugs further validated predictive utility, with correct assignment of most compounds to their canonical protein target class. The final MLP model was deployed as a user-friendly web application to facilitate accessible protein class prediction for novel compounds. Overall, this work presents a reliable and interpretable computational framework to support target-class-based drug discovery and repositioning.<br /> (© 2025. The Author(s), under exclusive licence to Springer Nature Switzerland AG.) |
|---|---|
| ISSN: | 1573-4951 |
| DOI: | 10.1007/s10822-025-00717-x |
Full Text Finder
Nájsť tento článok vo Web of Science