DeepTargetClass: a web-based platform for predicting protein target classes of small molecules.

Gespeichert in:
Bibliographische Detailangaben
Titel: DeepTargetClass: a web-based platform for predicting protein target classes of small molecules.
Autoren: Ouassaf M; LMCE Laboratory, Group of Computational and Medicinal Chemistry, University of Biskra, 07000, Biskra, Algeria. nouassaf@univ-biskra.dz., Alhatlani BY; Unit of Scientific Research, Applied College, Qassim University, 52571, Buraydah, Saudi Arabia. balhatlani@qu.edu.sa.
Quelle: Journal of computer-aided molecular design [J Comput Aided Mol Des] 2025 Dec 03; Vol. 40 (1), pp. 11. Date of Electronic Publication: 2025 Dec 03.
Publikationsart: Journal Article
Sprache: English
Info zur Zeitschrift: Publisher: Springer Country of Publication: Netherlands NLM ID: 8710425 Publication Model: Electronic Cited Medium: Internet ISSN: 1573-4951 (Electronic) Linking ISSN: 0920654X NLM ISO Abbreviation: J Comput Aided Mol Des Subsets: MEDLINE
Imprint Name(s): Publication: Amsterdam : Springer
Original Publication: Leiden, The Netherlands : ESCOM, [c1987-
MeSH-Schlagworte: Drug Discovery*/methods , Small Molecule Libraries*/chemistry , Small Molecule Libraries*/pharmacology , Deep Learning* , Proteins*/chemistry, Internet ; Receptors, G-Protein-Coupled/chemistry ; Receptors, G-Protein-Coupled/antagonists & inhibitors ; Ligands ; Humans ; Receptors, Cytoplasmic and Nuclear/chemistry ; Receptors, Cytoplasmic and Nuclear/antagonists & inhibitors
Abstract: Competing Interests: Declarations. Competing interests: The authors declare no competing interests. Informed consent: Not applicable. Institutional review board statement: Not applicable.
The identification of protein target classes is a key step in drug discovery, as it enables prioritization of screening campaigns and supports target-based drug repurpose. In this study, we developed a deep-learning pipeline based on a multilayer perceptron (MLP) trained on 15,804 curated compounds representing four major pharmacological target classes: G protein-coupled receptors (GPCRs), kinases, nuclear receptors, and transporters. Using extended connectivity fingerprints (ECFP4) as molecular descriptors, the model achieved 96% accuracy in internal cross-validation and 87% accuracy on an external test set, demonstrating performance comparable to ensemble classifiers such as Random Forest, XGBoost, and LightGBM. Class-specific F1 scores confirmed robust and balanced predictions across GPCR, kinase, nuclear receptor, and transporter categories. Model interpretability was addressed using SHAP values, which highlighted pharmacophore-like substructures consistent with known ligand-target interactions. Application to reference drugs further validated predictive utility, with correct assignment of most compounds to their canonical protein target class. The final MLP model was deployed as a user-friendly web application to facilitate accessible protein class prediction for novel compounds. Overall, this work presents a reliable and interpretable computational framework to support target-class-based drug discovery and repositioning.
(© 2025. The Author(s), under exclusive licence to Springer Nature Switzerland AG.)
References: Trapotsi M-A, Hosseini-Gerami L, Bender A (2022) Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol 3:170–200. (PMID: 10.1039/D1CB00069A35360890)
Cirinciani M, Da Pozzo E, Trincavelli ML, Milazzo P, Martini C (2024) Drug mechanism: a bioinformatic update. Biochem Pharmacol 228:116078. (PMID: 10.1016/j.bcp.2024.11607838402909)
Schirle M, Jenkins JL (2016) Identifying compound efficacy targets in phenotypic drug discovery. Drug Discov Today 21:82–89. (PMID: 10.1016/j.drudis.2015.08.00126272035)
López-Pérez K et al (2024) Molecular similarity: theory, applications, and perspectives. Artif Intell Chem 2:100077. (PMID: 10.1016/j.aichem.2024.1000774012465411928018)
Yuan Y et al (2025) A new paradigm for drug discovery in the treatment of complex diseases: drug discovery and optimization. Chin Med 20:40. (PMID: 10.1186/s13020-025-01075-44012280011931805)
Smith JS, Lefkowitz RJ, Rajagopal S (2018) Biased signalling: from simple switches to allosteric microprocessors. Nat Rev Drug Discov 17:243–260. (PMID: 10.1038/nrd.2017.229293020675936084)
Powers AS et al (2023) Structural basis of efficacy-driven ligand selectivity at GPCRs. Nat Chem Biol 19:805–814. (PMID: 10.1038/s41589-022-01247-53678201010299909)
Ouma RBO, Ngari SM, Kibet JK (2024) A review of the current trends in computational approaches in drug design and metabolism. Discov Public Health 21:108. (PMID: 10.1186/s12982-024-00229-3)
Flores-Hernandez H, Martinez-Ledesma E (2024) A systematic review of deep learning chemical language models in recent era. J Cheminform 16:129. (PMID: 10.1186/s13321-024-00916-y3955837611571686)
Ogbonna UE et al (2025) Advances in machine learning for optimizing pharmaceutical drug discovery. Curr Proteomics 22:100015. (PMID: 10.1016/j.curpro.2025.100015)
Ahmed SF et al (2023) Deep learning modelling techniques: current progress, applications, advantages, and challenges. Artif Intell Rev 56:13521–13617. (PMID: 10.1007/s10462-023-10466-8)
Carracedo-Reboredo P et al (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538–4558. (PMID: 10.1016/j.csbj.2021.08.011344714988387781)
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. (PMID: 10.1021/ci100050t20426451)
Schapin N, Majewski M, Varela-Rial A, Arroniz C, Fabritiis GD (2023) Machine learning small molecule properties in drug discovery. Artif Intell Chem 1:100020. (PMID: 10.1016/j.aichem.2023.100020)
Salehin I, Kang D-K (2023) A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 12:3106. (PMID: 10.3390/electronics12143106)
Tanoli Z et al (2025) Computational drug repurposing: approaches, evaluation of in silico resources and case studies. Nat Rev Drug Discov 24:521–542. (PMID: 10.1038/s41573-025-01164-x40102635)
Tanoli Z, Schulman A, Aittokallio T (2025) Validation guidelines for drug-target prediction methods. Expert Opin Drug Discov 20:31–45. (PMID: 10.1080/17460441.2024.243095539568436)
Duong Nguyen TT et al (2025) PGxDB: an interactive web-platform for pharmacogenomics research. Nucleic Acids Res 53:D1486–D1497. (PMID: 10.1093/nar/gkae112739565203)
Wang Y et al (2022) Drugrepo: a novel approach to repurposing drugs based on chemical and genomic features. Sci Rep 12:21116. (PMID: 10.1038/s41598-022-24980-2364776049729186)
Karimi M, Wu D, Wang Z, Shen Y (2019) DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinforma Oxf Engl 35:3329–3338. (PMID: 10.1093/bioinformatics/btz111)
Tanoli Z et al (2018) Drug target commons 2.0: a community platform for systematic analysis of drug–target interaction profiles. Database 2018:bay083. (PMID: 10.1093/database/bay083302198396146131)
Ianevski A et al (2024) RepurposeDrugs: an interactive web-portal and predictive platform for repurposing mono- and combination therapies. Brief Bioinform 25:bbae328. (PMID: 10.1093/bib/bbae3283898037011232279)
Tanoli Z et al (2021) Exploration of databases and methods supporting drug repurposing: a comprehensive survey. Brief Bioinform 22:1656–1678. (PMID: 10.1093/bib/bbaa00332055842)
Dablander M, Hanser T, Lambiotte R, Morris GM (2024) Sort & slice: a simple and superior alternative to hash-based folding for extended-connectivity fingerprints. J Cheminform 16:135. (PMID: 10.1186/s13321-024-00932-y3962786111616156)
Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Transact A Math Phys Eng Sci 374:20150202.
Cheng Y, Wang X, Xia Y (2021) Supervised t-distributed stochastic neighbor embedding for data visualization and classification. INFORMS J Comput 33:419–835. (PMID: 34354339)
Rácz A, Bajusz D, Héberger K (2018) Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints. J Cheminform 10:48. (PMID: 10.1186/s13321-018-0302-y302886266755604)
Pedregosa, F et al (2018) Scikit-learn: machine learning in Python. Preprint at https://doi.org/10.48550/arXiv.1201.0490.
Paszke A et al (2019) PyTorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, vol 32. Curran Associates, Inc.
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, Omnipress, Madison, WI, USA, pp 807–814.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958.
Kingma, D. P. & Ba, L. J. Adam: A Method for Stochastic Optimization. https://dare.uva.nl/search?identifier=a20791d3-1aff-464a-8544-268383c33a75 (2015).
Pattern Recognition and Machine Learning | SpringerLink. https://link.springer.com/book/9780387310732.
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437. (PMID: 10.1016/j.ipm.2009.03.002)
Breiman L (2001) Random forests. Mach Learn 45:5–32. (PMID: 10.1023/A:1010933404324)
Breiman L, Friedman J, Olshen RA, Stone CJ (2017) Classification and regression trees. Chapman and Hall/CRC, New York. https://doi.org/10.1201/9781315139470. (PMID: 10.1201/9781315139470)
Geron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc., Sebastopol.
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36. (PMID: 10.1148/radiology.143.1.70637477063747)
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874. (PMID: 10.1016/j.patrec.2005.10.010)
Schulman A, Rousu J, Aittokallio T, Tanoli Z (2024) Attention-based approach to predict drug–target interactions across seven target superfamilies. Bioinformatics 40:btae496. (PMID: 10.1093/bioinformatics/btae4963911537911520408)
Contributed Indexing: Keywords: Deep learning; Drug discovery; ECFP4 fingerprints; Interpretable AI; Protein target class; SHAP
Substance Nomenclature: 0 (Small Molecule Libraries)
0 (Receptors, G-Protein-Coupled)
0 (Ligands)
0 (Proteins)
0 (Receptors, Cytoplasmic and Nuclear)
Entry Date(s): Date Created: 20251202 Date Completed: 20251202 Latest Revision: 20251202
Update Code: 20251203
DOI: 10.1007/s10822-025-00717-x
PMID: 41331195
Datenbank: MEDLINE
Beschreibung
Abstract:Competing Interests: Declarations. Competing interests: The authors declare no competing interests. Informed consent: Not applicable. Institutional review board statement: Not applicable.<br />The identification of protein target classes is a key step in drug discovery, as it enables prioritization of screening campaigns and supports target-based drug repurpose. In this study, we developed a deep-learning pipeline based on a multilayer perceptron (MLP) trained on 15,804 curated compounds representing four major pharmacological target classes: G protein-coupled receptors (GPCRs), kinases, nuclear receptors, and transporters. Using extended connectivity fingerprints (ECFP4) as molecular descriptors, the model achieved 96% accuracy in internal cross-validation and 87% accuracy on an external test set, demonstrating performance comparable to ensemble classifiers such as Random Forest, XGBoost, and LightGBM. Class-specific F1 scores confirmed robust and balanced predictions across GPCR, kinase, nuclear receptor, and transporter categories. Model interpretability was addressed using SHAP values, which highlighted pharmacophore-like substructures consistent with known ligand-target interactions. Application to reference drugs further validated predictive utility, with correct assignment of most compounds to their canonical protein target class. The final MLP model was deployed as a user-friendly web application to facilitate accessible protein class prediction for novel compounds. Overall, this work presents a reliable and interpretable computational framework to support target-class-based drug discovery and repositioning.<br /> (© 2025. The Author(s), under exclusive licence to Springer Nature Switzerland AG.)
ISSN:1573-4951
DOI:10.1007/s10822-025-00717-x