Virtual screening using machine learning techniques and ensemble docking-based molecular descriptors ; Cribado virtual utilizando técnicas de aprendizaje de máquina y descriptores moleculares basados en acoplamiento molecular en conglomerado

Gespeichert in:
Bibliographische Detailangaben
Titel: Virtual screening using machine learning techniques and ensemble docking-based molecular descriptors ; Cribado virtual utilizando técnicas de aprendizaje de máquina y descriptores moleculares basados en acoplamiento molecular en conglomerado
Autoren: Joel Ricci López
Weitere Verfasser: SERGIO ANDRES AGUILA PUENTES, Carlos Alberto Brizuela Rodríguez
Verlagsinformationen: CICESE
Publikationsjahr: 2023
Schlagwörter: info:eu-repo/classification/Autor/Structure-based virtual screening, molecular docking, machine learning, molecular descriptors, molecular dynamics, drug discovery, info:eu-repo/classification/Autor/Cribado virtual basado en estructura, docking molecular, aprendizaje de máquina, descriptores moleculares, dinámica molecular, descubrimiento de fármacos, info:eu-repo/classification/cti/7, info:eu-repo/classification/cti/33, info:eu-repo/classification/cti/3312, info:eu-repo/classification/cti/331208
Beschreibung: Structure-based virtual screening (SBVS) is a key component of early-stage drug discovery and development. As a computational technique, SBVS allows the evaluation of a vast number of molecules (ligands) by simulating their binding interactions with a protein target (receptor). However, traditional SBVS methods do not account for the flexibility of the receptor, negatively affecting SBVS performance. To address this limitation, ensemble docking (ED) is often used to incorporate protein flexibility into SBVS campaigns. ED involves running molecular docking simulations on multiple conformations of the same receptor. Although ED has been successfully applied in previous studies, there are still challenges in determining the best strategies for aggregating ED results and identifying the receptor conformations with the greatest virtual screening utility. In the present study, we proposed using machine learning (ML) as an alternative to traditional consensus strategies to aggregate the ensemble docking results of four proteins: CDK2, EGFR, FXa, and HSP90. Specifically, ensemble docking scores were used as molecular descriptors derived from the predicted receptor-ligand binding. Subsequently, these molecular descriptors were used to develop ML classifiers trained to identify true-binder molecules. Results showed that ML classifiers achieved statistically higher SBVS performances than traditional strategies. Additionally, we investigated whether the composition of the protein conformational ensemble affects the ensemble docking performance to gain insights into how to select the best set of protein conformations before the docking phase. In this regard, we compared the performance of traditional strategies (single-conformation docking and consensus strategies) against the ML approach using both crystallographic and molecular dynamics-derived (MD-derived) conformations. We conducted this analysis using the CDK2 protein as a case study. Our results suggest that, contrary to traditional strategies, the performance of the ...
Publikationsart: doctoral or postdoctoral thesis
Dateibeschreibung: application/pdf
Sprache: English
Relation: citation:Ricci López, J. 2023. Virtual screening using machine learning techniques and ensemble docking-based molecular descriptors. PhD Thesis in Sciences. Centro de Investigación Científica y de Educación Superior de Ensenada, Baja California. 179 pp.; http://cicese.repositorioinstitucional.mx/jspui/handle/1007/3900
Verfügbarkeit: http://cicese.repositorioinstitucional.mx/jspui/handle/1007/3900
Rights: info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0
Dokumentencode: edsbas.75D518B8
Datenbank: BASE
Beschreibung
Abstract:Structure-based virtual screening (SBVS) is a key component of early-stage drug discovery and development. As a computational technique, SBVS allows the evaluation of a vast number of molecules (ligands) by simulating their binding interactions with a protein target (receptor). However, traditional SBVS methods do not account for the flexibility of the receptor, negatively affecting SBVS performance. To address this limitation, ensemble docking (ED) is often used to incorporate protein flexibility into SBVS campaigns. ED involves running molecular docking simulations on multiple conformations of the same receptor. Although ED has been successfully applied in previous studies, there are still challenges in determining the best strategies for aggregating ED results and identifying the receptor conformations with the greatest virtual screening utility. In the present study, we proposed using machine learning (ML) as an alternative to traditional consensus strategies to aggregate the ensemble docking results of four proteins: CDK2, EGFR, FXa, and HSP90. Specifically, ensemble docking scores were used as molecular descriptors derived from the predicted receptor-ligand binding. Subsequently, these molecular descriptors were used to develop ML classifiers trained to identify true-binder molecules. Results showed that ML classifiers achieved statistically higher SBVS performances than traditional strategies. Additionally, we investigated whether the composition of the protein conformational ensemble affects the ensemble docking performance to gain insights into how to select the best set of protein conformations before the docking phase. In this regard, we compared the performance of traditional strategies (single-conformation docking and consensus strategies) against the ML approach using both crystallographic and molecular dynamics-derived (MD-derived) conformations. We conducted this analysis using the CDK2 protein as a case study. Our results suggest that, contrary to traditional strategies, the performance of the ...