Comprehensive ensemble in QSAR prediction for drug discovery

Background Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based mach...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	BMC bioinformatics Ročník 20; číslo 1; s. 521 - 12
Hlavní autoři:	Kwon, Sunyoung, Bae, Ho, Jo, Jeonghee, Yoon, Sungroh
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	London BioMed Central 26.10.2019 BMC
Témata:	Algorithms Bioinformatics Biomedical and Life Sciences Computational Biology/Bioinformatics Computer Appl. in Life Sciences Drug Discovery - methods Drug-prediction Ensemble-learning Life Sciences Machine Learning Machine Learning and Artificial Intelligence in Bioinformatics Meta-learning Methodology Methodology Article Microarrays Quantitative Structure-Activity Relationship Ensemble-learning Drug-prediction Meta-learning
ISSN:	1471-2105, 1471-2105
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Background Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based machine learning approaches have been used to overcome constraints and obtain reliable predictions. Ensemble learning builds a set of diversified models and combines them. However, the most prevalent approach random forest and other ensemble approaches in QSAR prediction limit their model diversity to a single subject. Results The proposed ensemble method consistently outperformed thirteen individual models on 19 bioassay datasets and demonstrated superiority over other ensemble approaches that are limited to a single subject. The comprehensive ensemble method is publicly available at http://data.snu.ac.kr/QSAR/ . Conclusions We propose a comprehensive ensemble method that builds multi-subject diversified models and combines them through second-level meta-learning. In addition, we propose an end-to-end neural network-based individual classifier that can automatically extract sequential features from a simplified molecular-input line-entry system (SMILES). The proposed individual models did not show impressive results as a single model, but it was considered the most important predictor when combined, according to the interpretation of the meta-learning.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-019-3135-4