DBP-GAPred: An intelligent method for prediction of DNA-binding proteins types by enhanced evolutionary profile features with ensemble learning

DNA-binding proteins (DBPs) perform an influential role in diverse biological activities like DNA replication, slicing, repair, and transcription. Some DBPs are indispensable for understanding many types of human cancers (i.e. lung, breast, and liver cancer) and chronic diseases (i.e. AIDS/HIV, asth...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of bioinformatics and computational biology Jg. 19; H. 4; S. 2150018
Hauptverfasser: Barukab, Omar, Ali, Farman, Khan, Sher Afzal
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Singapore 01.08.2021
Schlagworte:
ISSN:1757-6334, 1757-6334
Online-Zugang:Weitere Angaben
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:DNA-binding proteins (DBPs) perform an influential role in diverse biological activities like DNA replication, slicing, repair, and transcription. Some DBPs are indispensable for understanding many types of human cancers (i.e. lung, breast, and liver cancer) and chronic diseases (i.e. AIDS/HIV, asthma), while other kinds are involved in antibiotics, steroids, and anti-inflammatory drugs designing. These crucial processes are closely related to DBPs types. DBPs are categorized into single-stranded DNA-binding proteins (ssDBPs) and double-stranded DNA-binding proteins (dsDBPs). Few computational predictors have been reported for discriminating ssDBPs and dsDBPs. However, due to the limitations of the existing methods, an intelligent computational system is still highly desirable. In this work, features from protein sequences are discovered by extending the notion of dipeptide composition (DPC), evolutionary difference formula (EDF), and K-separated bigram (KSB) into the position-specific scoring matrix (PSSM). The highly intrinsic information was encoded by a compression approach named discrete cosine transform (DCT) and the model was trained with support vector machine (SVM). The prediction performance was further boosted by the genetic algorithm (GA) ensemble strategy. The novel predictor (DBP-GAPred) acquired 1.89%, 0.28%, and 6.63% higher accuracies on jackknife, 10-fold, and independent dataset tests, respectively than the best predictor. These outcomes confirm the superiority of our method over the existing predictors.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1757-6334
1757-6334
DOI:10.1142/S0219720021500189