A novel method based on new adaptive LVQ neural network for predicting protein–protein interactions from protein sequences

Protein–Protein interaction (PPI) is one of the most important data in understanding the cellular processes. Many interesting methods have been proposed in order to predict PPIs. However, the methods which are based on the sequence of proteins as a prior knowledge are more universal. In this paper,...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of theoretical biology Vol. 336; pp. 231 - 239
Main Authors:	Yousef, Abdulaziz, Moghadam Charkari, Nasrollah
Format:	Journal Article
Language:	English
Published:	England Elsevier Ltd 07.11.2013
Subjects:	Algorithms Amino Acid Sequence amino acid sequences amino acids Classifier combination data collection Databases, Protein Feature extraction Helicobacter pylori Helicobacter pylori - metabolism Humans learning Learning vector quantization neural network neural networks Neural Networks (Computer) new methods physicochemical properties prediction Principal Component Analysis Principal components analysis Protein Interaction Mapping - methods protein-protein interactions proteins Protein–protein interaction prediction Saccharomyces cerevisiae - metabolism Feature extraction Principal components analysis Protein–protein interaction prediction Learning vector quantization neural network Classifier combination
ISSN:	0022-5193, 1095-8541, 1095-8541
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Protein–Protein interaction (PPI) is one of the most important data in understanding the cellular processes. Many interesting methods have been proposed in order to predict PPIs. However, the methods which are based on the sequence of proteins as a prior knowledge are more universal. In this paper, a sequence-based, fast, and adaptive PPI prediction method is introduced to assign two proteins to an interaction class (yes, no). First, in order to improve the presentation of the sequences, twelve physicochemical properties of amino acid have been used by different representation methods to transform the sequence of protein pairs into different feature vectors. Then, for speeding up the learning process and reducing the effect of noise PPI data, principal component analysis (PCA) is carried out as a proper feature extraction algorithm. Finally, a new and adaptive Learning Vector Quantization (LVQ) predictor is designed to deal with different models of datasets that are classified into balanced and imbalanced datasets. The accuracy of 93.88%, 90.03%, and 89.72% has been found on S. cerevisiae, H. pylori, and independent datasets, respectively. The results of various experiments indicate the efficiency and validity of the method. •We introduce a sequence-based, fast, and adaptive PPI prediction method.•The proposed method consists of four layers (multiple feature representation, feature extraction PCA, multi-NALVQ classifier, and combination layer).•Increasing physicochemical properties of amino acid which is used in the representation method will increase the information extracted from sequences of proteins.•The proposed method works as a one class classification system when the dataset is an imbalanced dataset.•The algorithm predicts PPI to a high value of accuracy and efficiency.
Bibliography:	http://dx.doi.org/10.1016/j.jtbi.2013.07.001 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0022-5193 1095-8541 1095-8541
DOI:	10.1016/j.jtbi.2013.07.001