A novel method based on new adaptive LVQ neural network for predicting protein–protein interactions from protein sequences
Protein–Protein interaction (PPI) is one of the most important data in understanding the cellular processes. Many interesting methods have been proposed in order to predict PPIs. However, the methods which are based on the sequence of proteins as a prior knowledge are more universal. In this paper,...
Saved in:
| Published in: | Journal of theoretical biology Vol. 336; pp. 231 - 239 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
England
Elsevier Ltd
07.11.2013
|
| Subjects: | |
| ISSN: | 0022-5193, 1095-8541, 1095-8541 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Protein–Protein interaction (PPI) is one of the most important data in understanding the cellular processes. Many interesting methods have been proposed in order to predict PPIs. However, the methods which are based on the sequence of proteins as a prior knowledge are more universal. In this paper, a sequence-based, fast, and adaptive PPI prediction method is introduced to assign two proteins to an interaction class (yes, no). First, in order to improve the presentation of the sequences, twelve physicochemical properties of amino acid have been used by different representation methods to transform the sequence of protein pairs into different feature vectors. Then, for speeding up the learning process and reducing the effect of noise PPI data, principal component analysis (PCA) is carried out as a proper feature extraction algorithm. Finally, a new and adaptive Learning Vector Quantization (LVQ) predictor is designed to deal with different models of datasets that are classified into balanced and imbalanced datasets. The accuracy of 93.88%, 90.03%, and 89.72% has been found on S. cerevisiae, H. pylori, and independent datasets, respectively. The results of various experiments indicate the efficiency and validity of the method.
•We introduce a sequence-based, fast, and adaptive PPI prediction method.•The proposed method consists of four layers (multiple feature representation, feature extraction PCA, multi-NALVQ classifier, and combination layer).•Increasing physicochemical properties of amino acid which is used in the representation method will increase the information extracted from sequences of proteins.•The proposed method works as a one class classification system when the dataset is an imbalanced dataset.•The algorithm predicts PPI to a high value of accuracy and efficiency. |
|---|---|
| Bibliography: | http://dx.doi.org/10.1016/j.jtbi.2013.07.001 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0022-5193 1095-8541 1095-8541 |
| DOI: | 10.1016/j.jtbi.2013.07.001 |