Using pseudo amino acid composition to predict transmembrane regions in protein: cellular automata and Lempel-Ziv complexity

Transmembrane (TM) proteins represent about 20-30% of the protein sequences in higher eukaryotes, playing important roles across a range of cellular functions. Moreover, knowledge about topology of these proteins often provides crucial hints toward their function. Due to the difficulties in experime...

Full description

Saved in:
Bibliographic Details
Published in:Amino acids Vol. 34; no. 1; pp. 111 - 117
Main Authors: Diao, Y, Ma, D, Wen, Z, Yin, J, Xiang, J, Li, M
Format: Journal Article
Language:English
Published: Vienna Vienna : Springer-Verlag 2008
Springer-Verlag
Springer Nature B.V
Subjects:
ISSN:0939-4451, 1438-2199, 1438-2199
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Transmembrane (TM) proteins represent about 20-30% of the protein sequences in higher eukaryotes, playing important roles across a range of cellular functions. Moreover, knowledge about topology of these proteins often provides crucial hints toward their function. Due to the difficulties in experimental structure determinations of TM protein, theoretical prediction methods are highly preferred in identifying the topology of newly found ones according to their primary sequences, useful in both basic research and drug discovery. In this paper, based on the concept of pseudo amino acid composition (PseAA) that can incorporate sequence-order information of a protein sequence so as to remarkably enhance the power of discrete models (Chou, K. C., Proteins: Structure, Function, and Genetics, 2001, 43: 246-255), cellular automata and Lempel-Ziv complexity are introduced to predict the TM regions of integral membrane proteins including both α-helical and β-barrel membrane proteins, validated by jackknife test. The result thus obtained is quite promising, which indicates that the current approach might be a quite potential high throughput tool in the post-genomic era. The source code and dataset are available for academic users at liml@scu.edu.cn.
Bibliography:http://dx.doi.org/10.1007/s00726-007-0550-z
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ISSN:0939-4451
1438-2199
1438-2199
DOI:10.1007/s00726-007-0550-z