Using pseudo amino acid composition to predict transmembrane regions in protein: cellular automata and Lempel-Ziv complexity

Transmembrane (TM) proteins represent about 20-30% of the protein sequences in higher eukaryotes, playing important roles across a range of cellular functions. Moreover, knowledge about topology of these proteins often provides crucial hints toward their function. Due to the difficulties in experime...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Amino acids Ročník 34; číslo 1; s. 111 - 117
Hlavní autoři: Diao, Y, Ma, D, Wen, Z, Yin, J, Xiang, J, Li, M
Médium: Journal Article
Jazyk:angličtina
Vydáno: Vienna Vienna : Springer-Verlag 2008
Springer-Verlag
Springer Nature B.V
Témata:
ISSN:0939-4451, 1438-2199, 1438-2199
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Transmembrane (TM) proteins represent about 20-30% of the protein sequences in higher eukaryotes, playing important roles across a range of cellular functions. Moreover, knowledge about topology of these proteins often provides crucial hints toward their function. Due to the difficulties in experimental structure determinations of TM protein, theoretical prediction methods are highly preferred in identifying the topology of newly found ones according to their primary sequences, useful in both basic research and drug discovery. In this paper, based on the concept of pseudo amino acid composition (PseAA) that can incorporate sequence-order information of a protein sequence so as to remarkably enhance the power of discrete models (Chou, K. C., Proteins: Structure, Function, and Genetics, 2001, 43: 246-255), cellular automata and Lempel-Ziv complexity are introduced to predict the TM regions of integral membrane proteins including both α-helical and β-barrel membrane proteins, validated by jackknife test. The result thus obtained is quite promising, which indicates that the current approach might be a quite potential high throughput tool in the post-genomic era. The source code and dataset are available for academic users at liml@scu.edu.cn.
Bibliografie:http://dx.doi.org/10.1007/s00726-007-0550-z
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ISSN:0939-4451
1438-2199
1438-2199
DOI:10.1007/s00726-007-0550-z