Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms

Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein β-sheets, however, remains challenging because protein β-sheets require formation of hydrogen bonds between linearly distant residues. Previo...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Bioinformatics Ročník 21; číslo suppl-1; s. i75 - i84
Hlavní autoři: Cheng, Jianlin, Baldi, Pierre
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford University Press 01.06.2005
Témata:
ISSN:1367-4803, 1460-2059
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein β-sheets, however, remains challenging because protein β-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting β-sheet topological features, such as β-strand alignments, in general have not exploited the global covariation and constraints characteristic of β-sheet architectures. Results: We propose a modular approach to the problem of predicting/assembling protein β-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand β-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of β-strands. Finally, the third step uses graph matching algorithms to predict the β-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global β-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods. Availability: http://www.igb.uci.edu/servers/psss.html Contact: pfbaldi@ics.uci.edu
AbstractList Protein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta-sheets, however, remains challenging because protein beta-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta-sheet topological features, such as beta-strand alignments, in general have not exploited the global covariation and constraints characteristic of beta-sheet architectures.MOTIVATIONProtein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta-sheets, however, remains challenging because protein beta-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta-sheet topological features, such as beta-strand alignments, in general have not exploited the global covariation and constraints characteristic of beta-sheet architectures.We propose a modular approach to the problem of predicting/assembling protein beta-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta-strands. Finally, the third step uses graph matching algorithms to predict the beta-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods.RESULTSWe propose a modular approach to the problem of predicting/assembling protein beta-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta-strands. Finally, the third step uses graph matching algorithms to predict the beta-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods.http://www.igb.uci.edu/servers/psss.html.AVAILABILITYhttp://www.igb.uci.edu/servers/psss.html.
MOTIVATION: Protein beta -sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta -sheets, however, remains challenging because protein beta -sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta -sheet topological features, such as beta -strand alignments, in general have not exploited the global covariation and constraints characteristic of beta -sheet architectures. RESULTS: We propose a modular approach to the problem of predicting/assembling protein beta -sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta -residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta -strands. Finally, the third step uses graph matching algorithms to predict the beta -sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta -strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods. AVAILABILITY: http://www.igb.uci.edu/servers/psss.html
Protein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta-sheets, however, remains challenging because protein beta-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta-sheet topological features, such as beta-strand alignments, in general have not exploited the global covariation and constraints characteristic of beta-sheet architectures. We propose a modular approach to the problem of predicting/assembling protein beta-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta-strands. Finally, the third step uses graph matching algorithms to predict the beta-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods. http://www.igb.uci.edu/servers/psss.html.
Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein β-sheets, however, remains challenging because protein β-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting β-sheet topological features, such as β-strand alignments, in general have not exploited the global covariation and constraints characteristic of β-sheet architectures. Results: We propose a modular approach to the problem of predicting/assembling protein β-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand β-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of β-strands. Finally, the third step uses graph matching algorithms to predict the β-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global β-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods. Availability: http://www.igb.uci.edu/servers/psss.html Contact: pfbaldi@ics.uci.edu
Author Baldi, Pierre
Cheng, Jianlin
Author_xml – sequence: 1
  givenname: Jianlin
  surname: Cheng
  fullname: Cheng, Jianlin
  organization: Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA 92697, USA
– sequence: 2
  givenname: Pierre
  surname: Baldi
  fullname: Baldi, Pierre
  organization: Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA 92697, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/15961501$$D View this record in MEDLINE/PubMed
BookMark eNqFkE1KBDEQhYMo_l9BeuXK1qSTdJKl_yMIgowibprKdHom2p2MSQb1Wh7EMxkZdevq1av3UVXUFlp13hmE9gg-JFjRI229dZ0PAyQ7iUc6WYIxW0GbhNW4rDBXq7mmtSiZxHQDbcX4hDEnjLF1tEG4qgnHZBM141kwpowJpqaYB9PaSbLeFb7LzidjXfH5UcaZMSkW-r1wZhGgz5JefXiOBwX0duoG43IMri2mAeaz3Jz6YNNsiDtorYM-mt0f3UZ3F-fj01F5fXN5dXp8XVrKZCoBhGZcaa0UERyE1LVqc0LFhHYURFdpzlotGAaJgRPeyjq7zHRQUSnoNtpfzs1XvyxMTM1g48T0PTjjF7GphWKyqqp_QaJqJQX7Bvd-wIUeTNvMgx0gvDe_v8tAuQRsTObtL4fwnLdRwZvRw2NDRreyPju5bzD9Aufnh3M
ContentType Journal Article
DBID BSCLL
CGR
CUY
CVF
ECM
EIF
NPM
7QO
8FD
FR3
P64
7X8
DOI 10.1093/bioinformatics/bti1004
DatabaseName Istex
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Biotechnology Research Abstracts
Technology Research Database
Engineering Research Database
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Engineering Research Database
Biotechnology Research Abstracts
Technology Research Database
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
Engineering Research Database
MEDLINE

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
EndPage i84
ExternalDocumentID 15961501
ark_67375_HXZ_1HR86DBV_0
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NLM NIH HHS
  grantid: LM-07442-01
GroupedDBID -~X
.2P
.I3
482
48X
53G
5GY
AAIMJ
AAJKP
AAKPC
AAMVS
AAPQZ
AAPXW
AARHZ
AAVAP
ABEFU
ABEJV
ABGNP
ABJNI
ABNGD
ABNKS
ABPTD
ABSMQ
ABWST
ABXVV
ABZBJ
ACGFS
ACPQN
ACUFI
ACUKT
ACYTK
ADEYI
ADFTL
ADGZP
ADHKW
ADOCK
ADRTK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKPW
AEKSI
AELWJ
AEPUE
AETBJ
AFFNX
AFFZL
AFOFC
AFSHK
AGINJ
AGKRT
AGQPQ
AGQXC
AI.
ALMA_UNASSIGNED_HOLDINGS
ALTZX
AQDSO
ARIXL
ASAOO
ATDFG
ATTQO
AXUDD
AYOIW
AZFZN
AZVOD
BHONS
BSCLL
CXTWN
CZ4
DFGAJ
EE~
ELUNK
F5P
F9B
FEDTE
H5~
HAR
HVGLF
HW0
IOX
KSI
KSN
MBTAY
MVM
NGC
PB-
Q1.
Q5Y
QBD
RD5
ROL
ROZ
RXO
TLC
TN5
TOX
TR2
VH1
WH7
XJT
ZGI
~91
---
-E4
.-4
.DC
.GJ
0R~
1TH
23N
2WC
4.4
5WA
70D
AAIJN
AAJQQ
AAMDB
AAOGV
AAUQX
AAVLN
ABEUO
ABIXL
ABPQP
ABQLI
ACIWK
ACPRK
ACUXJ
ADBBV
ADEZT
ADGKP
ADHZD
ADMLS
ADPDF
ADRDM
ADVEK
AEMDU
AENEX
AENZO
AEWNT
AFGWE
AFIYH
AFRAH
AGKEF
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALUQC
AMNDL
APIBT
APWMN
ASPBG
AVWKF
BAWUL
BAYMD
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
CGR
COF
CS3
CUY
CVF
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
ECM
EIF
EJD
EMOBN
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
HZ~
J21
JXSIZ
KAQDR
KOP
KQ8
M-Z
MK~
ML0
N9A
NLBLG
NMDNZ
NOMLY
NPM
NTWIH
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
O~Y
P2P
PAFKI
PEELM
PQQKQ
R44
RIG
RNI
RNS
RPM
RUSNO
RW1
RZF
RZO
SV3
TEORI
TJP
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~KM
7QO
8FD
FR3
P64
7X8
ID FETCH-LOGICAL-i348t-aa7b459bb99175a78b69d34837c3f3a7f2b54db740a80a515d86b749d3fa23873
ISICitedReferencesCount 91
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000230273000009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1367-4803
IngestDate Thu Sep 04 16:27:41 EDT 2025
Mon Oct 06 18:30:57 EDT 2025
Fri Jun 20 17:43:03 EDT 2025
Sat Sep 20 11:01:58 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue suppl-1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-i348t-aa7b459bb99175a78b69d34837c3f3a7f2b54db740a80a515d86b749d3fa23873
Notes To whom correspondence should be addressed.
local:bti1004
istex:5A32A39353AAB44540EDD1DF4CA7485B29946B4E
ark:/67375/HXZ-1HR86DBV-0
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
PMID 15961501
PQID 19698742
PQPubID 23462
ParticipantIDs proquest_miscellaneous_67948222
proquest_miscellaneous_19698742
pubmed_primary_15961501
istex_primary_ark_67375_HXZ_1HR86DBV_0
PublicationCentury 2000
PublicationDate 2005-06-01
PublicationDateYYYYMMDD 2005-06-01
PublicationDate_xml – month: 06
  year: 2005
  text: 2005-06-01
  day: 01
PublicationDecade 2000
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2005
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
SSID ssj0051444
ssj0005056
Score 2.1607258
Snippet Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein...
Protein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein...
MOTIVATION: Protein beta -sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of...
SourceID proquest
pubmed
istex
SourceType Aggregation Database
Index Database
Publisher
StartPage i75
SubjectTerms Algorithms
Computational Biology - methods
Humans
Hydrogen Bonding
Models, Chemical
Models, Molecular
Nerve Net
Protein Conformation
Protein Structure, Secondary
Proteins - chemistry
ROC Curve
X-Ray Diffraction
Title Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms
URI https://api.istex.fr/ark:/67375/HXZ-1HR86DBV-0/fulltext.pdf
https://www.ncbi.nlm.nih.gov/pubmed/15961501
https://www.proquest.com/docview/19698742
https://www.proquest.com/docview/67948222
Volume 21
WOSCitedRecordID wos000230273000009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20220930
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Pb9MwFLbYAIkL4ufogOED4jKiJXESO0c2QD2NCRUUcYnsxGkjRlIl2dT997znOE0mVAkOXKLWSRMr39fn9-zn7xHyFkISjkJmDrAnxNmq0BEQCDl-oXIp_IC5OjfFJvj5uUiS-MLuLmlNOQFeVWKzidf_FWpoA7Bx6-w_wL29KTTAZwAdjgA7HP8S-EZrB5y-Je6BwnWYwSk0mgxlhblZ0mlXWnctep8oaQlAVX1CuMEVnPOl3fyGE-tG1Roal3VTdiurbz6sBJe1FV81gs-oXroZEuZthZDJbMPZStsUYKDlZbll5qm8zE1ewQUM0zYfd5iLCMecKWs-GaqoC5dN7avvHZsKpcfexFCWfb2UPwx4L26lbnUdG7oSpe2mP4G3vv5lMASPDGXtvXFI2yYaDqf2yF2fhzGavcWXZMwBck153223hy3kMTu53YUT2wFTxam_JUQz-Efc7A5NjIuyeEQe2tiCfug58Zjc0dUTcr-vNnrzlMgJM-jIDFoX1DKDTphB1Q3tmUEHZrynIy8o4EoNL-jIi2fk2-dPi7O5YytsOCULROdIyVUQxkpBlMBDyYWK4pxhkYGMFUzywldhkCseuFK4ElzfXETwDa4pJPh6nD0n-1Vd6ReEFop5KpA8yjkKIvkiEkUYZIHICj_3imxG3pnXla57FZVUNj8xqZCH6Tz5kXrzryL6ePo9dWfkzfA-UzB1uH4lK11ftSkqOQke-LuviGB0QY93Rg56ILZPG1A73HnmJXkwUvoV2e-aK_2a3Muuu7JtjsgeT8SRYc9vB0eGXw
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Three-stage+prediction+of+protein+beta-sheets+by+neural+networks%2C+alignments+and+graph+algorithms&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Cheng%2C+Jianlin&rft.au=Baldi%2C+Pierre&rft.date=2005-06-01&rft.issn=1367-4803&rft.volume=21+Suppl+1&rft.spage=i75&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbti1004&rft_id=info%3Apmid%2F15961501&rft.externalDocID=15961501
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon