Direct coupling analysis and the attention mechanism

Proteins are involved in nearly all cellular functions, encompassing roles in transport, signaling, enzymatic activity, and more. Their functionalities crucially depend on their complex three-dimensional arrangement. For this reason, being able to predict their structure from the amino acid sequence...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	BMC bioinformatics Ročník 26; číslo 1; s. 41 - 21
Hlavní autoři:	Caredda, Francesco, Pagnani, Andrea
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	England BioMed Central Ltd 06.02.2025 Springer Nature B.V BMC
Témata:	Accuracy Algorithms Amino acid sequence Amino acids Architecture Attention mechanism Cellular structure Complexity Computational Biology - methods Computational linguistics Coupling Direct coupling analysis Enzymatic activity Enzymes Evolution Information processing Language processing Machine learning Methods Models, Molecular Natural language interfaces Protein Conformation Protein families Protein Folding Protein structure Protein structure prediction Protein transport Proteins Proteins - chemistry Structure-function relationships Terminology Transformer Italy Attention mechanism Direct coupling analysis Protein structure prediction Transformer
ISSN:	1471-2105, 1471-2105
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Proteins are involved in nearly all cellular functions, encompassing roles in transport, signaling, enzymatic activity, and more. Their functionalities crucially depend on their complex three-dimensional arrangement. For this reason, being able to predict their structure from the amino acid sequence has been and still is a phenomenal computational challenge that the introduction of AlphaFold solved with unprecedented accuracy. However, the inherent complexity of AlphaFold's architectures makes it challenging to understand the rules that ultimately shape the protein's predicted structure. This study investigates a single-layer unsupervised model based on the attention mechanism. More precisely, we explore a Direct Coupling Analysis (DCA) method that mimics the attention mechanism of several popular Transformer architectures, such as AlphaFold itself. The model's parameters, notably fewer than those in standard DCA-based algorithms, can be directly used for extracting structural determinants such as the contact map of the protein family under study. Additionally, the functional form of the energy function of the model enables us to deploy a multi-family learning strategy, allowing us to effectively integrate information across multiple protein families, whereas standard DCA algorithms are typically limited to single protein families. Finally, we implemented a generative version of the model using an autoregressive architecture, capable of efficiently generating new proteins in silico.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-025-06062-y