Deep Generative Models of LDLR Protein Structure to Predict Variant Pathogenicity

The complex structure and function of low-density lipoprotein receptor (LDLR) makes classification of protein-coding missense variants challenging. Deep generative models, including evolutionary model of variant effect (EVE), evolutionary scale modeling (ESM), and AlphaFold 2 (AF2), have enabled sig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of lipid research Jg. 64; H. 12; S. 100455
Hauptverfasser: James, Jose K., Norland, Kristjan, Johar, Angad S., Kullo, Iftikhar J.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Elsevier Inc 01.12.2023
American Society for Biochemistry and Molecular Biology
Elsevier
Schlagworte:
ISSN:0022-2275, 1539-7262, 1539-7262
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The complex structure and function of low-density lipoprotein receptor (LDLR) makes classification of protein-coding missense variants challenging. Deep generative models, including evolutionary model of variant effect (EVE), evolutionary scale modeling (ESM), and AlphaFold 2 (AF2), have enabled significant progress in the prediction of protein structure and function. ESM and EVE directly estimate the likelihood of a variant sequence but are purely data-driven and challenging to interpret. AF2 predicts LDLR structures but variant effects are explicitly modeled by estimating changes in stability. We tested the effectiveness of these models for predicting variant pathogenicity compared to established methods. AF2 produced two distinct conformations based on a novel hinge mechanism. Within ESM’s hidden space, benign and pathogenic variants had different distributions. In EVE these distributions were similar. EVE and ESM were comparable to Polyphen-2, SIFT, REVEL and Primate AI for predicting binary classifications in ClinVar. However, they were more strongly correlated with experimental measures of LDL uptake. AF2 poorly performed in these tasks. Using the UK biobank to compare association with clinical phenotypes, ESM and EVE were more strongly associated with serum LDL-C than Polyphen-2. ESM was able to identify variants with more extreme LDL-C levels than EVE and had a significantly stronger association with atherosclerotic cardiovascular disease. In conclusion, AF2 predicted LDLR structures do not accurately model variant pathogenicity. ESM and EVE are competitive with prior scoring methods for prediction based on binary classification in ClinVar but are superior based on correlations with experimental assays and clinical phenotypes.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0022-2275
1539-7262
1539-7262
DOI:10.1016/j.jlr.2023.100455