BBK (Branch and Bound Over K): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces

Computational protein design (CPD) algorithms that compute binding affinity, K , search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational biology Vol. 25; no. 7; p. 726
Main Authors: Ojewole, Adegoke A, Jou, Jonathan D, Fowler, Vance G, Donald, Bruce R
Format: Journal Article
Language:English
Published: United States 01.07.2018
Subjects:
ISSN:1557-8666, 1557-8666
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Computational protein design (CPD) algorithms that compute binding affinity, K , search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate K for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating K for up to 10 -fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein-ligand design problems, BBK* provably approximates K up to 1982-fold faster than the previous state-of-the-art iMinDEE/[Formula: see text]/[Formula: see text] algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1557-8666
1557-8666
DOI:10.1089/cmb.2017.0267