A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed
Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic...
Gespeichert in:
| Veröffentlicht in: | ACS synthetic biology Jg. 9; H. 10; S. 2665 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
16.10.2020
|
| ISSN: | 2161-5063, 2161-5063 |
| Online-Zugang: | Weitere Angaben |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20-120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈103. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module.Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20-120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈103. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module. |
|---|---|
| AbstractList | Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20-120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈103. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module.Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20-120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈103. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module. |
| Author | nace, Mark E Pierce, Niles A Porubsky, Nicholas J |
| Author_xml | – sequence: 1 givenname: Mark E surname: nace fullname: nace, Mark E – sequence: 2 givenname: Nicholas J surname: Porubsky fullname: Porubsky, Nicholas J – sequence: 3 givenname: Niles A surname: Pierce fullname: Pierce, Niles A |
| BookMark | eNpNUM1OAjEYbIwmIvIA3nr0ANht99fbBkFJ8CdBzuRr-y1Uuy1ul5h9AN_bNXpwLjOHmUlmLsip8w4JuYrYNGI8ugEVQuek8dNCMpZwcUIGPEqjScJScfpPn5NRCG-sR5KIROQD8lXSjTOVQU3vOge1UfSl8bsG6tq4HV30Aj99804r39B2j7R0YLtgAvUVXboWG1Dtj_PpqCz26VIZTddtA06HWzp3e3CqL3_0Gm0Y07UCC9JY03Zj2nvo-oCoL8lZBTbg6I-HZLOYv84eJqvn--WsXE0Uz_J2Eke5kKhYBSpBLTQr4krmSkImRIGQqkzkijMsCh2LJC2KfiWXlQQJeSaF4ENy_dt7aPzHEUO7rU1QaC049Mew5XHcX8XigvFvcN9qXA |
| CitedBy_id | crossref_primary_10_1093_nar_gkac1086 crossref_primary_10_1371_journal_pone_0320282 crossref_primary_10_1021_jacs_2c11208 crossref_primary_10_1021_jacs_3c04344 crossref_primary_10_1038_s41586_024_07706_4 crossref_primary_10_1109_TNB_2024_3403158 crossref_primary_10_1016_j_bioorg_2024_107561 crossref_primary_10_1038_s41467_025_60455_4 crossref_primary_10_1016_j_nbt_2025_07_004 crossref_primary_10_1021_jacs_4c03148 crossref_primary_10_1261_rna_079756_123 crossref_primary_10_1002_mabi_202300427 crossref_primary_10_1016_j_trac_2023_116963 crossref_primary_10_1093_nar_gkac590 crossref_primary_10_1021_jacs_4c07221 crossref_primary_10_3390_ijms23084265 crossref_primary_10_1007_s11427_023_2306_x crossref_primary_10_1093_nar_gkac650 crossref_primary_10_1038_s43588_024_00646_z crossref_primary_10_1039_D3RA03995A crossref_primary_10_1002_smtd_202401501 crossref_primary_10_1186_s13015_023_00229_z crossref_primary_10_1016_j_aca_2022_339568 crossref_primary_10_1016_j_aca_2024_342530 crossref_primary_10_1038_s41467_024_45385_x crossref_primary_10_1038_s41592_021_01187_3 crossref_primary_10_1093_bib_bbad421 crossref_primary_10_1016_j_bios_2023_115927 crossref_primary_10_1038_s41467_025_59389_8 crossref_primary_10_1002_advs_202409880 crossref_primary_10_1016_j_omtn_2021_12_039 crossref_primary_10_1038_s41467_023_36073_3 crossref_primary_10_1109_TNB_2021_3139079 crossref_primary_10_1039_D4NR05369A crossref_primary_10_1007_s44258_024_00015_5 crossref_primary_10_7554_eLife_90156 crossref_primary_10_7554_eLife_90156_3 crossref_primary_10_1002_EXP_20210265 crossref_primary_10_1049_cit2_70055 crossref_primary_10_3390_life11111280 crossref_primary_10_1038_s41592_022_01653_6 crossref_primary_10_1093_nar_gkae980 crossref_primary_10_1038_s41467_023_42272_9 crossref_primary_10_1186_s13036_024_00459_8 |
| ContentType | Journal Article |
| DBID | 7X8 |
| DOI | 10.1021/acssynbio.9b00523 |
| DatabaseName | MEDLINE - Academic |
| DatabaseTitle | MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2161-5063 |
| GroupedDBID | 53G 55A 7X8 7~N AABXI ABBLG ABJNI ABLBI ABMVS ABQRX ABUCX ACGFS ACS ADHLV AEESW AENEX AFEFF AHGAQ ALMA_UNASSIGNED_HOLDINGS AQSVZ BAANH CUPRZ EBS ED~ GGK GNL IH9 JG~ ROL UI2 VF5 VG9 W1F |
| ID | FETCH-LOGICAL-c278t-4183bec0fac5ed3d094fb8cba7339ea6c738c20e99d4356995532bfbaba87b332 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 56 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000582582100007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2161-5063 |
| IngestDate | Thu Jul 10 17:41:22 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 10 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c278t-4183bec0fac5ed3d094fb8cba7339ea6c738c20e99d4356995532bfbaba87b332 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PQID | 2441610490 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_2441610490 |
| PublicationCentury | 2000 |
| PublicationDate | 20201016 |
| PublicationDateYYYYMMDD | 2020-10-16 |
| PublicationDate_xml | – month: 10 year: 2020 text: 20201016 day: 16 |
| PublicationDecade | 2020 |
| PublicationTitle | ACS synthetic biology |
| PublicationYear | 2020 |
| SSID | ssj0000553538 |
| Score | 2.448667 |
| Snippet | Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing... |
| SourceID | proquest |
| SourceType | Aggregation Database |
| StartPage | 2665 |
| Title | A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed |
| URI | https://www.proquest.com/docview/2441610490 |
| Volume | 9 |
| WOSCitedRecordID | wos000582582100007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEF7UetCDb_HNCB4bm-42ya4XKdriqRSq0FvZJxY0qaYK_QH-b2fzUMGL4DmzEHYnM9_OTL6PkAtP2Y3AvBNQp2XQwYwSeJHCoB1bTU1iKA9dITaRDAZ8PBbDquCWV2OVdUwsArXJtK-RtzANITjxfarr2UvgVaN8d7WS0FgmDYZQxnt1MuZfNZYwilhUiFlTXBtEmI7rxiZtt6TO80WqptmlKKujv8JxkWP6m_99uy2yUaFL6JbusE2WbLpD1n9wDu6Sjy4gznSIPOG2VKOHYTmj9YzPoV8PawGiWUB0CDVtCWQOivqh_xUCLQeeCRlXd_XUgCe5TU1-Bb30sZgpAC-y9pQ3YYROUHKBL5qANjCaYcLcIw_93v3NXVBpMQSaJnyOp8gZHnfopI6sYQZvhU5xrWTCmLAy1gnjmoZWCIMALBYCt54qp6SSPFGM0X2ykmapPSBAuYqliWgiuOtYxZUVKqbOmo6yYVvrQ3Je7_EEfd03MGRqs7d88r3LR3-wOSZr1F-O_fhJfEIaDr9ne0pW9ft8mr-eFa7yCbx3ypw |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Unified+Dynamic+Programming+Framework+for+the+Analysis+of+Interacting+Nucleic+Acid+Strands%3A+Enhanced+Models%2C+Scalability%2C+and+Speed&rft.jtitle=ACS+synthetic+biology&rft.au=nace%2C+Mark+E&rft.au=Porubsky%2C+Nicholas+J&rft.au=Pierce%2C+Niles+A&rft.date=2020-10-16&rft.issn=2161-5063&rft.eissn=2161-5063&rft.volume=9&rft.issue=10&rft.spage=2665&rft_id=info:doi/10.1021%2Facssynbio.9b00523&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2161-5063&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2161-5063&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2161-5063&client=summon |