Multithread Multistring Burrows-Wheeler Transform and Longest Common Prefix Array
Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the Burrows-Wheeler transform (BWT) and the longest common prefix (LCP) array, since solving efficiently both problems are essential ingredients of...
Uloženo v:
| Vydáno v: | Journal of computational biology Ročník 26; číslo 9; s. 948 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
01.09.2019
|
| Témata: | |
| ISSN: | 1557-8666, 1557-8666 |
| On-line přístup: | Zjistit podrobnosti o přístupu |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the Burrows-Wheeler transform (BWT) and the longest common prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings, such as those for genome assembly. In this article, we explore a multithread computational strategy for building the BWT and LCP array. Our algorithm applies a divide and conquer approach that leads to parallel computation of multistring BWT and LCP array. Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the Burrows-Wheeler transform (BWT) and the longest common prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings, such as those for genome assembly. In this article, we explore a multithread computational strategy for building the BWT and LCP array. Our algorithm applies a divide and conquer approach that leads to parallel computation of multistring BWT and LCP array. |
|---|---|
| AbstractList | Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the Burrows-Wheeler transform (BWT) and the longest common prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings, such as those for genome assembly. In this article, we explore a multithread computational strategy for building the BWT and LCP array. Our algorithm applies a divide and conquer approach that leads to parallel computation of multistring BWT and LCP array. Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the Burrows-Wheeler transform (BWT) and the longest common prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings, such as those for genome assembly. In this article, we explore a multithread computational strategy for building the BWT and LCP array. Our algorithm applies a divide and conquer approach that leads to parallel computation of multistring BWT and LCP array. |
| Author | Bonizzoni, Paola Previtali, Marco Rizzi, Raffaella Della Vedova, Gianluca Pirola, Yuri |
| Author_xml | – sequence: 1 givenname: Paola surname: Bonizzoni fullname: Bonizzoni, Paola organization: Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy – sequence: 2 givenname: Gianluca surname: Della Vedova fullname: Della Vedova, Gianluca organization: Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy – sequence: 3 givenname: Yuri surname: Pirola fullname: Pirola, Yuri organization: Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy – sequence: 4 givenname: Marco surname: Previtali fullname: Previtali, Marco organization: Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy – sequence: 5 givenname: Raffaella surname: Rizzi fullname: Rizzi, Raffaella organization: Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31140836$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNUEtLAzEYDFKxDz16lRy9bE3ybbbpsRZfUFGh4nFJdr-0K7tJTXbR_nsXreBphmEYZmZMBs47JOScsylnan5VNGYqGFdTJoAdkRGXcpaoLMsG__iQjGN8Z4xDxmYnZAicp0xBNiIvj13dVu02oC7pD49tqNyGXnch-M-YvG0Rawx0HbSL1oeGalfSlXcbjC1d-qbxjj4HtNUXXYSg96fk2Oo64tkBJ-T19ma9vE9WT3cPy8UqKUCoNoE0KzAtoEwzo6zFwqAUEuS8hLlNmenFAtAYCzCTACk3RhsOdi5VP9SimJDL39xd8B9dXyZvqlhgXWuHvou5EMCVFKmC3npxsHamwTLfharRYZ__3SC-ATtVYbY |
| CitedBy_id | crossref_primary_10_1186_s13015_023_00232_4 crossref_primary_10_1016_j_tcs_2020_11_041 crossref_primary_10_1016_j_tcs_2019_11_001 crossref_primary_10_1007_s00236_024_00467_7 crossref_primary_10_1093_bioinformatics_btae333 crossref_primary_10_1186_s12859_020_03628_w |
| ContentType | Journal Article |
| DBID | NPM 7X8 |
| DOI | 10.1089/cmb.2018.0230 |
| DatabaseName | PubMed MEDLINE - Academic |
| DatabaseTitle | PubMed MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Biology Mathematics |
| EISSN | 1557-8666 |
| ExternalDocumentID | 31140836 |
| Genre | Journal Article |
| GroupedDBID | --- 0R~ 1-M 29K 34G 39C 4.4 53G 5GY ABBKN ABEFU ACGFO ADBBV AENEX AFOSN AI. ALMA_UNASSIGNED_HOLDINGS BAWUL BNQNF CAG COF CS3 D-I DIK DU5 EBS EJD F5P IAO IER IGS IHR IM4 ISR ITC MV1 NPM NQHIM O9- OK1 P2P R.V RIG RML RMSOB RNS TN5 TR2 UE5 VH1 7X8 SCNPE |
| ID | FETCH-LOGICAL-c328t-346ce4c3d46b8ffecbe525359d39f40bb8fc3ebbf33753341bbab13f958230fe2 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 11 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000469491200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1557-8666 |
| IngestDate | Thu Sep 04 15:43:12 EDT 2025 Thu Jan 02 22:59:24 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 9 |
| Keywords | parallel algorithms Burrows–Wheeler transform multithreading longest common prefix array |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c328t-346ce4c3d46b8ffecbe525359d39f40bb8fc3ebbf33753341bbab13f958230fe2 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 31140836 |
| PQID | 2231852483 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_2231852483 pubmed_primary_31140836 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-09-00 20190901 |
| PublicationDateYYYYMMDD | 2019-09-01 |
| PublicationDate_xml | – month: 09 year: 2019 text: 2019-09-00 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of computational biology |
| PublicationTitleAlternate | J Comput Biol |
| PublicationYear | 2019 |
| SSID | ssj0013607 |
| Score | 2.3152618 |
| Snippet | Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multistring generalizations of the... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 948 |
| Title | Multithread Multistring Burrows-Wheeler Transform and Longest Common Prefix Array |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/31140836 https://www.proquest.com/docview/2231852483 |
| Volume | 26 |
| WOSCitedRecordID | wos000469491200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8NAEB7UKujBR33VFyt4jabdTbJ7EhWLh7ZUqNJbyb7Eg5vaWLH_3tkkVS-C4CWEwLJhdme-b3deAGeJpNrGIkZF4mHAEiNQ50wUcG6VFKmUcRHy_9hJej0-HIp-deGWV2GVc5tYGGqdKX9HfoEw5vN8GaeX49fAd43y3tWqhcYi1ChSGa-YyfCHFyEu0qURMtESI0-vamyGXFyoF-njuvi55-C_s8sCZdob__2_TViv-CW5KjfEFiwYV4eVsuPkrA5r3a8yrfk23Jfpt7icqSbFu-_i4Z7I9dSXZswDNNUISxMymNNbkjpNOpnzXinik0syR_oIss8fOOUkne3AQ_t2cHMXVD0WAkVb_C2gLFaGKapZLLmPIJEmakU0EpoKy0KJHxU1UlpKE5-125QylU1qReQ9dNa0dmHJZc7sA8GTig2tjsJYS6aYRJxD9oO2FGcykTINOJ1LboR72DsmUmeyaT76ll0D9krxj8ZlsY0RxQObr6B98IfRh7CKa1qFgB1BzaIGm2NYVu8ov8lJsTnw2et3PwGPI8TK |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multithread+Multistring+Burrows-Wheeler+Transform+and+Longest+Common+Prefix+Array&rft.jtitle=Journal+of+computational+biology&rft.au=Bonizzoni%2C+Paola&rft.au=Della+Vedova%2C+Gianluca&rft.au=Pirola%2C+Yuri&rft.au=Previtali%2C+Marco&rft.date=2019-09-01&rft.eissn=1557-8666&rft.volume=26&rft.issue=9&rft.spage=948&rft_id=info:doi/10.1089%2Fcmb.2018.0230&rft_id=info%3Apmid%2F31140836&rft_id=info%3Apmid%2F31140836&rft.externalDocID=31140836 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon |