TSUNAMI: A GPU Implementation of the WFA Algorithm
Pairwise sequence alignment represents a fundamental step in the genome assembly pipeline, being the most time-consuming step and the bottleneck factor in multiple bioinformatics applications. Exact pairwise alignment methods like Smith-Waterman and Needleman-Wunsch, often cannot satisfy the perform...
Uloženo v:
| Vydáno v: | 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) s. 150 - 161 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
21.10.2023
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Pairwise sequence alignment represents a fundamental step in the genome assembly pipeline, being the most time-consuming step and the bottleneck factor in multiple bioinformatics applications. Exact pairwise alignment methods like Smith-Waterman and Needleman-Wunsch, often cannot satisfy the performance required by these tools because of their quadratic time complexity. Furthermore, given the increasing computational cost of analyzing third-generation sequences, the community is moving towards different alignment methods and hardware-accelerated solutions to overcome the limitations of these algorithms. In this scenario, we present TSUNAMI, a highly-optimized implementation of the WaveFront Alignment (WFA) algorithm on GPU. TSUNAMI exploits GPU high-parallel computing to accelerate the WFA algorithm, a novel alignment methodology exploiting homologous regions between the target sequences. By doing so, we are able to reduce both time and space complexity in our GPU implementation. Our results show that TSUNAMI achieves improvements up to 4512.28× in terms of speedup when compared to the multi-threaded state-of-the-art software implementation run on Intel Xeon Silver 4208 using 16 threads in total. We also compared our design with all the recently released hardware-accelerated solutions present in the State Of the Art, observing speedups up to 14.81×with respect to the best performing hardware-accelerated implementation in the literature, reaching up to 42604.98 Giga Cell Updates Per Second in our best configuration. TSUNAMI also supports aligning very erroneous long sequences, rendering our implementation much more useful in real-world scenarios. Finally, to prove the efficiency of our design, we evaluate TSUNAMI exploiting the Berkeley Roofline model and demonstrate that our implementation is near-optimal on the NVIDIA Tesla H100. |
|---|---|
| AbstractList | Pairwise sequence alignment represents a fundamental step in the genome assembly pipeline, being the most time-consuming step and the bottleneck factor in multiple bioinformatics applications. Exact pairwise alignment methods like Smith-Waterman and Needleman-Wunsch, often cannot satisfy the performance required by these tools because of their quadratic time complexity. Furthermore, given the increasing computational cost of analyzing third-generation sequences, the community is moving towards different alignment methods and hardware-accelerated solutions to overcome the limitations of these algorithms. In this scenario, we present TSUNAMI, a highly-optimized implementation of the WaveFront Alignment (WFA) algorithm on GPU. TSUNAMI exploits GPU high-parallel computing to accelerate the WFA algorithm, a novel alignment methodology exploiting homologous regions between the target sequences. By doing so, we are able to reduce both time and space complexity in our GPU implementation. Our results show that TSUNAMI achieves improvements up to 4512.28× in terms of speedup when compared to the multi-threaded state-of-the-art software implementation run on Intel Xeon Silver 4208 using 16 threads in total. We also compared our design with all the recently released hardware-accelerated solutions present in the State Of the Art, observing speedups up to 14.81×with respect to the best performing hardware-accelerated implementation in the literature, reaching up to 42604.98 Giga Cell Updates Per Second in our best configuration. TSUNAMI also supports aligning very erroneous long sequences, rendering our implementation much more useful in real-world scenarios. Finally, to prove the efficiency of our design, we evaluate TSUNAMI exploiting the Berkeley Roofline model and demonstrate that our implementation is near-optimal on the NVIDIA Tesla H100. |
| Author | Zeni, Alberto Santambrogio, Marco D. Gerometta, Giulia |
| Author_xml | – sequence: 1 givenname: Giulia surname: Gerometta fullname: Gerometta, Giulia email: giulia.gerometta@mail.polimi.it organization: Informatica e Bioingegneria, Politecnico di Milano,Dipartimento di Elettronica,Milan,Italy – sequence: 2 givenname: Alberto surname: Zeni fullname: Zeni, Alberto email: alberto.zeni@polimi.it organization: Informatica e Bioingegneria, Politecnico di Milano,Dipartimento di Elettronica,Milan,Italy – sequence: 3 givenname: Marco D. surname: Santambrogio fullname: Santambrogio, Marco D. email: marco.santambrogio@polimi.it organization: Informatica e Bioingegneria, Politecnico di Milano,Dipartimento di Elettronica,Milan,Italy |
| BookMark | eNotzstKw0AUgOERFNSaN-hiXiDxnLmPuyHYGqhaMMFlyeXEBnIpSTa-vYKu_t3Hf8-ux2kkxrYICSL4x2NIc-0QbSJAyAQABF6xyFvvpAaphFbylkXL0lWgrZVWeHnHRP5RvIXX7IkHvj8WPBsuPQ00ruXaTSOfWr6eiX_uAg_91zR363l4YDdt2S8U_XfDit1znr7Eh_d9loZDXApn1thI7ckoaBQ2SqOpanKVV9YZR6Jpm7qqpDO1-73TCtumFV4jlIi-1ug9yA3b_rkdEZ0uczeU8_cJQRqlrZA_JAxC-A |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/PACT58117.2023.00021 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350342543 |
| EndPage | 161 |
| ExternalDocumentID | 10364572 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a286t-6359e640d41d4516bce8b947868e2dfdcbb386c8835541fdf29510a119c519903 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001165646800013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:24:17 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a286t-6359e640d41d4516bce8b947868e2dfdcbb386c8835541fdf29510a119c519903 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_10364572 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Oct.-21 |
| PublicationDateYYYYMMDD | 2023-10-21 |
| PublicationDate_xml | – month: 10 year: 2023 text: 2023-Oct.-21 day: 21 |
| PublicationDecade | 2020 |
| PublicationTitle | 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib057737293 |
| Score | 2.2494173 |
| Snippet | Pairwise sequence alignment represents a fundamental step in the genome assembly pipeline, being the most time-consuming step and the bottleneck factor in... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 150 |
| SubjectTerms | Genome Alignment GPU Graphics processing units HPC Pipelines Rendering (computer graphics) Silver Software Software algorithms Tsunami WFA |
| Title | TSUNAMI: A GPU Implementation of the WFA Algorithm |
| URI | https://ieeexplore.ieee.org/document/10364572 |
| WOSCitedRecordID | wos001165646800013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmACRBHf8sCakjhObLNFFQUGqki0oltlx2eoRBMUUn4_OTcFFgY224t1_rh78vndI-QKU1Wamyhw3OmAc8EDmRjVdq1gMdhEaevFJsR4LGczlXdkdc-FAQD_-QwG2PS5fFsVK3wqa284Js1E63G3hRBrstbm8CQCBVdU3NHjolBd59lwkiCRcoAa4VipECuC_hJR8TFktPfP2fdJ_4eNR_PvOHNAtqA8JGzyNB1njw83NKN3-ZT6Kr_LjkhU0srRFtnR51FGs7eXql40r8s-mY5uJ8P7oNM_CDSTaRO0WEBBykPLI4t6uqYAaRQXMpXArLOFMbFMCykRM0TOOoZwSUeRKlpcpsL4iPTKqoRjQhm3RoUtFIkZ8NSB1ikwSBITaqMsS09IHw2ev69LXMw3tp7-MX5GdnFN0Ymz6Jz0mnoFF2Sn-GwWH_Wl35gvCXKLqQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVQQYIJEEV844E1JXacxGaLKkor2igSqehW2fEFKtEGlZTfTy5NgYWBzfZinT_unnx-9wi5wVSVFoY5uci1I0QoHOkbVXVtyD2wvtK2FpsI41hOJippyOo1FwYA6s9n0MFmncu3RbbCp7LqhmPSLKw87rYvBGdrutbm-PghSq4oryHIMVfdJlE39ZFK2UGVcKxViDVBf8mo1FGkt__P-Q9I-4ePR5PvSHNItmBxRHj6NI6j0eCORvQhGdO6zu-8oRItaJHTCtvR515Eo7eXYjkrX-dtMu7dp92-0yggOJrLoHQqNKAgEK4VzKKirslAGiVCGUjgNreZMZ4MMikRNbDc5hwBk2ZMZRUyU653TFqLYgEnhHJhjXIrMOJxEEEOWgfAwfeNq42yPDglbTR4-r4ucjHd2Hr2x_g12e2no-F0OIgfz8keri-6dM4uSKtcruCS7GSf5exjeVVv0heFFI7w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+32nd+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=TSUNAMI%3A+A+GPU+Implementation+of+the+WFA+Algorithm&rft.au=Gerometta%2C+Giulia&rft.au=Zeni%2C+Alberto&rft.au=Santambrogio%2C+Marco+D.&rft.date=2023-10-21&rft.pub=IEEE&rft.spage=150&rft.epage=161&rft_id=info:doi/10.1109%2FPACT58117.2023.00021&rft.externalDocID=10364572 |