BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing

In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in devising effective countermeasures. The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA s...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) s. 582 - 596
Hlavní autoři: Han, Seunghee, Moon, Seungjae, Suh, Teokkyu, Heo, JaeHoon, Kim, Joo-Young
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 29.06.2024
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in devising effective countermeasures. The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA samples and reference genome for comparative analysis, has emerged as a major bottleneck due to its memory-intensive characteristics. The learned index approach has been developed that uses machine learning model to partially predict the location of the exact matches, which has effectively reduced the memory access. However, the lack of locality in the current in dexing structure and randomness at runtime of the seeding workload have constrained the memory bandwidth usage and have limited further performance advantage. In this paper, we propose BLESS, a bandwidth and locality enhanced SMEM seeding accelerator for learned-index-based DNA sequence alignment. BLESS is the first domain-specific seeding accelerator to maximize the potential hardware advantage of the learned index approach. We introduce coarse-fine (CF) block data structure, a novel memory mapping of seeding parameters to exploit spatial locality and increase effective bandwidth usage for any memory type, including high bandwidth memory (HBM). We also develop guaranteed search range update (GSRU) algorithm, a method that exploits caching in the search procedure to enable temporal locality and data reuse. Utilizing the CF block and GSRU algorithm, we develop a multi-core seeding accelerator using HBM with context switching and runtime scheduling for maximum core and memory bandwidth utilization. With these improvements, BLESS achieves 35.65 \times and 15.49 \times speedup over the state-of-the-art seeding system BWA-MEME and ERT-ASIC, respectively, in raw system performance.
AbstractList In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in devising effective countermeasures. The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA samples and reference genome for comparative analysis, has emerged as a major bottleneck due to its memory-intensive characteristics. The learned index approach has been developed that uses machine learning model to partially predict the location of the exact matches, which has effectively reduced the memory access. However, the lack of locality in the current in dexing structure and randomness at runtime of the seeding workload have constrained the memory bandwidth usage and have limited further performance advantage. In this paper, we propose BLESS, a bandwidth and locality enhanced SMEM seeding accelerator for learned-index-based DNA sequence alignment. BLESS is the first domain-specific seeding accelerator to maximize the potential hardware advantage of the learned index approach. We introduce coarse-fine (CF) block data structure, a novel memory mapping of seeding parameters to exploit spatial locality and increase effective bandwidth usage for any memory type, including high bandwidth memory (HBM). We also develop guaranteed search range update (GSRU) algorithm, a method that exploits caching in the search procedure to enable temporal locality and data reuse. Utilizing the CF block and GSRU algorithm, we develop a multi-core seeding accelerator using HBM with context switching and runtime scheduling for maximum core and memory bandwidth utilization. With these improvements, BLESS achieves 35.65 \times and 15.49 \times speedup over the state-of-the-art seeding system BWA-MEME and ERT-ASIC, respectively, in raw system performance.
Author Suh, Teokkyu
Heo, JaeHoon
Kim, Joo-Young
Han, Seunghee
Moon, Seungjae
Author_xml – sequence: 1
  givenname: Seunghee
  surname: Han
  fullname: Han, Seunghee
  email: shhan1755@kaist.ac.kr
  organization: KAIST,Daejeon,South Korea
– sequence: 2
  givenname: Seungjae
  surname: Moon
  fullname: Moon, Seungjae
  email: sjaemoon@kaist.ac.kr
  organization: KAIST,Daejeon,South Korea
– sequence: 3
  givenname: Teokkyu
  surname: Suh
  fullname: Suh, Teokkyu
  email: ejrrb102@kaist.ac.kr
  organization: KAIST,Daejeon,South Korea
– sequence: 4
  givenname: JaeHoon
  surname: Heo
  fullname: Heo, JaeHoon
  email: kd01050@kaist.ac.kr
  organization: KAIST,Daejeon,South Korea
– sequence: 5
  givenname: Joo-Young
  surname: Kim
  fullname: Kim, Joo-Young
  email: jooyoung1203@kaist.ac.kr
  organization: KAIST,Daejeon,South Korea
BookMark eNotj8tKw0AYRkdQUGveoIt5gdZ_Jp2bu7SmWkh1EV2XufxjB-pE04j07Y3o6nxw4INzTc5zl5GQKYM5Y2BuN-2qEgaUmnPgizkALMwZKYwyuhRQcik0uyTF8ZgcSDCqVFpckXbZ1G17R5c2h-8Uhj0dB206bw9pONE67232GGi7rbe0RQwpv9HKezxgb4fUZRq7nt4_VaP8_MLsR39DLqI9HLH454S8ruuX1eOseX7YrKpmZrnQw8wzGZ3SzmnpXKmVYGHhheWIigtQwhvFkQlEDiyC9MyhCDZaFmJ0RsZyQqZ_vwkRdx99erf9acd-66Th5Q_0JFD0
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ISCA59077.2024.00049
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350326581
EndPage 596
ExternalDocumentID 10609692
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIO
ID FETCH-LOGICAL-a258t-c16fb78bb86bb38751d4c5a2ee725075c972e15ee201f06c1be5dafa1dffb96f3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290320700039&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 03:06:27 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a258t-c16fb78bb86bb38751d4c5a2ee725075c972e15ee201f06c1be5dafa1dffb96f3
PageCount 15
ParticipantIDs ieee_primary_10609692
PublicationCentury 2000
PublicationDate 2024-June-29
PublicationDateYYYYMMDD 2024-06-29
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-June-29
  day: 29
PublicationDecade 2020
PublicationTitle 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)
PublicationTitleAbbrev ISCA
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060973785
Score 2.3073175
Snippet In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in...
SourceID ieee
SourceType Publisher
StartPage 582
SubjectTerms Bandwidth
Data Locality
DNA
DNA Sequencing
Genomics
Hardware Accelerator
Memory management
Read Alignment
Runtime
Sequential analysis
SMEM Seeding
Software-Hardware Co-design
Switches
System performance
Title BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing
URI https://ieeexplore.ieee.org/document/10609692
WOSCitedRecordID wos001290320700039&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmACRBHf8sAaqOP4i60trUBqq0oBqVvlj4vaJUUlhb_PxSkgBgY2K1YU6Z2Tdy--dybkhltcRA4gwYD7JPMiJJahShGFkkan1nERjcIjNZno2cxMt2b16IUBgFh8Brf1MO7lh5Xf1L_K8A2XmHEb_OLuKqUas9bX4pF13xmlxdYexzrm7invdwWKP4UyMM1iX07z6xCVyCHDg38-_ZC0f9x4dPrNM0dkB8pjkvdGCN097dkyfCxDtaA4oKOamTCvpoNyEbf2aT4ejGne3Eq73iPLNDGnmK3Sh0kXJ2MxNc63yctw8Nx_TLYHJCQ2FbpKPJOFU9o5LZ3jqDxYQLRtCqAws1HCG5UCEwDI8kVHeuZABFtYForCGVnwE9IqVyWcEhp4wOQn49w5mwnNLA9WeR0UKhrJPDsj7RqR-WvTA2P-Bcb5H9cvyH4Nel1UlZpL0qrWG7gie_69Wr6tr2PkPgHTtJlG
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0YNdGTGjF-uwevVbbb_fIGCIFYGpJiwo3sVwOXYhD07zvdgsaDB2-bbpomM9u-97rzZhG6pxoWkfE-goTbKLHMRZqASmGF4ErG2lAWjMKpyDI5majRxqwevDDe-1B85h-qYdjLdwu7rn6VwRvOgXEr-OLusSSJSW3X2i4fXnWeEZJtDHKkqR4HeafFQP4JEIJxEjpzql_HqAQU6R398_nHqPHjx8Ojb6Q5QTu-PEV5O4XgPeG2Lt3n3K1mGAY4rbAJmDXulrOwuY_zYXeI8_pW3LIWcKbOOga-ip-zFkyGcmqYb6DXXnfc6UebIxIiHTO5iizhhRHSGMmNoaA9iIN469h7AdxGMKtE7AnzHnC-aHJLjGdOF5q4ojCKF_QM7ZaL0p8j7KgD-pNQaoxOmCSaOi2sdAI0DSeWXKBGFZHpW90FY7oNxuUf1-_QQX88TKfpIHu5QodVAqoSq1hdo93Vcu1v0L79WM3fl7chi1_9-JyN
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+ACM%2FIEEE+51st+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=BLESS%3A+Bandwidth+and+Locality+Enhanced+SMEM+Seeding+Acceleration+for+DNA+Sequencing&rft.au=Han%2C+Seunghee&rft.au=Moon%2C+Seungjae&rft.au=Suh%2C+Teokkyu&rft.au=Heo%2C+JaeHoon&rft.date=2024-06-29&rft.pub=IEEE&rft.spage=582&rft.epage=596&rft_id=info:doi/10.1109%2FISCA59077.2024.00049&rft.externalDocID=10609692