Seed-and-Vote based In-Memory Accelerator for DNA Read Mapping

Genome analysis is becoming more important in the fields of forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing (HTS) and Third Generation Sequencing (TGS) have greatly accelerated genome sequencing. However, genome read mapping remains significantly s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digest of technical papers - IEEE/ACM International Conference on Computer-Aided Design S. 1 - 9
Hauptverfasser: Laguna, Ann Franchesca, Gamaarachchi, Hasindu, Yin, Xunzhao, Niemier, Michael, Parameswaran, Sri, Hu, X. Sharon
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: Association on Computer Machinery 02.11.2020
Schlagworte:
ISSN:1558-2434
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Genome analysis is becoming more important in the fields of forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing (HTS) and Third Generation Sequencing (TGS) have greatly accelerated genome sequencing. However, genome read mapping remains significantly slower than sequencing. Because of the enormous amount of data needed, the speed of the data transfer between the memory and the processing unit limits the execution speed. In-memory computing can help address the memory-bandwidth bottleneck by minimizing data transfers. Ternary Content Addressable Memories (TCAMs) have been used in accelerators because of their fast searching capability for seed-and-extend, a popular read mapping approach. Seed-and-vote, another read mapping approach, is faster than the seed-and-extend approach but has lower accuracies when used with very short reads. Since sequencing technology is moving to longer reads, the seed-and-vote approach is becoming more viable. We propose a genome read mapping accelerator that uses approximate TCAM to execute the Fast Seed and Vote algorithm (FSVA) that can map both short and long reads. We achieved 400X acceleration compared to the seed-and-extend approach BWA-MEM on a CPU and 115X acceleration at 30X energy improvement compared to state-of-the-art in-memory accelerator using the seed-and-extend approach at 98.75% accuracy for 100bp reads.
AbstractList Genome analysis is becoming more important in the fields of forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing (HTS) and Third Generation Sequencing (TGS) have greatly accelerated genome sequencing. However, genome read mapping remains significantly slower than sequencing. Because of the enormous amount of data needed, the speed of the data transfer between the memory and the processing unit limits the execution speed. In-memory computing can help address the memory-bandwidth bottleneck by minimizing data transfers. Ternary Content Addressable Memories (TCAMs) have been used in accelerators because of their fast searching capability for seed-and-extend, a popular read mapping approach. Seed-and-vote, another read mapping approach, is faster than the seed-and-extend approach but has lower accuracies when used with very short reads. Since sequencing technology is moving to longer reads, the seed-and-vote approach is becoming more viable. We propose a genome read mapping accelerator that uses approximate TCAM to execute the Fast Seed and Vote algorithm (FSVA) that can map both short and long reads. We achieved 400X acceleration compared to the seed-and-extend approach BWA-MEM on a CPU and 115X acceleration at 30X energy improvement compared to state-of-the-art in-memory accelerator using the seed-and-extend approach at 98.75% accuracy for 100bp reads.
Author Yin, Xunzhao
Gamaarachchi, Hasindu
Hu, X. Sharon
Laguna, Ann Franchesca
Parameswaran, Sri
Niemier, Michael
Author_xml – sequence: 1
  givenname: Ann Franchesca
  surname: Laguna
  fullname: Laguna, Ann Franchesca
  email: alaguna@nd.edu
  organization: University of Notre Dame
– sequence: 2
  givenname: Hasindu
  surname: Gamaarachchi
  fullname: Gamaarachchi, Hasindu
  email: hasindu@unsw.edu.au
  organization: University of New South Wales
– sequence: 3
  givenname: Xunzhao
  surname: Yin
  fullname: Yin, Xunzhao
  email: xyin1@zju.edu.cn
  organization: Zhejiang University
– sequence: 4
  givenname: Michael
  surname: Niemier
  fullname: Niemier, Michael
  email: mniemier@nd.edu
  organization: University of Notre Dame
– sequence: 5
  givenname: Sri
  surname: Parameswaran
  fullname: Parameswaran, Sri
  email: sri.parameswaran@unsw.edu.au
  organization: University of New South Wales
– sequence: 6
  givenname: X. Sharon
  surname: Hu
  fullname: Hu, X. Sharon
  email: shu@nd.edu
  organization: University of Notre Dame
BookMark eNotjLtOwzAUQA0CiVIyM7D4B1z8uLavF6SoFKjUgsRrrezkGkVqkyjJ0r-nEgxH50znml20XUuM3Sq5UArsvQEpjdQLA8o6q85YETwq5yxoo8Gcs5myFsUp4YoV49gkCSDBBsQZe_ggqkVsa_HdTcRTHKnm61Zs6dANR15WFe1piFM38Hzi8bXk7xRrvo1937Q_N-wyx_1Ixb_n7Otp9bl8EZu35_Wy3IiowU8CfSLpvEeDyWCUwSYIgGAM1iHlbDBkTUpHZzEmTVlnwkpDcCmgU2Tm7O7v2xDRrh-aQxyOu6Ct8xbNL9jPSI0
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1145/3400302.3415651
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781665423243
1665423242
EISSN 1558-2434
EndPage 9
ExternalDocumentID 9256758
Genre orig-research
GrantInformation_xml – fundername: DARPA
  funderid: 10.13039/100000185
GroupedDBID 6IE
6IF
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
FEDTE
IEGSK
IJVOP
M43
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-a247t-87be0677838b38a095b49484338d9bff389f2e12a658ab2ef2fe8c2496b9861e3
IEDL.DBID RIE
ISICitedReferencesCount 33
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000671087100143&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:32 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a247t-87be0677838b38a095b49484338d9bff389f2e12a658ab2ef2fe8c2496b9861e3
PageCount 9
ParticipantIDs ieee_primary_9256758
PublicationCentury 2000
PublicationDate 2020-Nov.-2
PublicationDateYYYYMMDD 2020-11-02
PublicationDate_xml – month: 11
  year: 2020
  text: 2020-Nov.-2
  day: 02
PublicationDecade 2020
PublicationTitle Digest of technical papers - IEEE/ACM International Conference on Computer-Aided Design
PublicationTitleAbbrev ICCAD
PublicationYear 2020
Publisher Association on Computer Machinery
Publisher_xml – name: Association on Computer Machinery
SSID ssib044045988
ssj0020286
Score 2.3216052
Snippet Genome analysis is becoming more important in the fields of forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Acceleration
Approximation algorithms
bioinformatics
content-addressable storage
data transfer
DNA
DNA read mapping
Encoding
execution speed
fast seed and vote algorithm
Ferroelectric FET
forensic science
FSVA
genome analysis
genome read mapping accelerator
genome sequencing
genomics
Heuristic algorithms
high throughput sequencing
in-memory accelerator
in-memory computing
memory-bandwidth bottleneck
molecular biophysics
parallel processing
processing unit
read mapping approach
seed-and-extend
Seed-and-Vote
seed-and-vote based in-memory accelerator
Sequential analysis
ternary content addressable memories
third generation sequencing
Title Seed-and-Vote based In-Memory Accelerator for DNA Read Mapping
URI https://ieeexplore.ieee.org/document/9256758
WOSCitedRecordID wos000671087100143&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED61FQMsPFrEWx4YcR-O81qQKqACiVaVgKpbZccXxJKikiLx77lzQ-nAwhYlS3R-fN_nu_sMcNmzhJExGmkiE0qdd1NaUmEmrXZIdCQ2zvfCTB7j0SiZTtNxDa7WvTCI6IvPsM2PPpfv5tmSj8o6KeEz8ds61OM4WvVq_cwdtrkLvfVWJbYIN6PKyqenw06geTqrdsCChbOSG3epeCgZ7P7vJ_ag9duTJ8ZrtNmHGhYHsLNhJ9iE6yf6KE3h5GReomCAcuKhkEOupv0S_SwjjPFpdUFUVdyO-oJr6MXQsEnDawteBnfPN_eyuh9BGqXjkjYyi94ALkhskBgiS5bNXjSpTpfaPCcukivsKUMsw1iFucoxyUhvRTZNoh4Gh9Ao5gUegaBhCS1zv9xE2kaYoFGB07HLEuyarjmGJkdi9r6ywJhVQTj5-_UpbCuWpXz6qs6gUS6WeA5b2Wf59rG48OP2DbyOlko
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEJ4gmqgXH2h8uwePFtrt9nUxISqBCA2JSLiR3e7UeCkGi4n_3tmlIgcv3pr20sw-vu_bmfkW4MZThJERSkeGMnBE7ia0pILMUUIj0ZFIatsLM-5HaRpPJsmwBrerXhhEtMVn2DSPNpevZ9nCHJW1EsJn4rcbsBkIwd1lt9bP7DFGd4E136rkFiFnWJn5eCJo-cJMaN70jWQxecm121QsmHT2_vcb-3D025XHhiu8OYAaFoewu2Yo2IC7Z_royEI741mJzECUZr3CGZh62i_WzjJCGZtYZ0RW2UPaZqaKng2ksWl4PYKXzuPovutUNyQ4kouopK1MobWA82Plx5LokjJ2L4J0p05UnhMbyTl6XBLPkIpjznOMM1JcoUri0EP_GOrFrMATYDQwgTLsL5ehUCHGKLmvRaSzGF3pylNomEhM35cmGNMqCGd_v76G7e5o0J_2e-nTOexwI1LNWSy_gHo5X-AlbGWf5dvH_MqO4TfC9pmR
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Digest+of+technical+papers+-+IEEE%2FACM+International+Conference+on+Computer-Aided+Design&rft.atitle=Seed-and-Vote+based+In-Memory+Accelerator+for+DNA+Read+Mapping&rft.au=Laguna%2C+Ann+Franchesca&rft.au=Gamaarachchi%2C+Hasindu&rft.au=Yin%2C+Xunzhao&rft.au=Niemier%2C+Michael&rft.date=2020-11-02&rft.pub=Association+on+Computer+Machinery&rft.eissn=1558-2434&rft.spage=1&rft.epage=9&rft_id=info:doi/10.1145%2F3400302.3415651&rft.externalDocID=9256758