Co-design Hardware and Algorithm for Vector Search

Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query text...

Full description

Saved in:
Bibliographic Details
Published in:International Conference for High Performance Computing, Networking, Storage and Analysis (Online) pp. 1 - 16
Main Authors: Jiang, Wenqi, Li, Shigang, Zhu, Yu, de Fine Licht, Johannes, He, Zhenhao, Shi, Runbin, Renggli, Cedric, Zhang, Shuai, Rekatsinas, Theodoros, Hoefler, Torsten, Alonso, Gustavo
Format: Conference Proceeding
Language:English
Published: ACM 11.11.2023
Subjects:
ISSN:2167-4337
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0× and 37.2× speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5× and 7.6× speedup in median and 95 th per-centile (P95) latency within an eight-accelerator configuration. The remarkable performance of FANNS lays a robust groundwork for future FPGA integration in data centers and AI supercomputers.
AbstractList Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0× and 37.2× speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5× and 7.6× speedup in median and 95 th per-centile (P95) latency within an eight-accelerator configuration. The remarkable performance of FANNS lays a robust groundwork for future FPGA integration in data centers and AI supercomputers.
Author Li, Shigang
Alonso, Gustavo
Hoefler, Torsten
Renggli, Cedric
Zhu, Yu
Zhang, Shuai
Rekatsinas, Theodoros
Jiang, Wenqi
de Fine Licht, Johannes
He, Zhenhao
Shi, Runbin
Author_xml – sequence: 1
  givenname: Wenqi
  surname: Jiang
  fullname: Jiang, Wenqi
  email: wenqi.jiang@inf.ethz.ch
  organization: Systems Group, ETH Zurich
– sequence: 2
  givenname: Shigang
  surname: Li
  fullname: Li, Shigang
  organization: SPCL, ETH Zurich
– sequence: 3
  givenname: Yu
  surname: Zhu
  fullname: Zhu, Yu
  organization: Systems Group, ETH Zurich
– sequence: 4
  givenname: Johannes
  surname: de Fine Licht
  fullname: de Fine Licht, Johannes
  organization: SPCL, ETH Zurich
– sequence: 5
  givenname: Zhenhao
  surname: He
  fullname: He, Zhenhao
  organization: Systems Group, ETH Zurich
– sequence: 6
  givenname: Runbin
  surname: Shi
  fullname: Shi, Runbin
  organization: Systems Group, ETH Zurich
– sequence: 7
  givenname: Cedric
  surname: Renggli
  fullname: Renggli, Cedric
  organization: Systems Group, ETH Zurich
– sequence: 8
  givenname: Shuai
  surname: Zhang
  fullname: Zhang, Shuai
  organization: Systems Group, ETH Zurich
– sequence: 9
  givenname: Theodoros
  surname: Rekatsinas
  fullname: Rekatsinas, Theodoros
  organization: Systems Group, ETH Zurich
– sequence: 10
  givenname: Torsten
  surname: Hoefler
  fullname: Hoefler, Torsten
  organization: SPCL, ETH Zurich
– sequence: 11
  givenname: Gustavo
  surname: Alonso
  fullname: Alonso, Gustavo
  organization: Systems Group, ETH Zurich
BookMark eNotjE1Lw0AQQFdRsNacvXjIH0jd2ZlkN8cS1AoFD35cy2R3to20iWwC4r83oKfHe4d3rS76oRelbkGvAKi8x9KBdbTCSltN5ZnKals70rOBrs25WhiobEGI9kpl4_iptUajCZxeKNMMRZCx2_f5hlP45iQ59yFfH_dD6qbDKY9Dyj_ETzNehZM_3KjLyMdRsn8u1fvjw1uzKbYvT8_NelswUjUVzoprKwZGSyW60PoaDYFwSRE5BrKtoK8CtnGO3hpsvfgQ0Xgbfc24VHd_305Edl-pO3H62YEmVwIB_gL3TEbJ
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3581784.3607045
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798400701092
EISSN 2167-4337
EndPage 16
ExternalDocumentID 10485141
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-a346t-87e8b6a1a374538dbc93241ea54f3afd47be3c6d3bf1eac723bcecdf32c7fc9a3
IEDL.DBID RIE
ISICitedReferencesCount 8
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001461755900074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:09:33 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a346t-87e8b6a1a374538dbc93241ea54f3afd47be3c6d3bf1eac723bcecdf32c7fc9a3
PageCount 16
ParticipantIDs ieee_primary_10485141
PublicationCentury 2000
PublicationDate 2023-Nov.-11
PublicationDateYYYYMMDD 2023-11-11
PublicationDate_xml – month: 11
  year: 2023
  text: 2023-Nov.-11
  day: 11
PublicationDecade 2020
PublicationTitle International Conference for High Performance Computing, Networking, Storage and Analysis (Online)
PublicationTitleAbbrev SC
PublicationYear 2023
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0003204180
ssib053141430
Score 1.980188
Snippet Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Approximate nearest neighbor search
Clustering algorithms
Data centers
FPGA
Hardware
hardware acceleration
Search engines
TCPIP
Training
Vectors
Title Co-design Hardware and Algorithm for Vector Search
URI https://ieeexplore.ieee.org/document/10485141
WOSCitedRecordID wos001461755900074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwGLWgYmAqRxG3PLCm1Edie0QVFVPVAVC3ysdnqAQJCin8fWwnARYGNsuTj--S_b33ELpy3jGfW5XlmvqMSwWZkcRmE-OU9lwxAJ_EJsR8LpdLtejA6gkLAwCp-QzGcZj-8l1lN_GpLHg4DwVChKlvCyFasFZvPMGWOOmpxGMYZnTCiZx0dD6E59eR6ktIPmZFsPOIX_qlp5LSyWz4z4XsodEPMA8vvlPOPtqC8gANe2UG3DnqIaLTKnOpNwPHr_lPXQPWpcM3L09VvW6eX3GoVfFjerHHbcfxCD3Mbu-nd1mnjpBpxosmhDGQptBEM8FD1HLGhlKME9A590x7x4UBZgvHjA-TVlBmLFjnGbXCW6XZERqUVQnHCCsKRR5cV3qmONDcOB1pYkyIlM5ZKk_QKJ7B6q0lwFj12z_9Y_4M7UZV9gjZI-QcDZp6Axdox3406_f6Ml3bF-egl_Q
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwGLVQQYKpHEXceGBNqY8k9ogqUBGl6lBQt8rHZ6gECQop_H1stwEWBjbLk4_vkv299xC6sM4ylxqZpIq6hAsJiRbEJD1tpXJcMgAXxSby0UhMp3K8AqtHLAwAxOYz6IZh_Mu3pVmEpzLv4dwXCAGmvp5yTskSrtWYj7cmThoy8RCIGe1xInorQh_C08tA9pUL3mWZt_SAYPqlqBITyk37n0vZRp0faB4efyedHbQGxS5qN9oMeOWqe4j2y8TG7gwcPuc_VQVYFRZfvTyV1bx-fsW-WsWP8c0eL3uOO-jh5nrSHyQrfYREMZ7VPpCB0JkiiuXcxy2rjS_GOAGVcseUszzXwExmmXZ-0uSUaQPGOkZN7oxUbB-1irKAA4QlhSz1zisckxxoqq0KRDHax0prDRWHqBPOYPa2pMCYNds_-mP-HG0OJvfD2fB2dHeMtoJGewDwEXKCWnW1gFO0YT7q-Xt1Fq_wCwPCmzs
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis+%28Online%29&rft.atitle=Co-design+Hardware+and+Algorithm+for+Vector+Search&rft.au=Jiang%2C+Wenqi&rft.au=Li%2C+Shigang&rft.au=Zhu%2C+Yu&rft.au=de+Fine+Licht%2C+Johannes&rft.date=2023-11-11&rft.pub=ACM&rft.eissn=2167-4337&rft.spage=1&rft.epage=16&rft_id=info:doi/10.1145%2F3581784.3607045&rft.externalDocID=10485141