Co-design Hardware and Algorithm for Vector Search
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query text...
Saved in:
| Published in: | International Conference for High Performance Computing, Networking, Storage and Analysis (Online) pp. 1 - 16 |
|---|---|
| Main Authors: | , , , , , , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
ACM
11.11.2023
|
| Subjects: | |
| ISSN: | 2167-4337 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0× and 37.2× speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5× and 7.6× speedup in median and 95 th per-centile (P95) latency within an eight-accelerator configuration. The remarkable performance of FANNS lays a robust groundwork for future FPGA integration in data centers and AI supercomputers. |
|---|---|
| AbstractList | Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0× and 37.2× speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5× and 7.6× speedup in median and 95 th per-centile (P95) latency within an eight-accelerator configuration. The remarkable performance of FANNS lays a robust groundwork for future FPGA integration in data centers and AI supercomputers. |
| Author | Li, Shigang Alonso, Gustavo Hoefler, Torsten Renggli, Cedric Zhu, Yu Zhang, Shuai Rekatsinas, Theodoros Jiang, Wenqi de Fine Licht, Johannes He, Zhenhao Shi, Runbin |
| Author_xml | – sequence: 1 givenname: Wenqi surname: Jiang fullname: Jiang, Wenqi email: wenqi.jiang@inf.ethz.ch organization: Systems Group, ETH Zurich – sequence: 2 givenname: Shigang surname: Li fullname: Li, Shigang organization: SPCL, ETH Zurich – sequence: 3 givenname: Yu surname: Zhu fullname: Zhu, Yu organization: Systems Group, ETH Zurich – sequence: 4 givenname: Johannes surname: de Fine Licht fullname: de Fine Licht, Johannes organization: SPCL, ETH Zurich – sequence: 5 givenname: Zhenhao surname: He fullname: He, Zhenhao organization: Systems Group, ETH Zurich – sequence: 6 givenname: Runbin surname: Shi fullname: Shi, Runbin organization: Systems Group, ETH Zurich – sequence: 7 givenname: Cedric surname: Renggli fullname: Renggli, Cedric organization: Systems Group, ETH Zurich – sequence: 8 givenname: Shuai surname: Zhang fullname: Zhang, Shuai organization: Systems Group, ETH Zurich – sequence: 9 givenname: Theodoros surname: Rekatsinas fullname: Rekatsinas, Theodoros organization: Systems Group, ETH Zurich – sequence: 10 givenname: Torsten surname: Hoefler fullname: Hoefler, Torsten organization: SPCL, ETH Zurich – sequence: 11 givenname: Gustavo surname: Alonso fullname: Alonso, Gustavo organization: Systems Group, ETH Zurich |
| BookMark | eNotjE1Lw0AQQFdRsNacvXjIH0jd2ZlkN8cS1AoFD35cy2R3to20iWwC4r83oKfHe4d3rS76oRelbkGvAKi8x9KBdbTCSltN5ZnKals70rOBrs25WhiobEGI9kpl4_iptUajCZxeKNMMRZCx2_f5hlP45iQ59yFfH_dD6qbDKY9Dyj_ETzNehZM_3KjLyMdRsn8u1fvjw1uzKbYvT8_NelswUjUVzoprKwZGSyW60PoaDYFwSRE5BrKtoK8CtnGO3hpsvfgQ0Xgbfc24VHd_305Edl-pO3H62YEmVwIB_gL3TEbJ |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3581784.3607045 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798400701092 |
| EISSN | 2167-4337 |
| EndPage | 16 |
| ExternalDocumentID | 10485141 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-a346t-87e8b6a1a374538dbc93241ea54f3afd47be3c6d3bf1eac723bcecdf32c7fc9a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001461755900074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:09:33 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a346t-87e8b6a1a374538dbc93241ea54f3afd47be3c6d3bf1eac723bcecdf32c7fc9a3 |
| PageCount | 16 |
| ParticipantIDs | ieee_primary_10485141 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Nov.-11 |
| PublicationDateYYYYMMDD | 2023-11-11 |
| PublicationDate_xml | – month: 11 year: 2023 text: 2023-Nov.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | International Conference for High Performance Computing, Networking, Storage and Analysis (Online) |
| PublicationTitleAbbrev | SC |
| PublicationYear | 2023 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0003204180 ssib053141430 |
| Score | 1.980188 |
| Snippet | Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Approximate nearest neighbor search Clustering algorithms Data centers FPGA Hardware hardware acceleration Search engines TCPIP Training Vectors |
| Title | Co-design Hardware and Algorithm for Vector Search |
| URI | https://ieeexplore.ieee.org/document/10485141 |
| WOSCitedRecordID | wos001461755900074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5BxcBUHkW85YE1JbYvcTKiioqp6gCoW-UnVIIEhRT-PrabAAsDm2XJkh9355N93_cBXHnnY8bkaaieyhKkTiSlxjRxZa6dYqksqItiE2I2KxaLct6B1SMWxlobi8_sODTjX76p9To8lXkPR58gBJj6thBiA9bqjcfbEtKeSjyEYc5SpEXa0flQzK4D1ZcocMxzb-cBv_RLTyVeJ9PhPyeyB6MfYB6Zf185-7BlqwMY9soMpHPUQ2CTOjGxNoOEr_lP2VgiK0NuXp7qZtU-vxKfq5LH-GJPNhXHI3iY3t5P7pJOHSGRHPPWhzFbqFxSyQX6qGWU9qkYUiszdFw6g0JZrnPDlfOdWjCutNXGcaaF06XkRzCo6soeA8m03w-hs0gXyCIHvR8bqM9ooTjHExiFPVi-bQgwlv3yT__oP4PdoMoeIHuUnsOgbdb2Anb0R7t6by7jsX0BngaVjA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZQQYKpPIp444E1JbYvcTKiClREqToU1K3yEypBgkIKfx_bTYCFgc2yZMmPu_PJvu_7ELpwzke1TmNfPZVEQCyPcgVxZPNUWUljkREbxCb4eJzNZvmkAasHLIwxJhSfmb5vhr98XaqlfypzHg4uQfAw9fUEgJIVXKs1H2dNQFoycR-IGY2BZHFD6EMgufRkXzyDPkudpXsE0y9FlXCh3HT_OZVt1PuB5uHJ96Wzg9ZMsYu6rTYDblx1D9FBGelQnYH95_ynqAwWhcZXL09ltaifX7HLVvFjeLPHq5rjHnq4uZ4OhlGjjxAJBmntApnJZCqIYBxc3NJSuWQMiBEJWCasBi4NU6lm0rpOxSmTyihtGVXcqlywfdQpysIcIJwotx9cJYEwkAYWejfWk5-RTDIGh6jn92D-tqLAmLfLP_qj_xxtDqf3o_nodnx3jLa8RrsH8BFygjp1tTSnaEN91Iv36iwc4ReacZjT |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis+%28Online%29&rft.atitle=Co-design+Hardware+and+Algorithm+for+Vector+Search&rft.au=Jiang%2C+Wenqi&rft.au=Li%2C+Shigang&rft.au=Zhu%2C+Yu&rft.au=de+Fine+Licht%2C+Johannes&rft.date=2023-11-11&rft.pub=ACM&rft.eissn=2167-4337&rft.spage=1&rft.epage=16&rft_id=info:doi/10.1145%2F3581784.3607045&rft.externalDocID=10485141 |