HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing
A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the syst...
Saved in:
| Published in: | 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) pp. 613 - 627 |
|---|---|
| Main Authors: | , , , , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
29.06.2024
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the system-wide energy efficiency of datacenter servers. Nevertheless, the latest trend of integrating accelerators into server CPUs for these functions sparks a question on the SNIC processor's superiority over a host processor (i.e., server CPU with accelerators) in system-wide energy efficiency, especially under given tail latency constraints. Answering this question, we first take an Intel Xeon processor, integrated with various accelerators (e.g., QuickAssist Technology), as a host processor, and then compare it to an NVIDIA BlueField-2 SNIC processor. This uncovers that (1) the host accelerator, coupled with a more powerful memory subsystem, can outperform the SNIC accelerator, and (2) the SNIC processor can improve system-wide energy efficiency only at low packet rates for most functions under tail latency constraints. To provide high system-wide energy efficiency without compromising tail latency at any packet rates, we propose HAL, consisting of a hardware-based load balancer and an intelligent load balancing policy implemented inside the SNIC. When HAL determines that the SNIC processor cannot efficiently process a given function beyond a specific packet rate, it limits the rate of packets to the SNIC processor and lets the host processor handle the excess. We implement a HAL-enabled SNIC with a commodity FPGA and a BlueField-2 SNIC, plug it into a commodity server, and run 10 popular network functions. Our evaluation shows that HAL can improve the system-wide energy efficiency and throughput of the server running these functions by 31% and 10%, respectively, without notably increasing the tail latency. |
|---|---|
| AbstractList | A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the system-wide energy efficiency of datacenter servers. Nevertheless, the latest trend of integrating accelerators into server CPUs for these functions sparks a question on the SNIC processor's superiority over a host processor (i.e., server CPU with accelerators) in system-wide energy efficiency, especially under given tail latency constraints. Answering this question, we first take an Intel Xeon processor, integrated with various accelerators (e.g., QuickAssist Technology), as a host processor, and then compare it to an NVIDIA BlueField-2 SNIC processor. This uncovers that (1) the host accelerator, coupled with a more powerful memory subsystem, can outperform the SNIC accelerator, and (2) the SNIC processor can improve system-wide energy efficiency only at low packet rates for most functions under tail latency constraints. To provide high system-wide energy efficiency without compromising tail latency at any packet rates, we propose HAL, consisting of a hardware-based load balancer and an intelligent load balancing policy implemented inside the SNIC. When HAL determines that the SNIC processor cannot efficiently process a given function beyond a specific packet rate, it limits the rate of packets to the SNIC processor and lets the host processor handle the excess. We implement a HAL-enabled SNIC with a commodity FPGA and a BlueField-2 SNIC, plug it into a commodity server, and run 10 popular network functions. Our evaluation shows that HAL can improve the system-wide energy efficiency and throughput of the server running these functions by 31% and 10%, respectively, without notably increasing the tail latency. |
| Author | Lou, Jiaqi Zhuo, Danyang Ji, Houxiang Lee, Eun Kyung Jeong, Ipoom Kong, Xinhao Huang, Jinghan Kim, Nam Sung Vanavasam, Srikar |
| Author_xml | – sequence: 1 givenname: Jinghan surname: Huang fullname: Huang, Jinghan organization: University of Illinois Urbana-Champaign – sequence: 2 givenname: Jiaqi surname: Lou fullname: Lou, Jiaqi organization: University of Illinois Urbana-Champaign – sequence: 3 givenname: Srikar surname: Vanavasam fullname: Vanavasam, Srikar organization: University of Illinois Urbana-Champaign – sequence: 4 givenname: Xinhao surname: Kong fullname: Kong, Xinhao organization: Duke University – sequence: 5 givenname: Houxiang surname: Ji fullname: Ji, Houxiang organization: University of Illinois Urbana-Champaign – sequence: 6 givenname: Ipoom surname: Jeong fullname: Jeong, Ipoom organization: University of Illinois Urbana-Champaign – sequence: 7 givenname: Danyang surname: Zhuo fullname: Zhuo, Danyang organization: Duke University – sequence: 8 givenname: Eun Kyung surname: Lee fullname: Lee, Eun Kyung organization: IBM Research – sequence: 9 givenname: Nam Sung surname: Kim fullname: Kim, Nam Sung organization: University of Illinois Urbana-Champaign |
| BookMark | eNotj81KAzEUhSMoqLVv0EVeYOpNMvlzV4fqFAZdVDduym1yUwLtTJkZlb69FV2dAx_ng3PLLtuuJcZmAuZCgL9frauF9mDtXIIs5wCgxQWbeuud0qCk0U5cs-kw5C0Y8FZZp2_YR71oHniNffzGngo842GkyJsOI3_EPbYhtzueup4vW-p3p4JSyiFTO_L1y6oq6m4YedV1R-pxzF907ofj53ge3bGrhPuBpv85Ye9Py7eqLprX51W1aAqU2o1FlBYw2G3pkjNoTFDgwMUUREiUggq2JCcQyCuLPkazTTaQUVaWvtTSqwmb_XkzEW2OfT5gf9qI35PGaPUDMOBUBw |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ISCA59077.2024.00051 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350326581 |
| EndPage | 627 |
| ExternalDocumentID | 10609665 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a258t-d270ac7b48f86a66c30808dfc1cfefc3c74e81a0e937a9dd6bf7ce63724945293 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290320700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:34:59 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a258t-d270ac7b48f86a66c30808dfc1cfefc3c74e81a0e937a9dd6bf7ce63724945293 |
| PageCount | 15 |
| ParticipantIDs | ieee_primary_10609665 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-June-29 |
| PublicationDateYYYYMMDD | 2024-06-29 |
| PublicationDate_xml | – month: 06 year: 2024 text: 2024-June-29 day: 29 |
| PublicationDecade | 2020 |
| PublicationTitle | 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) |
| PublicationTitleAbbrev | ISCA |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib060973785 |
| Score | 2.28092 |
| Snippet | A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 613 |
| SubjectTerms | Energy efficiency Load management Memory management Servers Tail Throughput |
| Title | HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing |
| URI | https://ieeexplore.ieee.org/document/10609665 |
| WOSCitedRecordID | wos001290320700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA22ePCkYsVvcvAaTZNukvVWl5YWSilUoXgpaTIBL93SbvXvO8m26sWDt7BkCUwS5r3MvBlC7gGkdFIBE9oLJCg-tnkRwAx0QHLIOpqH1GxCj8dmNssnO7F60sIAQEo-g4c4TLF8X7ptfCrDG64QcausQRpaq1qstT88Ktad0SbbyePaPH8cTotuhuRPIw0UsUg2j-HIX01Ukg_pH_9z9RPS-lHj0cm3nzklB7A8I2-D7uiJxrj7p10DQwgc98vTUWk9fY75ig4nU4SktJfkfQxSsQhcgk7Hw4INyk1Fi7JcQV37m9b9HfCnFnnt916KAdv1SWBWZKZiXmhunV50TDDKKuUkwkDjg2u7AMFJpztg2pYDQhGbe68WQTtQUiP1inFXeU6ay3IJF4TiTIuX3GZBInXiwnrDF4ELp4NT6MguSSsaZr6qS2HM9za5-uP7NTmKto-5VSK_Ic1qvYVbcug-qvfN-i5t4Bd3UZxj |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1aBT2pWPHbHLxG02Q3Sb3VxdJiXQqtULyUNJmAl27ph_59J9utevHgLSxZApmEmcnMe4-QWwApnVTAhPYCExQfZV4EMAMJSA5ponkoxSZ0npvRqNmvwOolFgYAyuYzuIvDspbvC7eKT2V4wxVG3CrdJjtROquCa22Oj4rMM9qkFUCuwZv33UHWSjH905gIikiTzWNB8peMSulF2gf_XP-Q1H_weLT_7WmOyBZMj8lbp9V7oLHy_mnnwDAIjhbztFdYTx9jx6LDyRSDUvpUAvwYlHQRuAQd5N2MdYrFkmZFMYM1-zddKzzgT3Xy2n4aZh1WKSUwK1KzZF5obp2eJCYYZZVyEgNB44NruADBSacTMA3LAYMR2_ReTYJ2oKTG5CtWXuUJqU2LKZwSijMtXnObBonJExfWGz4JXDgdnEJXdkbqcWPGszUZxnizJ-d_fL8he53hS2_c6-bPF2Q_2iF2WonmJakt5yu4IrvuY_m-mF-XxvwCzTGfrA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+ACM%2FIEEE+51st+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=HAL%3A+Hardware-assisted+Load+Balancing+for+Energy-efficient+SNIC-Host+Cooperative+Computing&rft.au=Huang%2C+Jinghan&rft.au=Lou%2C+Jiaqi&rft.au=Vanavasam%2C+Srikar&rft.au=Kong%2C+Xinhao&rft.date=2024-06-29&rft.pub=IEEE&rft.spage=613&rft.epage=627&rft_id=info:doi/10.1109%2FISCA59077.2024.00051&rft.externalDocID=10609665 |