HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing

A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the syst...

Full description

Saved in:
Bibliographic Details
Published in:2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) pp. 613 - 627
Main Authors: Huang, Jinghan, Lou, Jiaqi, Vanavasam, Srikar, Kong, Xinhao, Ji, Houxiang, Jeong, Ipoom, Zhuo, Danyang, Lee, Eun Kyung, Kim, Nam Sung
Format: Conference Proceeding
Language:English
Published: IEEE 29.06.2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the system-wide energy efficiency of datacenter servers. Nevertheless, the latest trend of integrating accelerators into server CPUs for these functions sparks a question on the SNIC processor's superiority over a host processor (i.e., server CPU with accelerators) in system-wide energy efficiency, especially under given tail latency constraints. Answering this question, we first take an Intel Xeon processor, integrated with various accelerators (e.g., QuickAssist Technology), as a host processor, and then compare it to an NVIDIA BlueField-2 SNIC processor. This uncovers that (1) the host accelerator, coupled with a more powerful memory subsystem, can outperform the SNIC accelerator, and (2) the SNIC processor can improve system-wide energy efficiency only at low packet rates for most functions under tail latency constraints. To provide high system-wide energy efficiency without compromising tail latency at any packet rates, we propose HAL, consisting of a hardware-based load balancer and an intelligent load balancing policy implemented inside the SNIC. When HAL determines that the SNIC processor cannot efficiently process a given function beyond a specific packet rate, it limits the rate of packets to the SNIC processor and lets the host processor handle the excess. We implement a HAL-enabled SNIC with a commodity FPGA and a BlueField-2 SNIC, plug it into a commodity server, and run 10 popular network functions. Our evaluation shows that HAL can improve the system-wide energy efficiency and throughput of the server running these functions by 31% and 10%, respectively, without notably increasing the tail latency.
AbstractList A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently execute network functions frequently used by datacenter applications. With such a processor, the SNIC has promised to notably improve the system-wide energy efficiency of datacenter servers. Nevertheless, the latest trend of integrating accelerators into server CPUs for these functions sparks a question on the SNIC processor's superiority over a host processor (i.e., server CPU with accelerators) in system-wide energy efficiency, especially under given tail latency constraints. Answering this question, we first take an Intel Xeon processor, integrated with various accelerators (e.g., QuickAssist Technology), as a host processor, and then compare it to an NVIDIA BlueField-2 SNIC processor. This uncovers that (1) the host accelerator, coupled with a more powerful memory subsystem, can outperform the SNIC accelerator, and (2) the SNIC processor can improve system-wide energy efficiency only at low packet rates for most functions under tail latency constraints. To provide high system-wide energy efficiency without compromising tail latency at any packet rates, we propose HAL, consisting of a hardware-based load balancer and an intelligent load balancing policy implemented inside the SNIC. When HAL determines that the SNIC processor cannot efficiently process a given function beyond a specific packet rate, it limits the rate of packets to the SNIC processor and lets the host processor handle the excess. We implement a HAL-enabled SNIC with a commodity FPGA and a BlueField-2 SNIC, plug it into a commodity server, and run 10 popular network functions. Our evaluation shows that HAL can improve the system-wide energy efficiency and throughput of the server running these functions by 31% and 10%, respectively, without notably increasing the tail latency.
Author Lou, Jiaqi
Zhuo, Danyang
Ji, Houxiang
Lee, Eun Kyung
Jeong, Ipoom
Kong, Xinhao
Huang, Jinghan
Kim, Nam Sung
Vanavasam, Srikar
Author_xml – sequence: 1
  givenname: Jinghan
  surname: Huang
  fullname: Huang, Jinghan
  organization: University of Illinois Urbana-Champaign
– sequence: 2
  givenname: Jiaqi
  surname: Lou
  fullname: Lou, Jiaqi
  organization: University of Illinois Urbana-Champaign
– sequence: 3
  givenname: Srikar
  surname: Vanavasam
  fullname: Vanavasam, Srikar
  organization: University of Illinois Urbana-Champaign
– sequence: 4
  givenname: Xinhao
  surname: Kong
  fullname: Kong, Xinhao
  organization: Duke University
– sequence: 5
  givenname: Houxiang
  surname: Ji
  fullname: Ji, Houxiang
  organization: University of Illinois Urbana-Champaign
– sequence: 6
  givenname: Ipoom
  surname: Jeong
  fullname: Jeong, Ipoom
  organization: University of Illinois Urbana-Champaign
– sequence: 7
  givenname: Danyang
  surname: Zhuo
  fullname: Zhuo, Danyang
  organization: Duke University
– sequence: 8
  givenname: Eun Kyung
  surname: Lee
  fullname: Lee, Eun Kyung
  organization: IBM Research
– sequence: 9
  givenname: Nam Sung
  surname: Kim
  fullname: Kim, Nam Sung
  organization: University of Illinois Urbana-Champaign
BookMark eNotj81KAzEUhSMoqLVv0EVeYOpNMvlzV4fqFAZdVDduym1yUwLtTJkZlb69FV2dAx_ng3PLLtuuJcZmAuZCgL9frauF9mDtXIIs5wCgxQWbeuud0qCk0U5cs-kw5C0Y8FZZp2_YR71oHniNffzGngo842GkyJsOI3_EPbYhtzueup4vW-p3p4JSyiFTO_L1y6oq6m4YedV1R-pxzF907ofj53ge3bGrhPuBpv85Ye9Py7eqLprX51W1aAqU2o1FlBYw2G3pkjNoTFDgwMUUREiUggq2JCcQyCuLPkazTTaQUVaWvtTSqwmb_XkzEW2OfT5gf9qI35PGaPUDMOBUBw
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ISCA59077.2024.00051
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350326581
EndPage 627
ExternalDocumentID 10609665
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIO
ID FETCH-LOGICAL-a258t-d270ac7b48f86a66c30808dfc1cfefc3c74e81a0e937a9dd6bf7ce63724945293
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290320700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:34:59 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a258t-d270ac7b48f86a66c30808dfc1cfefc3c74e81a0e937a9dd6bf7ce63724945293
PageCount 15
ParticipantIDs ieee_primary_10609665
PublicationCentury 2000
PublicationDate 2024-June-29
PublicationDateYYYYMMDD 2024-06-29
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-June-29
  day: 29
PublicationDecade 2020
PublicationTitle 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)
PublicationTitleAbbrev ISCA
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060973785
Score 2.28092
Snippet A typical SmartNIC (SNIC) integrates a processor comprising Arm CPU and accelerators with a conventional NIC. The processor is designed to energy-efficiently...
SourceID ieee
SourceType Publisher
StartPage 613
SubjectTerms Energy efficiency
Load management
Memory management
Servers
Tail
Throughput
Title HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing
URI https://ieeexplore.ieee.org/document/10609665
WOSCitedRecordID wos001290320700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA22ePCkYsVvcvAaTZNukvVWl5YWSilUoXgpaTIBL93SbvXvO8m26sWDt7BkCUwS5r3MvBlC7gGkdFIBE9oLJCg-tnkRwAx0QHLIOpqH1GxCj8dmNssnO7F60sIAQEo-g4c4TLF8X7ptfCrDG64QcausQRpaq1qstT88Ktad0SbbyePaPH8cTotuhuRPIw0UsUg2j-HIX01Ukg_pH_9z9RPS-lHj0cm3nzklB7A8I2-D7uiJxrj7p10DQwgc98vTUWk9fY75ig4nU4SktJfkfQxSsQhcgk7Hw4INyk1Fi7JcQV37m9b9HfCnFnnt916KAdv1SWBWZKZiXmhunV50TDDKKuUkwkDjg2u7AMFJpztg2pYDQhGbe68WQTtQUiP1inFXeU6ay3IJF4TiTIuX3GZBInXiwnrDF4ELp4NT6MguSSsaZr6qS2HM9za5-uP7NTmKto-5VSK_Ic1qvYVbcug-qvfN-i5t4Bd3UZxj
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1aBT2pWPHbHLxG02Q3Sb3VxdJiXQqtULyUNJmAl27ph_59J9utevHgLSxZApmEmcnMe4-QWwApnVTAhPYCExQfZV4EMAMJSA5ponkoxSZ0npvRqNmvwOolFgYAyuYzuIvDspbvC7eKT2V4wxVG3CrdJjtROquCa22Oj4rMM9qkFUCuwZv33UHWSjH905gIikiTzWNB8peMSulF2gf_XP-Q1H_weLT_7WmOyBZMj8lbp9V7oLHy_mnnwDAIjhbztFdYTx9jx6LDyRSDUvpUAvwYlHQRuAQd5N2MdYrFkmZFMYM1-zddKzzgT3Xy2n4aZh1WKSUwK1KzZF5obp2eJCYYZZVyEgNB44NruADBSacTMA3LAYMR2_ReTYJ2oKTG5CtWXuUJqU2LKZwSijMtXnObBonJExfWGz4JXDgdnEJXdkbqcWPGszUZxnizJ-d_fL8he53hS2_c6-bPF2Q_2iF2WonmJakt5yu4IrvuY_m-mF-XxvwCzTGfrA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+ACM%2FIEEE+51st+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=HAL%3A+Hardware-assisted+Load+Balancing+for+Energy-efficient+SNIC-Host+Cooperative+Computing&rft.au=Huang%2C+Jinghan&rft.au=Lou%2C+Jiaqi&rft.au=Vanavasam%2C+Srikar&rft.au=Kong%2C+Xinhao&rft.date=2024-06-29&rft.pub=IEEE&rft.spage=613&rft.epage=627&rft_id=info:doi/10.1109%2FISCA59077.2024.00051&rft.externalDocID=10609665