Thermal Model Identification of Computing Nodes in High-Performance Computing Systems

Thermal-aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC) systems. A fundamental requirement to effectively develop such techniques is the availability of distributed and compact models representing t...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on industrial electronics (1982) Vol. 67; no. 9; pp. 7778 - 7788
Main Authors: Diversi, Roberto, Bartolini, Andrea, Benini, Luca
Format: Journal Article
Language:English
Published: New York IEEE 01.09.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0278-0046, 1557-9948
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Thermal-aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC) systems. A fundamental requirement to effectively develop such techniques is the availability of distributed and compact models representing the system thermal behavior. System identification algorithms allow to extract models directly from the thermal response of the target device. This article proposes a novel thermal identification approach for real, in-production HPC systems, which is capable of extracting thermal models from a computing node affected by quantization noise on the temperature measurements as well as operating in the free-cooling mode, with variable ambient temperature. The approach allows also to identify the physical floorplan of the CPU dies in supercomputing nodes. The effectiveness of the proposed methodology has been tested on a node of the CINECA Galileo Tier-1 supercomputer system.
AbstractList Thermal-aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC) systems. A fundamental requirement to effectively develop such techniques is the availability of distributed and compact models representing the system thermal behavior. System identification algorithms allow to extract models directly from the thermal response of the target device. This article proposes a novel thermal identification approach for real, in-production HPC systems, which is capable of extracting thermal models from a computing node affected by quantization noise on the temperature measurements as well as operating in the free-cooling mode, with variable ambient temperature. The approach allows also to identify the physical floorplan of the CPU dies in supercomputing nodes. The effectiveness of the proposed methodology has been tested on a node of the CINECA Galileo Tier-1 supercomputer system.
Author Diversi, Roberto
Bartolini, Andrea
Benini, Luca
Author_xml – sequence: 1
  givenname: Roberto
  orcidid: 0000-0002-4033-4019
  surname: Diversi
  fullname: Diversi, Roberto
  email: roberto.diversi@unibo.it
  organization: Department of Electrical, Electronic and Information Engineering (DEI), University of Bologna, Bologna, Italy
– sequence: 2
  givenname: Andrea
  orcidid: 0000-0002-1148-2450
  surname: Bartolini
  fullname: Bartolini, Andrea
  email: a.bartolini@unibo.it
  organization: Department of Electrical, Electronic and Information Engineering (DEI), University of Bologna, Bologna, Italy
– sequence: 3
  givenname: Luca
  orcidid: 0000-0001-8068-3806
  surname: Benini
  fullname: Benini, Luca
  email: luca.benini@unibo.it
  organization: Department of Electrical, Electronic and Information Engineering (DEI), University of Bologna, Bologna, Italy
BookMark eNp9kM9rwjAUgMNwMHW7D3Yp7FyXpGmbHIe4KbgfMD2HtHnRSE1cUg_-96tTxthhpweP73sPvgHqOe8AoVuCR4Rg8bCYTUYUEzGiguW0LC9Qn-R5mQrBeA_1MS15ijErrtAgxg3GhOUk76PlYg1hq5rkxWtokpkG11pja9Va7xJvkrHf7vatdavktSNiYl0ytat1-g7B-M50NfxiPg6xhW28RpdGNRFuznOIlk-TxXiazt-eZ-PHeVpnWdamouJFCUornnFRlooRgG5RVBU1WFVU17U2musCmOCKMOCcFkYbriuFTZZnQ3R_ursL_nMPsZUbvw-ueykpw4RQRgTrKHyi6uBjDGDkLtitCgdJsDzGk108eYwnz_E6pfij1Lb9btIGZZv_xLuTaAHg5w_nRUZInn0ByJ5_Sw
CODEN ITIED6
CitedBy_id crossref_primary_10_1016_j_ifacol_2021_11_262
crossref_primary_10_1155_2022_9153885
crossref_primary_10_1109_JIOT_2021_3125885
crossref_primary_10_1109_TCAD_2022_3158832
crossref_primary_10_1155_2022_3994848
crossref_primary_10_1109_TCAD_2022_3157685
Cites_doi 10.1109/TIE.2017.2777415
10.1016/j.ifacol.2017.08.1168
10.1109/JSEN.2017.2774704
10.1109/ASPDAC.2012.6165027
10.1109/TC.2012.293
10.7873/DATE.2013.060
10.1109/19.492748
10.1145/1837274.1837292
10.1109/TCSI.2014.2312495
10.1109/ITHERM.2017.7992594
10.1109/TIE.2016.2597764
10.1109/TCAD.2006.882589
10.1109/TCAD.2009.2026357
10.1109/TPDS.2012.117
10.1109/IECON.2016.7793664
10.1109/MDAT.2017.2774774
10.1109/TIE.2017.2703680
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SP
8FD
L7M
DOI 10.1109/TIE.2019.2945277
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1557-9948
EndPage 7788
ExternalDocumentID 10_1109_TIE_2019_2945277
8863115
Genre orig-research
GrantInformation_xml – fundername: EU FETHPC
  grantid: 671623
– fundername: EU ERC
  grantid: 291125
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACKIV
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IBMZZ
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
RIA
RIE
RNS
TAE
TN5
TWZ
VH1
VJK
AAYXX
CITATION
7SP
8FD
L7M
ID FETCH-LOGICAL-c333t-9b867eada838977a41ee67e6bb2f0ab2dccdfd8d6e498a14e8826fdf8dba0f353
IEDL.DBID RIE
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000536291000061&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0278-0046
IngestDate Mon Jun 30 10:20:09 EDT 2025
Sat Nov 29 01:31:45 EST 2025
Tue Nov 18 21:18:39 EST 2025
Wed Aug 27 02:39:16 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c333t-9b867eada838977a41ee67e6bb2f0ab2dccdfd8d6e498a14e8826fdf8dba0f353
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8068-3806
0000-0002-4033-4019
0000-0002-1148-2450
OpenAccessLink https://ieeexplore.ieee.org/document/8863115
PQID 2401124194
PQPubID 85464
PageCount 11
ParticipantIDs crossref_primary_10_1109_TIE_2019_2945277
crossref_citationtrail_10_1109_TIE_2019_2945277
ieee_primary_8863115
proquest_journals_2401124194
PublicationCentury 2000
PublicationDate 2020-09-01
PublicationDateYYYYMMDD 2020-09-01
PublicationDate_xml – month: 09
  year: 2020
  text: 2020-09-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on industrial electronics (1982)
PublicationTitleAbbrev TIE
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref15
ref14
box (ref25) 2016
ref10
ref17
ref16
ref19
ref18
esmaeilzadeh (ref1) 2011
gao (ref4) 0
(ref2) 2017
ljung (ref11) 1999
ref24
ref23
ref20
moskovsky (ref7) 2016; 3
ref22
ref21
anderson (ref26) 1979
soderstrom (ref12) 1989
ref8
ref9
pore (ref5) 2015
ref3
ref6
References_xml – ident: ref15
  doi: 10.1109/TIE.2017.2777415
– ident: ref3
  doi: 10.1016/j.ifacol.2017.08.1168
– ident: ref23
  doi: 10.1109/JSEN.2017.2774704
– year: 2017
  ident: ref2
  publication-title: Strategic Research Agenda
– ident: ref21
  doi: 10.1109/ASPDAC.2012.6165027
– ident: ref17
  doi: 10.1109/TC.2012.293
– year: 1999
  ident: ref11
  publication-title: System Identification-Theory for the User
– start-page: 109
  year: 2015
  ident: ref5
  publication-title: Techniques to Achieve Energy Proportionality in Data Centers A Survey
– ident: ref9
  doi: 10.7873/DATE.2013.060
– volume: 3
  start-page: 67
  year: 2016
  ident: ref7
  article-title: Server level liquid cooling: Do higher system temperatures improve energy efficiency?
  publication-title: Supercomputing Frontiers and Innovations
– ident: ref24
  doi: 10.1109/19.492748
– ident: ref19
  doi: 10.1145/1837274.1837292
– year: 1989
  ident: ref12
  publication-title: System Identification
– year: 2016
  ident: ref25
  publication-title: Time Series Analysis Forecasting and Control
– start-page: 365
  year: 2011
  ident: ref1
  article-title: Dark silicon and the end of multicore scaling
  publication-title: 2011 38th Annual International Symposium on Computer Architecture (ISCA) ISCA
– ident: ref10
  doi: 10.1109/TCSI.2014.2312495
– year: 0
  ident: ref4
  article-title: Machine learning applications for data center optimization
  publication-title: Google Inc White Paper
– ident: ref6
  doi: 10.1109/ITHERM.2017.7992594
– ident: ref13
  doi: 10.1109/TIE.2016.2597764
– ident: ref18
  doi: 10.1109/TCAD.2006.882589
– ident: ref20
  doi: 10.1109/TCAD.2009.2026357
– ident: ref22
  doi: 10.1109/TPDS.2012.117
– ident: ref16
  doi: 10.1109/IECON.2016.7793664
– year: 1979
  ident: ref26
  publication-title: Optimal Filtering
– ident: ref8
  doi: 10.1109/MDAT.2017.2774774
– ident: ref14
  doi: 10.1109/TIE.2017.2703680
SSID ssj0014515
Score 2.3684866
Snippet Thermal-aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC)...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 7778
SubjectTerms Algorithms
Ambient temperature
Autoregressive processes
Computation
Computational modeling
Design optimization
Floorplans
High performance computing
High-performance computing (HPC) systems
Nodes
supercomputing nodes
System effectiveness
System identification
Temperature measurement
Temperature sensors
Thermal analysis
thermal modeling
Thermal noise
Thermal response
Thermodynamic properties
Title Thermal Model Identification of Computing Nodes in High-Performance Computing Systems
URI https://ieeexplore.ieee.org/document/8863115
https://www.proquest.com/docview/2401124194
Volume 67
WOSCitedRecordID wos000536291000061&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  customDbUrl:
  eissn: 1557-9948
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014515
  issn: 0278-0046
  databaseCode: RIE
  dateStart: 19820101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4kEPvqpYrZKDF8Ft95HN4yjSoiClh1Z6WzYvKJSt9OHvN9lNV0URvC3LZFkyk-SbzMw3ALe5CnMeabuQCE8DzCMTWFTNg5SbWBLCjSrJnl9f6GjEZjM-bsB9XQujtS6Tz3TPPZaxfLWUW3dV1meMOHKYJjQpJVWtVh0xwGnVrSB2jLHW6duFJEPenzwPXA4X78UcpzGl346gsqfKj424PF2GR__7r2M49CgSPVRqP4GGLk7h4Au3YBum1gDsprtArtvZAlUFucbf0KGlQVU_ByuLRlZijeYFckkfwfizlOCLjGc2P4PpcDB5fAp8D4VAJkmyCbhghFpryZlFJpTmONLaviBCxCbMRaykVEYxRTTmLI-wtoibGGWYEnlokjQ5h1axLPQFIBbaDwkWmTzEOKWSmVC7uKykyvp0QnSgv5vWTHqCcdfnYpGVjkbIM6uIzCki84rowF094q0i1_hDtu0mvpbzc96B7k5zmV9968yiFAsjccTx5e-jrmA_dn5zmSvWhdZmtdXXsCffN_P16qY0rA8BMcun
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_mFNQHv6Y4nZoHXwS7pW3aJo8iGxvOsodN9laaLxiMTvbh32_SdnOiCL6Vcikld0l-l7v7HcB9KnHKXGUWUsgChzBXOwZVMydg2hNhyLTMyZ7f-lEc0_GYDSrwuKmFUUrlyWeqaR_zWL6ciZW9KmtRGlpymB3YDQjxcFGttYkZkKDoV-BZzljj9q2Dkpi1hr22zeJiTY-RwIuib4dQ3lXlx1acny-d4__92QkclTgSPRWKP4WKys7gcItdsAYjYwJm250i2-9sioqSXF3e0aGZRkVHByOLYiOxQJMM2bQPZ_BVTLAlU3Kbn8Oo0x4-d52yi4IjfN9fOozTMDL2klKDTaIoJa5S5kXIuadxyj0phNSSylARRlOXKIO5Qy01lTzF2g_8C6hms0xdAqLYfIhTV6eYkCASVGNlI7Miksar47wOrfW0JqKkGLedLqZJ7mpglhhFJFYRSamIOjxsRrwX9Bp_yNbsxG_kyjmvQ2OtuaRcf4vE4BQDJInLyNXvo-5gvzt87Sf9XvxyDQee9aLzzLEGVJfzlbqBPfGxnCzmt7mRfQLBas7u
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Thermal+Model+Identification+of+Computing+Nodes+in+High-Performance+Computing+Systems&rft.jtitle=IEEE+transactions+on+industrial+electronics+%281982%29&rft.au=Diversi%2C+Roberto&rft.au=Bartolini%2C+Andrea&rft.au=Benini%2C+Luca&rft.date=2020-09-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=0278-0046&rft.eissn=1557-9948&rft.volume=67&rft.issue=9&rft.spage=7778&rft_id=info:doi/10.1109%2FTIE.2019.2945277&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0278-0046&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0278-0046&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0278-0046&client=summon