HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization

Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many t...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings / IEEE International Conference on Cluster Computing pp. 381 - 393
Main Authors: Dorier, Matthieu, Egele, Romain, Balaprakash, Prasanna, Koo, Jaehoon, Madireddy, Sandeep, Ramesh, Srinivasan, Malony, Allen D., Ross, Rob
Format: Conference Proceeding
Language:English
Published: IEEE 01.09.2022
Subjects:
ISSN:2168-9253
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on Argonne's Theta supercomputer. We show that our transfer-learning approach enables a more than 40 x search speedup over random search, compared with a 2.5 x to 10 x speedup when not using transfer learning. Additionally, we show that our approach is on par with state-of-the-art autotuning frameworks in speed and outperforms them in resource utilization and parallelization capabilities.
AbstractList Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on Argonne's Theta supercomputer. We show that our transfer-learning approach enables a more than 40 x search speedup over random search, compared with a 2.5 x to 10 x speedup when not using transfer learning. Additionally, we show that our approach is on par with state-of-the-art autotuning frameworks in speed and outperforms them in resource utilization and parallelization capabilities.
Author Dorier, Matthieu
Balaprakash, Prasanna
Madireddy, Sandeep
Ross, Rob
Ramesh, Srinivasan
Koo, Jaehoon
Egele, Romain
Malony, Allen D.
Author_xml – sequence: 1
  givenname: Matthieu
  surname: Dorier
  fullname: Dorier, Matthieu
  email: mdorier@anl.gov
  organization: Argonne National Laboratory,Lemont,IL
– sequence: 2
  givenname: Romain
  surname: Egele
  fullname: Egele, Romain
  email: romain.egele@universite-paris-saclay.fr
  organization: Argonne National Laboratory,Lemont,IL
– sequence: 3
  givenname: Prasanna
  surname: Balaprakash
  fullname: Balaprakash, Prasanna
  email: pbalapra@anl.gov
  organization: Argonne National Laboratory,Lemont,IL
– sequence: 4
  givenname: Jaehoon
  surname: Koo
  fullname: Koo, Jaehoon
  email: jkoo@anl.gov
  organization: Argonne National Laboratory,Lemont,IL
– sequence: 5
  givenname: Sandeep
  surname: Madireddy
  fullname: Madireddy, Sandeep
  email: smadireddy@anl.gov
  organization: Argonne National Laboratory,Lemont,IL
– sequence: 6
  givenname: Srinivasan
  surname: Ramesh
  fullname: Ramesh, Srinivasan
  email: sramesh@cs.uorcgon.cdu
  organization: University of Oregon,Eugene,OR
– sequence: 7
  givenname: Allen D.
  surname: Malony
  fullname: Malony, Allen D.
  email: malony@cs.uorcgon.cdu
  organization: University of Oregon,Eugene,OR
– sequence: 8
  givenname: Rob
  surname: Ross
  fullname: Ross, Rob
  email: rross@anl.gov
  organization: Argonne National Laboratory,Lemont,IL
BookMark eNotjM1OAjEURqvRRECfwIV9gcH-zLR0iRMEExKMgFtymd5CDbRkOmMyPr2Kbs63OPlOn1yFGJCQB86GnDPzWM7Xy9XkreA5l0PBhBgyxnJzQfpcqSI3o0KxS9ITXI0yIwp5Q_opfTAmtWSqR_az15Ium1jDDukS609fIR23TWza4MOOrtMv36H20PgY4JCdLYYqWqxpNm29RUvHqQvVvo4htok-QYfJQ6CLU-OP_uv8vCXXDg4J7_53QNbPk1U5y-aL6Us5nmd7IXWTYZUbdLmtCjDbEXBTaZQalBMgLRpVOMMt_7HSKdCWIVdbB3rrLLOWCSMH5P6v6xFxc6r9EepuYwwXWmj5DTKdXKM
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CLUSTER51413.2022.00049
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1665498560
9781665498562
EISSN 2168-9253
EndPage 393
ExternalDocumentID 9912727
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-h237t-ec49ef4dc5a9b8a19c7e37a6f2a3de965f91d1c5a3f6a7d0e16bfa7bfd0dd0293
IEDL.DBID RIE
ISICitedReferencesCount 9
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000920273100034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:18:29 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-h237t-ec49ef4dc5a9b8a19c7e37a6f2a3de965f91d1c5a3f6a7d0e16bfa7bfd0dd0293
PageCount 13
ParticipantIDs ieee_primary_9912727
PublicationCentury 2000
PublicationDate 2022-Sept.
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: 2022-Sept.
PublicationDecade 2020
PublicationTitle Proceedings / IEEE International Conference on Cluster Computing
PublicationTitleAbbrev CLUSTER
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0037306
Score 2.2716343
Snippet Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address...
SourceID ieee
SourceType Publisher
StartPage 381
SubjectTerms Autotuning
Bayes methods
Bayesian Optimization
DeepHyper
HPC
I/O
Mochi
Probability distribution
Resource management
Semantics
Storage
Supercomputers
Systematics
Transfer learning
Title HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization
URI https://ieeexplore.ieee.org/document/9912727
WOSCitedRecordID wos000920273100034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6AePCECsZ3evDoynYf7faIROVgkAQx3Ei3Mw0cZA3smvDvbXcXjIkXb02btEkf8-p88xFyK5gCn6eJp2NHYcYh9FTqR16gIGWaIySJLskmxGiUzGZy3CB3eywMIpbJZ3jvmuVfPmS6cKGynrVlAqtvm6QpBK-wWjupG9qbyuv8LebL3uBlOrH2oDUHWGi9wKAqyyl_caiUKuSp_b_Fj0j3B4tHx3stc0wauDoh7R0ZA63fZocshuMBnVgP2goIWksA2i_yLC9c6IOWuQH03brGdfjPK0ddHUuw83jPxRIQaH-zXWlXMDcrNvRBbdGBLOmrFSwfNWKzS6ZPj2-DoVfTKHiLIBS5hzqSaCLQsZJpopjUAkOhuAlUCCh5bCQDZkdDw5UAHxlPjRKpAR_At-bAKWmtshWeERomIJMoNoAmjjBQqWG-MSlanQbKOrjnpOM2bv5ZVcqY13t28Xf3JTl0J1NlbF2RVr4u8Joc6K98uVnflMf7DVnRqq4
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4QTfSECsa3PXh0ZbvvHpGIGBFJAMONdDvTwMFdA7sm_HvbZcGYePHWtEmb9DGvzjcfIbchE2AHcWRJ31CYBeBaIrY9yxEQMxkgRJEsyCbCfj-aTPigQu62WBhELJLP8N40i798SGVuQmVNbcs4Wt_ukF3DnFWitTZy19V3NSgzuJjNm-3eeKgtQm0QMFf7gc66MCf_xaJSKJFO7X_LH5LGDxqPDrZ65ohUMDkmtQ0dAy1fZ53MuoM2HWofWosIWsoA2sqzNMtN8IMW2QH0XTvHZQDQKkZNJUvQ81hP-RwQaGu5SqQpmZvmS_ogVmhglvRNi5aPErPZIOPO46jdtUoiBWvmuGFmofQ4Kg-kL3gcCcZliG4oAuUIF5AHvuIMmB51VSBCsJEFsRJhrMAGsLVBcEKqSZrgKaFuBDzyfAWofA8dEStmKxWj1mogtIt7Rupm46af61oZ03LPzv_uviH73dFrb9p77r9ckANzSuv8rUtSzRY5XpE9-ZXNl4vr4qi_AWc0rfc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+International+Conference+on+Cluster+Computing&rft.atitle=HPC+Storage+Service+Autotuning+Using+Variational-+Autoencoder+-Guided+Asynchronous+Bayesian+Optimization&rft.au=Dorier%2C+Matthieu&rft.au=Egele%2C+Romain&rft.au=Balaprakash%2C+Prasanna&rft.au=Koo%2C+Jaehoon&rft.date=2022-09-01&rft.pub=IEEE&rft.eissn=2168-9253&rft.spage=381&rft.epage=393&rft_id=info:doi/10.1109%2FCLUSTER51413.2022.00049&rft.externalDocID=9912727