An Optimal Algorithm for Extreme Scale Job Launching

All distributed software systems execute a bootstrapping phase upon instantiation. During this phase, the composite processes of the system are deployed onto a set of computational nodes and initialization information is disseminated amongst these processes. However, with the growing trend toward hi...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE ... International Conference on Trust, Security and Privacy in Computing and Communications (Online) s. 1115 - 1122
Hlavní autoři: Goehner, Joshua D., Groves, Taylor L., Arnold, Dorian C., Ahn, Dong H., Lee, Gregory L.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.07.2013
Témata:
ISSN:2324-898X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract All distributed software systems execute a bootstrapping phase upon instantiation. During this phase, the composite processes of the system are deployed onto a set of computational nodes and initialization information is disseminated amongst these processes. However, with the growing trend toward high-end systems with very large numbers of compute cores, the bootstrapping phase increasingly is becoming a bottleneck. This presents significant challenges to several key elements of extreme-scale machines: the usefulness of interactive run-time tools and the efficiency of newly emerging computational models such as many-task computing and uncertainty quantification runs are increasingly subject to the inefficient bootstrapping problem. In this paper, we propose a novel algorithm that determines an optimal bootstrapping strategy. Our algorithm is based on a process launch performance model and finds the optimal strategy given a specified set of nodes. We prove that our process launching strategy is optimal with empirical comparisons with other standard strategies. Lastly, we show that our algorithm can decrease bootstrapping time in a real software system by up to 50%.
AbstractList All distributed software systems execute a bootstrapping phase upon instantiation. During this phase, the composite processes of the system are deployed onto a set of computational nodes and initialization information is disseminated amongst these processes. However, with the growing trend toward high-end systems with very large numbers of compute cores, the bootstrapping phase increasingly is becoming a bottleneck. This presents significant challenges to several key elements of extreme-scale machines: the usefulness of interactive run-time tools and the efficiency of newly emerging computational models such as many-task computing and uncertainty quantification runs are increasingly subject to the inefficient bootstrapping problem. In this paper, we propose a novel algorithm that determines an optimal bootstrapping strategy. Our algorithm is based on a process launch performance model and finds the optimal strategy given a specified set of nodes. We prove that our process launching strategy is optimal with empirical comparisons with other standard strategies. Lastly, we show that our algorithm can decrease bootstrapping time in a real software system by up to 50%.
Author Arnold, Dorian C.
Groves, Taylor L.
Lee, Gregory L.
Ahn, Dong H.
Goehner, Joshua D.
Author_xml – sequence: 1
  givenname: Joshua D.
  surname: Goehner
  fullname: Goehner, Joshua D.
  email: josh.goehner@roguewave.com
  organization: Rogue Wave Software, Inc., Natick, MA, USA
– sequence: 2
  givenname: Taylor L.
  surname: Groves
  fullname: Groves, Taylor L.
  email: tgroves@cs.unm.edu
  organization: Univ. of New Mexico, Albuquerque, NM, USA
– sequence: 3
  givenname: Dorian C.
  surname: Arnold
  fullname: Arnold, Dorian C.
  email: darnold@cs.unm.edu
  organization: Univ. of New Mexico, Albuquerque, NM, USA
– sequence: 4
  givenname: Dong H.
  surname: Ahn
  fullname: Ahn, Dong H.
  email: ahn1@llnl.gov
  organization: Lawrence Livermore Nat. Lab., Livermore, CA, USA
– sequence: 5
  givenname: Gregory L.
  surname: Lee
  fullname: Lee, Gregory L.
  email: lee218@llnl.gov
  organization: Lawrence Livermore Nat. Lab., Livermore, CA, USA
BookMark eNotzstKw0AYQOERKlhrHkDczAskzv2yDKHeCHRhBXdlZvJPO5BLmaSgb6-gq7P7OLdoNU4jIHRPSUUpsY_7fJmXZhoqRiivKJdXqLDaEK2slIQxskJrxpkojTWfN6iY5-QJU1pxoswaiXrEu_OSBtfjuj9OOS2nAccp4-3XkmEA_B5cD_ht8rh1lzGc0ni8Q9fR9TMU_92gj6ftvnkp293za1O3ZWLULKXWzpnOBAbBU-c6LY3SMUYjFGgOrHNWMhFFxzsbPXFUBM80-BCotgCCb9DDn5sA4HDOv5f5-6CUIVYq_gOeA0mV
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/TrustCom.2013.135
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9780769550220
0769550223
EndPage 1122
ExternalDocumentID 6680956
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i218t-77aa8d8c2ecb1aad75867fff846e73e2da9524f4d3d9fb0a14cb27ebcc179ee43
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000332856700141&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2324-898X
IngestDate Wed Aug 27 03:55:36 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i218t-77aa8d8c2ecb1aad75867fff846e73e2da9524f4d3d9fb0a14cb27ebcc179ee43
OpenAccessLink https://www.osti.gov/biblio/1119938
PageCount 8
ParticipantIDs ieee_primary_6680956
PublicationCentury 2000
PublicationDate 2013-July
PublicationDateYYYYMMDD 2013-07-01
PublicationDate_xml – month: 07
  year: 2013
  text: 2013-July
PublicationDecade 2010
PublicationTitle IEEE ... International Conference on Trust, Security and Privacy in Computing and Communications (Online)
PublicationTitleAbbrev trustcom
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026763068
ssj0003204185
Score 1.5121849
Snippet All distributed software systems execute a bootstrapping phase upon instantiation. During this phase, the composite processes of the system are deployed onto a...
SourceID ieee
SourceType Publisher
StartPage 1115
SubjectTerms bootstrapping
Computational modeling
Data models
Greedy algorithms
job launching
large scale systems software
Mathematical model
resource and job management
Software
Software algorithms
Topology
Title An Optimal Algorithm for Extreme Scale Job Launching
URI https://ieeexplore.ieee.org/document/6680956
WOSCitedRecordID wos000332856700141&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6AePCECsbf6cGjk63tuvZIDMQYghzQcCNt96YksBkcxj_f1zExJl68LcsOW9fX7-vr-95HyLVCjqARugNpE9ygxE4FKmZxoJ2RnHPLs0oh9zxKxmM1m-lJg9zstDAAUBWfwa2_rM7y08JtfKqsJ6XyffOapJkkcqvV-p47TGKghDV19qswZ6Hvy1J5yzERKK1m9aFmFOre1AsaMOJ8bRf33g-_zFUqbBm2__dWB6T7I9Kjkx38HJIG5Eek_e3SQOug7RDRz-kjLgwrs6T95UuxXpSvK4pclQ4-S58dxEcRJehDYenIIMz5nFSXPA0H07v7oPZKCBYI0iWSZGNUqhwDZyNjUtwGyCTLMqQXkHBgqdExE5lIeaozG5pIOMsSsM5hRAIIfkxaeZHDiS92wiGSsTZhCMJEznDkbMIYAGms0PyUdPwgzN-27TDm9fef_X37nOyzykHCV7hekFa53sAl2XMf5eJ9fVX9wy982Jn9
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG4QTfSECsbf9uDRydZ2XXskBuKPiRzQcCNt1ykJbAaH8c_3dQyMiRdvy7LD1vX1-_r6vvchdCmAI0iAbo_rCDYooRGeCEnoSaM4pVTTtFTIvcRRvy9GIzmooau1FsZaWxaf2Wt3WZ7lJ7lZuFRZm3Ph-uZtoM2QMeIv1Vqr2UM4hIpfkWe3DlPiu84spbscYZ6QYlQdawa-bA-dpAFizlV3Uef-8MtepUSXXuN_77WLWj8yPTxYA9AeqtlsHzVWPg24CtsmYp0MP8HSMFNT3Jm-5vNJ8TbDwFZx96tw-UF4FHAC3-caxwqAzmWlWui51x3e3HqVW4I3AZgugCYrJRJhiDU6UCqBjQCP0jQFgmEjakmiZEhYyhKayFT7KmBGk8hqYyAmrWX0ANWzPLOHrtwJhoiHUvm-ZSowigJrY0pZy5Vmkh6hphuE8fuyIca4-v7jv29foO3b4WM8ju_6Dydoh5R-Eq7e9RTVi_nCnqEt81lMPubn5f_8BrGknUQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+...+International+Conference+on+Trust%2C+Security+and+Privacy+in+Computing+and+Communications+%28Online%29&rft.atitle=An+Optimal+Algorithm+for+Extreme+Scale+Job+Launching&rft.au=Goehner%2C+Joshua+D.&rft.au=Groves%2C+Taylor+L.&rft.au=Arnold%2C+Dorian+C.&rft.au=Ahn%2C+Dong+H.&rft.date=2013-07-01&rft.pub=IEEE&rft.issn=2324-898X&rft.spage=1115&rft.epage=1122&rft_id=info:doi/10.1109%2FTrustCom.2013.135&rft.externalDocID=6680956
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2324-898X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2324-898X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2324-898X&client=summon