Distributed scheduling and data sharing in late-binding overlays

Pull-based late-binding overlays are used in some of today's largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime. This helps overcome the problems of these complex environments: heterogeneity, imprecise sta...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2014 International Conference on High Performance Computing & Simulation (HPCS) S. 129 - 136
Hauptverfasser: Delgado Peris, Antonio, Hernandez, Jose M., Huedo, Eduardo
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.07.2014
Schlagworte:
ISBN:9781479953127, 1479953121
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Pull-based late-binding overlays are used in some of today's largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime. This helps overcome the problems of these complex environments: heterogeneity, imprecise status information and relatively high failure rates. In addition, the late job assignment allows dynamic adaptation to changes in grid conditions or user priorities. However, as the scale grows, the central assignment queue may become a bottleneck for the whole system. This article presents a distributed scheduling architecture for late-binding overlays, which addresses this issue by letting execution nodes build a distributed hash table and delegating job matching and assignment to them. This reduces the load on the central server and makes the system much more scalable and robust. Scalability makes fine-grained scheduling possible and enables new functionalities, like the implementation of a distributed data cache on the execution nodes, which helps alleviate the commonly congested grid storage services.
AbstractList Pull-based late-binding overlays are used in some of today's largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime. This helps overcome the problems of these complex environments: heterogeneity, imprecise status information and relatively high failure rates. In addition, the late job assignment allows dynamic adaptation to changes in grid conditions or user priorities. However, as the scale grows, the central assignment queue may become a bottleneck for the whole system. This article presents a distributed scheduling architecture for late-binding overlays, which addresses this issue by letting execution nodes build a distributed hash table and delegating job matching and assignment to them. This reduces the load on the central server and makes the system much more scalable and robust. Scalability makes fine-grained scheduling possible and enables new functionalities, like the implementation of a distributed data cache on the execution nodes, which helps alleviate the commonly congested grid storage services.
Author Delgado Peris, Antonio
Hernandez, Jose M.
Huedo, Eduardo
Author_xml – sequence: 1
  givenname: Antonio
  surname: Delgado Peris
  fullname: Delgado Peris, Antonio
  email: antonio.delgadoperis@ciemat.es
  organization: Sci. Comput. Unit, CIEMAT, Madrid, Spain
– sequence: 2
  givenname: Jose M.
  surname: Hernandez
  fullname: Hernandez, Jose M.
  email: jose.hernandez@ciemat.es
  organization: Sci. Comput. Unit, CIEMAT, Madrid, Spain
– sequence: 3
  givenname: Eduardo
  surname: Huedo
  fullname: Huedo, Eduardo
  email: ehuedo@fdi.ucm.es
  organization: Fac. de Inf., Univ. Complutense de Madrid (UCM), Madrid, Spain
BookMark eNo1j8FKAzEURSPqwtZ-QTfzAzPm5WWSyU4ZtRUKCuq6vDRvbGCaSmYq9O-lWFeHcxYX7kRcpX1iIeYgKwDp7pZv7XvcVUqCroyTaGxzISagrXM1AuClmDnb_LuyN-L-MQ5jjv4wciiGzZbDoY_pq6AUikAjFcOW8inEVPQ0culjCiff_3Du6TjciuuO-oFnZ07F5_PTR7ssV6-Ll_ZhVUZQOJYNEJrOhsZ51xnrENkxQbBaQaglkNJMiMYz1rXxqjNSMWvcWK-xA49TMf_bjcy8_s5xR_m4Pp_EXxisSaU
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/HPCSim.2014.6903678
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1479953113
9781479953134
147995313X
9781479953110
EndPage 136
ExternalDocumentID 6903678
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i123t-81a36f7d89b9f67933e9ea1d7421d501a24ea336be3556b2f602ee43c7b43f1b3
IEDL.DBID RIE
ISBN 9781479953127
1479953121
IngestDate Wed Jun 26 19:23:49 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i123t-81a36f7d89b9f67933e9ea1d7421d501a24ea336be3556b2f602ee43c7b43f1b3
PageCount 8
ParticipantIDs ieee_primary_6903678
PublicationCentury 2000
PublicationDate 20140701
PublicationDateYYYYMMDD 2014-07-01
PublicationDate_xml – month: 07
  year: 2014
  text: 20140701
  day: 01
PublicationDecade 2010
PublicationTitle 2014 International Conference on High Performance Computing & Simulation (HPCS)
PublicationTitleAbbrev HPCSim
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.5603955
Snippet Pull-based late-binding overlays are used in some of today's largest computational grids. Job agents are submitted to resources with the duty of retrieving...
SourceID ieee
SourceType Publisher
StartPage 129
SubjectTerms Distributed algorithms
Grid and Cluster Computing
Measurement
Peer-to-Peer Architectures and Networks
Peer-to-peer computing
Reliable Parallel and Distributed Algorithms
Scalable Computing
Title Distributed scheduling and data sharing in late-binding overlays
URI https://ieeexplore.ieee.org/document/6903678
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5t8eBJpRXf5ODRbTePJpubUC09lYIKvZU8ZnGhbKUPwX9vZnetCl68JYGEvGcmme8bQm6Zky5q0VncvEOdSAg8ccxC4p332jgtja6DTejpNJvPzaxF7vZYGAConM-gj8nqLz-s_A6fygbRkhPxcm2Tttb6B1YLSc0E4-yLwqnJ64ZliKVmMJmNngpEnjPZb5r5FU-lEifjo_915Jj0vnF5dLaXOCekBWWX3D8g9y2GrYJAo60aZQdCzKktA0X_T7p5tfh4R4uSLqNmibYwVqfovLm0H5seeRk_Po8mSRMXISminNkmGbNC5TpkxplcxQMmwIBlIVq5LAxTZrkEK4RyEJUJ5XiuUg4ghddOipw5cUo65aqEM0IRP5mGHITNo2rChy7zmeUOuPI5gFfnpIujX7zV1BeLZuAXfxdfkkOc4Nqb9Yp0tusdXJMD_74tNuubar0-ARhglR0
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5qFfSk0opvc_DotpvHbnZvQrVUrKVghd5KHrO4ULbSh-C_N9muVcGLtySQkPfMJPN9A3BNtdBOi07c5o1kINCyQFOFgdHGyFRLkcp1sAk5GCTjcTqswc0GC4OIpfMZtnyy_Mu3M7PyT2VtZ8lxd7luwXYkBKM_0Fqe1oxTRr9InKq8rHiGaJi2e8POc-6x51S0qoZ-RVQpBUp3_39dOYDmNzKPDDcy5xBqWDTg9s6z3_rAVWiJs1ad9PAgc6IKS7wHKFm8Kv98R_KCTJ1u6a1hX514982p-lg04aV7P-r0gioyQpA7SbMMEqp4nEmbpDrNYnfEOKaoqHV2LrVRSBUTqDiPNTp1ItYsi0OGKLiRWvCMan4E9WJW4DEQj6AMbYZcZU45YZFOTKKYRhabDNHEJ9Dwo5-8rckvJtXAT_8uvoLd3uipP-k_DB7PYM9P9tq39Rzqy_kKL2DHvC_zxfyyXLtPDXKYZA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+International+Conference+on+High+Performance+Computing+%26+Simulation+%28HPCS%29&rft.atitle=Distributed+scheduling+and+data+sharing+in+late-binding+overlays&rft.au=Delgado+Peris%2C+Antonio&rft.au=Hernandez%2C+Jose+M.&rft.au=Huedo%2C+Eduardo&rft.date=2014-07-01&rft.pub=IEEE&rft.isbn=9781479953127&rft.spage=129&rft.epage=136&rft_id=info:doi/10.1109%2FHPCSim.2014.6903678&rft.externalDocID=6903678
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781479953127/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781479953127/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781479953127/sc.gif&client=summon&freeimage=true