Reducing Fragmentation on 3D Torus-Based HPC Systems Using Packing-Based Job Scheduling and Job Placement Reconfiguration

We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the in...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2017 16th International Symposium on Parallel and Distributed Computing (ISPDC) s. 34 - 43
Hlavní autori: Kangkang Li, Malawski, Maciej, Nabrzyski, Jarek
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.07.2017
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the internal fragmentation due to using a convex prism shape for job allocation. However, HPC systems are prone to suffer from external fragmentation as well. Hence, in this paper, we majorly strive to reduce the external fragmentation brought in by the job scheduling and placement processes. Firstly, from the job scheduling aspect of view, we propose a packing-based job scheduling strategy, which reduces the external fragmentation by using the First Come First Served + backfilling strategy. Secondly, we give a review of the migration-based job placement algorithm in our previous work. Thirdly, in order to reduce the external fragmentation resulting from running jobs scattered across the system, we propose a job placement reconfiguration algorithm, which uses a global migration process to rearrange the placement of the running jobs across the system. Both local and global migration are emulated virtual processes under the off-line scenario, which have no migration overhead. However, under the on-line scenario, migration is a real process and leads to a migration delay. Therefore, we propose a buffer-based on-line scheduling model, which helps to avoid the delay of local migration. The evaluation results validate the efficiency of our approach in reducing system fragmentation and improving system utilization.
AbstractList We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the internal fragmentation due to using a convex prism shape for job allocation. However, HPC systems are prone to suffer from external fragmentation as well. Hence, in this paper, we majorly strive to reduce the external fragmentation brought in by the job scheduling and placement processes. Firstly, from the job scheduling aspect of view, we propose a packing-based job scheduling strategy, which reduces the external fragmentation by using the First Come First Served + backfilling strategy. Secondly, we give a review of the migration-based job placement algorithm in our previous work. Thirdly, in order to reduce the external fragmentation resulting from running jobs scattered across the system, we propose a job placement reconfiguration algorithm, which uses a global migration process to rearrange the placement of the running jobs across the system. Both local and global migration are emulated virtual processes under the off-line scenario, which have no migration overhead. However, under the on-line scenario, migration is a real process and leads to a migration delay. Therefore, we propose a buffer-based on-line scheduling model, which helps to avoid the delay of local migration. The evaluation results validate the efficiency of our approach in reducing system fragmentation and improving system utilization.
Author Kangkang Li
Malawski, Maciej
Nabrzyski, Jarek
Author_xml – sequence: 1
  surname: Kangkang Li
  fullname: Kangkang Li
  email: kli3@nd.edu
  organization: Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA
– sequence: 2
  givenname: Maciej
  surname: Malawski
  fullname: Malawski, Maciej
  email: malawski@agh.edu.pl
  organization: Dept. of Comput. Sci., AGH Univ. of Sci. & Technol., Krakow, Poland
– sequence: 3
  givenname: Jarek
  surname: Nabrzyski
  fullname: Nabrzyski, Jarek
  email: naber@nd.edu
  organization: Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA
BookMark eNotjUtPwzAQhI0EByg9cuLiP5DiRxw7R0jpA1Uiasq5WtubEtE6KI9D_z0JrTTSaGa03z6Q21AHJOSJsxnnLH1ZF_k8mwnG9ZBvyDTVhitpEmYSIe_JeYu-d1U40EUDhxOGDrqqDnSQnNNd3fRt9AYterrKM1qc2w5PLf1qx4sc3M_g1_2jtrRw3wPuOI4QLlV-BIcjl27R1aGsDn3z_-KR3JVwbHF69QnZLd532SrafC7X2esmqrhWXaQ9MOMsyIQxpcHGZZIMndReQcxiZYS1XoK1XMROeIfeeM6FAp-WWqGckOcLtkLE_W9TnaA57w0XPOFa_gG_2loK
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ISPDC.2017.11
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781538608623
1538608626
EndPage 43
ExternalDocumentID 8121617
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i175t-7da08cba360057ab4f667da37d5a404582bbd3abb124c2dced8d1125ad9f75e3
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000419859700005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu Jun 29 18:37:35 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-7da08cba360057ab4f667da37d5a404582bbd3abb124c2dced8d1125ad9f75e3
PageCount 10
ParticipantIDs ieee_primary_8121617
PublicationCentury 2000
PublicationDate 2017-July
PublicationDateYYYYMMDD 2017-07-01
PublicationDate_xml – month: 07
  year: 2017
  text: 2017-July
PublicationDecade 2010
PublicationTitle 2017 16th International Symposium on Parallel and Distributed Computing (ISPDC)
PublicationTitleAbbrev ISPDC
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.658849
Snippet We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system...
SourceID ieee
SourceType Publisher
StartPage 34
SubjectTerms Delays
job placement
Layout
migration
off-line
on-line
packing-based job scheduling
Processor scheduling
Resource management
Schedules
Scheduling
Shape
Topology-aware
Title Reducing Fragmentation on 3D Torus-Based HPC Systems Using Packing-Based Job Scheduling and Job Placement Reconfiguration
URI https://ieeexplore.ieee.org/document/8121617
WOSCitedRecordID wos000419859700005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0A8eBJDRi_swePLmC7dNurRYLGkEZ64Eb2Y0o42BqgJv57d9oGPHgx6aGdbbrJziYznX3vDcA9Rsql6V7GA18oLgIRcu28y00YiUfjhRprkaQ3OZuFi0WUtOBhz4VBxAp8hn26rc7ybWFKKpUNXDCidLwNbSllzdU6yGYOXubJOCawluxTP6BfzVKqWDE5-d8sp9A7kO5Ysg8nZ9DCvAvf7ySt6p6ZyzBXHw1RKGfu8scsLTbllj-5QGTZNIlZIz_OKhwAS5ShOngz_lpoNncesgQ9XzGV16aEyuj0XUY_onm2XpX1nuhBOnlO4ylvuiXwtUsBdlxaNQyNVn5ABFOlRRYEzuZLO1KCjkM9ra2vtHYR3XjWoA2tS7ZGykaZHKF_Dp28yPECGL1LfcK0yqQQmuRcpBlqtNIKKTReQpdWbflZ62EsmwW7-tt8DcfkkxriegOd3abEWzgyX7v1dnNXOfEHheWh-A
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEJ4gmuhJDRjf7sGjy6PddturIAFF0kgP3Mi-SjjQGqAm_nt32gY8eDHpYTvbbJOdJjOd_b5vAB5NKGya7iTUd5mgzGcBlda7VAUh6yonkKYUSRrzySSYzcKoBk87LowxpgCfmRYOi7N8nakcS2VtG4wwHT-AQ8-u3y3ZWnvhzPZoGvV7CNfiLewI9KtdShEtBqf_e88ZNPe0OxLtAso51EzagO8PFFe198TmmItVRRVKib3cPomzdb6hzzYUaTKMeqQSICcFEoBEQmElvJp_zSSZWh9pBJ8viEhLU4SFdFyX4K9omiwXeflVNCEevMS9Ia36JdClTQK2lGvRCZQUro8UUyFZ4vvW5nLtCYYHoo6U2hVS2piuHK2MDrRNtzyhw4R7xr2Aepql5hIIPoudwqRIOGMSBV246kijuWacSXMFDdy1-WepiDGvNuz6b_MDHA_j9_F8PJq83cAJ-qcEvN5CfbvOzR0cqa_tcrO-Lxz6AxvZpT8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+16th+International+Symposium+on+Parallel+and+Distributed+Computing+%28ISPDC%29&rft.atitle=Reducing+Fragmentation+on+3D+Torus-Based+HPC+Systems+Using+Packing-Based+Job+Scheduling+and+Job+Placement+Reconfiguration&rft.au=Kangkang+Li&rft.au=Malawski%2C+Maciej&rft.au=Nabrzyski%2C+Jarek&rft.date=2017-07-01&rft.pub=IEEE&rft.spage=34&rft.epage=43&rft_id=info:doi/10.1109%2FISPDC.2017.11&rft.externalDocID=8121617