Reducing Fragmentation on 3D Torus-Based HPC Systems Using Packing-Based Job Scheduling and Job Placement Reconfiguration
We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the in...
Uložené v:
| Vydané v: | 2017 16th International Symposium on Parallel and Distributed Computing (ISPDC) s. 34 - 43 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.07.2017
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the internal fragmentation due to using a convex prism shape for job allocation. However, HPC systems are prone to suffer from external fragmentation as well. Hence, in this paper, we majorly strive to reduce the external fragmentation brought in by the job scheduling and placement processes. Firstly, from the job scheduling aspect of view, we propose a packing-based job scheduling strategy, which reduces the external fragmentation by using the First Come First Served + backfilling strategy. Secondly, we give a review of the migration-based job placement algorithm in our previous work. Thirdly, in order to reduce the external fragmentation resulting from running jobs scattered across the system, we propose a job placement reconfiguration algorithm, which uses a global migration process to rearrange the placement of the running jobs across the system. Both local and global migration are emulated virtual processes under the off-line scenario, which have no migration overhead. However, under the on-line scenario, migration is a real process and leads to a migration delay. Therefore, we propose a buffer-based on-line scheduling model, which helps to avoid the delay of local migration. The evaluation results validate the efficiency of our approach in reducing system fragmentation and improving system utilization. |
|---|---|
| AbstractList | We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the internal fragmentation due to using a convex prism shape for job allocation. However, HPC systems are prone to suffer from external fragmentation as well. Hence, in this paper, we majorly strive to reduce the external fragmentation brought in by the job scheduling and placement processes. Firstly, from the job scheduling aspect of view, we propose a packing-based job scheduling strategy, which reduces the external fragmentation by using the First Come First Served + backfilling strategy. Secondly, we give a review of the migration-based job placement algorithm in our previous work. Thirdly, in order to reduce the external fragmentation resulting from running jobs scattered across the system, we propose a job placement reconfiguration algorithm, which uses a global migration process to rearrange the placement of the running jobs across the system. Both local and global migration are emulated virtual processes under the off-line scenario, which have no migration overhead. However, under the on-line scenario, migration is a real process and leads to a migration delay. Therefore, we propose a buffer-based on-line scheduling model, which helps to avoid the delay of local migration. The evaluation results validate the efficiency of our approach in reducing system fragmentation and improving system utilization. |
| Author | Kangkang Li Malawski, Maciej Nabrzyski, Jarek |
| Author_xml | – sequence: 1 surname: Kangkang Li fullname: Kangkang Li email: kli3@nd.edu organization: Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA – sequence: 2 givenname: Maciej surname: Malawski fullname: Malawski, Maciej email: malawski@agh.edu.pl organization: Dept. of Comput. Sci., AGH Univ. of Sci. & Technol., Krakow, Poland – sequence: 3 givenname: Jarek surname: Nabrzyski fullname: Nabrzyski, Jarek email: naber@nd.edu organization: Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA |
| BookMark | eNotjUtPwzAQhI0EByg9cuLiP5DiRxw7R0jpA1Uiasq5WtubEtE6KI9D_z0JrTTSaGa03z6Q21AHJOSJsxnnLH1ZF_k8mwnG9ZBvyDTVhitpEmYSIe_JeYu-d1U40EUDhxOGDrqqDnSQnNNd3fRt9AYterrKM1qc2w5PLf1qx4sc3M_g1_2jtrRw3wPuOI4QLlV-BIcjl27R1aGsDn3z_-KR3JVwbHF69QnZLd532SrafC7X2esmqrhWXaQ9MOMsyIQxpcHGZZIMndReQcxiZYS1XoK1XMROeIfeeM6FAp-WWqGckOcLtkLE_W9TnaA57w0XPOFa_gG_2loK |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ISPDC.2017.11 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781538608623 1538608626 |
| EndPage | 43 |
| ExternalDocumentID | 8121617 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i175t-7da08cba360057ab4f667da37d5a404582bbd3abb124c2dced8d1125ad9f75e3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000419859700005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Thu Jun 29 18:37:35 EDT 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i175t-7da08cba360057ab4f667da37d5a404582bbd3abb124c2dced8d1125ad9f75e3 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_8121617 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-July |
| PublicationDateYYYYMMDD | 2017-07-01 |
| PublicationDate_xml | – month: 07 year: 2017 text: 2017-July |
| PublicationDecade | 2010 |
| PublicationTitle | 2017 16th International Symposium on Parallel and Distributed Computing (ISPDC) |
| PublicationTitleAbbrev | ISPDC |
| PublicationYear | 2017 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.658849 |
| Snippet | We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 34 |
| SubjectTerms | Delays job placement Layout migration off-line on-line packing-based job scheduling Processor scheduling Resource management Schedules Scheduling Shape Topology-aware |
| Title | Reducing Fragmentation on 3D Torus-Based HPC Systems Using Packing-Based Job Scheduling and Job Placement Reconfiguration |
| URI | https://ieeexplore.ieee.org/document/8121617 |
| WOSCitedRecordID | wos000419859700005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0A8eBJDRi_swePLmC7dNurRYLGkEZ64Eb2Y0o42BqgJv57d9oGPHgx6aGdbbrJziYznX3vDcA9Rsql6V7GA18oLgIRcu28y00YiUfjhRprkaQ3OZuFi0WUtOBhz4VBxAp8hn26rc7ybWFKKpUNXDCidLwNbSllzdU6yGYOXubJOCawluxTP6BfzVKqWDE5-d8sp9A7kO5Ysg8nZ9DCvAvf7ySt6p6ZyzBXHw1RKGfu8scsLTbllj-5QGTZNIlZIz_OKhwAS5ShOngz_lpoNncesgQ9XzGV16aEyuj0XUY_onm2XpX1nuhBOnlO4ylvuiXwtUsBdlxaNQyNVn5ABFOlRRYEzuZLO1KCjkM9ra2vtHYR3XjWoA2tS7ZGykaZHKF_Dp28yPECGL1LfcK0yqQQmuRcpBlqtNIKKTReQpdWbflZ62EsmwW7-tt8DcfkkxriegOd3abEWzgyX7v1dnNXOfEHheWh-A |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEJ4gmuhJDRjf7sGjy6PddturIAFF0kgP3Mi-SjjQGqAm_nt32gY8eDHpYTvbbJOdJjOd_b5vAB5NKGya7iTUd5mgzGcBlda7VAUh6yonkKYUSRrzySSYzcKoBk87LowxpgCfmRYOi7N8nakcS2VtG4wwHT-AQ8-u3y3ZWnvhzPZoGvV7CNfiLewI9KtdShEtBqf_e88ZNPe0OxLtAso51EzagO8PFFe198TmmItVRRVKib3cPomzdb6hzzYUaTKMeqQSICcFEoBEQmElvJp_zSSZWh9pBJ8viEhLU4SFdFyX4K9omiwXeflVNCEevMS9Ia36JdClTQK2lGvRCZQUro8UUyFZ4vvW5nLtCYYHoo6U2hVS2piuHK2MDrRNtzyhw4R7xr2Aepql5hIIPoudwqRIOGMSBV246kijuWacSXMFDdy1-WepiDGvNuz6b_MDHA_j9_F8PJq83cAJ-qcEvN5CfbvOzR0cqa_tcrO-Lxz6AxvZpT8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+16th+International+Symposium+on+Parallel+and+Distributed+Computing+%28ISPDC%29&rft.atitle=Reducing+Fragmentation+on+3D+Torus-Based+HPC+Systems+Using+Packing-Based+Job+Scheduling+and+Job+Placement+Reconfiguration&rft.au=Kangkang+Li&rft.au=Malawski%2C+Maciej&rft.au=Nabrzyski%2C+Jarek&rft.date=2017-07-01&rft.pub=IEEE&rft.spage=34&rft.epage=43&rft_id=info:doi/10.1109%2FISPDC.2017.11&rft.externalDocID=8121617 |