A multiprocessor scheduling algorithm for low overhead fault-tolerance
We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. App...
Uloženo v:
| Vydáno v: | Proceedings - Symposium on Reliable Distributed Systems s. 186 - 194 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.01.1998
|
| Témata: | |
| ISBN: | 0818692189, 9780818692185 |
| ISSN: | 1060-9857 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm. |
|---|---|
| AbstractList | We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm. In this paper, we propose a new scheduling algorithm for achieving fault-tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU-decomposition), we conduct simulations. Experimental results show that fault-tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm. |
| Author | Tsuchiya, T. Hashimoto, K. Kikuno, T. |
| Author_xml | – sequence: 1 givenname: K. surname: Hashimoto fullname: Hashimoto, K. organization: Dept. of Inf. & Math. Sci., Osaka Univ., Japan – sequence: 2 givenname: T. surname: Tsuchiya fullname: Tsuchiya, T. – sequence: 3 givenname: T. surname: Kikuno fullname: Kikuno, T. |
| BookMark | eNotUE1Lw0AUXLCCbfUP9JSTt9S3-djNO5baaiEgaO9hs3nbrmyyNZso_nsD9TQD88EwCzbrfEeMrTisOQd8et-Vz4ePNUcs1jKDDNMbtoCCFwITXuCMzTkIiLHI5R1bhPAJkEBayDnbb6J2dIO99F5TCL6Pgj5TMzrbnSLlTr63w7mNzCQ4_xP5b-rPpJrIqCkVD95RrzpN9-zWKBfo4R-X7LjfHbevcfn2cthuytgmAodYyjqvsda8wbxAQAIpoDZZTigIjJxoInSDqeFaC9XUDWVS1YpLyMGYdMker7XT3K-RwlC1NmhyTnXkx1AlQqQoMZ-Mq6vRElF16W2r-t_qek36B5PvW-Y |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IH CBEJK RIE RIO 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/RELDIS.1998.740493 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EndPage | 194 |
| ExternalDocumentID | 740493 |
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IPLJI M43 OCL RIE RIL RIO RNS 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-i269t-77b5b9bc1d958909e0760bf45e96e0f7bf426cd93f1cc6adbde47aba17050ff3 |
| IEDL.DBID | RIE |
| ISBN | 0818692189 9780818692185 |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000078318600023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1060-9857 |
| IngestDate | Fri Sep 05 14:32:37 EDT 2025 Tue Aug 26 17:50:38 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i269t-77b5b9bc1d958909e0760bf45e96e0f7bf426cd93f1cc6adbde47aba17050ff3 |
| Notes | SourceType-Scholarly Journals-2 ObjectType-Feature-2 ObjectType-Conference Paper-1 content type line 23 SourceType-Conference Papers & Proceedings-1 ObjectType-Article-3 |
| PQID | 26639795 |
| PQPubID | 23500 |
| PageCount | 9 |
| ParticipantIDs | proquest_miscellaneous_26639795 ieee_primary_740493 |
| PublicationCentury | 1900 |
| PublicationDate | 1998-01-01 |
| PublicationDateYYYYMMDD | 1998-01-01 |
| PublicationDate_xml | – month: 01 year: 1998 text: 1998-01-01 day: 01 |
| PublicationDecade | 1990 |
| PublicationTitle | Proceedings - Symposium on Reliable Distributed Systems |
| PublicationTitleAbbrev | RELDIS |
| PublicationYear | 1998 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0020387 ssj0000507064 |
| Score | 1.4613771 |
| Snippet | We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of... In this paper, we propose a new scheduling algorithm for achieving fault-tolerance in multiprocessor systems. The new algorithm partitions a parallel program... |
| SourceID | proquest ieee |
| SourceType | Aggregation Database Publisher |
| StartPage | 186 |
| SubjectTerms | Fault tolerance Laplace equations Processor scheduling Scheduling algorithm |
| Title | A multiprocessor scheduling algorithm for low overhead fault-tolerance |
| URI | https://ieeexplore.ieee.org/document/740493 https://www.proquest.com/docview/26639795 |
| WOSCitedRecordID | wos000078318600023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcBUKEWUTw-sad3GseMRQSuQUFVBh25RYp-hUklQmsLf55yPMsDC5gxWrIvje-e7d4-QGy2stfihPYP-AwMUob0Q4sADbozF8MD1GynFJuRsFi6Xal732S65MABQFp_BwA3LXL7J9NZdlQ0lRzzrt0hLSlFRtXbXKQxxDXPIv461XFa2THQK5qkwkGXrRye_hEtSdeOd5jloyDRMDZ8nT_ePL47DFw6q19WyK7_O6tIBTTv_Wvoh6f0Q-eh856KOyB6kXdJplBxo_WMfk-ktrSoLK9pAllMMetEJOa46jdevWb4q3t4pAly6zr6oK_vEQ9xQG-Msr8jW4PQ5oEcW08ni7sGrFRa81VioAqF1EiQq0SOjglAxBS5Pl1gegBLArMThWGijfDvSWsQmMcBlnMSuBw-z1j8h7TRL4ZRQNbZcgA9W-T633CY8BMUsQ_SjTCLDPuk6o0QfVQ-NqLJHn1w3Ro1wW7tcRZxCtt1EiBtcxjE4-3PeOTmouIHuKuSCtIt8C5dkX38Wq01-Ve6Mb_EVtGQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgIMFUKEWUr3pgTWsSx4lHBK1aUaoKOnSLkvgMlUqC0hT-Pud8lAEWNmewYl0c3zvfvXuE3MRCa40f2lLoPzBAEbHlQ-hawJXSGB6YfiOF2IQ3nfqLhZxVfbYLLgwAFMVn0DPDIpev0nhjrsr6Hkc86-ySPZdzm5Vkre2FCkNkwwz2r6Itk5ctUp2CWdJ3vaL5oxFgwkXJqvVO_ezWdBom-8-DycP4xbD4_F75wkp45ddpXbigYfNfiz8i7R8qH51tndQx2YGkRZq1lgOtfu0TMryjZW1hSRxIM4phL7ohw1an4eo1zZb52ztFiEtX6Rc1hZ94jCuqQ5xl5ekKjEIHtMl8OJjfj6xKY8Fa2kLmCK4jN5JRfKuk60smwWTqIs1dkAKY9nBoi1hJR9_GsQhVpIB7YRSaLjxMa-eUNJI0gTNCpa25AAe0dByuuY64D5JphvhHqsjzO6RljBJ8lF00gtIeHdKtjRrgxjbZijCBdLMOEDmYnKN7_ue8LjkYzZ8mwWQ8fbwghyVT0FyMXJJGnm3giuzHn_lynV0Xu-Qbgx23qw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+Seventeenth+IEEE+Symposium+on+Reliable+Distributed+Systems+%28Cat.+No.98CB36281%29&rft.atitle=A+multiprocessor+scheduling+algorithm+for+low+overhead+fault-tolerance&rft.au=Hashimoto%2C+K.&rft.au=Tsuchiya%2C+T.&rft.au=Kikuno%2C+T.&rft.date=1998-01-01&rft.pub=IEEE&rft.isbn=9780818692185&rft.issn=1060-9857&rft.spage=186&rft.epage=194&rft_id=info:doi/10.1109%2FRELDIS.1998.740493&rft.externalDocID=740493 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1060-9857&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1060-9857&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1060-9857&client=summon |

