How to recover efficiently and asynchronously when optimism fails
We propose a new algorithm for recovering asynchronously from failures in a distributed computation. Our algorithm is based on two novel concepts-a fault-tolerant vector clock to maintain causality information in spite of failures, and a history mechanism to detect orphan states and obsolete message...
Gespeichert in:
| Veröffentlicht in: | Proceedings of the 16th International Conference on Distributed Computing Systems S. 108 - 115 |
|---|---|
| Hauptverfasser: | , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
1996
|
| Schlagworte: | |
| ISBN: | 9780818673993, 0818673990 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We propose a new algorithm for recovering asynchronously from failures in a distributed computation. Our algorithm is based on two novel concepts-a fault-tolerant vector clock to maintain causality information in spite of failures, and a history mechanism to detect orphan states and obsolete messages. These two mechanisms together with checkpointing and message-logging are used to restore the system to a consistent state after a failure of one or more processes. Our algorithm is completely asynchronous. It handles multiple failures, does not assume any message ordering, causes the minimum amount of rollback and restores the maximum recoverable state with low overhead. Earlier optimistic protocols lack one or more of the above properties. |
|---|---|
| AbstractList | We propose a new algorithm for recovering asynchronously from failures in a distributed computation. Our algorithm is based on two novel concepts-a fault-tolerant vector clock to maintain causality information in spite of failures, and a history mechanism to detect orphan states and obsolete messages. These two mechanisms together with checkpointing and message-logging are used to restore the system to a consistent state after a failure of one or more processes. Our algorithm is completely asynchronous. It handles multiple failures, does not assume any message ordering, causes the minimum amount of rollback and restores the maximum recoverable state with low overhead. Earlier optimistic protocols lack one or more of the above properties. |
| Author | Damani, O.P. Garg, V.K. |
| Author_xml | – sequence: 1 givenname: O.P. surname: Damani fullname: Damani, O.P. organization: Dept. of Comput. Sci., Texas Univ., Austin, TX, USA – sequence: 2 givenname: V.K. surname: Garg fullname: Garg, V.K. |
| BookMark | eNotj8tKxDAUQAMqqGM_QFf5gdabV9Msh_qYgQEX6npIk1sm0iZDUx369wrj6sBZHDi35DKmiITcM6gYA_O4bZ_a94oZU1cKtAF9QQqjG2hYU2thjLgmRc5fAMBMbUCxG7LepBOdE53QpR-cKPZ9cAHjPCzURk9tXqI7TCmm7_ynTgeMNB3nMIY80t6GId-Rq94OGYt_rsjny_NHuyl3b6_bdr0rAwM5l9JYZ7hrNAihvEDHJHjfdZxBrbyy2CvLpfXacNmw3nOlmXaAteKdxA7EijycuwER98cpjHZa9udP8Qvjqkrh |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ICDCS.1996.507907 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EndPage | 115 |
| ExternalDocumentID | 507907 |
| GroupedDBID | 6IE 6IK 6IL AAJGR AAWTH ACGHX ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK OCL RIE RIL |
| ID | FETCH-LOGICAL-i104t-49ac92c870335d3ec140ddbb21065d5aef5a24ad792481fd25717c0e652b4eb03 |
| IEDL.DBID | RIE |
| ISBN | 9780818673993 0818673990 |
| ISICitedReferencesCount | 35 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=507907&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Tue Aug 26 17:13:37 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i104t-49ac92c870335d3ec140ddbb21065d5aef5a24ad792481fd25717c0e652b4eb03 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_507907 |
| PublicationCentury | 1900 |
| PublicationDate | 19960000 |
| PublicationDateYYYYMMDD | 1996-01-01 |
| PublicationDate_xml | – year: 1996 text: 19960000 |
| PublicationDecade | 1990 |
| PublicationTitle | Proceedings of the 16th International Conference on Distributed Computing Systems |
| PublicationTitleAbbrev | ICDCS |
| PublicationYear | 1996 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0001969051 |
| Score | 1.473742 |
| Snippet | We propose a new algorithm for recovering asynchronously from failures in a distributed computation. Our algorithm is based on two novel concepts-a... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 108 |
| SubjectTerms | Checkpointing Clocks Costs Distributed computing Fault detection Fault tolerance History Protocols |
| Title | How to recover efficiently and asynchronously when optimism fails |
| URI | https://ieeexplore.ieee.org/document/507907 |
| WOSCitedRecordID | wos507907&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqlCK-5YE1bRMndjyiQlWWqhIgdav8cRGVSoKalKr_nrMTWiGxsCUeLMv2-e6d_d4Rcs-EdahBB2GWxUHMNEebi6Mg49YaQJfJjPLFJsRkks5mctrobHsuDAD4x2fQc5_-Lt8WZu1SZX2MXaRjjh8KIWqq1j6dIrlTmvIKj06jDf1uo-i0-2fNpWY4kP3n4ePwxTH1eK_u9FdxFe9bRu1_jeqEdPccPTrdeZ9TcgB5h7R_ijTQxmbPyMO42NCqoA754ral4DUjsMfllqrcUlVuc-MUcot1iU2bd8hpgecIrv8HzdRiWXbJ2-jpdTgOmroJwQLBVRXEUhkZGbRExhLLwCCIslZrRHc8sYmCLFFRrKxA7JWGmUWrDYUZAE8iHYMesHPSyoscLgiNLEgMspwIHY9Do7XkoWJJaFLmQjdxSTpuQuaftTTGvJ6Lqz9br8lx_eTZ5S9uSKtareGWHJmvalGu7vxyfgPvZZyP |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0MmugJRYzf9uB1gW67XXo0KIGIhERMuJF-bSTBXcOChH_vtLtCTLx46_bQbNpOZ9607w1C9zQ2DjWogCQJCxhVHGyOhUHCjdEWXCbV0hebiIfD9mQiRqXOtufCWGv94zPbcE1_l28yvXKpsibELsIxx_cjxkJSkLV2CRXBndaU13h0Km3geUtNp-03La81SUs0-53Hzqvj6vFGMeyv8ireu3Sr__qvY1TfsfTwaOt_TtCeTWuo-lOmAZdWe4oeetkaLzPssC9sXGy9agSMON9gmRos802qnUZutsqha_1uU5zBSQI74AMncjbP6-it-zTu9IKyckIwA3i1DJiQWoQabJHSyFCrAUYZoxTgOx6ZSNokkiGTJgb01SaJAbslsW5ZHoWKWdWiZ6iSZqk9Rzg0VkCY5WToOCNaKcGJpBHRbeqCt_gC1dyETD8LcYxpMReXf_beocPe-GUwHfSHz1foqHgA7bIZ16iyXKzsDTrQX8tZvrj1S_sNSA2f1g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+16th+International+Conference+on+Distributed+Computing+Systems&rft.atitle=How+to+recover+efficiently+and+asynchronously+when+optimism+fails&rft.au=Damani%2C+O.P.&rft.au=Garg%2C+V.K.&rft.date=1996-01-01&rft.pub=IEEE&rft.isbn=9780818673993&rft.spage=108&rft.epage=115&rft_id=info:doi/10.1109%2FICDCS.1996.507907&rft.externalDocID=507907 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818673993/lc.gif&client=summon&freeimage=true |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818673993/mc.gif&client=summon&freeimage=true |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818673993/sc.gif&client=summon&freeimage=true |

