Nicaea: A Byzantine Fault Tolerant Consensus Under Unpredictable Message Delivery Failures for Parallel and Distributed Computing

Byzantine fault-tolerant (BFT) consensus is a critical problem in parallel and distributed computing systems, particularly with potential adversaries. Most prior work on BFT consensus assumes reliable message delivery and tolerates arbitrary failures of up to <inline-formula><tex-math notat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers Jg. 74; H. 3; S. 915 - 928
Hauptverfasser: Jing, Guanlin, Zou, Yifei, Xu, Minghui, Zhang, Yanqiang, Yu, Dongxiao, Shan, Zhiguang, Cheng, Xiuzhen, Ranjan, Rajiv
Format: Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 01.03.2025
Schlagworte:
ISSN:0018-9340, 1557-9956
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Byzantine fault-tolerant (BFT) consensus is a critical problem in parallel and distributed computing systems, particularly with potential adversaries. Most prior work on BFT consensus assumes reliable message delivery and tolerates arbitrary failures of up to <inline-formula><tex-math notation="LaTeX">\frac{n}{3}</tex-math> <mml:math><mml:mfrac><mml:mi>n</mml:mi><mml:mn>3</mml:mn></mml:mfrac></mml:math><inline-graphic xlink:href="jing-ieq1-3506856.gif"/> </inline-formula> nodes out of <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="jing-ieq2-3506856.gif"/> </inline-formula> total nodes. However, many systems face unpredictable message delivery failures. This paper investigates the impact of unpredictable message delivery failures on the BFT consensus problem. We propose Nicaea, a novel protocol enabling consensus among loyal nodes when the number of Byzantine nodes is below a new threshold, given by: <inline-formula><tex-math notation="LaTeX">\frac{\left(2-\rho\right)\left(1-\rho\right)^{2n-2}-1}{\left(2-\rho\right) \left(1-\rho\right)^{2n-2}+1}n</tex-math> <mml:math><mml:mfrac><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mo>−</mml:mo><mml:mi>ρ</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>ρ</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>n</mml:mi><mml:mo>−</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mo>−</mml:mo><mml:mi>ρ</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>ρ</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>n</mml:mi><mml:mo>−</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="jing-ieq3-3506856.gif"/> </inline-formula>, where <inline-formula><tex-math notation="LaTeX">\rho</tex-math> <mml:math><mml:mi>ρ</mml:mi></mml:math><inline-graphic xlink:href="jing-ieq4-3506856.gif"/> </inline-formula> denotes the message failure rate. Theoretical proofs and experimental results validate Nicaea's Byzantine resilience. Our findings reveal a fundamental trade-off: as message delivery instability increases, a system's tolerance to Byzantine failures decreases. The well-known <inline-formula><tex-math notation="LaTeX">\frac{n}{3}</tex-math> <mml:math><mml:mfrac><mml:mi>n</mml:mi><mml:mn>3</mml:mn></mml:mfrac></mml:math><inline-graphic xlink:href="jing-ieq5-3506856.gif"/> </inline-formula> threshold under reliable message delivery is a special case of our generalized threshold when <inline-formula><tex-math notation="LaTeX">\rho=0</tex-math> <mml:math><mml:mi>ρ</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math><inline-graphic xlink:href="jing-ieq6-3506856.gif"/> </inline-formula>. To the best of our knowledge, this work presents the first quantitative characterization of unpredictable message delivery failures' impact on Byzantine fault tolerance in parallel and distributed computing.
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2024.3506856