Integrating GraphSAGE and Mamba for Self-Supervised Spatio-Temporal Fault Detection in Microservice Systems

Monitoring and fault detection in microservice systems is crucial for ensuring service stability. However, most existing methods either rely heavily on labeled data or fail to model complex spatial-temporal dependencies across services. To address these limitations, we propose ChronoSage, a spatiote...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - International Symposium on Software Reliability Engineering s. 323 - 334
Hlavní autoři: Zhang, Shenglin, Li, Yingke, Tang, Jianjin, Zhao, Chenyu, Gu, Wenwei, Sun, Yongqian, Pei, Dan
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 21.10.2025
Témata:
ISSN:2332-6549
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Monitoring and fault detection in microservice systems is crucial for ensuring service stability. However, most existing methods either rely heavily on labeled data or fail to model complex spatial-temporal dependencies across services. To address these limitations, we propose ChronoSage, a spatiotemporal fault detection framework that integrates GraphSAGE and Mamba for unified graph-stream-based modeling. GraphSAGE captures the evolving topological structures by aggregating neighborhood features, while Mamba efficiently models long-range temporal dependencies through a selective state-space mechanism. We adopt a self-supervised training strategy to reduce label dependence and enhance generalization. Experiments on two real-world datasets demonstrate that ChronoSage achieves superior accuracy and efficiency compared to state-of-art baselines, such as ART and Eadro. The results validate ChronoSage's ability to support system-level fault detection in dynamic microservice environments, achieving an F1-score of 0.872 on D1 and 0.972 on D2, surpassing all compared methods.
ISSN:2332-6549
DOI:10.1109/ISSRE66568.2025.00041