Integrating GraphSAGE and Mamba for Self-Supervised Spatio-Temporal Fault Detection in Microservice Systems

Monitoring and fault detection in microservice systems is crucial for ensuring service stability. However, most existing methods either rely heavily on labeled data or fail to model complex spatial-temporal dependencies across services. To address these limitations, we propose ChronoSage, a spatiote...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings - International Symposium on Software Reliability Engineering pp. 323 - 334
Main Authors: Zhang, Shenglin, Li, Yingke, Tang, Jianjin, Zhao, Chenyu, Gu, Wenwei, Sun, Yongqian, Pei, Dan
Format: Conference Proceeding
Language:English
Published: IEEE 21.10.2025
Subjects:
ISSN:2332-6549
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Monitoring and fault detection in microservice systems is crucial for ensuring service stability. However, most existing methods either rely heavily on labeled data or fail to model complex spatial-temporal dependencies across services. To address these limitations, we propose ChronoSage, a spatiotemporal fault detection framework that integrates GraphSAGE and Mamba for unified graph-stream-based modeling. GraphSAGE captures the evolving topological structures by aggregating neighborhood features, while Mamba efficiently models long-range temporal dependencies through a selective state-space mechanism. We adopt a self-supervised training strategy to reduce label dependence and enhance generalization. Experiments on two real-world datasets demonstrate that ChronoSage achieves superior accuracy and efficiency compared to state-of-art baselines, such as ART and Eadro. The results validate ChronoSage's ability to support system-level fault detection in dynamic microservice environments, achieving an F1-score of 0.872 on D1 and 0.972 on D2, surpassing all compared methods.
ISSN:2332-6549
DOI:10.1109/ISSRE66568.2025.00041