DeepTraLog: Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning

A microservice system in industry is usually a large-scale dis-tributed system consisting of dozens to thousands of services run-ning in different machines. An anomaly of the system often can be reflected in traces and logs, which record inter-service interactions and intra-service behaviors respect...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) s. 623 - 634
Hlavní autoři: Zhang, Chenxi, Peng, Xin, Sha, Chaofeng, Zhang, Ke, Fu, Zhenqing, Wu, Xiya, Lin, Qingwei, Zhang, Dongmei
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 01.05.2022
Témata:
ISSN:1558-1225
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:A microservice system in industry is usually a large-scale dis-tributed system consisting of dozens to thousands of services run-ning in different machines. An anomaly of the system often can be reflected in traces and logs, which record inter-service interactions and intra-service behaviors respectively. Existing trace anomaly detection approaches treat a trace as a sequence of service invocations. They ignore the complex structure of a trace brought by its invocation hierarchy and parallel/asynchronous invocations. On the other hand, existing log anomaly detection approaches treat a log as a sequence of events and cannot handle microservice logs that are distributed in a large number of services with complex interactions. In this paper, we propose DeepTraLog, a deep learning based microservice anomaly detection approach. DeepTraLog uses a unified graph representation to describe the complex structure of a trace together with log events embedded in the structure. Based on the graph representation, DeepTraLog trains a GGNNs based deep SVDD model by combing traces and logs and detects anom-alies in new traces and the corresponding logs. Evaluation on a microservice benchmark shows that DeepTraLog achieves a high precision (0.93) and recall (0.97), outperforming state-of-the-art trace/log anomaly detection approaches with an average increase of 0.37 in F1-score. It also validates the efficiency of DeepTraLog, the contribution of the unified graph representation, and the impact of the configurations of some key parameters.
ISSN:1558-1225
DOI:10.1145/3510003.3510180