Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems

Distributed systems generate a large amount of monitoring data such as log files to track their operational status. However, it is hard to correlate such monitoring data effectively across distributed systems and along observation time for system management. In previous work, we proposed a concept n...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on knowledge and data engineering Vol. 19; no. 11; pp. 1508 - 1523
Main Authors:	Guofei Jiang, Haifeng Chen, Yoshihira, K.
Format:	Journal Article
Language:	English
Published:	New York, NY IEEE 01.11.2007 IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Algorithms for data and knowledge management Analysis of Algorithms and Problem Complexity Applied sciences Approximation Complexity Computation Computational complexity Computer networks Computer science; control theory; systems Computerized monitoring Costs Data mining Data processing. List processing. Character string processing Distributed Systems Exact sciences and technology Fault detection Fluid flow measurement Hardware Information systems. Data bases Invariants Large-scale systems Management information systems Memory organisation. Data processing Monitoring Software Studies System Management Systems management Time series analysis Volume measurement Invariant data management Decision support system Complex system Time series Distributed system Randomized algorithm Computational complexity Search algorithm monitoring data Surveillance Storage management Time management Distributed systems randomized algorithms Algorithm complexity Log file Monitoring Large scale system invariants
ISSN:	1041-4347, 1558-2191
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Distributed systems generate a large amount of monitoring data such as log files to track their operational status. However, it is hard to correlate such monitoring data effectively across distributed systems and along observation time for system management. In previous work, we proposed a concept named flow intensity to measure the intensity with which internal monitoring data reacts to the volume of user requests. We calculated flow intensity measurements from monitoring data and proposed an algorithm to automatically search constant relationships between flow intensities measured at various points across distributed systems. If such relationships hold all the time, we regard them as invariants of the underlying systems. Invariants can be used to characterize complex systems and support various system management tasks. However, the computational complexity of the previous invariant search algorithm is high so that it may not scale well in large systems with thousands of measurements. In this paper, we propose two efficient but approximate algorithms for inferring invariants in large-scale systems. The computational complexity of new randomized algorithms is significantly reduced, and experimental results from a real system are also included to demonstrate the accuracy and efficiency of our new algorithms.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2007.190648