LS-Decomposition for Robust Recovery of Sensory Big Data

The emerging Internet of Things (IoT) systems are fueling an exponential explosion of sensory data. The major challenge to effective implementation of IoT systems is the presence of massive missing data entries, measurement noise, and anomaly readings , which motivates us to investigate the robust r...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on big data Ročník 4; číslo 4; s. 542 - 555
Hlavní autori: Liu, Xiao-Yang, Wang, Xiaodong
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Piscataway The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 01.12.2018
Predmet:
ISSN:2332-7790, 2372-2096
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The emerging Internet of Things (IoT) systems are fueling an exponential explosion of sensory data. The major challenge to effective implementation of IoT systems is the presence of massive missing data entries, measurement noise, and anomaly readings , which motivates us to investigate the robust recovery of sensory big data. In this paper, we propose an LS-decomposition approach that decomposes a sensory reading matrix as the superposition of a L ow-rank matrix and a S parse anomaly matrix. First, based on data sets from three representative real-world IoT projects, i.e., the IntelLab project (indoor environment), the GreenOrbs project (mountain environment), and the NBDC-CTD project (ocean environment), we observe that anomaly readings are ubiquitous and cannot be ignored. Second, we prove that the convex surrogate of the LS-decomposition problem guarantees bounded recovery error under proper conditions. Third, we propose an accelerated proximal gradient algorithm that converges to the optimal solution at a rate that is inversely proportional to the square of the number of iterations. Evaluations on the above three data sets show that the proposed scheme achieves (relative) recovery error <inline-formula><tex-math notation="LaTeX">\leq 0.05</tex-math> <inline-graphic xlink:href="wang-ieq1-2763170.gif"/> </inline-formula> for missing data rate <inline-formula><tex-math notation="LaTeX">\leq 50</tex-math> <inline-graphic xlink:href="wang-ieq2-2763170.gif"/> </inline-formula> percent and almost exact recovery for missing data rate <inline-formula><tex-math notation="LaTeX">\leq 40</tex-math> <inline-graphic xlink:href="wang-ieq3-2763170.gif"/> </inline-formula> percent, while previous methods have (relative) recovery error <inline-formula><tex-math notation="LaTeX">0.04\! \sim\! 0.15</tex-math> <inline-graphic xlink:href="wang-ieq4-2763170.gif"/> </inline-formula> even at only 10 percent missing data rate.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2332-7790
2372-2096
DOI:10.1109/TBDATA.2017.2763170