MOSES: A Streaming Algorithm for Linear Dimensionality Reduction

This paper introduces Memory-limited Online Subspace Estimation Scheme (MOSES) for both estimating the principal components of streaming data and reducing its dimension. More specifically, in various applications such as sensor networks, the data vectors are presented sequentially to a user who has...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on pattern analysis and machine intelligence Ročník 42; číslo 11; s. 2901 - 2911
Hlavní autoři: Eftekhari, Armin, Hauser, Raphael A., Grammenos, Andreas
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States IEEE 01.11.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper introduces Memory-limited Online Subspace Estimation Scheme (MOSES) for both estimating the principal components of streaming data and reducing its dimension. More specifically, in various applications such as sensor networks, the data vectors are presented sequentially to a user who has limited storage and processing time available. Applied to such problems, MOSES can provide a running estimate of leading principal components of the data that has arrived so far and also reduce its dimension. MOSES generalises the popular incremental Singular Vale Decomposition (iSVD) to handle thin blocks of data, rather than just vectors. This minor generalisation in part allows us to complement MOSES with a comprehensive statistical analysis, thus providing the first theoretically-sound variant of iSVD, which has been lacking despite the empirical success of this method. This generalisation also enables us to concretely interpret MOSES as an approximate solver for the underlying non-convex optimisation program. We find that MOSES consistently surpasses the state of the art in our numerical experiments with both synthetic and real-world datasets, while being computationally inexpensive.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2019.2919597