rs-Sparse principal component analysis: A mixed integer nonlinear programming approach with VNS
Principal component analysis is a popular data analysis dimensionality reduction technique, aiming to project with minimum error for a given dataset into a subspace of smaller number of dimensions. In order to improve interpretability, different variants of the method have been proposed in the liter...
Uloženo v:
| Vydáno v: | Computers & operations research Ročník 52; číslo Part B; s. 349 - 354 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Elsevier Ltd
01.12.2014
Pergamon Press Inc |
| Témata: | |
| ISSN: | 0305-0548, 1873-765X, 0305-0548 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Principal component analysis is a popular data analysis dimensionality reduction technique, aiming to project with minimum error for a given dataset into a subspace of smaller number of dimensions.
In order to improve interpretability, different variants of the method have been proposed in the literature, in which, besides error minimization, sparsity is sought. In this paper we formulate as a mixed integer nonlinear program the problem of finding a subspace with a sparse basis minimizing the sum of squares of distances between the points and their projections. Contrary to other attempts in the literature, with our model the user can fix the level of sparseness of the resulting basis vectors. Variable neighborhood search is proposed to solve the problem obtained this way.
Our numerical experience on test sets shows that our procedure outperforms benchmark methods in the literature, both in terms of sparsity and errors. |
|---|---|
| Bibliografie: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0305-0548 1873-765X 0305-0548 |
| DOI: | 10.1016/j.cor.2013.04.012 |