Exact and Approximation Algorithms for Sparse Principal Component Analysis

Gespeichert in:
Bibliographische Detailangaben
Titel: Exact and Approximation Algorithms for Sparse Principal Component Analysis
Autoren: Yongchun Li, Weijun Xie
Quelle: INFORMS Journal on Computing. 37:582-602
Verlagsinformationen: Institute for Operations Research and the Management Sciences (INFORMS), 2025.
Publikationsjahr: 2025
Schlagwörter: 0211 other engineering and technologies, 02 engineering and technology, 0101 mathematics, 01 natural sciences
Beschreibung: Sparse principal component analysis (SPCA) is designed to enhance the interpretability of traditional principal component analysis by optimally selecting a subset of features that comprise the first principal component. Given the NP-hard nature of SPCA, most current approaches resort to approximate solutions, typically achieved through tractable semidefinite programs or heuristic methods. To solve SPCA to optimality, we propose two exact mixed-integer semidefinite programs (MISDPs) and an arbitrarily equivalent mixed-integer linear program. The MISDPs allow us to design an effective branch-and-cut algorithm with closed-form cuts that do not need to solve dual problems. For the proposed mixed-integer formulations, we further derive the theoretical optimality gaps of their continuous relaxations. Besides, we apply the greedy and local search algorithms to solving SPCA and derive their first-known approximation ratios. Our numerical experiments reveal that the exact methods we developed can efficiently find optimal solutions for data sets containing hundreds of features. Furthermore, our approximation algorithms demonstrate both scalability and near-optimal performance when benchmarked on larger data sets, specifically those with thousands of features. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: This research was supported in part by the Division of Civil, Mechanical and Manufacturing Innovation [Grant 224614], the Division of Computing and Communication Foundations [Grant 2246417], and the Office of Naval Research [Grant N00014-24-1-2066]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0372 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0372 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Publikationsart: Article
Sprache: English
ISSN: 1526-5528
1091-9856
DOI: 10.1287/ijoc.2022.0372
Dokumentencode: edsair.doi...........1ee2f532070adb42d9c287cc121c15d4
Datenbank: OpenAIRE
Beschreibung
Abstract:Sparse principal component analysis (SPCA) is designed to enhance the interpretability of traditional principal component analysis by optimally selecting a subset of features that comprise the first principal component. Given the NP-hard nature of SPCA, most current approaches resort to approximate solutions, typically achieved through tractable semidefinite programs or heuristic methods. To solve SPCA to optimality, we propose two exact mixed-integer semidefinite programs (MISDPs) and an arbitrarily equivalent mixed-integer linear program. The MISDPs allow us to design an effective branch-and-cut algorithm with closed-form cuts that do not need to solve dual problems. For the proposed mixed-integer formulations, we further derive the theoretical optimality gaps of their continuous relaxations. Besides, we apply the greedy and local search algorithms to solving SPCA and derive their first-known approximation ratios. Our numerical experiments reveal that the exact methods we developed can efficiently find optimal solutions for data sets containing hundreds of features. Furthermore, our approximation algorithms demonstrate both scalability and near-optimal performance when benchmarked on larger data sets, specifically those with thousands of features. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: This research was supported in part by the Division of Civil, Mechanical and Manufacturing Innovation [Grant 224614], the Division of Computing and Communication Foundations [Grant 2246417], and the Office of Naval Research [Grant N00014-24-1-2066]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0372 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0372 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
ISSN:15265528
10919856
DOI:10.1287/ijoc.2022.0372