Feature selection in linear support vector machines via a hard cardinality constraint: A scalable conic decomposition approach

Uloženo v:
Podrobná bibliografie
Název: Feature selection in linear support vector machines via a hard cardinality constraint: A scalable conic decomposition approach
Autoři: Immanuel Bomze, Federico D’Onofrio, Laura Palagi, Bo Peng
Zdroj: European Journal of Operational Research.
Informace o vydavateli: Elsevier BV, 2025.
Rok vydání: 2025
Témata: Interpretable Machine Learning, Support Vector Machines, Semidefinite Programming, Mixed Integer Quadratic Programming
Popis: In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to an interpretable classification model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first introduce two mixed-integer formulations for which novel semidefinite relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. To make the best usage of the decomposed relaxations, we propose heuristics using the information of its optimal solution. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed semidefinite optimization problems. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.
Druh dokumentu: Article
Jazyk: English
ISSN: 0377-2217
DOI: 10.1016/j.ejor.2025.03.017
Přístupová URL adresa: https://hdl.handle.net/11573/1736235
https://doi.org/10.1016/j.ejor.2025.03.017
Rights: CC BY
Přístupové číslo: edsair.doi.dedup.....3e1aafb8df1753f7cd18fb32f9c48e64
Databáze: OpenAIRE
Popis
Abstrakt:In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to an interpretable classification model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first introduce two mixed-integer formulations for which novel semidefinite relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. To make the best usage of the decomposed relaxations, we propose heuristics using the information of its optimal solution. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed semidefinite optimization problems. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.
ISSN:03772217
DOI:10.1016/j.ejor.2025.03.017