Unsupervised feature selection using orthogonal encoder-decoder factorization

Unsupervised feature selection (UFS) is a fundamental task in machine learning and data analysis, aimed at identifying a subset of non-redundant and relevant features from a high-dimensional dataset. Embedded methods seamlessly integrate feature selection into model training, resulting in more effic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences Jg. 663; S. 120277
Hauptverfasser: Mozafari, Maryam, Seyedi, Seyed Amjad, Pir Mohammadiani, Rojiar, Akhlaghian Tab, Fardin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.03.2024
Schlagworte:
ISSN:0020-0255, 1872-6291
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Unsupervised feature selection (UFS) is a fundamental task in machine learning and data analysis, aimed at identifying a subset of non-redundant and relevant features from a high-dimensional dataset. Embedded methods seamlessly integrate feature selection into model training, resulting in more efficient and interpretable models. Current embedded UFS methods primarily rely on self-representation or pseudo-supervised feature selection approaches to address redundancy and irrelevant feature issues, respectively. Nevertheless, there is currently a lack of research showcasing the fusion of these two approaches. This paper proposes the Orthogonal Encoder-Decoder factorization for unsupervised Feature Selection (OEDFS) model, combining the strengths of self-representation and pseudo-supervised approaches. This method draws inspiration from the self-representation properties of autoencoder architectures and leverages encoder and decoder factorizations to simulate a pseudo-supervised feature selection approach. To further enhance the part-based characteristics of factorization, orthogonality constraints and local structure preservation restrictions are incorporated into the objective function. The optimization process is based on the multiplicative update rule, ensuring efficient convergence. To assess the effectiveness of the proposed method, comprehensive experiments are conducted on 14 datasets and compare the results with eight state-of-the-art methods. The experimental results demonstrate the superior performance of the proposed approach in terms of UFS efficiency.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.120277