cPCA++: An efficient method for contrastive feature learning

•In this work, we propose a new data visualization and clustering technique for discovering discriminative structures in high-dimensional data.•This technique, referred to as cPCA++, is motivated by the fact that the interesting features of a “target” dataset may be obscured by high variance compone...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Pattern recognition Ročník 124; s. 108378
Hlavní autoři: Salloum, Ronald, Kuo, C.-C. Jay
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 01.04.2022
Témata:
ISSN:0031-3203, 1873-5142
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:•In this work, we propose a new data visualization and clustering technique for discovering discriminative structures in high-dimensional data.•This technique, referred to as cPCA++, is motivated by the fact that the interesting features of a “target” dataset may be obscured by high variance components during traditional PCA.•By analyzing what is referred to as a “background” dataset (i.e., one that exhibits the high variance principal components but not the interesting structures), our technique is capable of efficiently highlighting the structures that are unique to the “target” dataset.•Similar to another recently proposed algorithm called “contrastive PCA” (cPCA), the proposed cPCA++ method identifies important dataset-specific patterns that are not detected by traditional PCA in a wide variety of settings.•However, unlike cPCA, the proposed cPCA++ method does not require a parameter sweep, and as a result, it is significantly more efficient. In this work, we propose a new data visualization and clustering technique for discovering discriminative structures in high-dimensional data. This technique, referred to as cPCA++, is motivated by the fact that the interesting features of a “target” dataset may be obscured by high variance components during traditional PCA. By analyzing what is referred to as a “background” dataset (i.e., one that exhibits the high variance principal components but not the interesting structures), our technique is capable of efficiently highlighting the structures that are unique to the “target” dataset. Similar to another recently proposed algorithm called “contrastive PCA” (cPCA), the proposed cPCA++ method identifies important dataset-specific patterns that are not detected by traditional PCA in a wide variety of settings. However, unlike cPCA, the proposed cPCA++ method does not require a parameter sweep, and as a result, it is significantly more efficient. Several experiments were conducted in order to compare the proposed method to state-of-the-art methods. These experiments show that the proposed method achieves performance that is similar to or better than that of the other methods, while being more efficient.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2021.108378