A randomized exponential canonical correlation analysis method for data analysis and dimensionality reduction

Canonical correlation analysis (CCA) is a famous data analysis method that has been successfully used in many areas. CCA extracts meaningful information from a pair of data sets, by seeking pairs of linear combinations from two sets of variables with maximum correlation. Mathematically, CCA resorts...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Applied numerical mathematics Ročník 164; s. 101 - 124
Hlavní autoři: Wu, Gang, Li, Fei
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.06.2021
Témata:
ISSN:0168-9274, 1873-5460
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Canonical correlation analysis (CCA) is a famous data analysis method that has been successfully used in many areas. CCA extracts meaningful information from a pair of data sets, by seeking pairs of linear combinations from two sets of variables with maximum correlation. Mathematically, CCA resorts to solving a large-scale generalized eigenvalue problem. However, as the dimension of the data sets is much larger than the number of samples, CCA may suffer from the small-sample-size (SSS) problem and the over-fitting problem. In order to overcome these difficulties, the regularized technique is often applied, but it is difficult to choose the optimal parameter in advance. In this work, we propose an Exponential Canonical Correlation Analysis (ECCA) method based on matrix exponential, which is parameter-free and can overcome the over-fitting and the SSS problems fundamentally. However, the computational overhead of the ECCA method is very high in practice. Based on the randomized singular value decomposition (RSVD), we then propose a Randomized Exponential Canonical Correlation Analysis (RECCA) method for data analysis and dimensionality reduction. Theoretical results are given to show the rationality of this randomized method, and establish the relationship between RECCA and ECCA. Numerical experiments are performed on some real-world, high-dimensional and large-sample data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art CCA algorithms.
ISSN:0168-9274
1873-5460
DOI:10.1016/j.apnum.2020.09.013