Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors

Uloženo v:
Podrobná bibliografie
Název: Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors
Autoři: Xiaoke Qin, Wangshu Tu, Francesca Martella, Sanjeena Dang Subedi
Informace o vydavateli: 2023.
Rok vydání: 2023
Témata: mixture models, AECM algorithm, factor analyzers, clustering
Popis: Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.
Druh dokumentu: Conference object
Jazyk: English
Přístupová URL adresa: https://hdl.handle.net/11573/1717031
Přístupové číslo: edsair.od......3686..ac9d62aefdbf4e77c22b5cb75f9cebb1
Databáze: OpenAIRE
Popis
Abstrakt:Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.