Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors

Uložené v:
Podrobná bibliografia
Názov: Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors
Autori: Xiaoke Qin, Wangshu Tu, Francesca Martella, Sanjeena Dang Subedi
Informácie o vydavateľovi: 2023.
Rok vydania: 2023
Predmety: mixture models, AECM algorithm, factor analyzers, clustering
Popis: Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.
Druh dokumentu: Conference object
Jazyk: English
Prístupová URL adresa: https://hdl.handle.net/11573/1717031
Prístupové číslo: edsair.od......3686..ac9d62aefdbf4e77c22b5cb75f9cebb1
Databáza: OpenAIRE
Popis
Abstrakt:Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.