A Block EM Algorithm for Multivariate Skew Normal and Skew t -Mixture Models

Finite mixtures of skew distributions provide a flexible tool for modeling heterogeneous data with asymmetric distributional features. However, parameter estimation via the Expectation-Maximization (EM) algorithm can become very time consuming due to the complicated expressions involved in the E-ste...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transaction on neural networks and learning systems Ročník 29; číslo 11; s. 5581 - 5591
Hlavní autoři: Lee, Sharon X., Leemaqz, Kaleb L., McLachlan, Geoffrey J.
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.11.2018
Témata:
ISSN:2162-237X, 2162-2388
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Finite mixtures of skew distributions provide a flexible tool for modeling heterogeneous data with asymmetric distributional features. However, parameter estimation via the Expectation-Maximization (EM) algorithm can become very time consuming due to the complicated expressions involved in the E-step that are numerically expensive to evaluate. While parallelizing the EM algorithm can offer considerable speedup in time performance, current implementations focus almost exclusively on distributed platforms. In this paper, we consider instead the most typical operating environment for users of mixture models-a standalone multicore machine and the R programming environment. We develop a block implementation of the EM algorithm that facilitates the calculations on the E- and M-steps to be spread across a number of threads. We focus on the fitting of finite mixtures of multivariate skew normal and skew <inline-formula> <tex-math notation="LaTeX">t </tex-math></inline-formula> distributions, and show that both the E- and M-steps in the EM algorithm can be modified to allow the data to be split into blocks. Our approach is easy to implement and provides immediate benefits to users of multicore machines. Experiments were conducted on two real data sets to demonstrate the effectiveness of the proposed approach.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2018.2805317