L1/2-regularized nonnegative matrix factorization for HMM-based sequence representation learning

Symbolic sequence representation plays a pivotal role in many resource-constrained expert systems. Recently, Hidden Markov Model (HMM)-based methods have received extensive interest due to their ability to capture underlying structural features with interpretability, especially for representation le...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 300; s. 130378
Hlavní autoři: Cheng, Lingfang, Chen, Lifei, Zheng, Ping
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 05.03.2026
Témata:
ISSN:0957-4174
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Symbolic sequence representation plays a pivotal role in many resource-constrained expert systems. Recently, Hidden Markov Model (HMM)-based methods have received extensive interest due to their ability to capture underlying structural features with interpretability, especially for representation learning applications on small sequence sets. However, the performance of the existing methods is constrained by the optimization of the hidden states. In this paper, a novel joint optimization method is proposed, the optimization objective of which is to minimize the loss of data reconstruction based on the HMM state transition, while maximizing the between-state scatter associated with the state emission probability distribution. We propose to measure the scatter by a new L1/2-norm defined on the state emission, and formulate the representation learning problem as an L1/2-regularized symmetric nonnegative matrix tri-factorization problem. An efficient matrix factorization algorithm is then derived with rigorous convergence proof. The proposed method is experimentally evaluated on widely adopted sequence sets, and the results obtained demonstrate its effectiveness and efficiency.
ISSN:0957-4174
DOI:10.1016/j.eswa.2025.130378