L1/2-regularized nonnegative matrix factorization for HMM-based sequence representation learning

Symbolic sequence representation plays a pivotal role in many resource-constrained expert systems. Recently, Hidden Markov Model (HMM)-based methods have received extensive interest due to their ability to capture underlying structural features with interpretability, especially for representation le...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications Vol. 300; p. 130378
Main Authors: Cheng, Lingfang, Chen, Lifei, Zheng, Ping
Format: Journal Article
Language:English
Published: Elsevier Ltd 05.03.2026
Subjects:
ISSN:0957-4174
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Symbolic sequence representation plays a pivotal role in many resource-constrained expert systems. Recently, Hidden Markov Model (HMM)-based methods have received extensive interest due to their ability to capture underlying structural features with interpretability, especially for representation learning applications on small sequence sets. However, the performance of the existing methods is constrained by the optimization of the hidden states. In this paper, a novel joint optimization method is proposed, the optimization objective of which is to minimize the loss of data reconstruction based on the HMM state transition, while maximizing the between-state scatter associated with the state emission probability distribution. We propose to measure the scatter by a new L1/2-norm defined on the state emission, and formulate the representation learning problem as an L1/2-regularized symmetric nonnegative matrix tri-factorization problem. An efficient matrix factorization algorithm is then derived with rigorous convergence proof. The proposed method is experimentally evaluated on widely adopted sequence sets, and the results obtained demonstrate its effectiveness and efficiency.
ISSN:0957-4174
DOI:10.1016/j.eswa.2025.130378