A Hybrid Framework for Real-Time Data Drift and Anomaly Identification Using Hierarchical Temporal Memory and Statistical Tests
Uloženo v:
| Název: | A Hybrid Framework for Real-Time Data Drift and Anomaly Identification Using Hierarchical Temporal Memory and Statistical Tests |
|---|---|
| Autoři: | Subhadip Bandyopadhyay, Joy Bose, Sujoy Roy Chowdhury |
| Zdroj: | International Journal of Mathematical, Engineering and Management Sciences, Vol 10, Iss 3, Pp 777-796 (2025) |
| Publication Status: | Preprint |
| Informace o vydavateli: | Ram Arti Publishers, 2025. |
| Rok vydání: | 2025 |
| Témata: | telecom network monitoring, FOS: Computer and information sciences, Technology, Computer Science - Machine Learning, real-time anomaly detection, I.2.6, I.2.7, G.3, 62M10, 62P30, 68T07, data drift detection, H.2.8, H.3.3, Machine Learning (cs.LG), hybrid machine learning models, ai powered data drift detection, QA1-939, hierarchical temporal memory (htm), time series, Mathematics, sequential probability ratio test (sprt), streaming data analysis |
| Popis: | Data Drift refers to the phenomenon where the generating model behind the data changes over time. Due to data drift, any model built on the past training data becomes less relevant and inaccurate over time. Thus, detecting and controlling for data drift is critical in machine learning models. Hierarchical Temporal Memory (HTM) is a machine learning model developed by Jeff Hawkins, inspired by how the human brain processes information. It is a biologically inspired model of memory similar in structure to the neocortex and whose performance is claimed to be comparable to state of the art models in detecting anomalies in time series data. Another unique benefit of HTMs is their independence from training and testing cycles; all the learning takes place online with streaming data, and no separate training and testing cycle is required. In the sequential learning paradigm, the Sequential Probability Ratio Test (SPRT) offers unique benefits for online learning and inference. This paper proposes a novel hybrid framework combining HTM and SPRT for real-time data drift detection and anomaly identification. Unlike existing data drift methods, our approach eliminates frequent retraining and ensures low false positive rates. HTMs currently work with one dimensional or univariate data. In a second study, we also propose an application of HTM in a multidimensional supervised scenario for anomaly detection by combining the outputs of multiple HTM columns, one for each data dimension, through a neural network. Experimental evaluations demonstrate that the proposed method outperforms conventional drift detection techniques like the Kolmogorov-Smirnov (KS) test, Wasserstein distance, and Population Stability Index (PSI) in terms of accuracy, adaptability, and computational efficiency. Our experiments also provide insights into optimizing hyperparameters for real-time deployment in domains such as Telecom. |
| Druh dokumentu: | Article |
| Jazyk: | English |
| ISSN: | 2455-7749 |
| DOI: | 10.33889/ijmems.2025.10.3.039 |
| DOI: | 10.48550/arxiv.2504.18599 |
| Přístupová URL adresa: | http://arxiv.org/abs/2504.18599 https://doaj.org/article/64d306aa2fe347eb8fa0d5c9f0c2af63 https://doi.org/10.33889/IJMEMS.2025.10.3.039 |
| Rights: | CC BY |
| Přístupové číslo: | edsair.doi.dedup.....b08e1fdfe1537e8ce3c16077f5521873 |
| Databáze: | OpenAIRE |
| Abstrakt: | Data Drift refers to the phenomenon where the generating model behind the data changes over time. Due to data drift, any model built on the past training data becomes less relevant and inaccurate over time. Thus, detecting and controlling for data drift is critical in machine learning models. Hierarchical Temporal Memory (HTM) is a machine learning model developed by Jeff Hawkins, inspired by how the human brain processes information. It is a biologically inspired model of memory similar in structure to the neocortex and whose performance is claimed to be comparable to state of the art models in detecting anomalies in time series data. Another unique benefit of HTMs is their independence from training and testing cycles; all the learning takes place online with streaming data, and no separate training and testing cycle is required. In the sequential learning paradigm, the Sequential Probability Ratio Test (SPRT) offers unique benefits for online learning and inference. This paper proposes a novel hybrid framework combining HTM and SPRT for real-time data drift detection and anomaly identification. Unlike existing data drift methods, our approach eliminates frequent retraining and ensures low false positive rates. HTMs currently work with one dimensional or univariate data. In a second study, we also propose an application of HTM in a multidimensional supervised scenario for anomaly detection by combining the outputs of multiple HTM columns, one for each data dimension, through a neural network. Experimental evaluations demonstrate that the proposed method outperforms conventional drift detection techniques like the Kolmogorov-Smirnov (KS) test, Wasserstein distance, and Population Stability Index (PSI) in terms of accuracy, adaptability, and computational efficiency. Our experiments also provide insights into optimizing hyperparameters for real-time deployment in domains such as Telecom. |
|---|---|
| ISSN: | 24557749 |
| DOI: | 10.33889/ijmems.2025.10.3.039 |
Full Text Finder
Nájsť tento článok vo Web of Science