Detecting Outliers Using Modified Recursive PCA Algorithm For Dynamic Streaming Data

Outlier analysis has been widely studied and has produced many methods. However, there is still rare a method to detect outliers for dynamically streaming batch data (online learning). In the present research, a novel online algorithm to detect outliers in such dataset is proposed. Data points are p...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Mendel (Brno (Czech Republic)) Ročník 29; číslo 2; s. 237 - 244
Hlavní autoři: Dani, Yasi, Gunawan, Agus Yodi, Khodra, Masayu Leylia, Indratno, Sapto Wahyu
Médium: Journal Article
Jazyk:angličtina
Vydáno: Brno University of Technology 20.12.2023
Témata:
ISSN:1803-3814, 2571-3701
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Outlier analysis has been widely studied and has produced many methods. However, there is still rare a method to detect outliers for dynamically streaming batch data (online learning). In the present research, a novel online algorithm to detect outliers in such dataset is proposed. Data points are proceeded by applying a modified recursive PCA to predict sequentially parameters of the model; eigenvalues and eigenvectors of the statistical detection model are recursively updated using approximate values by perturbation methods. More specifically, the recursive eigenstructure is obtained from the derivation of the covariance matrix using the first-order perturbation technique. The Mahalanobis distance is then used as an outlier score. Our algorithm performances are evaluated using some metrics, namely accuration, precision, recall, F1-score, AUC-PR, and the execution time. Results show that the proposed online outlier detection is computationally efficient in time and the algorithm's performance effectiveness is comparable to that of the offline outlier detection algorithm via classical PCA.
ISSN:1803-3814
2571-3701
DOI:10.13164/mendel.2023.2.237