Graph autoencoder-based unsupervised outlier detection

Outlier detection technologies play an important role in various application domains. Most existing outlier detection algorithms have difficulty detecting outliers that are mixed within normal object regions or around dense clusters. To address this problem, we propose a novel graph neural network s...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Information sciences Ročník 608; s. 532 - 550
Hlavní autori: Du, Xusheng, Yu, Jiong, Chu, Zheng, Jin, Lina, Chen, Jiaying
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Inc 01.08.2022
Predmet:
ISSN:0020-0255, 1872-6291
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Outlier detection technologies play an important role in various application domains. Most existing outlier detection algorithms have difficulty detecting outliers that are mixed within normal object regions or around dense clusters. To address this problem, we propose a novel graph neural network structure called the graph autoencoder (GAE), which is capable of handling the task of outlier detection in Euclidean structured data. The GAE can perform feature value propagation in the form of a neural network that changes the distribution pattern of the original dataset, which can accurately detect outliers with low deviation. This method first converts the Euclidean structured dataset into a graph using the graph generation module, then inputs the dataset together with its corresponding graph into the GAE for training, and finally determines the top-n objects that are difficult to reconstruct in the output layer of the GAE as outliers. The results of comparing eight state-of-the-art algorithms on eight real-world datasets showed that GAE achieved the highest area under the receiver operating characteristic curve (ROC AUC) on six datasets. By comparing GAE with the autoencoder-based outlier detection algorithm, it was discovered that the proposed method improved the AUC by 16.9% on average for eight datasets.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2022.06.039