Graph autoencoder-based unsupervised outlier detection

Outlier detection technologies play an important role in various application domains. Most existing outlier detection algorithms have difficulty detecting outliers that are mixed within normal object regions or around dense clusters. To address this problem, we propose a novel graph neural network s...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Information sciences Ročník 608; s. 532 - 550
Hlavní autoři: Du, Xusheng, Yu, Jiong, Chu, Zheng, Jin, Lina, Chen, Jiaying
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.08.2022
Témata:
ISSN:0020-0255, 1872-6291
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Outlier detection technologies play an important role in various application domains. Most existing outlier detection algorithms have difficulty detecting outliers that are mixed within normal object regions or around dense clusters. To address this problem, we propose a novel graph neural network structure called the graph autoencoder (GAE), which is capable of handling the task of outlier detection in Euclidean structured data. The GAE can perform feature value propagation in the form of a neural network that changes the distribution pattern of the original dataset, which can accurately detect outliers with low deviation. This method first converts the Euclidean structured dataset into a graph using the graph generation module, then inputs the dataset together with its corresponding graph into the GAE for training, and finally determines the top-n objects that are difficult to reconstruct in the output layer of the GAE as outliers. The results of comparing eight state-of-the-art algorithms on eight real-world datasets showed that GAE achieved the highest area under the receiver operating characteristic curve (ROC AUC) on six datasets. By comparing GAE with the autoencoder-based outlier detection algorithm, it was discovered that the proposed method improved the AUC by 16.9% on average for eight datasets.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2022.06.039