Knowledge-integrated autoencoder model

Data encoding is a common and central operation in most data analysis tasks. The performance of other models downstream in the computational process highly depends on the quality of data encoding. One of the most powerful ways to encode data is using the neural network AutoEncoder (AE) architecture....

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications Vol. 252; p. 124108
Main Authors: Lazebnik, Teddy, Simon-keren, Liron
Format: Journal Article
Language:English
Published: Elsevier Ltd 15.10.2024
Subjects:
ISSN:0957-4174, 1873-6793
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data encoding is a common and central operation in most data analysis tasks. The performance of other models downstream in the computational process highly depends on the quality of data encoding. One of the most powerful ways to encode data is using the neural network AutoEncoder (AE) architecture. However, the developers of AE cannot easily influence the produced embedding space, as it is usually treated as a black box technique. This means the embedding space is uncontrollable and does not necessarily possess the properties desired for downstream tasks. This paper introduces a novel approach for developing AE models that can integrate external knowledge sources into the learning process, possibly leading to more accurate results. The proposed Knowledge-integrated AutoEncoder (KiAE) model can leverage domain-specific information to make sure the desired distance and neighborhood properties between samples are preservative in the embedding space. The proposed model is evaluated on three large-scale datasets from three scientific fields and is compared to nine existing encoding models. The results demonstrate that the KiAE model effectively captures the underlying structures and relationships between the input data and external knowledge, meaning it generates a more useful representation. This leads to outperforming the rest of the models in terms of reconstruction accuracy. [Display omitted] •To control the latent space, knowledge of relative distances can be integrated using a distance matrix.•The KiAE method can capture desired properties even on a small dataset (<200 samples).•The knowledge matrix is not required to be full on the dataset using meta-regression.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2024.124108