Self-supervised autoencoders for clustering and classification

Clustering techniques aim at finding meaningful groups of data samples which exhibit similarity with regards to a set of characteristics, typically measured in terms of pairwise distances. Due to the so-called curse of dimensionality, i.e., the observation that high-dimensional spaces are unsuited f...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Evolving systems Ročník 11; číslo 3; s. 453 - 466
Hlavní autoři: Nousi, Paraskevi, Tefas, Anastasios
Médium: Journal Article
Jazyk:angličtina
Vydáno: Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2020
Témata:
ISSN:1868-6478, 1868-6486
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Clustering techniques aim at finding meaningful groups of data samples which exhibit similarity with regards to a set of characteristics, typically measured in terms of pairwise distances. Due to the so-called curse of dimensionality, i.e., the observation that high-dimensional spaces are unsuited for measuring distances, distance-based clustering techniques such as the classic k -means algorithm fail to uncover meaningful clusters in high-dimensional spaces. Thus, dimensionality reduction techniques can be used to greatly improve the performance of such clustering methods. In this work, we study Autoencoders as Deep Learning tools for dimensionality reduction, and combine them with k -means clustering to learn low-dimensional representations which improve the clustering performance by enhancing intra-cluster relationships and suppressing inter-cluster ones, in a self-supervised manner. In the supervised paradigm, distance-based classifiers may also greatly benefit from robust dimensionality reduction techniques. The proposed method is evaluated via multiple experiments on datasets of handwritten digits, various objects and faces, and is shown to improve external cluster quality measuring criteria. A fully supervised counterpart is also evaluated on two face recognition datasets, and is shown to improve the performance of various lightweight classifiers, allowing their use in real-time applications on devices with limited computational resources, such as Unmanned Aerial Vehicles (UAVs).
ISSN:1868-6478
1868-6486
DOI:10.1007/s12530-018-9235-y