AutoConf: Automated Configuration of Unsupervised Learning Systems Using Metamorphic Testing and Bayesian Optimization

Unsupervised learning systems using clustering have gained significant attention for numerous applications due to their unique ability to discover patterns and structures in large unlabeled datasets. However, their effectiveness highly depends on their configuration, which requires domain-specific e...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 1326 - 1338
Hlavní autoři:	Shar, Lwin Khin, Goknil, Arda, Husom, Erik Johannes, Sen, Sagar, Tun, Yan Naing, Kim, Kisub
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 11.09.2023
Témata:	Bayes methods Clustering algorithms Complexity theory Manuals Measurement Optimization Unsupervised learning
ISSN:	2643-1572
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Unsupervised learning systems using clustering have gained significant attention for numerous applications due to their unique ability to discover patterns and structures in large unlabeled datasets. However, their effectiveness highly depends on their configuration, which requires domain-specific expertise and often involves numerous manual trials. Specifically, selecting appropriate algorithms and hyperparameters adds to the complexity of the configuration process. In this paper, we propose, apply, and assess an automated approach (AutoConf) for configuring unsupervised learning systems using clustering, leveraging metamorphic testing and Bayesian optimization. Metamorphic testing is utilized to verify the configurations of unsupervised learning systems by applying a series of input transformations. We use Bayesian optimization guided by metamorphic-testing output to automatically identify the optimal configuration. The approach aims to streamline the configuration process and enhance the effectiveness of unsupervised learning systems. It has been evaluated through experiments on six datasets from three domains for anomaly detection. The evaluation results show that our approach can find configurations outperforming the baseline approaches as they achieved a recall of 0.89 and a precision of 0.84 (on average).
ISSN:	2643-1572
DOI:	10.1109/ASE56229.2023.00094