Ensemble Masked Autoencoder and Gaussian Mixture Models for Cardinality Estimation

In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and data-driven approaches. Query-driven methods rely on training queries and lack generalization ability, while data-driven methods infer cardinalities...

Full description

Saved in:
Bibliographic Details
Published in:2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 7
Main Authors: Chen, Yimeng, Jia, Devin, Lei, Ying, Li, Bo
Format: Conference Proceeding
Language:English
Published: IEEE 22.09.2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and data-driven approaches. Query-driven methods rely on training queries and lack generalization ability, while data-driven methods infer cardinalities by learning the joint distribution of data in the database, independent of specific queries. Some data-driven methods utilize deep autoregressive models, but they suffer from fixed attribute order during training and the issue of requiring large embedding matrices for continuous attributes. In order to overcome these challenges, we propose MAGMM, which combines masked autoencoder and Gaussian mixture models and is data-driven. Experimental results demonstrate that MAGMM performs well in terms of estimation accuracy and inference time, providing a robust solution for cardinality estimation in databases.
DOI:10.1109/DOCS60977.2023.10294693