Ensemble Masked Autoencoder and Gaussian Mixture Models for Cardinality Estimation

In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and data-driven approaches. Query-driven methods rely on training queries and lack generalization ability, while data-driven methods infer cardinalities...

Full description

Saved in:
Bibliographic Details
Published in:2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 7
Main Authors: Chen, Yimeng, Jia, Devin, Lei, Ying, Li, Bo
Format: Conference Proceeding
Language:English
Published: IEEE 22.09.2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and data-driven approaches. Query-driven methods rely on training queries and lack generalization ability, while data-driven methods infer cardinalities by learning the joint distribution of data in the database, independent of specific queries. Some data-driven methods utilize deep autoregressive models, but they suffer from fixed attribute order during training and the issue of requiring large embedding matrices for continuous attributes. In order to overcome these challenges, we propose MAGMM, which combines masked autoencoder and Gaussian mixture models and is data-driven. Experimental results demonstrate that MAGMM performs well in terms of estimation accuracy and inference time, providing a robust solution for cardinality estimation in databases.
AbstractList In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and data-driven approaches. Query-driven methods rely on training queries and lack generalization ability, while data-driven methods infer cardinalities by learning the joint distribution of data in the database, independent of specific queries. Some data-driven methods utilize deep autoregressive models, but they suffer from fixed attribute order during training and the issue of requiring large embedding matrices for continuous attributes. In order to overcome these challenges, we propose MAGMM, which combines masked autoencoder and Gaussian mixture models and is data-driven. Experimental results demonstrate that MAGMM performs well in terms of estimation accuracy and inference time, providing a robust solution for cardinality estimation in databases.
Author Li, Bo
Lei, Ying
Jia, Devin
Chen, Yimeng
Author_xml – sequence: 1
  givenname: Yimeng
  surname: Chen
  fullname: Chen, Yimeng
  email: 202132803176@stu.hebut.edu.cn
  organization: School of Artificial Intelligence, Hebei University of Technology,Tianjin,China
– sequence: 2
  givenname: Devin
  surname: Jia
  fullname: Jia, Devin
  email: jiadx@inspur.com
  organization: Shandong Inspur Database Technology Co., Ltd.,Tianjin,China
– sequence: 3
  givenname: Ying
  surname: Lei
  fullname: Lei, Ying
  email: leiying.hebut@gmail.com
  organization: School of Artificial Intelligence, Hebei University of Technology,Tianjin,China
– sequence: 4
  givenname: Bo
  surname: Li
  fullname: Li, Bo
  email: 394078839@qq.com
  organization: Tianjin Jizhong Technology Co., Ltd.,Tianjin,China
BookMark eNo1j8tKxDAYhSPoQsd5A8G8QGuuTbMcah2FGQa8rIc0-QPBTipJCs7bW1BXZ3E-zse5QZdxioDQPSU1pUQ_PB66t4ZopWpGGK8pYVo0ml-gtVa65ZJwzSWj1-i1jxlOwwh4b_InOLyZywTRTg4SNtHhrZlzDibiffguc1q4pRoz9lPCnUkuRDOGcsZ9LuFkSpjiLbryZsyw_ssV-njq37vnanfYvnSbXRUo1aWyUgrGG8NBgWqIb1vPLR2s9VRrxq3yXkhrmQIjBkY0BcG8EraFxsmBAl-hu9_dAADHr7To0_n4f5X_AMzUTzk
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/DOCS60977.2023.10294693
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350393521
EndPage 7
ExternalDocumentID 10294693
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i119t-c554236a3e7e760f88f3c1bccf19923c7ff45cc27ea4b2091e42f74c8e6d5b1e3
IEDL.DBID RIE
IngestDate Wed Jan 10 09:28:11 EST 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i119t-c554236a3e7e760f88f3c1bccf19923c7ff45cc27ea4b2091e42f74c8e6d5b1e3
PageCount 7
ParticipantIDs ieee_primary_10294693
PublicationCentury 2000
PublicationDate 2023-Sept.-22
PublicationDateYYYYMMDD 2023-09-22
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-22
  day: 22
PublicationDecade 2020
PublicationTitle 2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS)
PublicationTitleAbbrev DOCS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.845043
Snippet In recent years, deep learning models have been employed for cardinality estimation in databases. These models can be categorized into query-driven and...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms cardinality estimation
Complex systems
data-driven
Deep learning
Estimation
Gaussian mixture model
masked autoencoder
Mixture models
Training
Title Ensemble Masked Autoencoder and Gaussian Mixture Models for Cardinality Estimation
URI https://ieeexplore.ieee.org/document/10294693
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcAEiCK-5YE1Jf6I7YyotDDQUvGlbpVjn6WKklZNgvj52EkLYmBgi6woli45v-f4vTuELkEREwtlI4-FScQ151EKVkTWYyXRgkpZ1y14vZejkZpM0vHarF57YQCgFp9BN1zWZ_l2Yarwq8xnOE39do61UMs_ozFrrTVbJE6vbh56TyL2hKYbeoJ3N3f_6ptSw8Zg958T7qHOjwEPj7-hZR9tQX6AHvt5Ae_ZHPBQF29g8XVVLkIdSgsrrHOLb3VVBE8kHs4-w8EADo3O5gX2vBT3wpfQkG7c92ndOBY76GXQf-7dReuWCNGMkLSMjEd_yoRmIEGK2CnlmCGZMS7ISJmRzvHEGCpB84x6LgCcOsmNAmGTjAA7RO18kcMRwpa7RFu_GHqOxgVjWaYc50kaJxZ84qtj1AkBmS6bqhfTTSxO_hg_RTsh7EFLQekZaperCs7RtvkoZ8Xqon5XXytWlqk
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwMhECVaTfSkxhq_5eB16y6wwB5Nba2xrY1W01vDwpA01q3p7hp_vrBtNR48eCOEhGRgeA-YN4PQJchIh1yawGFhHDDFWJCA4YFxWBkpToSo8ha8dEW_L0ejZLAUq1daGACogs-g4ZvVX76Z6dI_lTkPJ4m7ztF1tBEzRsKFXGsZtRWFydXNQ_OJh47SNHxV8MZq_K_KKRVwtHf-OeUuqv9I8PDgG1z20Bpk--ixleXwlk4B91T-CgZfl8XMZ6I0MMcqM_hWlblXReLe5NN_DWBf6myaY8dMcdPvhQXtxi3n2AvNYh09t1vDZidYFkUIJlGUFIF2-E8oVxQECB5aKS3VUaq19YGkVAtrWaw1EaBYShwbAEasYFoCN3EaAT1AtWyWwSHChtlYGXccOpbGOKVpKi1jcRLGBpzryyNU9wYZvy_yXoxXtjj-o_8CbXWGve64e9e_P0Hbfgl8ZAUhp6hWzEs4Q5v6o5jk8_Nq3b4AMDSZ8A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+5th+International+Conference+on+Data-driven+Optimization+of+Complex+Systems+%28DOCS%29&rft.atitle=Ensemble+Masked+Autoencoder+and+Gaussian+Mixture+Models+for+Cardinality+Estimation&rft.au=Chen%2C+Yimeng&rft.au=Jia%2C+Devin&rft.au=Lei%2C+Ying&rft.au=Li%2C+Bo&rft.date=2023-09-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDOCS60977.2023.10294693&rft.externalDocID=10294693