Exploring Fine-Grained In-Memory Database Performance for Modern CPUs

Modern CPUs keep integrating more cores and large size cache, which is beneficial for in-memory databases to improve parallel processing power and cache locality. While state-of-the-art CPUs have diverse architectures and roadmaps such as large core count and large cache size (AMD x86), moderate cor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems Jg. 34; H. 6; S. 1 - 16
Hauptverfasser: Liu, Zhuan, Han, Ruichen, Zhang, Yansong, Zhang, Yu, Tang, Xi, Deng, Gang, Zhong, Tao, Dementiev, Roman, Lu, Yunfei, Que, Mingjian
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.06.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1045-9219, 1558-2183
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Modern CPUs keep integrating more cores and large size cache, which is beneficial for in-memory databases to improve parallel processing power and cache locality. While state-of-the-art CPUs have diverse architectures and roadmaps such as large core count and large cache size (AMD x86), moderate core count and cache size (intel x86), large core count and moderate cache size (ARM), exploring in-memory databases performance characteristics for different CPU architectures is important for in-memory database designs and optimizations. In this paper, we develop a fine-grained in-memory database benchmark to evaluate the performance of each operator on different CPUs to explore how CPU hardware architectures influence performance. Different from well known conclusions that more cores and larger cache size can achieve higher performance, we find out that the micro cache architectures play an important role opposite to core count and cache size, the shared monolithic L3 cache with moderate size beats large disaggregated L3 cache. The experiments also show that predicting operator performance on different CPUs is difficult according to diverse CPU architectures and micro cache architectures, and different implementations of each operator are not always high or low with interleaved strong and weak performance regions influenced by CPU hardware architectures. Intel x86 CPUs represent cache-centric processor design, while AMD x86 and ARM CPUs represent computing-centric processor design, the OLAP benchmark experiments of SSB discover that OmniSciDB and OLAP Accelerator with vector-wise processing model performs well on intel x86 CPUs compared to AMD x86 CPUs and the JIT compliant based Hyper prefers to AMD x86 CPUs rather than intel x86 CPUs. The CPU roadmaps of increasing cores or improving cache locality should be considered for in-memory database algorithm design and platform selection.
AbstractList Modern CPUs keep integrating more cores and large size cache, which is beneficial for in-memory databases to improve parallel processing power and cache locality. While state-of-the-art CPUs have diverse architectures and roadmaps such as large core count and large cache size (AMD x86), moderate core count and cache size (intel x86), large core count and moderate cache size (ARM), exploring in-memory databases performance characteristics for different CPU architectures is important for in-memory database designs and optimizations. In this article, we develop a fine-grained in-memory database benchmark to evaluate the performance of each operator on different CPUs to explore how CPU hardware architectures influence performance. Different from well known conclusions that more cores and larger cache size can achieve higher performance, we find out that the micro cache architectures play an important role opposite to core count and cache size, the shared monolithic L3 cache with moderate size beats large disaggregated L3 cache. The experiments also show that predicting operator performance on different CPUs is difficult according to diverse CPU architectures and micro cache architectures, and different implementations of each operator are not always high or low with interleaved strong and weak performance regions influenced by CPU hardware architectures. Intel x86 CPUs represent cache-centric processor design, while AMD x86 and ARM CPUs represent computing-centric processor design, the OLAP benchmark experiments of SSB discover that OmniSciDB and OLAP Accelerator with vector-wise processing model performs well on intel x86 CPUs compared to AMD x86 CPUs and the JIT compliant based Hyper prefers to AMD x86 CPUs rather than intel x86 CPUs. The CPU roadmaps of increasing cores or improving cache locality should be considered for in-memory database algorithm design and platform selection.
Modern CPUs keep integrating more cores and large size cache, which is beneficial for in-memory databases to improve parallel processing power and cache locality. While state-of-the-art CPUs have diverse architectures and roadmaps such as large core count and large cache size (AMD x86), moderate core count and cache size (intel x86), large core count and moderate cache size (ARM), exploring in-memory databases performance characteristics for different CPU architectures is important for in-memory database designs and optimizations. In this paper, we develop a fine-grained in-memory database benchmark to evaluate the performance of each operator on different CPUs to explore how CPU hardware architectures influence performance. Different from well known conclusions that more cores and larger cache size can achieve higher performance, we find out that the micro cache architectures play an important role opposite to core count and cache size, the shared monolithic L3 cache with moderate size beats large disaggregated L3 cache. The experiments also show that predicting operator performance on different CPUs is difficult according to diverse CPU architectures and micro cache architectures, and different implementations of each operator are not always high or low with interleaved strong and weak performance regions influenced by CPU hardware architectures. Intel x86 CPUs represent cache-centric processor design, while AMD x86 and ARM CPUs represent computing-centric processor design, the OLAP benchmark experiments of SSB discover that OmniSciDB and OLAP Accelerator with vector-wise processing model performs well on intel x86 CPUs compared to AMD x86 CPUs and the JIT compliant based Hyper prefers to AMD x86 CPUs rather than intel x86 CPUs. The CPU roadmaps of increasing cores or improving cache locality should be considered for in-memory database algorithm design and platform selection.
Author Liu, Zhuan
Zhang, Yu
Lu, Yunfei
Zhong, Tao
Han, Ruichen
Zhang, Yansong
Dementiev, Roman
Que, Mingjian
Tang, Xi
Deng, Gang
Author_xml – sequence: 1
  givenname: Zhuan
  orcidid: 0000-0001-7269-0280
  surname: Liu
  fullname: Liu, Zhuan
  organization: Intel Corp, China
– sequence: 2
  givenname: Ruichen
  orcidid: 0009-0008-8561-3979
  surname: Han
  fullname: Han, Ruichen
  organization: School of Information, Renmin University of China, Beijing, China
– sequence: 3
  givenname: Yansong
  orcidid: 0000-0003-2198-0987
  surname: Zhang
  fullname: Zhang, Yansong
  organization: School of Information, Renmin University of China, Beijing, China
– sequence: 4
  givenname: Yu
  orcidid: 0009-0005-8048-9612
  surname: Zhang
  fullname: Zhang, Yu
  organization: National Satellite Meteorological Center of China, Beijing, China
– sequence: 5
  givenname: Xi
  orcidid: 0009-0006-7235-5099
  surname: Tang
  fullname: Tang, Xi
  organization: Intel Corp, China
– sequence: 6
  givenname: Gang
  orcidid: 0009-0004-4655-505X
  surname: Deng
  fullname: Deng, Gang
  organization: Intel Corp, China
– sequence: 7
  givenname: Tao
  orcidid: 0009-0008-8327-8345
  surname: Zhong
  fullname: Zhong, Tao
  organization: Intel Corp, China
– sequence: 8
  givenname: Roman
  orcidid: 0009-0009-9183-2673
  surname: Dementiev
  fullname: Dementiev, Roman
  organization: Intel Corp, China
– sequence: 9
  givenname: Yunfei
  orcidid: 0009-0004-5324-7493
  surname: Lu
  fullname: Lu, Yunfei
  organization: Huawei Technologies Co., Ltd, Hangzhou, China
– sequence: 10
  givenname: Mingjian
  orcidid: 0009-0009-8351-515X
  surname: Que
  fullname: Que, Mingjian
  organization: Huawei Technologies Co., Ltd, Hangzhou, China
BookMark eNp9UE1rAjEUDMVC1fYHFHpY6HltvnaTHItfFZQK1XOI2bdlRRObrFD_fSN6KD30NMNjZh4zPdRx3gFCjwQPCMHqZbUcfQwopmzAaEmFpDeoS4pC5pRI1kkc8yJXlKg71ItxizHhBeZdNB5_H3Y-NO4zmzQO8mkwCaps5vIF7H04ZSPTmo2JkC0h1D7sjbOQJZItfAXBZcPlOt6j29rsIjxcsY_Wk_Fq-JbP36ez4es8t1TxNpeEFpUSvBQCWy5IaYSs0skCLQ0VDGq7UTXHpCyYZYJDEmw4B6WsxAU1rI-eL7mH4L-OEFu99cfg0ktNU7jkRAiRVOKissHHGKDWtmlN23jXpnY7TbA-b6bPm-nzZvq6WXKSP85DaPYmnP71PF08DQD80mNZlqnHD-qtd4Y
CODEN ITDSEO
CitedBy_id crossref_primary_10_1109_TPEL_2025_3531414
crossref_primary_10_1016_j_future_2024_02_006
crossref_primary_10_1109_JSTARS_2025_3573026
Cites_doi 10.1109/ICDE.2019.00068
10.1145/3318464.3380595
10.1109/ICDE.2013.6544839
10.1109/ICDE.2018.00163
10.1109/TKDE.2015.2499199
10.1145/1995441.1995446
10.1145/3514221.3526054
10.14778/2735703.2735704
10.1145/1807167.1807221
10.1145/2882903.2882917
10.1007/s11704-011-9181-3
10.1145/1989323.1989328
10.14778/3151113.3151114
10.1145/3035918.3035946
10.1007/s10619-018-7226-4
10.1145/3448016.3452831
10.14778/2536206.2536216
10.1145/3514221.3526045
10.1109/TKDE.2018.2867522
10.1145/2723372.2747645
10.1007/s10619-020-07304-z
10.1145/1409360.1409380
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2023.3262782
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 16
ExternalDocumentID 10_1109_TPDS_2023_3262782
10086665
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
5VS
AAYXX
ABFSI
AETIX
AGSQL
AI.
AIBXA
ALLEH
CITATION
E.L
H~9
ICLAB
IFJZH
RNI
RZB
VH1
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c294t-8125d9746770c4716a78d25dce26a273efcb9f401653c374e6a7b44e99c8052a3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000992499400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Nov 30 04:41:44 EST 2025
Sat Nov 29 06:06:50 EST 2025
Tue Nov 18 22:26:35 EST 2025
Wed Aug 27 02:25:54 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c294t-8125d9746770c4716a78d25dce26a273efcb9f401653c374e6a7b44e99c8052a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-7269-0280
0009-0008-8561-3979
0009-0008-8327-8345
0009-0009-8351-515X
0009-0004-5324-7493
0009-0005-8048-9612
0009-0006-7235-5099
0009-0004-4655-505X
0000-0003-2198-0987
0009-0009-9183-2673
PQID 2812841777
PQPubID 85437
PageCount 16
ParticipantIDs crossref_citationtrail_10_1109_TPDS_2023_3262782
crossref_primary_10_1109_TPDS_2023_3262782
ieee_primary_10086665
proquest_journals_2812841777
PublicationCentury 2000
PublicationDate 2023-06-01
PublicationDateYYYYMMDD 2023-06-01
PublicationDate_xml – month: 06
  year: 2023
  text: 2023-06-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref24
ref12
ref23
ref15
halstead (ref14) 2015
ref20
ref11
ref22
ref10
ref21
ref2
ref17
ref16
ref19
ref18
ref8
(ref1) 2022
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref13
  doi: 10.1109/ICDE.2019.00068
– year: 2015
  ident: ref14
  article-title: FPGA-based multithreading for in-memory hash joins
  publication-title: Proc Conf Innov Data Syst Res
– ident: ref16
  doi: 10.1145/3318464.3380595
– ident: ref4
  doi: 10.1109/ICDE.2013.6544839
– ident: ref17
  doi: 10.1109/ICDE.2018.00163
– ident: ref8
  doi: 10.1109/TKDE.2015.2499199
– ident: ref9
  doi: 10.1145/1995441.1995446
– ident: ref23
  doi: 10.1145/3514221.3526054
– ident: ref11
  doi: 10.14778/2735703.2735704
– ident: ref21
  doi: 10.1145/1807167.1807221
– ident: ref20
  doi: 10.1145/2882903.2882917
– ident: ref5
  doi: 10.1007/s11704-011-9181-3
– ident: ref3
  doi: 10.1145/1989323.1989328
– ident: ref24
  doi: 10.14778/3151113.3151114
– ident: ref15
  doi: 10.1145/3035918.3035946
– year: 2022
  ident: ref1
  article-title: November 2021
– ident: ref6
  doi: 10.1007/s10619-018-7226-4
– ident: ref18
  doi: 10.1145/3448016.3452831
– ident: ref12
  doi: 10.14778/2536206.2536216
– ident: ref22
  doi: 10.1145/3514221.3526045
– ident: ref7
  doi: 10.1109/TKDE.2018.2867522
– ident: ref19
  doi: 10.1145/2723372.2747645
– ident: ref10
  doi: 10.1007/s10619-020-07304-z
– ident: ref2
  doi: 10.1145/1409360.1409380
SSID ssj0014504
Score 2.402435
Snippet Modern CPUs keep integrating more cores and large size cache, which is beneficial for in-memory databases to improve parallel processing power and cache...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Algorithms
Benchmark testing
Benchmarks
cache architecture
Central processing units
Computer architecture
Computer memory
CPUs
Hardware
In-memory database
Micromechanical devices
Microprocessors
multi-core
OLAP benchmark
Operator performance
Parallel processing
Performance evaluation
Performance prediction
Program processors
Sockets
Title Exploring Fine-Grained In-Memory Database Performance for Modern CPUs
URI https://ieeexplore.ieee.org/document/10086665
https://www.proquest.com/docview/2812841777
Volume 34
WOSCitedRecordID wos000992499400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Xplore
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5s8aAHq7VitUoOnoS0-05ylD5U0LJgC70t2WwWBNlKH4L_3kx2WxVR8BZCsiwzyX4zO48P4EqlmknuRzRlkaCBVJxynmrKZK5lmqV5mtkmrg9sPOazmYirYnVbC6O1tslnuotDG8vP5mqNv8p62IjGmNthDWqMsbJYaxsyCELLFWjci5AKcw-rEKbriN4kHjx1kSe8a4wVj3HvGwhZVpUfn2KLL6PGP9_sEA4qQ5LclJo_gh1dNKGxIWkg1Z1twv6XjoPHMNzm3JGRmaW3yBChM3Jf0EfMuX0nA7mSCG0k_iwpIGZAStY00o-nyxZMR8NJ_45WTApUeSJYUYPiYSaQWYQ5ysBRJBnPzJTSXiSNAaNzlYo8wNImX_ks0GZBGgRaCIWUB9I_gXoxL_QpEOW6uSMRxoyj5Sjf7DM2TqZ8iX0nWdQGZyPaRFVtxpHt4iWx7oYjEtRGgtpIKm204Xq75bXssfHX4haK_8vCUvJt6GwUmFTXcJl4HOHXNYfl7Jdt57CHTy-TvzpQXy3W-gJ21dvqebm4tCfsA2Rhy_g
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7oFNQH52XidGoefBKy9ZI2zaPs4obbGDjBt5KmKQiyyS6C_96ctJuKKPgWQkLLOUm_c3ouH8C1SjSXkR_ShIeCMqkiGkWJplxmWiZpkiWpbeLa58Nh9PQkRkWxuq2F0Vrb5DNdx6GN5adTtcRfZQ1sRGPM7WATtgLGPDcv11oHDVhg2QKNgxFQYW5iEcR0HdEYj1oPdWQKrxtzxeOR9w2GLK_Kj4-xRZhO-Z_vdgD7hSlJbnPdH8KGnhxBeUXTQIpbewR7X3oOHkN7nXVHOmaW3iFHhE5Jb0IHmHX7TlpyIRHcyOizqICYAcl500hz9DivwGOnPW52acGlQJUn2IIaHA9Sgdwi3FEGkELJo9RMKe2F0pgwOlOJyBgWN_nK50ybBQljWgiFpAfSP4HSZDrRp0CU62aORCAzrpajfLPPWDmp8iV2nuRhFZyVaGNVNBpHvouX2DocjohRGzFqIy60UYWb9ZbXvMvGX4srKP4vC3PJV6G2UmBcXMR57EUIwC7n_OyXbVew0x0P-nG_N7w_h118Up4KVoPSYrbUF7Ct3hbP89mlPW0fL4zPPw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploring+Fine-Grained+In-Memory+Database+Performance+for+Modern+CPUs&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Liu%2C+Zhuan&rft.au=Han%2C+Ruichen&rft.au=Zhang%2C+Yansong&rft.au=Zhang%2C+Yu&rft.date=2023-06-01&rft.pub=IEEE&rft.issn=1045-9219&rft.spage=1&rft.epage=16&rft_id=info:doi/10.1109%2FTPDS.2023.3262782&rft.externalDocID=10086665
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon