Design of a Multithreaded Barnes-Hut Algorithm for Multicore Clusters

We describe in this paper an implementation of the Barnes-Hut algorithm on multicore clusters. Based on a partitioned global address space (PGAS) library, the design integrates intranode multithreading and internode one-sided communication, exemplifying a PGAS + X programming style. Within a node, t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems Jg. 26; H. 7; S. 1861 - 1873
Hauptverfasser: Junchao Zhang, Behzad, Babak, Snir, Marc
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.07.2015
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1045-9219, 1558-2183
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract We describe in this paper an implementation of the Barnes-Hut algorithm on multicore clusters. Based on a partitioned global address space (PGAS) library, the design integrates intranode multithreading and internode one-sided communication, exemplifying a PGAS + X programming style. Within a node, the computation is decomposed into tasks (subtasks) and multitasking is used to hide network latency. We study the tradeoffs between locality in private caches and locality in shared caches and bring the insights into the design. As a result, our implementation consumes less memory per core, invokes less internode communication, and enjoys better load-balancing strategies. The final code achieves up to 41 percent performance improvement over a non-multithreaded counterpart. Through detailed comparison, we also show its advantages over other well-known Barnes-Hut implementations, both in programming complexity and in performance.
AbstractList We describe in this paper an implementation of the Barnes-Hut algorithm on multicore clusters. Based on a partitioned global address space (PGAS) library, the design integrates intranode multithreading and internode one-sided communication, exemplifying a PGAS [Formula Omitted] X programming style. Within a node, the computation is decomposed into tasks (subtasks) and multitasking is used to hide network latency. We study the tradeoffs between locality in private caches and locality in shared caches and bring the insights into the design. As a result, our implementation consumes less memory per core, invokes less internode communication, and enjoys better load-balancing strategies. The final code achieves up to 41 percent performance improvement over a non-multithreaded counterpart. Through detailed comparison, we also show its advantages over other well-known Barnes-Hut implementations, both in programming complexity and in performance.
We describe in this paper an implementation of the Barnes-Hut algorithm on multicore clusters. Based on a partitioned global address space (PGAS) library, the design integrates intranode multithreading and internode one-sided communication, exemplifying a PGAS + X programming style. Within a node, the computation is decomposed into tasks (subtasks) and multitasking is used to hide network latency. We study the tradeoffs between locality in private caches and locality in shared caches and bring the insights into the design. As a result, our implementation consumes less memory per core, invokes less internode communication, and enjoys better load-balancing strategies. The final code achieves up to 41 percent performance improvement over a non-multithreaded counterpart. Through detailed comparison, we also show its advantages over other well-known Barnes-Hut implementations, both in programming complexity and in performance.
Author Snir, Marc
Junchao Zhang
Behzad, Babak
Author_xml – sequence: 1
  surname: Junchao Zhang
  fullname: Junchao Zhang
  email: jczhang@anl.gov
  organization: Math. & Comput. Sci. (MCS) Div., Argonne Nat. Lab., Argonne, IL, USA
– sequence: 2
  givenname: Babak
  surname: Behzad
  fullname: Behzad, Babak
  email: bbehza2@illinois.edu
  organization: Comput. Sci. Dept., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
– sequence: 3
  givenname: Marc
  surname: Snir
  fullname: Snir, Marc
  email: snir@anl.gov
  organization: Math. & Comput. Sci. (MCS) Div., Argonne Nat. Lab., Argonne, IL, USA
BookMark eNp9kD1PwzAQhi1UJNrCD0AskZhT_BE79ljaQpGKQKLMlptcSqo0LrYz9N_jKhUDA9OddM9zp3tHaNDaFhC6JXhCCFYP6_f5x4Rikk0oY4Rm7AINCecypUSyQexxxlNFibpCI-93OJIcZ0O0mIOvt21iq8Qkr10T6vDlwJRQJo_GteDTZReSabO1Lk72SWVdjxXWQTJrOh_A-Wt0WZnGw825jtHn02I9W6art-eX2XSVFoyrkFJmmMSy2uTGiKKETalEXmaSGANgsMpLEEJwTGmVywwoxVkRFWby0mwqWbAxuu_3Hpz97sAHvbOda-NJTYRSlIv4WKRITxXOeu-g0gdX7407aoL1KS19Skuf0tLntKKT_3GKOphQ2zY4Uzf_mne9WQPA7yUhWc4pYT8Sq3kx
CODEN ITDSEO
CitedBy_id crossref_primary_10_1007_s42452_020_2386_z
crossref_primary_10_1016_j_ascom_2021_100466
Cites_doi 10.1016/j.cpc.2011.12.013
10.1137/1.9780898719604
10.1145/1693453.1693482
10.1109/ICPPW.2009.14
10.1088/1742-6596/180/1/012037
10.1038/324446a0
10.1177/1094342012440585
10.1051/0004-6361/201118085
10.1145/1787275.1787323
10.1109/ISCA.1995.524546
10.1016/0021-9991(87)90140-9
10.1145/2048066.2048104
10.1145/169627.169640
10.1016/j.parco.2011.10.003
10.1177/1094342006064503
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2015
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2015
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2014.2331243
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 1873
ExternalDocumentID 3760182581
10_1109_TPDS_2014_2331243
6837521
Genre orig-research
GrantInformation_xml – fundername: Advanced Scientific Computing Research
  grantid: DE-AC02-06CH11357
– fundername: Office of Science
  grantid: DE-AC02-06CH11357
  funderid: 10.13039/100006132
– fundername: DOE
  grantid: 1205852
  funderid: 10.13039/100000015
– fundername: Department of Energy
  grantid: DE-AC02-06CH11357
  funderid: 10.13039/100000015
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AGSQL
AHBIQ
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
RIG
ID FETCH-LOGICAL-c359t-23a3808fb7aa6cdebd967d481aaeea097de6665022f784e2204c3a33a7dabf8c3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000356138700007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Mon Jun 30 07:09:18 EDT 2025
Sat Nov 29 03:36:08 EST 2025
Tue Nov 18 22:24:37 EST 2025
Wed Aug 27 02:52:30 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 7
Keywords cluster
n-body
PGAS
multicore
Barnes-Hut
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c359t-23a3808fb7aa6cdebd967d481aaeea097de6665022f784e2204c3a33a7dabf8c3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 1699256001
PQPubID 85437
PageCount 13
ParticipantIDs crossref_primary_10_1109_TPDS_2014_2331243
ieee_primary_6837521
proquest_journals_1699256001
crossref_citationtrail_10_1109_TPDS_2014_2331243
PublicationCentury 2000
PublicationDate 2015-July-1
2015-7-1
20150701
PublicationDateYYYYMMDD 2015-07-01
PublicationDate_xml – month: 07
  year: 2015
  text: 2015-July-1
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2015
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ishiyama (ref30) 0
kale (ref22) 2011
ref34
ref31
ref32
ref10
barnes (ref8) 1986; 324
(ref12) 0
ref17
(ref6) 2005
(ref7) 2010
(ref1) 0
(ref18) 2011
singh (ref19) 1993
kale (ref21) 0
shan (ref3) 0
ref24
kurzak (ref33) 2011
aarseth (ref13) 1974; 37
ref26
mucci (ref16) 0
ref25
ref20
reinders (ref23) 2007
woo (ref9) 1995
ref28
ref27
ref29
zhang (ref11) 2011
barriuso (ref4) 1994
ref5
herlihy (ref15) 2008
stevens (ref2) 2009
salmon (ref14) 1991
References_xml – year: 1993
  ident: ref19
  article-title: Parallel hierarchical N-body methods and their implications for multiprocessors
– ident: ref20
  doi: 10.1016/j.cpc.2011.12.013
– start-page: 7
  year: 0
  ident: ref16
  article-title: PAPI: A portable interface to hardware performance counters
  publication-title: Proc Dept of Defense HPCMP Users Group Conference
– year: 1994
  ident: ref4
  article-title: SHMEM user's guide for C
– year: 2011
  ident: ref33
  article-title: Multithreading in the PLASMA library
  publication-title: Multi- and Many-Core Technologies Programming Algorithms and Applications
– ident: ref32
  doi: 10.1137/1.9780898719604
– year: 0
  ident: ref12
– year: 2011
  ident: ref22
  article-title: Charm++ for productivity and performance: A submission to the 2011 HPC class II challenge
– year: 2005
  ident: ref6
  article-title: UPC language specifications, v1.2
– year: 2010
  ident: ref7
  publication-title: Programming and Languages
– ident: ref28
  doi: 10.1145/1693453.1693482
– ident: ref27
  doi: 10.1109/ICPPW.2009.14
– ident: ref31
  doi: 10.1088/1742-6596/180/1/012037
– year: 0
  ident: ref3
  article-title: Accelerating applications at scale using one-sided communication
  publication-title: Proc 6th Conf Partitioned Global Address Space Program Models
– volume: 324
  start-page: 446
  year: 1986
  ident: ref8
  article-title: A hierarchical O(nlogn) force-calculation algorithm
  publication-title: Nature
  doi: 10.1038/324446a0
– year: 2009
  ident: ref2
  article-title: Architectures and technology for extreme scale computing
  publication-title: ASCR Sci Grand Challenges Workshop Ser
– year: 0
  ident: ref1
– start-page: 91
  year: 0
  ident: ref21
  article-title: Charm++: A portable concurrent object oriented system based on c++
  publication-title: Proc 8th Annu Conf Object-Oriented Program Syst Languages Appl
– year: 2008
  ident: ref15
  publication-title: The Art of Multiprocessor Programming
– ident: ref26
  doi: 10.1177/1094342012440585
– ident: ref24
  doi: 10.1051/0004-6361/201118085
– volume: 37
  start-page: 183
  year: 1974
  ident: ref13
  article-title: A comparison of numerical methods for the study of star cluster dynamics
  publication-title: Astron Astrophys
– start-page: 75?85
  year: 2011
  ident: ref11
  article-title: Optimizing the barnes-hut algorithm in UPC
  publication-title: Proc 2011 Int Conf High Perform Comput Netw Storage Anal
– year: 2007
  ident: ref23
  publication-title: Intel Threading Building Blocks Outfitting C++ for Multi-core Processor Parallelism
– ident: ref25
  doi: 10.1145/1787275.1787323
– start-page: 24
  year: 1995
  ident: ref9
  article-title: Methodological considerations and characterization of the splash-2 parallel application suite
  publication-title: Proc 22nd Annu Int Symp Comput Archit
  doi: 10.1109/ISCA.1995.524546
– ident: ref29
  doi: 10.1016/0021-9991(87)90140-9
– start-page: 5:1
  year: 0
  ident: ref30
  article-title: 4.45 pflops astrophysical n-body simulation on k computer: The gravitational trillion-body problem
  publication-title: Proc Int Conf High Perform Comput Netw Storage Anal
– ident: ref17
  doi: 10.1145/2048066.2048104
– ident: ref10
  doi: 10.1145/169627.169640
– ident: ref34
  doi: 10.1016/j.parco.2011.10.003
– year: 2011
  ident: ref18
  article-title: OpenMP application program interface version 3.1
– year: 1991
  ident: ref14
  article-title: Parallel hierarchical n-body methods
– ident: ref5
  doi: 10.1177/1094342006064503
SSID ssj0014504
Score 2.1638317
Snippet We describe in this paper an implementation of the Barnes-Hut algorithm on multicore clusters. Based on a partitioned global address space (PGAS) library, the...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1861
SubjectTerms Algorithm design and analysis
Force
Instruction sets
Multicore processing
Octrees
Product introduction
Programming
Synchronization
Title Design of a Multithreaded Barnes-Hut Algorithm for Multicore Clusters
URI https://ieeexplore.ieee.org/document/6837521
https://www.proquest.com/docview/1699256001
Volume 26
WOSCitedRecordID wos000356138700007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4g8aAHUdCIounBk7HQZR9tj8gjnAiJmHDbzLZdY4JgFvD32-4uRKMx8bbJdprNfJ3OzM4L4I5LESlPcyq15DRAjRS5pyhDe36UTtOEFcMm-GQi5nM5rcDDvhbGGJMnn5m2e8xj-Xqltu5XWSey3lToqsYPOI-KWq19xCAI81GB1rsIqbRiWEYwPSY7s-ngySVxBe2u71t95n_TQflQlR83ca5eRrX_fdgpnJRmJOkVuJ9BxSzrUNuNaCClxNbh-Eu_wQYMB3m-BlmlBElRemuhRG00ecTMXnp0vN2Q3uJlldk3b8QatMUy1-uS9Bdb11VhfQ7Po-GsP6blHAWq_FBuaNdHXzCRJhwxUtokWkZcB8JDNAaZ5NpYJya02jzlIjDdLguUJfGRa0xSofwLqC5XS3MJxJpDSRB6SRIJ64nxFJkJlTIovMigVLIJbMfZWJVNxt2si0WcOxtMxg6M2IERl2A04X5P8l502PhrccNxf7-wZHwTWjv44lIG17EXSVkYdFe_U13Dkd07LJJvW1DdZFtzA4fqY_O6zm7z4_UJ5JTM3g
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB5EBfXgW1yfOXgSo2mbNMlxfbGiLoIreCvTJBVh3ZV9-PtN2u6iKIK3QjOkzJfJzHReAEdSq9REVlJttaQcLVKUkaEM_fkxtihyVg2bkO22en7WDzNwMq2Fcc6VyWfuNDyWsXzbN-Pwq-ws9d6UCFXjc4LzmFXVWtOYARflsEDvXwiqvSDWMcyI6bPOw-VjSOPip3GSeI2WfNNC5ViVH3dxqWCuV_73aauwXBuSpFkhvwYzrrcOK5MhDaSW2XVY-tJxcAOuLsuMDdIvCJKq-NaDidZZco4Df-3R1nhEmt2X_sC_eSPepK2WhW6X5KI7Dn0VhpvwdH3VuWjRepICNYnQIxonmCimilwipsa63OpUWq4iROeQaWmdd2OE1-eFVNzFMePGkyQoLeaFMskWzPb6PbcNxBtEORdRnqfK-2KyQOaEMQ5VlDrURjeATTibmbrNeJh20c1Kd4PpLICRBTCyGowGHE9J3qseG38t3gjcny6sGd-AvQl8WS2FwyxKta5Mup3fqQ5hodW5v8vubtq3u7Do9xFVKu4ezI4GY7cP8-Zj9DocHJRH7RO3BNAl
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+of+a+Multithreaded+Barnes-Hut+Algorithm+for+Multicore+Clusters&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Junchao+Zhang&rft.au=Behzad%2C+Babak&rft.au=Snir%2C+Marc&rft.date=2015-07-01&rft.pub=IEEE&rft.issn=1045-9219&rft.volume=26&rft.issue=7&rft.spage=1861&rft.epage=1873&rft_id=info:doi/10.1109%2FTPDS.2014.2331243&rft.externalDocID=6837521
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon