Resource allocation for task-level speculative scientific applications: A proof of concept using Parallel Trajectory Splicing

The constant increase in parallelism available on large-scale distributed computers poses major scalability challenges to many scientific applications. A common strategy to improve scalability is to express algorithms in terms of independent tasks that can be executed concurrently on a runtime syste...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Parallel computing Ročník 112; s. 102936
Hlavní autoři: Garmon, Andrew, Ramakrishnaiah, Vinay, Perez, Danny
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.09.2022
Témata:
ISSN:0167-8191, 1872-7336
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The constant increase in parallelism available on large-scale distributed computers poses major scalability challenges to many scientific applications. A common strategy to improve scalability is to express algorithms in terms of independent tasks that can be executed concurrently on a runtime system. In this manuscript, we consider a generalization of this approach where task-level speculation is allowed. In this context, a probability is attached to each task which corresponds to the likelihood that the output of the speculative task will be consumed as part of the larger calculation. We consider the problem of optimal resource allocation to each of the possible tasks so as to maximize the total expected computational throughput. The power of this approach is demonstrated by analyzing its application to Parallel Trajectory Splicing, a massively-parallel long-time-dynamics method for atomistic simulations. •Efficiently utilizing large-scale HPC machines has become increasingly difficult.•Speculative task-based execution serves as a means of increasing parallelism.•Performance relies on optimal resource allocation among speculative tasks.•Improved scaling of traditionally scale-limited applications; Molecular Dynamics.
AbstractList The constant increase in parallelism available on large-scale distributed computers poses major scalability challenges to many scientific applications. A common strategy to improve scalability is to express algorithms in terms of independent tasks that can be executed concurrently on a runtime system. In this manuscript, we consider a generalization of this approach where task-level speculation is allowed. In this context, a probability is attached to each task which corresponds to the likelihood that the output of the speculative task will be consumed as part of the larger calculation. We consider the problem of optimal resource allocation to each of the possible tasks so as to maximize the total expected computational throughput. The power of this approach is demonstrated by analyzing its application to Parallel Trajectory Splicing, a massively-parallel long-time-dynamics method for atomistic simulations. •Efficiently utilizing large-scale HPC machines has become increasingly difficult.•Speculative task-based execution serves as a means of increasing parallelism.•Performance relies on optimal resource allocation among speculative tasks.•Improved scaling of traditionally scale-limited applications; Molecular Dynamics.
ArticleNumber 102936
Author Perez, Danny
Garmon, Andrew
Ramakrishnaiah, Vinay
Author_xml – sequence: 1
  givenname: Andrew
  orcidid: 0000-0002-5478-5160
  surname: Garmon
  fullname: Garmon, Andrew
  email: agarmon@lanl.gov
  organization: Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
– sequence: 2
  givenname: Vinay
  surname: Ramakrishnaiah
  fullname: Ramakrishnaiah, Vinay
  email: vinayr@lanl.gov
  organization: Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
– sequence: 3
  givenname: Danny
  surname: Perez
  fullname: Perez, Danny
  email: danny_perez@lanl.gov
  organization: Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
BookMark eNqFkM9KAzEQh4NUsFafwEteYGuy2W52BQ-l-A8KitZziLMTSV03S5IWevDdTbuePCiECWTm-zH5Tsmocx0ScsHZlDNeXq6nvfbgpjnL8_SS16I8ImNeyTyTQpQjMk5TMqt4zU_IaQhrxlhZVGxMvp4xuI0HpLptHehoXUeN8zTq8JG1uMWWhh5h06bWFmkAi120xgLVfd_agQhXdE5775yh6YDrAPtIN8F27_RJ-xSdYlZerxGi8zv6sidT84wcG90GPP-5J-T19ma1uM-Wj3cPi_kyA8FEzCrGjYaybgoDKBtgUBvRFJLlemZMVZtClHzGBNMyZ1xUsm5YKqJAiW_ccDEh9ZAL3oXg0Siw8bB59Nq2ijO196jW6uBR7T2qwWNixS-29_ZT-90_1PVAYfrW1qJXB3OAjfVJgmqc_ZP_Bjwnkn0
CitedBy_id crossref_primary_10_7717_peerj_cs_2966
crossref_primary_10_1038_s41598_024_61647_6
Cites_doi 10.1109/JSYST.2017.2722476
10.1016/0021-9991(75)90060-1
10.1021/acs.jctc.5b00916
10.7717/peerj-cs.183
10.1016/S0167-8191(02)00216-8
10.1145/2742347
10.1145/224538.224564
10.1098/rsta.2019.0056
10.1137/18M1177792
10.1177/1094342007078442
10.1016/0377-2217(93)90177-O
10.1142/S0129626411000060
10.1145/2821505
10.1557/jmr.2017.456
10.1063/5.0014475
10.1287/trsc.36.2.231.561
10.1063/1.1415500
10.1177/0037549716674806
10.1016/j.cpc.2020.107262
10.1103/PhysRevLett.78.3908
10.1109/TCC.2015.2481400
10.1145/1465482.1465560
10.1145/165854.165874
10.1147/JRD.2011.2109230
10.1006/jcph.1995.1039
10.1145/2699715
10.1016/S1574-1400(09)00504-0
10.1016/j.commatsci.2014.12.011
10.1016/S0377-2217(99)00287-8
10.1063/1.481576
10.1088/1361-651X/aba511
10.1145/2400682.2400698
ContentType Journal Article
Copyright 2022 Elsevier B.V.
Copyright_xml – notice: 2022 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.parco.2022.102936
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7336
ExternalDocumentID 10_1016_j_parco_2022_102936
S0167819122000369
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29O
4.4
457
4G.
5VS
6OB
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WH7
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c303t-801fac69d4fce7dc0c9f3d4702a5ff89f43615030a72013879d087934e7eb1f13
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000805900400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-8191
IngestDate Sat Nov 29 07:26:14 EST 2025
Tue Nov 18 21:44:55 EST 2025
Fri Feb 23 02:40:11 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Task-based programming
Discrete event simulation
Accelerated molecular dynamics
Speculation
Resource allocation
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c303t-801fac69d4fce7dc0c9f3d4702a5ff89f43615030a72013879d087934e7eb1f13
ORCID 0000-0002-5478-5160
ParticipantIDs crossref_citationtrail_10_1016_j_parco_2022_102936
crossref_primary_10_1016_j_parco_2022_102936
elsevier_sciencedirect_doi_10_1016_j_parco_2022_102936
PublicationCentury 2000
PublicationDate September 2022
2022-09-00
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: September 2022
PublicationDecade 2020
PublicationTitle Parallel computing
PublicationYear 2022
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Morales, Almeida, Garcıa, Roda, Rodrıguez (b29) 2000; 126
Mouad Ramil, Private communication.
Alexander, Almgren, Bell, Bhattacharjee, Chen, Colella, Daniel, DeSlippe, Diachin, Draeger (b1) 2020; 378
LAMMPS website
Minarolli, Freisleben (b24) 2014
Mniszewski, Junghans, Voter, Perez, Eidenbenz (b49) 2015; 25
Perez, Cubuk, Waterland, Kaxiras, Voter (b10) 2016; 12
Ma, Wang, Tak, Wang, Tang (b26) 2016
Robson, Buch, Kale (b2) 2016
Denardo (b32) 2012
Zamora, Voter, Perez, Santhi, Mniszewski, Thulasidasan, Eidenbenz (b48) 2016; 92
Rajamony, Arimilli, Gildea (b18) 2011; 55
Balaji, Buntinas, Goodell, Gropp, Hoefler, Kumar, Lusk, Thakur, Träff (b21) 2011; 21
Perez, Uberuaga, Voter (b46) 2015; 100
Plimpton (b38) 1995; 117
F. Almeida, F. Garcia, J. Roda, D. Morales, C. Rodríguez, A comparative study of two distributed systems: PVM and transputers, in: Transputers Applications and Systems’ 95, 1995, pp. 244–258.
D. Morales, J. Roda, Francisco Almeida, Casiano Rodríguez, F. Garcia, Integral knapsack problems: Parallel algorithms and their implementations on distributed systems, in: Proceedings of the 9th International Conference on Supercomputing, 1995, pp. 218–226.
.
Perez, Huang, Voter (b41) 2018; 33
Tseng, Wang, Chou, Chao, Leung (b27) 2017; 12
Hilario Torres, Manolis Papadakis, Lluís Jofre Cruanyes, Soleil-X: Turbulence, particles, and radiation in the Regent programming language, in: SC’19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, pp. 1–4.
Wei, Foh, He, Cai (b25) 2015; 6
Bramas (b36) 2019; 5
Llanos, Orden, Palop (b33) 2008
Sorensen, Voter (b44) 2000; 112
Zamora, Perez, Martinez, Uberuaga, Voter (b45) 2020
Xu, Yan, Wan (b23) 2009
González, Almeida, Moreno, Rodrıguez (b34) 2003; 29
Chamberlain, Callahan, Zima (b12) 2007; 21
Gene M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, in: Proceedings of the April 18-20, 1967, Spring Joint Computer Conference, 1967, pp. 483–485.
Yiapanis, Rosas-Ham, Brown, Luján (b8) 2013; 9
Gibbons (b17) 1988
Bauer, Treichler, Slaughter, Aiken (b3) 2012
Garmon, Perez (b51) 2020; 28
Chen, Fu, Tang, Zhu (b28) 2015
Yiapanis, Brown, Luján (b9) 2015; 38
Jahn, Pagani, Kobbe, Chen, Henkel (b35) 2015; 2
Di Renzo, Fu, Urzay (b4) 2020
Tsolakis, Thomadakis, Chrisochoides (b37) 2021
Jain, Bohm, Mikida, Mandal, Kim, Jindal, Li, Ismail-Beigi, Martyna, Kale (b7) 2016
Moussa, Embaby, Farag (b22) 2017
Bortz, Kalos, Lebowitz (b53) 1975; 17
Henkelman, Jónsson (b42) 2001; 115
Andonov, Raimbault, Quinton (b15) 1993
Rosu, Schwan, Yalamanchili, Jha (b20) 1997
Elmaghraby (b30) 1993; 64
Aristoff (b47) 2019; 7
Phillips, Hardy, Maia, Stone, Ribeiro, Bernardi, Buch, Fiorin, Henin, Jiang, McGreevy, Melo, Radak, Skeel, Singharoy, Wang, Roux, Aksimentiev, Luthey-Schulten, Kale, Schulten, Chipot, Tajkhorshid (b5) 2020; 153
Faanes, Bataineh, Roweth, Court, Froese, Alverson, Johnson, Kopnick, Higgins, Reinhard (b19) 2012
OpenMP application programming interface
Perez, Uberuaga, Shim, Amar, Voter (b50) 2009; 5
Laxmikant V. Kale, Sanjeev Krishnan, Charm++ a portable concurrent object oriented system based on C++, in: Proceedings of the Eighth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications, 1993, pp. 91–108.
Powell, Shapiro, Simão (b31) 2002; 36
Voter (b43) 1997; 78
Aristoff (10.1016/j.parco.2022.102936_b47) 2019; 7
Bramas (10.1016/j.parco.2022.102936_b36) 2019; 5
10.1016/j.parco.2022.102936_b6
Henkelman (10.1016/j.parco.2022.102936_b42) 2001; 115
Bortz (10.1016/j.parco.2022.102936_b53) 1975; 17
Denardo (10.1016/j.parco.2022.102936_b32) 2012
Andonov (10.1016/j.parco.2022.102936_b15) 1993
Chamberlain (10.1016/j.parco.2022.102936_b12) 2007; 21
Xu (10.1016/j.parco.2022.102936_b23) 2009
Faanes (10.1016/j.parco.2022.102936_b19) 2012
Voter (10.1016/j.parco.2022.102936_b43) 1997; 78
Chen (10.1016/j.parco.2022.102936_b28) 2015
Morales (10.1016/j.parco.2022.102936_b29) 2000; 126
Bauer (10.1016/j.parco.2022.102936_b3) 2012
Yiapanis (10.1016/j.parco.2022.102936_b9) 2015; 38
Tsolakis (10.1016/j.parco.2022.102936_b37) 2021
Plimpton (10.1016/j.parco.2022.102936_b38) 1995; 117
10.1016/j.parco.2022.102936_b16
Zamora (10.1016/j.parco.2022.102936_b48) 2016; 92
Zamora (10.1016/j.parco.2022.102936_b45) 2020
Powell (10.1016/j.parco.2022.102936_b31) 2002; 36
Jain (10.1016/j.parco.2022.102936_b7) 2016
Gibbons (10.1016/j.parco.2022.102936_b17) 1988
Sorensen (10.1016/j.parco.2022.102936_b44) 2000; 112
Perez (10.1016/j.parco.2022.102936_b41) 2018; 33
Yiapanis (10.1016/j.parco.2022.102936_b8) 2013; 9
Ma (10.1016/j.parco.2022.102936_b26) 2016
Elmaghraby (10.1016/j.parco.2022.102936_b30) 1993; 64
Perez (10.1016/j.parco.2022.102936_b46) 2015; 100
Llanos (10.1016/j.parco.2022.102936_b33) 2008
González (10.1016/j.parco.2022.102936_b34) 2003; 29
Phillips (10.1016/j.parco.2022.102936_b5) 2020; 153
Perez (10.1016/j.parco.2022.102936_b10) 2016; 12
Alexander (10.1016/j.parco.2022.102936_b1) 2020; 378
Garmon (10.1016/j.parco.2022.102936_b51) 2020; 28
Di Renzo (10.1016/j.parco.2022.102936_b4) 2020
10.1016/j.parco.2022.102936_b14
Robson (10.1016/j.parco.2022.102936_b2) 2016
10.1016/j.parco.2022.102936_b13
Mniszewski (10.1016/j.parco.2022.102936_b49) 2015; 25
10.1016/j.parco.2022.102936_b11
Balaji (10.1016/j.parco.2022.102936_b21) 2011; 21
10.1016/j.parco.2022.102936_b52
Wei (10.1016/j.parco.2022.102936_b25) 2015; 6
10.1016/j.parco.2022.102936_b39
Perez (10.1016/j.parco.2022.102936_b50) 2009; 5
Tseng (10.1016/j.parco.2022.102936_b27) 2017; 12
Minarolli (10.1016/j.parco.2022.102936_b24) 2014
Moussa (10.1016/j.parco.2022.102936_b22) 2017
10.1016/j.parco.2022.102936_b40
Rajamony (10.1016/j.parco.2022.102936_b18) 2011; 55
Rosu (10.1016/j.parco.2022.102936_b20) 1997
Jahn (10.1016/j.parco.2022.102936_b35) 2015; 2
References_xml – reference: Hilario Torres, Manolis Papadakis, Lluís Jofre Cruanyes, Soleil-X: Turbulence, particles, and radiation in the Regent programming language, in: SC’19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, pp. 1–4.
– volume: 17
  start-page: 10
  year: 1975
  end-page: 18
  ident: b53
  article-title: A new algorithm for Monte Carlo simulation of Ising spin systems
  publication-title: J. Comput. Phys.
– start-page: 1
  year: 2012
  end-page: 9
  ident: b19
  article-title: Cray cascade: A scalable HPC system based on a dragonfly network
  publication-title: SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
– volume: 7
  start-page: 685
  year: 2019
  end-page: 719
  ident: b47
  article-title: Generalizing parallel replica dynamics: Trajectory fragments, asynchronous computing, and PDMPs
  publication-title: SIAM/ASA J. Uncertain. Quantif.
– reference: Gene M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, in: Proceedings of the April 18-20, 1967, Spring Joint Computer Conference, 1967, pp. 483–485.
– year: 1988
  ident: b17
  article-title: Efficient Parallel Algorithms
– volume: 12
  start-page: 18
  year: 2016
  end-page: 28
  ident: b10
  article-title: Long-time dynamics through parallel trajectory splicing
  publication-title: J. Chem. Theory Comput.
– volume: 21
  start-page: 291
  year: 2007
  end-page: 312
  ident: b12
  article-title: Parallel programmability and the chapel language
  publication-title: Int. J. High Perform. Comput. Appl.
– start-page: 334
  year: 2008
  end-page: 342
  ident: b33
  article-title: Just-in-time scheduling for loop-based speculative parallelization
  publication-title: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2008
– volume: 5
  year: 2019
  ident: b36
  article-title: Increasing the degree of parallelism using speculative execution in task-based runtime systems
  publication-title: PeerJ Comput. Sci.
– start-page: 1
  year: 2017
  end-page: 4
  ident: b22
  article-title: Intelligent real-time scheduling of dynamic processes in MPI
  publication-title: 2017 IEEE International Conference on Fuzzy Systems, FUZZ-IEEE
– reference: F. Almeida, F. Garcia, J. Roda, D. Morales, C. Rodríguez, A comparative study of two distributed systems: PVM and transputers, in: Transputers Applications and Systems’ 95, 1995, pp. 244–258.
– reference: Mouad Ramil, Private communication.
– year: 1993
  ident: b15
  article-title: Dynamic programming parallel implementations for the knapsack problem
– volume: 9
  start-page: 1
  year: 2013
  end-page: 27
  ident: b8
  article-title: Optimizing software runtime systems for speculative parallelization
  publication-title: ACM Transactions on Architecture and Code Optimization (TACO)
– reference: Laxmikant V. Kale, Sanjeev Krishnan, Charm++ a portable concurrent object oriented system based on C++, in: Proceedings of the Eighth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications, 1993, pp. 91–108.
– volume: 115
  start-page: 9657
  year: 2001
  end-page: 9666
  ident: b42
  article-title: Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table
  publication-title: J. Chem. Phys.
– volume: 78
  start-page: 3908
  year: 1997
  ident: b43
  article-title: Hyperdynamics: Accelerated molecular dynamics of infrequent events
  publication-title: Phys. Rev. Lett.
– volume: 55
  year: 2011
  ident: b18
  article-title: PERCS: The IBM POWER7-IH high-performance computing system
  publication-title: IBM J. Res. Dev.
– volume: 38
  start-page: 1
  year: 2015
  end-page: 45
  ident: b9
  article-title: Compiler-driven software speculation for thread-level parallelism
  publication-title: ACM Transactions on Programming Languages and Systems (TOPLAS)
– volume: 153
  year: 2020
  ident: b5
  article-title: Scalable molecular dynamics on CPU and GPU architectures with NAMD
  publication-title: J. Chem. Phys.
– volume: 112
  start-page: 9599
  year: 2000
  end-page: 9606
  ident: b44
  article-title: Temperature-accelerated dynamics for simulation of infrequent events
  publication-title: J. Chem. Phys.
– start-page: 40
  year: 2016
  end-page: 43
  ident: b2
  article-title: Runtime coordinated heterogeneous tasks in Charm++
  publication-title: Proceedings of the Second Internationsl Workshop on Extreme Scale Programming Models and Middleware
– volume: 36
  start-page: 231
  year: 2002
  end-page: 249
  ident: b31
  article-title: An adaptive dynamic programming algorithm for the heterogeneous resource allocation problem
  publication-title: Transp. Sci.
– volume: 64
  start-page: 199
  year: 1993
  end-page: 215
  ident: b30
  article-title: Resource allocation via dynamic programming in activity networks
  publication-title: European J. Oper. Res.
– reference: LAMMPS website,
– volume: 2
  start-page: 1
  year: 2015
  end-page: 23
  ident: b35
  article-title: Runtime resource allocation for software pipelines
  publication-title: ACM Trans. Parallel Comput.
– volume: 33
  start-page: 813
  year: 2018
  end-page: 822
  ident: b41
  article-title: Long-time molecular dynamics simulations on massively parallel platforms: A comparison of parallel replica dynamics and parallel trajectory splicing
  publication-title: J. Mater. Res.
– start-page: 745
  year: 2020
  end-page: 772
  ident: b45
  article-title: Accelerated molecular dynamics methods in a massively parallel world
  publication-title: Handbook of Materials Modeling: Methods: Theory and Modeling
– reference: D. Morales, J. Roda, Francisco Almeida, Casiano Rodríguez, F. Garcia, Integral knapsack problems: Parallel algorithms and their implementations on distributed systems, in: Proceedings of the 9th International Conference on Supercomputing, 1995, pp. 218–226.
– volume: 25
  start-page: 1
  year: 2015
  end-page: 26
  ident: b49
  article-title: Tadsim: Discrete event-based performance prediction for temperature-accelerated dynamics
  publication-title: ACM Trans. Model. Comput. Simul.
– start-page: 139
  year: 2016
  end-page: 158
  ident: b7
  article-title: Openatom: scalable ab-initio molecular dynamics with diverse capabilities
  publication-title: International Conference on High Performance Computing
– start-page: 320
  year: 1997
  end-page: 329
  ident: b20
  article-title: On adaptive resource allocation for complex real-time applications
  publication-title: Proceedings Real-Time Systems Symposium
– volume: 6
  start-page: 264
  year: 2015
  end-page: 275
  ident: b25
  article-title: Towards efficient resource allocation for heterogeneous workloads in IaaS clouds
  publication-title: IEEE Trans. Cloud Comput.
– start-page: 288
  year: 2015
  end-page: 292
  ident: b28
  article-title: Resource monitoring and prediction in cloud computing environments
  publication-title: 2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence
– volume: 126
  start-page: 166
  year: 2000
  end-page: 174
  ident: b29
  article-title: Design of parallel algorithms for the single resource allocation problem
  publication-title: European J. Oper. Res.
– volume: 21
  start-page: 45
  year: 2011
  end-page: 60
  ident: b21
  article-title: MPI on millions of cores
  publication-title: Parallel Process. Lett.
– start-page: 109
  year: 2009
  end-page: 114
  ident: b23
  article-title: Grey prediction control of adaptive resources allocation in virtualized computing system
  publication-title: 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing
– volume: 29
  start-page: 241
  year: 2003
  end-page: 254
  ident: b34
  article-title: Towards the automatic optimal mapping of pipeline algorithms
  publication-title: Parallel Comput.
– volume: 28
  year: 2020
  ident: b51
  article-title: Exploiting model uncertainty to improve the scalability of long-time simulations using parallel trajectory splicing
  publication-title: Modelling Simulation Mater. Sci. Eng.
– volume: 378
  year: 2020
  ident: b1
  article-title: Exascale applications: Skin in the game
  publication-title: Phil. Trans. R. Soc. A
– start-page: 545
  year: 2016
  end-page: 552
  ident: b26
  article-title: Auto-tuning performance of MPI parallel programs using resource management in container-based virtual cloud
  publication-title: 2016 IEEE 9th International Conference on Cloud Computing, CLOUD
– volume: 100
  start-page: 90
  year: 2015
  end-page: 103
  ident: b46
  article-title: The parallel replica dynamics method–Coming of age
  publication-title: Comput. Mater. Sci.
– year: 2020
  ident: b4
  article-title: HTR solver: An open-source exascale-oriented task-based multi-GPU high-order code for hypersonic aerothermodynamics
  publication-title: Comput. Phys. Comm.
– volume: 92
  start-page: 1065
  year: 2016
  end-page: 1086
  ident: b48
  article-title: Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics
  publication-title: Simulation
– volume: 5
  start-page: 79
  year: 2009
  end-page: 98
  ident: b50
  article-title: Accelerated molecular dynamics methods: Introduction and recent developments
  publication-title: Annu. Rep. Comput. Chem.
– volume: 12
  start-page: 1688
  year: 2017
  end-page: 1699
  ident: b27
  article-title: Dynamic resource prediction and allocation for cloud data center using the multiobjective genetic algorithm
  publication-title: IEEE Syst. J.
– volume: 117
  start-page: 1
  year: 1995
  end-page: 19
  ident: b38
  article-title: Fast parallel algorithms for short-range molecular dynamics
  publication-title: J. Comput. Phys.
– reference: .
– start-page: 1
  year: 2012
  end-page: 11
  ident: b3
  article-title: Legion: Expressing locality and independence with logical regions
  publication-title: SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
– start-page: 490
  year: 2014
  end-page: 499
  ident: b24
  article-title: Distributed resource allocation to virtual machines via artificial neural networks
  publication-title: 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
– reference: OpenMP application programming interface,
– year: 2012
  ident: b32
  article-title: Dynamic Programming: Models and Applications
– start-page: 1
  year: 2021
  end-page: 32
  ident: b37
  article-title: Tasking framework for adaptive speculative parallel mesh generation
  publication-title: J. Supercomput.
– volume: 12
  start-page: 1688
  issue: 2
  year: 2017
  ident: 10.1016/j.parco.2022.102936_b27
  article-title: Dynamic resource prediction and allocation for cloud data center using the multiobjective genetic algorithm
  publication-title: IEEE Syst. J.
  doi: 10.1109/JSYST.2017.2722476
– year: 1988
  ident: 10.1016/j.parco.2022.102936_b17
– start-page: 745
  year: 2020
  ident: 10.1016/j.parco.2022.102936_b45
  article-title: Accelerated molecular dynamics methods in a massively parallel world
– volume: 17
  start-page: 10
  issue: 1
  year: 1975
  ident: 10.1016/j.parco.2022.102936_b53
  article-title: A new algorithm for Monte Carlo simulation of Ising spin systems
  publication-title: J. Comput. Phys.
  doi: 10.1016/0021-9991(75)90060-1
– volume: 12
  start-page: 18
  issue: 1
  year: 2016
  ident: 10.1016/j.parco.2022.102936_b10
  article-title: Long-time dynamics through parallel trajectory splicing
  publication-title: J. Chem. Theory Comput.
  doi: 10.1021/acs.jctc.5b00916
– volume: 5
  year: 2019
  ident: 10.1016/j.parco.2022.102936_b36
  article-title: Increasing the degree of parallelism using speculative execution in task-based runtime systems
  publication-title: PeerJ Comput. Sci.
  doi: 10.7717/peerj-cs.183
– volume: 29
  start-page: 241
  issue: 2
  year: 2003
  ident: 10.1016/j.parco.2022.102936_b34
  article-title: Towards the automatic optimal mapping of pipeline algorithms
  publication-title: Parallel Comput.
  doi: 10.1016/S0167-8191(02)00216-8
– volume: 2
  start-page: 1
  issue: 1
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b35
  article-title: Runtime resource allocation for software pipelines
  publication-title: ACM Trans. Parallel Comput.
  doi: 10.1145/2742347
– ident: 10.1016/j.parco.2022.102936_b16
  doi: 10.1145/224538.224564
– ident: 10.1016/j.parco.2022.102936_b52
– ident: 10.1016/j.parco.2022.102936_b14
– volume: 378
  issue: 2166
  year: 2020
  ident: 10.1016/j.parco.2022.102936_b1
  article-title: Exascale applications: Skin in the game
  publication-title: Phil. Trans. R. Soc. A
  doi: 10.1098/rsta.2019.0056
– start-page: 40
  year: 2016
  ident: 10.1016/j.parco.2022.102936_b2
  article-title: Runtime coordinated heterogeneous tasks in Charm++
– volume: 7
  start-page: 685
  issue: 2
  year: 2019
  ident: 10.1016/j.parco.2022.102936_b47
  article-title: Generalizing parallel replica dynamics: Trajectory fragments, asynchronous computing, and PDMPs
  publication-title: SIAM/ASA J. Uncertain. Quantif.
  doi: 10.1137/18M1177792
– volume: 21
  start-page: 291
  issue: 3
  year: 2007
  ident: 10.1016/j.parco.2022.102936_b12
  article-title: Parallel programmability and the chapel language
  publication-title: Int. J. High Perform. Comput. Appl.
  doi: 10.1177/1094342007078442
– volume: 64
  start-page: 199
  issue: 2
  year: 1993
  ident: 10.1016/j.parco.2022.102936_b30
  article-title: Resource allocation via dynamic programming in activity networks
  publication-title: European J. Oper. Res.
  doi: 10.1016/0377-2217(93)90177-O
– volume: 21
  start-page: 45
  issue: 01
  year: 2011
  ident: 10.1016/j.parco.2022.102936_b21
  article-title: MPI on millions of cores
  publication-title: Parallel Process. Lett.
  doi: 10.1142/S0129626411000060
– start-page: 490
  year: 2014
  ident: 10.1016/j.parco.2022.102936_b24
  article-title: Distributed resource allocation to virtual machines via artificial neural networks
– volume: 38
  start-page: 1
  issue: 2
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b9
  article-title: Compiler-driven software speculation for thread-level parallelism
  publication-title: ACM Transactions on Programming Languages and Systems (TOPLAS)
  doi: 10.1145/2821505
– year: 1993
  ident: 10.1016/j.parco.2022.102936_b15
– volume: 33
  start-page: 813
  issue: 7
  year: 2018
  ident: 10.1016/j.parco.2022.102936_b41
  article-title: Long-time molecular dynamics simulations on massively parallel platforms: A comparison of parallel replica dynamics and parallel trajectory splicing
  publication-title: J. Mater. Res.
  doi: 10.1557/jmr.2017.456
– start-page: 334
  year: 2008
  ident: 10.1016/j.parco.2022.102936_b33
  article-title: Just-in-time scheduling for loop-based speculative parallelization
– volume: 153
  issue: 4
  year: 2020
  ident: 10.1016/j.parco.2022.102936_b5
  article-title: Scalable molecular dynamics on CPU and GPU architectures with NAMD
  publication-title: J. Chem. Phys.
  doi: 10.1063/5.0014475
– volume: 36
  start-page: 231
  issue: 2
  year: 2002
  ident: 10.1016/j.parco.2022.102936_b31
  article-title: An adaptive dynamic programming algorithm for the heterogeneous resource allocation problem
  publication-title: Transp. Sci.
  doi: 10.1287/trsc.36.2.231.561
– start-page: 545
  year: 2016
  ident: 10.1016/j.parco.2022.102936_b26
  article-title: Auto-tuning performance of MPI parallel programs using resource management in container-based virtual cloud
– volume: 115
  start-page: 9657
  issue: 21
  year: 2001
  ident: 10.1016/j.parco.2022.102936_b42
  article-title: Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table
  publication-title: J. Chem. Phys.
  doi: 10.1063/1.1415500
– start-page: 1
  year: 2017
  ident: 10.1016/j.parco.2022.102936_b22
  article-title: Intelligent real-time scheduling of dynamic processes in MPI
– ident: 10.1016/j.parco.2022.102936_b11
– volume: 92
  start-page: 1065
  issue: 12
  year: 2016
  ident: 10.1016/j.parco.2022.102936_b48
  article-title: Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics
  publication-title: Simulation
  doi: 10.1177/0037549716674806
– start-page: 139
  year: 2016
  ident: 10.1016/j.parco.2022.102936_b7
  article-title: Openatom: scalable ab-initio molecular dynamics with diverse capabilities
– year: 2012
  ident: 10.1016/j.parco.2022.102936_b32
– year: 2020
  ident: 10.1016/j.parco.2022.102936_b4
  article-title: HTR solver: An open-source exascale-oriented task-based multi-GPU high-order code for hypersonic aerothermodynamics
  publication-title: Comput. Phys. Comm.
  doi: 10.1016/j.cpc.2020.107262
– volume: 78
  start-page: 3908
  issue: 20
  year: 1997
  ident: 10.1016/j.parco.2022.102936_b43
  article-title: Hyperdynamics: Accelerated molecular dynamics of infrequent events
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.78.3908
– volume: 6
  start-page: 264
  issue: 1
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b25
  article-title: Towards efficient resource allocation for heterogeneous workloads in IaaS clouds
  publication-title: IEEE Trans. Cloud Comput.
  doi: 10.1109/TCC.2015.2481400
– ident: 10.1016/j.parco.2022.102936_b40
  doi: 10.1145/1465482.1465560
– ident: 10.1016/j.parco.2022.102936_b13
  doi: 10.1145/165854.165874
– start-page: 1
  year: 2021
  ident: 10.1016/j.parco.2022.102936_b37
  article-title: Tasking framework for adaptive speculative parallel mesh generation
  publication-title: J. Supercomput.
– volume: 55
  issue: 3
  year: 2011
  ident: 10.1016/j.parco.2022.102936_b18
  article-title: PERCS: The IBM POWER7-IH high-performance computing system
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/JRD.2011.2109230
– volume: 117
  start-page: 1
  issue: 1
  year: 1995
  ident: 10.1016/j.parco.2022.102936_b38
  article-title: Fast parallel algorithms for short-range molecular dynamics
  publication-title: J. Comput. Phys.
  doi: 10.1006/jcph.1995.1039
– volume: 25
  start-page: 1
  issue: 3
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b49
  article-title: Tadsim: Discrete event-based performance prediction for temperature-accelerated dynamics
  publication-title: ACM Trans. Model. Comput. Simul.
  doi: 10.1145/2699715
– ident: 10.1016/j.parco.2022.102936_b39
– volume: 5
  start-page: 79
  year: 2009
  ident: 10.1016/j.parco.2022.102936_b50
  article-title: Accelerated molecular dynamics methods: Introduction and recent developments
  publication-title: Annu. Rep. Comput. Chem.
  doi: 10.1016/S1574-1400(09)00504-0
– volume: 100
  start-page: 90
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b46
  article-title: The parallel replica dynamics method–Coming of age
  publication-title: Comput. Mater. Sci.
  doi: 10.1016/j.commatsci.2014.12.011
– start-page: 320
  year: 1997
  ident: 10.1016/j.parco.2022.102936_b20
  article-title: On adaptive resource allocation for complex real-time applications
– start-page: 288
  year: 2015
  ident: 10.1016/j.parco.2022.102936_b28
  article-title: Resource monitoring and prediction in cloud computing environments
– ident: 10.1016/j.parco.2022.102936_b6
– volume: 126
  start-page: 166
  issue: 1
  year: 2000
  ident: 10.1016/j.parco.2022.102936_b29
  article-title: Design of parallel algorithms for the single resource allocation problem
  publication-title: European J. Oper. Res.
  doi: 10.1016/S0377-2217(99)00287-8
– volume: 112
  start-page: 9599
  issue: 21
  year: 2000
  ident: 10.1016/j.parco.2022.102936_b44
  article-title: Temperature-accelerated dynamics for simulation of infrequent events
  publication-title: J. Chem. Phys.
  doi: 10.1063/1.481576
– volume: 28
  issue: 6
  year: 2020
  ident: 10.1016/j.parco.2022.102936_b51
  article-title: Exploiting model uncertainty to improve the scalability of long-time simulations using parallel trajectory splicing
  publication-title: Modelling Simulation Mater. Sci. Eng.
  doi: 10.1088/1361-651X/aba511
– volume: 9
  start-page: 1
  issue: 4
  year: 2013
  ident: 10.1016/j.parco.2022.102936_b8
  article-title: Optimizing software runtime systems for speculative parallelization
  publication-title: ACM Transactions on Architecture and Code Optimization (TACO)
  doi: 10.1145/2400682.2400698
– start-page: 1
  year: 2012
  ident: 10.1016/j.parco.2022.102936_b19
  article-title: Cray cascade: A scalable HPC system based on a dragonfly network
– start-page: 1
  year: 2012
  ident: 10.1016/j.parco.2022.102936_b3
  article-title: Legion: Expressing locality and independence with logical regions
– start-page: 109
  year: 2009
  ident: 10.1016/j.parco.2022.102936_b23
  article-title: Grey prediction control of adaptive resources allocation in virtualized computing system
SSID ssj0006480
Score 2.3427765
Snippet The constant increase in parallelism available on large-scale distributed computers poses major scalability challenges to many scientific applications. A...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 102936
SubjectTerms Accelerated molecular dynamics
Discrete event simulation
Resource allocation
Speculation
Task-based programming
Title Resource allocation for task-level speculative scientific applications: A proof of concept using Parallel Trajectory Splicing
URI https://dx.doi.org/10.1016/j.parco.2022.102936
Volume 112
WOSCitedRecordID wos000805900400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7336
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006480
  issn: 0167-8191
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Li9swEBZptode-i7dvtCht9TFKymW1VsoW9oelkC3JTcjyzIkm_UGbzbspfSvd0YP2zRlaQuFIIJjWSHzZebTaB6EvAYSLrmRMgHrwxNhRJrotLSJsHisVGZ5qoxrNiFPTvLFQs1Hox8xF2a3lk2TX1-rzX8VNVwDYWPq7F-Iu3soXID3IHQYQeww_pHgo0N-gkfqZhBLqC_PkjXGCE0wvdK17drZkBGJAUOT4WG2z1gH9Qpk0kWeu-zGyZVzLcx1iy1Y1lgafeXc_qCBcG60g4HtdvcZ1zsiforhPro9D92s-4hKd9x0rs-w6X2jl9p5fL4tmz7OZ25b7_EGsAYlFjwWsNmNIVnBjbaXSuM9m6CxcffoDZPXxrkE-s99hZROXfuw6z3V770Qq7cbQA1mdTKGZSkU_6XQtjPdX3A1XIwxV5FH3SIHTE5VPiYHs0_Hi8-dMc-Ea77XfbtYuMqFCO4t9XtyMyAsp_fJ3bDToDOPkAdkZJuH5F7s4kGDUn9EvkfA0B4wFABDe8DQAWBoDxg6BMw7OqMOLhReAS7UwYVGGNAeLjTC5TH5-uH49P3HJDTlSAywnS0ymlqbTFWiNlZWJjWq5pWQKdPTus5VLTj2GOCplgxPwaWqUhi4sBJoQX3En5Bxc9HYp4TqaaqN1YyXohRVKXRpU1tnujwCHstFeUhY_DULEyrWY-OUdRFDE1eFE0GBIii8CA7Jm27Sxhdsufn2LIqpCJzTc8kCcHXTxGf_OvE5udP_KV6Q8ba9si_JbbPbLi_bVwF_PwF3iaxU
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Resource+allocation+for+task-level+speculative+scientific+applications%3A+A+proof+of+concept+using+Parallel+Trajectory+Splicing&rft.jtitle=Parallel+computing&rft.au=Garmon%2C+Andrew&rft.au=Ramakrishnaiah%2C+Vinay&rft.au=Perez%2C+Danny&rft.date=2022-09-01&rft.pub=Elsevier+B.V&rft.issn=0167-8191&rft.eissn=1872-7336&rft.volume=112&rft_id=info:doi/10.1016%2Fj.parco.2022.102936&rft.externalDocID=S0167819122000369
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon