Performance implications of synchronization structure in parallel programming

The restricted synchronization structure of so-called structured parallel programming paradigms has an advantageous effect on programmer productivity, cost modeling, and scheduling complexity. However, imposing these restrictions can lead to a loss of parallelism, compared to using a programming app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parallel computing Jg. 35; H. 8; S. 455 - 474
Hauptverfasser: González-Escribano, Arturo, van Gemund, Arjan J.C., Cardeñoso-Payo, Valentín
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 01.08.2009
Schlagworte:
ISSN:0167-8191, 1872-7336
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The restricted synchronization structure of so-called structured parallel programming paradigms has an advantageous effect on programmer productivity, cost modeling, and scheduling complexity. However, imposing these restrictions can lead to a loss of parallelism, compared to using a programming approach that does not impose synchronization structure. In this paper we study the potential loss of parallelism when expressing parallel computations into a programming model which limits the computation graph (DAG) to series–parallel topology, which characterizes all well-known structured programming models. We present an analytical model that approximately captures this loss of parallelism in terms of simple parameters that are related to DAG topology and workload distribution. We validate the model using a wide range of synthetic and real-world parallel computations running on shared and distributed-memory machines. Although the loss of parallelism is theoretically unbounded, our measurements show that for all above applications the performance loss due to choosing a series–parallel structured model is invariably limited up to 10%. In all cases, the loss of parallelism is predictable provided the topology and workload variability of the DAG are known.
AbstractList The restricted synchronization structure of so-called structured parallel programming paradigms has an advantageous effect on programmer productivity, cost modeling, and scheduling complexity. However, imposing these restrictions can lead to a loss of parallelism, compared to using a programming approach that does not impose synchronization structure. In this paper we study the potential loss of parallelism when expressing parallel computations into a programming model which limits the computation graph (DAG) to series–parallel topology, which characterizes all well-known structured programming models. We present an analytical model that approximately captures this loss of parallelism in terms of simple parameters that are related to DAG topology and workload distribution. We validate the model using a wide range of synthetic and real-world parallel computations running on shared and distributed-memory machines. Although the loss of parallelism is theoretically unbounded, our measurements show that for all above applications the performance loss due to choosing a series–parallel structured model is invariably limited up to 10%. In all cases, the loss of parallelism is predictable provided the topology and workload variability of the DAG are known.
Author Cardeñoso-Payo, Valentín
González-Escribano, Arturo
van Gemund, Arjan J.C.
Author_xml – sequence: 1
  givenname: Arturo
  surname: González-Escribano
  fullname: González-Escribano, Arturo
  organization: Dept. de Informática, Universidad de Valladolid, E.T.I.T. Campus Miguel Delibes, 47011 Valladolid, Spain
– sequence: 2
  givenname: Arjan J.C.
  surname: van Gemund
  fullname: van Gemund, Arjan J.C.
  email: a.j.c.vangemund@tudelft.nl
  organization: Faculty of Electrical Engineering, Mathematics, and Computer Science, P.O. Box 5031, NL-2600 GA Delft, The Netherlands
– sequence: 3
  givenname: Valentín
  surname: Cardeñoso-Payo
  fullname: Cardeñoso-Payo, Valentín
  organization: Dept. de Informática, Universidad de Valladolid, E.T.I.T. Campus Miguel Delibes, 47011 Valladolid, Spain
BookMark eNp9kMtOwzAQRS1UJNrCF7DJDySM87CdBQtUAUUqggWsLceZFFeJHdkpUvl63JY1q5FG98zjLMjMOouE3FLIKFB2t8tG5bXLcoA6A54B5BdkTgXPU14UbEbmMcVTQWt6RRYh7ACAlQLm5PUdfef8oKzGxAxjb7SajLMhcV0SDlZ_eWfNz6mXhMnv9bT3MWmTuFH1PfbJ6N3Wq2EwdntNLjvVB7z5q0vy-fT4sVqnm7fnl9XDJtV5VUxpXWhWl3lJUQDypqpY2yBlFeM1a1lZ1qJDKmhXCSiZUFwLzTXvNNZN3irVFEtSnOdq70Lw2MnRm0H5g6Qgj0bkTp6MyKMRCVxGI5G6P1MYT_s26GXQBuPjrfGoJ9k68y__Cw2Rbs0
Cites_doi 10.1109/M-PDT.1994.329791
10.1109/TPDS.2006.13
10.1145/290409.290412
10.1145/356901.356903
10.1137/0211023
10.1145/79173.79181
10.1145/322326.322328
10.1109/TSE.1987.232852
10.1007/11403937_41
10.1145/237502.237504
10.1016/S0304-3975(00)00031-1
10.1142/9781848160170_0062
10.1145/1035594.1035623
10.1007/3-540-36569-9_27
10.1109/12.780876
10.1145/209937.209958
10.1109/TPDS.2003.1178879
10.1007/978-3-540-49382-2_33
10.1016/0304-3975(96)00035-7
10.1145/263580.263625
10.1109/MC.2006.180
10.1145/280277.280278
10.1145/167088.167196
10.1006/jpdc.1994.1085
10.1006/jpdc.1995.1089
10.1006/jpdc.1995.1044
10.1007/BFb0020455
10.1007/3-540-58021-2_8
10.1109/M-PDT.1994.329801
ContentType Journal Article
Copyright 2009 Elsevier B.V.
Copyright_xml – notice: 2009 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.parco.2009.07.002
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7336
EndPage 474
ExternalDocumentID 10_1016_j_parco_2009_07_002
S0167819109000817
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29O
4.4
457
4G.
5VS
6OB
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WH7
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c253t-93c694241e80e7b556dbe1656796d64498fe181f580468a7c8c7c7fce9b2daab3
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000270704500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-8191
IngestDate Sat Nov 29 04:06:55 EST 2025
Fri Feb 23 02:30:41 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Task graphs
Performance prediction
Parallel programming models
Language English
License https://www.elsevier.com/tdm/userlicense/1.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c253t-93c694241e80e7b556dbe1656796d64498fe181f580468a7c8c7c7fce9b2daab3
PageCount 20
ParticipantIDs crossref_primary_10_1016_j_parco_2009_07_002
elsevier_sciencedirect_doi_10_1016_j_parco_2009_07_002
PublicationCentury 2000
PublicationDate 2009-08-01
PublicationDateYYYYMMDD 2009-08-01
PublicationDate_xml – month: 08
  year: 2009
  text: 2009-08-01
  day: 01
PublicationDecade 2000
PublicationTitle Parallel computing
PublicationYear 2009
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References M. Ãldinucci, A. B˜enoit, Automatic mapping of ASSIST applications using process algebra, in: HLPP’05, 2005.
K. Lodaya, P. Weil, A Kleene iteration for parallelism, in: Proceedings of FST & TCS’98, LNCS 1530, Springer, 1998, pp. 355–366.
M. Cole, Frame: an imperative coordination language for parallel programming, Technical Report EDI-INF-RR-0026, Div. Informatics, Univ. of Edinburgh, September 2000.
Gautama, van Gemund (bib21) 2006; 17
Juurlink, Wijshoff (bib34) 1998; 16
Butenhof (bib10) 1997
Gumbel (bib32) 1962
A.J.C. van Gemund, The importance of synchronization structure in parallel program optimization, in: Proceedings of 11th ACM ICS, Vienna, July 1997, pp. 164–171.
Foster, Mani Chandy (bib19) 1995; 26
H.X. Lin, A general approach for parallelizing the FEM software package DIANA, in: Proceedings of High Performance Computing Conference’94, National Supercomputing Research Center, National University of Singapore, 1994, pp. 229–236.
DIANA FE Program, WWW, January 2000.
.
J. Darlington, Y. Guo, H.W. To, J. Yang, Functional skeletons for parallel coordination, in: Europar’95, LNCS, 1995, pp. 55–69.
Gatlin (bib20) 2004; 2
Takamizawa, Nishizeki, Saito (bib52) 1982; 29
in: Proceedings of 8th ACM Symposium on Par. Alg. and Arch. (SPAA’96), Padua, ACM, 1996, pp. 25–32.
V. Adve, A. Carle, E. Granston, S. Hiranandani, K. Kennedy, C. Koebel, U. Kremer, J. Mellor-Crummey, S. Warren, C.-W. Tseng, Requirements for data-parallel programming environments, IEEE Parallel Distr. Technol., 1994, pp. 48–58.
A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, Tools to schedule and simulate parallel computations expressed as directed graphs. Technical Report IT-DI-2007-0001, Dpto. Infomática, Universidad de Valladolid, April 2007.
Goudreau, Lang, Rao, Suel, Tsantilas (bib30) 1999; 48
van Gemund (bib23) 2003; 14
O. Bonorden, B. Juurlink, I. von Otte, I. Rieping, The Paderborn University BSP (PUB) library – design, implementation, and performance, in: Proceedings IPPS/SPDP’99, San Juan, Puerto Rico, Computer Society, IEEE, April 1999.
A. González-Escribano, Synchronization Architecture in Parallel Programming Models, PhD Thesis, Dpto. Informtica, University of Valladolid, July 2003.
R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, Y. Zhou, Cilk: an efficient multithreaded runtime system, in: Proceedings of 5th PPoPP, ACM, 1995, pp. 207–216.
A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, Mapping unstructured applications into nested parallelism, in: High Perf. Comp. for Comput. Science (Selected Papers), LNCS 2565, Springer, 2003, pp. 407–420.
G. Karypis, METIS: family of multilevel partitioning algorithms, WWW, 2002.
A.D. Malony, V. Mertsiotakis, A. Quick, Automatic scalability analysis of parallel programs based on modeling techniques, in: Comp. Perf. Eval.: Modeling Techniques and Tools (LNCS 794), Springer, Berlin, May 1994, pp. 139–158.
The MPI Forum, MPI: a message passing interface, in: Proceedings of the conference on Supercomputing’93, ACM, 1993, pp. 878–883.
Lee (bib37) 2006; 39
Lodaya, Weil (bib41) 2000; 237
A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, R. Portales-Fernández, J.A. Caminero-Granja, A preliminary nested-parallel framework to efficiently implement scientific applications, in: VECPAR 2004, LNCS 3402, Springer, April 2005, pp. 541–555.
Finta, Liu, Milis, Bampis (bib18) 1996; 162
R.D. Blumofe, C.E. Leiserson, Scheduling multithreaded computations by work stealing, in: Proceedings Annual Symposium on FoCS, November 1994, pp. 356–368.
Gerbessiotis, Valiant (bib24) 1994; 22
A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, H.-X. Lin, V. Vaca-Díez, Expressiveness versus optimizability in coordinating parallelism, in: Parallel Computing, Fundamentals & Applications, Proc. Int. Conf. ParCo99, Delft (The Netherlands), Imperial College Press, August 1999, pp. 526–533.
C.W. Kessler, NestStep: nested parallelism and virtual shared memory for the BSP model, in: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’99), Las Vegas (USA), June–July 1999.
V. Ramachandran, B. Grayson, M. Dahlin, Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design, in: Proceedings of ACM–SIAM SODA’99, 1999, pp. 957–958.
A.Z. Salamon, Task Graph Performance Bounds Through Comparison Methods, MSc Thesis, Fac. of Science, Univ. of the Witwatersrand, Johannesburg, January 2001.
A. Vaca-Díez, Tools and techniques to assess the loss of parallelism when imposing synchronization structure, Techn. Rep. 1-68340-28(1999)02, TU Delft, 1999.
G. Bilardi, K.T. Herley, A. Pietracaprina, BSP vs. Log
Valiant (bib55) 1990; 33
Schloegel, Karypis, Kumar (bib49) 2000
V.A.F. Almeida, I.M.M. Vasconcelos, J.N.C. Rabe, D.A. Menasc, Using random task graphs to investigate the potential benefits of heterogeneity in parallel systems, in: Proceedings of Supercomputing’92, IEEE, Minn., MN, November 1992, pp. 683–691.
Quinn (bib45) 1993
Sahner, Trivedi (bib47) 1987; 13
I. Duff, R.G. Grimes, J.G. Lewis, Users’ guide for the Harwell–Boeing sparse matrix collection (release i). Technical Report TR/PA/92/86, CERFACS, October 1992.
Skillicorn, Talia (bib51) 1998; 30
Andrews, Schneider (bib4) 1983; 15
El-Ghazawi, Carlson, Sterling, Yelick (bib17) 2003
National Institute of Standards and Technology (NIST), Matrix Market, WWW, 2002.
Chandra, Menon, Dagum, Kohr (bib11) 2000
Valdés, Tarjan, Lawler (bib54) 1982; 11
Gross, O’Hallaron, Subhlok (bib31) 1994
van Nieuwpoort, Maassen, Wrzesinska, Kielmann, Bal (bib44) 2004
WWW, The landscape of parallel computing research: a view from berkeley, WWW, November 2006.
Skillicorn (bib50) 1995; 28
G. Bilardi, Observations on universality and portability in high-performance computing, in: Proceedings of 1998 International Workshop on Innov. Archit., IEEE, 1999, pp. 21–26.
K. Lodaya, P. Weil, Series–parallel posets: algebra, automata, and languages, in: Proceedings of STACS’98, LNCS, vol. 1373, Springer, Paris, 1998, pp. 555–565.
P.G. Joisha, P. Banerjee. PARADIGM(version 2.0): a new HPF compilation system, in: Proceedings IPPS/SPDP’99, IEEE Computer Society, San Juan, April 1999.
Butenhof (10.1016/j.parco.2009.07.002_bib10) 1997
Andrews (10.1016/j.parco.2009.07.002_bib4) 1983; 15
10.1016/j.parco.2009.07.002_bib48
10.1016/j.parco.2009.07.002_bib40
Valiant (10.1016/j.parco.2009.07.002_bib55) 1990; 33
Gumbel (10.1016/j.parco.2009.07.002_bib32) 1962
10.1016/j.parco.2009.07.002_bib42
van Gemund (10.1016/j.parco.2009.07.002_bib23) 2003; 14
10.1016/j.parco.2009.07.002_bib43
10.1016/j.parco.2009.07.002_bib46
Takamizawa (10.1016/j.parco.2009.07.002_bib52) 1982; 29
Gross (10.1016/j.parco.2009.07.002_bib31) 1994
van Nieuwpoort (10.1016/j.parco.2009.07.002_bib44) 2004
Gerbessiotis (10.1016/j.parco.2009.07.002_bib24) 1994; 22
Quinn (10.1016/j.parco.2009.07.002_bib45) 1993
10.1016/j.parco.2009.07.002_bib14
10.1016/j.parco.2009.07.002_bib15
Skillicorn (10.1016/j.parco.2009.07.002_bib51) 1998; 30
10.1016/j.parco.2009.07.002_bib16
10.1016/j.parco.2009.07.002_bib53
Schloegel (10.1016/j.parco.2009.07.002_bib49) 2000
10.1016/j.parco.2009.07.002_bib12
10.1016/j.parco.2009.07.002_bib56
10.1016/j.parco.2009.07.002_bib13
El-Ghazawi (10.1016/j.parco.2009.07.002_bib17) 2003
Chandra (10.1016/j.parco.2009.07.002_bib11) 2000
Finta (10.1016/j.parco.2009.07.002_bib18) 1996; 162
Sahner (10.1016/j.parco.2009.07.002_bib47) 1987; 13
Skillicorn (10.1016/j.parco.2009.07.002_bib50) 1995; 28
10.1016/j.parco.2009.07.002_bib3
Foster (10.1016/j.parco.2009.07.002_bib19) 1995; 26
10.1016/j.parco.2009.07.002_bib1
10.1016/j.parco.2009.07.002_bib2
10.1016/j.parco.2009.07.002_bib7
10.1016/j.parco.2009.07.002_bib25
10.1016/j.parco.2009.07.002_bib8
10.1016/j.parco.2009.07.002_bib26
10.1016/j.parco.2009.07.002_bib5
10.1016/j.parco.2009.07.002_bib27
10.1016/j.parco.2009.07.002_bib6
10.1016/j.parco.2009.07.002_bib28
10.1016/j.parco.2009.07.002_bib29
Juurlink (10.1016/j.parco.2009.07.002_bib34) 1998; 16
10.1016/j.parco.2009.07.002_bib9
Goudreau (10.1016/j.parco.2009.07.002_bib30) 1999; 48
10.1016/j.parco.2009.07.002_bib22
Valdés (10.1016/j.parco.2009.07.002_bib54) 1982; 11
10.1016/j.parco.2009.07.002_bib36
10.1016/j.parco.2009.07.002_bib38
10.1016/j.parco.2009.07.002_bib39
Gautama (10.1016/j.parco.2009.07.002_bib21) 2006; 17
Lodaya (10.1016/j.parco.2009.07.002_bib41) 2000; 237
Lee (10.1016/j.parco.2009.07.002_bib37) 2006; 39
10.1016/j.parco.2009.07.002_bib33
10.1016/j.parco.2009.07.002_bib35
Gatlin (10.1016/j.parco.2009.07.002_bib20) 2004; 2
References_xml – start-page: 16
  year: 1994
  end-page: 26
  ident: bib31
  article-title: Task parallelism in a high-performance Fortan framework
  publication-title: IEEE Parallel Distr. Technol.
– volume: 17
  start-page: 78
  year: 2006
  end-page: 91
  ident: bib21
  article-title: Low-cost static performance prediction of stochastic parallel task compositions
  publication-title: IEEE Trans. Parallel Distr. Syst.
– volume: 11
  start-page: 298
  year: 1982
  end-page: 313
  ident: bib54
  article-title: The recognition of series–parallel digraphs
  publication-title: SIAM J. Comput.
– reference: , in: Proceedings of 8th ACM Symposium on Par. Alg. and Arch. (SPAA’96), Padua, ACM, 1996, pp. 25–32.
– volume: 13
  start-page: 1105
  year: 1987
  end-page: 1114
  ident: bib47
  article-title: Performance and reliability analysis using directed acyclic graphs
  publication-title: IEEE Trans. Softw. Eng.
– reference: A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, H.-X. Lin, V. Vaca-Díez, Expressiveness versus optimizability in coordinating parallelism, in: Parallel Computing, Fundamentals & Applications, Proc. Int. Conf. ParCo99, Delft (The Netherlands), Imperial College Press, August 1999, pp. 526–533.
– reference: O. Bonorden, B. Juurlink, I. von Otte, I. Rieping, The Paderborn University BSP (PUB) library – design, implementation, and performance, in: Proceedings IPPS/SPDP’99, San Juan, Puerto Rico, Computer Society, IEEE, April 1999.
– reference: C.W. Kessler, NestStep: nested parallelism and virtual shared memory for the BSP model, in: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’99), Las Vegas (USA), June–July 1999.
– reference: G. Bilardi, Observations on universality and portability in high-performance computing, in: Proceedings of 1998 International Workshop on Innov. Archit., IEEE, 1999, pp. 21–26.
– volume: 16
  start-page: 271
  year: 1998
  end-page: 318
  ident: bib34
  article-title: A quantitative comparison of parallel computation models
  publication-title: ACM Trans. Comput. Syst.
– reference: A.D. Malony, V. Mertsiotakis, A. Quick, Automatic scalability analysis of parallel programs based on modeling techniques, in: Comp. Perf. Eval.: Modeling Techniques and Tools (LNCS 794), Springer, Berlin, May 1994, pp. 139–158.
– volume: 30
  start-page: 123
  year: 1998
  end-page: 169
  ident: bib51
  article-title: Models and languages for parallel computation
  publication-title: ACM Comput. Surv.
– volume: 26
  start-page: 24
  year: 1995
  end-page: 35
  ident: bib19
  article-title: Fortran M: a language for modular parallel programming
  publication-title: J. Parallel Distr. Comput.
– volume: 39
  start-page: 33
  year: 2006
  end-page: 41
  ident: bib37
  article-title: The problem with threads
  publication-title: Computer
– reference: G. Bilardi, K.T. Herley, A. Pietracaprina, BSP vs. Log
– reference: A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, R. Portales-Fernández, J.A. Caminero-Granja, A preliminary nested-parallel framework to efficiently implement scientific applications, in: VECPAR 2004, LNCS 3402, Springer, April 2005, pp. 541–555.
– year: 1962
  ident: bib32
  article-title: Statistical Theory of Extreme Values (Main Results), Wiley Publications in Statistics
– reference: A.Z. Salamon, Task Graph Performance Bounds Through Comparison Methods, MSc Thesis, Fac. of Science, Univ. of the Witwatersrand, Johannesburg, January 2001.
– reference: A. González-Escribano, Synchronization Architecture in Parallel Programming Models, PhD Thesis, Dpto. Informtica, University of Valladolid, July 2003.
– reference: National Institute of Standards and Technology (NIST), Matrix Market, WWW, 2002.
– volume: 162
  start-page: 323
  year: 1996
  end-page: 340
  ident: bib18
  article-title: Scheduling UET–UCT series–parallel graphs on two processors
  publication-title: Theoret. Comput. Sci.
– year: 1997
  ident: bib10
  article-title: Programming with POSIX(R) Threads
– volume: 237
  start-page: 347
  year: 2000
  end-page: 380
  ident: bib41
  article-title: Series–parallel languages and the bounded-width property
  publication-title: Theoret. Comput. Sci.
– reference: H.X. Lin, A general approach for parallelizing the FEM software package DIANA, in: Proceedings of High Performance Computing Conference’94, National Supercomputing Research Center, National University of Singapore, 1994, pp. 229–236.
– volume: 22
  start-page: 251
  year: 1994
  end-page: 267
  ident: bib24
  article-title: Direct bulk-synchronous parallel algorithms
  publication-title: J. Parallel Distr. Comput.
– year: 2000
  ident: bib11
  article-title: Parallel Programming in OpenMP
– reference: I. Duff, R.G. Grimes, J.G. Lewis, Users’ guide for the Harwell–Boeing sparse matrix collection (release i). Technical Report TR/PA/92/86, CERFACS, October 1992.
– year: 2003
  ident: bib17
  article-title: UPC: Distributed Shared-memory Programming
– volume: 2
  start-page: 67
  year: 2004
  end-page: 73
  ident: bib20
  article-title: Trials and tribulations of debugging concurrency
  publication-title: ACM Queue
– volume: 29
  start-page: 623
  year: 1982
  end-page: 641
  ident: bib52
  article-title: Linear-time computability of combinatorial problems on series–parallel graphs
  publication-title: J. ACM
– volume: 33
  start-page: 103
  year: 1990
  end-page: 111
  ident: bib55
  article-title: A bridging model for parallel computation
  publication-title: Commun. ACM
– volume: 14
  start-page: 154
  year: 2003
  end-page: 165
  ident: bib23
  article-title: Symbolic performance modeling of parallel systems
  publication-title: IEEE Trans. Parallel Distr. Syst.
– reference: A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, Tools to schedule and simulate parallel computations expressed as directed graphs. Technical Report IT-DI-2007-0001, Dpto. Infomática, Universidad de Valladolid, April 2007.
– reference: .
– reference: A. Vaca-Díez, Tools and techniques to assess the loss of parallelism when imposing synchronization structure, Techn. Rep. 1-68340-28(1999)02, TU Delft, 1999.
– reference: DIANA FE Program, WWW, January 2000.
– reference: G. Karypis, METIS: family of multilevel partitioning algorithms, WWW, 2002.
– reference: R.D. Blumofe, C.E. Leiserson, Scheduling multithreaded computations by work stealing, in: Proceedings Annual Symposium on FoCS, November 1994, pp. 356–368.
– reference: The MPI Forum, MPI: a message passing interface, in: Proceedings of the conference on Supercomputing’93, ACM, 1993, pp. 878–883.
– reference: P.G. Joisha, P. Banerjee. PARADIGM(version 2.0): a new HPF compilation system, in: Proceedings IPPS/SPDP’99, IEEE Computer Society, San Juan, April 1999.
– reference: R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, Y. Zhou, Cilk: an efficient multithreaded runtime system, in: Proceedings of 5th PPoPP, ACM, 1995, pp. 207–216.
– year: 2004
  ident: bib44
  article-title: Satin: simple and efficient Java-based grid programming
  publication-title: J. Parallel Distr. Comput. Prac.
– reference: M. Ãldinucci, A. B˜enoit, Automatic mapping of ASSIST applications using process algebra, in: HLPP’05, 2005.
– reference: K. Lodaya, P. Weil, A Kleene iteration for parallelism, in: Proceedings of FST & TCS’98, LNCS 1530, Springer, 1998, pp. 355–366.
– reference: V. Adve, A. Carle, E. Granston, S. Hiranandani, K. Kennedy, C. Koebel, U. Kremer, J. Mellor-Crummey, S. Warren, C.-W. Tseng, Requirements for data-parallel programming environments, IEEE Parallel Distr. Technol., 1994, pp. 48–58.
– reference: V.A.F. Almeida, I.M.M. Vasconcelos, J.N.C. Rabe, D.A. Menasc, Using random task graphs to investigate the potential benefits of heterogeneity in parallel systems, in: Proceedings of Supercomputing’92, IEEE, Minn., MN, November 1992, pp. 683–691.
– reference: WWW, The landscape of parallel computing research: a view from berkeley, WWW, November 2006.
– reference: M. Cole, Frame: an imperative coordination language for parallel programming, Technical Report EDI-INF-RR-0026, Div. Informatics, Univ. of Edinburgh, September 2000.
– reference: J. Darlington, Y. Guo, H.W. To, J. Yang, Functional skeletons for parallel coordination, in: Europar’95, LNCS, 1995, pp. 55–69.
– volume: 15
  start-page: 3
  year: 1983
  end-page: 43
  ident: bib4
  article-title: Concepts and notations for concurrent programming
  publication-title: Comput. Surv.
– reference: A.J.C. van Gemund, The importance of synchronization structure in parallel program optimization, in: Proceedings of 11th ACM ICS, Vienna, July 1997, pp. 164–171.
– volume: 28
  start-page: 65
  year: 1995
  end-page: 83
  ident: bib50
  article-title: A cost calculus for parallel functional programming
  publication-title: J. Parallel Distr. Comput.
– reference: K. Lodaya, P. Weil, Series–parallel posets: algebra, automata, and languages, in: Proceedings of STACS’98, LNCS, vol. 1373, Springer, Paris, 1998, pp. 555–565.
– reference: A. González-Escribano, A.J.C. van Gemund, V. Cardeñoso-Payo, Mapping unstructured applications into nested parallelism, in: High Perf. Comp. for Comput. Science (Selected Papers), LNCS 2565, Springer, 2003, pp. 407–420.
– year: 2000
  ident: bib49
  publication-title: CRPC Parallel Computing Handbook
– year: 1993
  ident: bib45
  article-title: Parallel Computing: Theory and Practice
– reference: V. Ramachandran, B. Grayson, M. Dahlin, Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design, in: Proceedings of ACM–SIAM SODA’99, 1999, pp. 957–958.
– volume: 48
  start-page: 670
  year: 1999
  end-page: 689
  ident: bib30
  article-title: Portable and efficient parallel computing using the BSP model
  publication-title: IEEE Trans. Comput.
– start-page: 16
  year: 1994
  ident: 10.1016/j.parco.2009.07.002_bib31
  article-title: Task parallelism in a high-performance Fortan framework
  publication-title: IEEE Parallel Distr. Technol.
  doi: 10.1109/M-PDT.1994.329791
– ident: 10.1016/j.parco.2009.07.002_bib56
– year: 2000
  ident: 10.1016/j.parco.2009.07.002_bib49
– ident: 10.1016/j.parco.2009.07.002_bib33
– year: 1993
  ident: 10.1016/j.parco.2009.07.002_bib45
– volume: 17
  start-page: 78
  issue: 1
  year: 2006
  ident: 10.1016/j.parco.2009.07.002_bib21
  article-title: Low-cost static performance prediction of stochastic parallel task compositions
  publication-title: IEEE Trans. Parallel Distr. Syst.
  doi: 10.1109/TPDS.2006.13
– volume: 16
  start-page: 271
  issue: 3
  year: 1998
  ident: 10.1016/j.parco.2009.07.002_bib34
  article-title: A quantitative comparison of parallel computation models
  publication-title: ACM Trans. Comput. Syst.
  doi: 10.1145/290409.290412
– ident: 10.1016/j.parco.2009.07.002_bib43
– volume: 15
  start-page: 3
  issue: 1
  year: 1983
  ident: 10.1016/j.parco.2009.07.002_bib4
  article-title: Concepts and notations for concurrent programming
  publication-title: Comput. Surv.
  doi: 10.1145/356901.356903
– volume: 11
  start-page: 298
  issue: 2
  year: 1982
  ident: 10.1016/j.parco.2009.07.002_bib54
  article-title: The recognition of series–parallel digraphs
  publication-title: SIAM J. Comput.
  doi: 10.1137/0211023
– ident: 10.1016/j.parco.2009.07.002_bib38
– volume: 33
  start-page: 103
  issue: 8
  year: 1990
  ident: 10.1016/j.parco.2009.07.002_bib55
  article-title: A bridging model for parallel computation
  publication-title: Commun. ACM
  doi: 10.1145/79173.79181
– year: 1997
  ident: 10.1016/j.parco.2009.07.002_bib10
– volume: 29
  start-page: 623
  issue: 3
  year: 1982
  ident: 10.1016/j.parco.2009.07.002_bib52
  article-title: Linear-time computability of combinatorial problems on series–parallel graphs
  publication-title: J. ACM
  doi: 10.1145/322326.322328
– volume: 13
  start-page: 1105
  issue: 10
  year: 1987
  ident: 10.1016/j.parco.2009.07.002_bib47
  article-title: Performance and reliability analysis using directed acyclic graphs
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.1987.232852
– ident: 10.1016/j.parco.2009.07.002_bib13
– ident: 10.1016/j.parco.2009.07.002_bib48
– ident: 10.1016/j.parco.2009.07.002_bib29
  doi: 10.1007/11403937_41
– year: 2000
  ident: 10.1016/j.parco.2009.07.002_bib11
– ident: 10.1016/j.parco.2009.07.002_bib6
  doi: 10.1145/237502.237504
– volume: 237
  start-page: 347
  year: 2000
  ident: 10.1016/j.parco.2009.07.002_bib41
  article-title: Series–parallel languages and the bounded-width property
  publication-title: Theoret. Comput. Sci.
  doi: 10.1016/S0304-3975(00)00031-1
– ident: 10.1016/j.parco.2009.07.002_bib27
– year: 2003
  ident: 10.1016/j.parco.2009.07.002_bib17
– year: 2004
  ident: 10.1016/j.parco.2009.07.002_bib44
  article-title: Satin: simple and efficient Java-based grid programming
  publication-title: J. Parallel Distr. Comput. Prac.
– ident: 10.1016/j.parco.2009.07.002_bib3
– ident: 10.1016/j.parco.2009.07.002_bib40
– ident: 10.1016/j.parco.2009.07.002_bib28
  doi: 10.1142/9781848160170_0062
– year: 1962
  ident: 10.1016/j.parco.2009.07.002_bib32
– ident: 10.1016/j.parco.2009.07.002_bib35
– volume: 2
  start-page: 67
  issue: 7
  year: 2004
  ident: 10.1016/j.parco.2009.07.002_bib20
  article-title: Trials and tribulations of debugging concurrency
  publication-title: ACM Queue
  doi: 10.1145/1035594.1035623
– ident: 10.1016/j.parco.2009.07.002_bib16
– ident: 10.1016/j.parco.2009.07.002_bib26
  doi: 10.1007/3-540-36569-9_27
– volume: 48
  start-page: 670
  issue: 7
  year: 1999
  ident: 10.1016/j.parco.2009.07.002_bib30
  article-title: Portable and efficient parallel computing using the BSP model
  publication-title: IEEE Trans. Comput.
  doi: 10.1109/12.780876
– ident: 10.1016/j.parco.2009.07.002_bib12
– ident: 10.1016/j.parco.2009.07.002_bib2
– ident: 10.1016/j.parco.2009.07.002_bib7
  doi: 10.1145/209937.209958
– volume: 14
  start-page: 154
  issue: 2
  year: 2003
  ident: 10.1016/j.parco.2009.07.002_bib23
  article-title: Symbolic performance modeling of parallel systems
  publication-title: IEEE Trans. Parallel Distr. Syst.
  doi: 10.1109/TPDS.2003.1178879
– ident: 10.1016/j.parco.2009.07.002_bib39
  doi: 10.1007/978-3-540-49382-2_33
– volume: 162
  start-page: 323
  issue: August
  year: 1996
  ident: 10.1016/j.parco.2009.07.002_bib18
  article-title: Scheduling UET–UCT series–parallel graphs on two processors
  publication-title: Theoret. Comput. Sci.
  doi: 10.1016/0304-3975(96)00035-7
– ident: 10.1016/j.parco.2009.07.002_bib22
  doi: 10.1145/263580.263625
– volume: 39
  start-page: 33
  issue: 5
  year: 2006
  ident: 10.1016/j.parco.2009.07.002_bib37
  article-title: The problem with threads
  publication-title: Computer
  doi: 10.1109/MC.2006.180
– ident: 10.1016/j.parco.2009.07.002_bib36
– volume: 30
  start-page: 123
  issue: 2
  year: 1998
  ident: 10.1016/j.parco.2009.07.002_bib51
  article-title: Models and languages for parallel computation
  publication-title: ACM Comput. Surv.
  doi: 10.1145/280277.280278
– ident: 10.1016/j.parco.2009.07.002_bib8
  doi: 10.1145/167088.167196
– volume: 22
  start-page: 251
  issue: 2
  year: 1994
  ident: 10.1016/j.parco.2009.07.002_bib24
  article-title: Direct bulk-synchronous parallel algorithms
  publication-title: J. Parallel Distr. Comput.
  doi: 10.1006/jpdc.1994.1085
– ident: 10.1016/j.parco.2009.07.002_bib53
– volume: 28
  start-page: 65
  year: 1995
  ident: 10.1016/j.parco.2009.07.002_bib50
  article-title: A cost calculus for parallel functional programming
  publication-title: J. Parallel Distr. Comput.
  doi: 10.1006/jpdc.1995.1089
– ident: 10.1016/j.parco.2009.07.002_bib15
– ident: 10.1016/j.parco.2009.07.002_bib46
– volume: 26
  start-page: 24
  year: 1995
  ident: 10.1016/j.parco.2009.07.002_bib19
  article-title: Fortran M: a language for modular parallel programming
  publication-title: J. Parallel Distr. Comput.
  doi: 10.1006/jpdc.1995.1044
– ident: 10.1016/j.parco.2009.07.002_bib14
  doi: 10.1007/BFb0020455
– ident: 10.1016/j.parco.2009.07.002_bib42
  doi: 10.1007/3-540-58021-2_8
– ident: 10.1016/j.parco.2009.07.002_bib9
– ident: 10.1016/j.parco.2009.07.002_bib25
– ident: 10.1016/j.parco.2009.07.002_bib1
  doi: 10.1109/M-PDT.1994.329801
– ident: 10.1016/j.parco.2009.07.002_bib5
SSID ssj0006480
Score 1.8306824
Snippet The restricted synchronization structure of so-called structured parallel programming paradigms has an advantageous effect on programmer productivity, cost...
SourceID crossref
elsevier
SourceType Index Database
Publisher
StartPage 455
SubjectTerms Parallel programming models
Performance prediction
Task graphs
Title Performance implications of synchronization structure in parallel programming
URI https://dx.doi.org/10.1016/j.parco.2009.07.002
Volume 35
WOSCitedRecordID wos000270704500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7336
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006480
  issn: 0167-8191
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZWLQcuvBFtAfnADbLKJo4fx1KWl6DaQ0F7ixzHkXa1dapsqdr-kP5exq9saFFFD1yiVZKdJJ5PM2N75huE3qR1WqgKZqoKplcJsY3cJdF1oorc0ZPlXLmuJd_Y4SGfz8VsNLqKtTBnK2YMPz8XJ_9V1XAOlG1LZ--g7l4onIDfoHQ4gtrh-E-Knw1KARbDfHEIC9cXRjk2XF98-dazx9o9BEex2tnGKquYs3UcvVqIXWfxunKdIOJVR65vLt2O-2SlL5OptUSVNL6ApgPxbbzRFkt90se_wmp2t7TFU-OD8WYnpKu1lfR-0q7bZCYvnJCf0vpG94QP5o91CtFnyfVLl2CS7fRwaHs9VUnAGFjegSklnr43eGXie_ncMPh-7WE5hiFSbaAftaSU2ca_xT39a26vT0aMeW7L0gmxjTlFmdpteXDt2xkrBBj87f0v0_nX3sdT4nry9d8U-axc5uCNd_l7zDOIY44eoQdhAoL3PXAeo5E2T9DD2NwDB1v_FH0f4AgPcYTbBl_DEe5xhBcGRxzhAY6eoR8fp0cHn5PQeyNRWZGfJiJXVBAI7zRPNauKgtaVtkxNTNAaYmjBGw3BYVPwlFAumeKKKdYoLaqslrLKn6Mt0xr9AmEqKSNqkmWCK6KklI1WMPOnDciCyIjtoHdxdMoTT7FS3qKTHUTjCJYhSvTRXwmYuO2Pu3d7zh66v4HxS7QFA6lfoXvq7HSx7l4HQPwGmRCLfQ
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+implications+of+synchronization+structure+in+parallel+programming&rft.jtitle=Parallel+computing&rft.au=Gonz%C3%A1lez-Escribano%2C+Arturo&rft.au=van+Gemund%2C+Arjan+J.C.&rft.au=Carde%C3%B1oso-Payo%2C+Valent%C3%ADn&rft.date=2009-08-01&rft.issn=0167-8191&rft.volume=35&rft.issue=8-9&rft.spage=455&rft.epage=474&rft_id=info:doi/10.1016%2Fj.parco.2009.07.002&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_parco_2009_07_002
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon