Structured parallel programming : patterns for efficient computation

Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: McCool, Michael, Robison, Arch D., Reinders, James
Format: E-Book Buch
Sprache:Englisch
Veröffentlicht: Waltham, Mass Morgan Kaufmann 2012
Elsevier Science & Technology
Ausgabe:1
Schlagworte:
ISBN:9780124159938, 0124159931
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • A.3 Parallel Programming Models -- Appendix B: Cilk Plus -- B.1 Shared Design Principles with TBB -- B.2 Unique Characteristics -- B.3 Borrowing Components from TBB -- B.4 Keyword Spelling -- B.5 cilk_for -- B.6 cilk_spawn and cilk_sync -- B.7 Reducers (Hyperobjects) -- B.8 Array Notation -- B.9 #pragma simd -- B.10 Elemental Functions -- B.11 Note on C++11 -- B.12 Notes on Setup -- B.13 History -- B.14 Summary -- Appendix C: TBB -- C.1 Unique Characteristics -- C.2 Using TBB -- C.3 parallel_for -- C.4 parallel_reduce -- C.5 parallel_deterministic_reduce -- C.6 parallel_pipeline -- C.7 parallel_invoke -- C.8 task_group -- C.9 task -- C.10 atomic -- C.11 enumerable_thread_specific -- C.12 Notes on C++11 -- C.13 History -- C.14 Summary -- Appendix D: C++11 -- D.1 Declaring with auto -- D.2 Lambda Expressions -- D.3 std::move -- Appendix E: Glossary -- Bibliography -- Index
  • 8.4 Choosing Base Cases -- 8.5 Load Balancing -- 8.6 Complexity of Parallel Divide-and-Conquer -- 8.7 Karatsuba Multiplication of Polynomials -- 8.8 Cache Locality and Cache-Oblivious Algorithms -- 8.9 Quicksort -- 8.10 Reductions and Hyperobjects -- 8.11 Implementing Scan with Fork-Join -- 8.12 Applying Fork-Join to Recurrences -- 8.13 Summary -- CHAPTER 9. Pipeline -- 9.1 Basic Pipeline -- 9.2 Pipeline with Parallel Stages -- 9.3 Implementation of a Pipeline -- 9.4 Programming Model Support for Pipelines -- 9.5 More General Topologies -- 9.6 Mandatory versus Optional Parallelism -- 9.7 Summary -- PART II: Examples -- CHAPTER 10. Forward Seismic Simulation -- 10.1 Background -- 10.2 Stencil Computation -- 10.3 Impact of Caches on Arithmetic Intensity -- 10.4 Raising Arithmetic Intensity with Space-Time Tiling -- 10.5 Cilk Plus Code -- 10.6 ArBB Implementation -- 10.7 Summary -- CHAPTER 11. K-Means Clustering -- 11.1 Algorithm -- 11.2 K-Means with Cilk Plus -- 11.3 K-Means with TBB -- 11.4 Summary -- CHAPTER 12. Bzip2 Data Compression -- 12.1 The Bzip2 Algorithm -- 12.2 Three-Stage Pipeline Using TBB -- 12.3 Four-Stage Pipeline Using TBB -- 12.4 Three-Stage Pipeline Using Cilk Plus -- 12.5 Summary -- CHAPTER 13. Merge Sort -- 13.1 Parallel Merge -- 13.2 Parallel Merge Sort -- 13.3 Summary -- CHAPTER 14. Sample Sort -- 14.1 Overall Structure -- 14.2 Choosing the Number of Bins -- 14.3 Binning -- 14.4 Repacking and Subsorting -- 14.5 Performance Analysis of Sample Sort -- 14.6 For C++ Experts -- 14.7 Summary -- CHAPTER 15. Cholesky Factorization -- 15.1 Fortran Rules! -- 15.2 Recursive Cholesky Decomposition -- 15.3 Triangular Solve -- 15.4 Symmetric Rank Update -- 15.5 Where Is the Time Spent? -- 15.6 Summary -- Appendices -- Appendix A: Further Reading -- A.1 Parallel Algorithms and Patterns -- A.2 Computer Architecture Including Parallel Systems
  • Front Cover -- Structured Parallel Programming: Patterns for Efficient Computation -- Copyright Page -- Table of Contents -- Listings -- Preface -- Preliminaries -- CHAPTER 1. Introduction -- 1.1 Think Parallel -- 1.2 Performance -- 1.3 Motivation: Pervasive Parallelism -- 1.4 Structured Pattern-Based Programming -- 1.5 Parallel Programming Models -- 1.6 Organization of this Book -- 1.7 Summary -- CHAPTER 2. Background -- 2.1 Vocabulary and Notation -- 2.2 Strategies -- 2.3 Mechanisms -- 2.4 Machine Models -- 2.5 Performance Theory -- 2.6 Pitfalls -- 2.7 Summary -- PART I: Patterns -- CHAPTER 3. Patterns -- 3.1 Nesting Pattern -- 3.2 Structured Serial Control Flow Patterns -- 3.3 Parallel Control Patterns -- 3.4 Serial Data Management Patterns -- 3.5 Parallel Data Management Patterns -- 3.6 Other Parallel Patterns -- 3.7 Non-Deterministic Patterns -- 3.8 Programming Model Support for Patterns -- 3.9 Summary -- CHAPTER 4. Map -- 4.1 Map -- 4.2 Scaled Vector Addition (SAXPY) -- 4.3 Mandelbrot -- 4.4 Sequence of Maps versus Map of Sequence -- 4.5 Comparison of Parallel Models -- 4.6 Related Patterns -- 4.7 Summary -- CHAPTER 5. Collectives -- 5.1 Reduce -- 5.2 Fusing Map and Reduce -- 5.3 Dot Product -- 5.4 Scan -- 5.5 Fusing Map and Scan -- 5.6 Integration -- 5.7 Summary -- CHAPTER 6. Data Reorganization -- 6.1 Gather -- 6.2 Scatter -- 6.3 Converting Scatter to Gather -- 6.4 Pack -- 6.5 Fusing Map and Pack -- 6.6 Geometric Decomposition and Partition -- 6.7 Array of Structures vs. Structures of Arrays -- 6.8 Summary -- CHAPTER 7. Stencil and Recurrence -- 7.1 Stencil -- 7.2 Implementing Stencil with Shift -- 7.3 Tiling Stencils for Cache -- 7.4 Optimizing Stencils for Communication -- 7.5 Recurrence -- 7.6 Summary -- CHAPTER 8. Fork-Join -- 8.1 Definition -- 8.2 Programming Model Support for Fork-Join -- 8.3 Recursive Implementation of Map