Parallel Performance Evaluation and Optimization
This chapter covers the most important aspects of shared‐memory parallel programming that impact performance. It gives guidance for diagnosing such issues in order to assist in performance tuning. The chapter overviews the performance impact of cache coherence, and presents the guidelines for minimi...
Uloženo v:
| Vydáno v: | Programming multi‐core and many‐core computing systems s. 343 - 362 |
|---|---|
| Hlavní autor: | |
| Médium: | Kapitola |
| Jazyk: | angličtina |
| Vydáno: |
Hoboken, NJ, USA
John Wiley & Sons, Inc
24.01.2017
|
| Témata: | |
| ISBN: | 0470936908, 9780470936900 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | This chapter covers the most important aspects of shared‐memory parallel programming that impact performance. It gives guidance for diagnosing such issues in order to assist in performance tuning. The chapter overviews the performance impact of cache coherence, and presents the guidelines for minimizing these overheads: minimize write sharing and avoid false sharing. Nonuniform memory access (NUMA) systems present a challenge to application performance because, depending on where a thread is running and which memory address it's accessing, the performance of the application may vary. This presents developers with the additional burden of ensuring that their applications do not suffer from NUMA latency effects. The chapter describes how this may be accomplished. I/O latency can be a major source of serialization in a parallel application. The best way to deal with I/O is to overlap it with other work when possible. |
|---|---|
| ISBN: | 0470936908 9780470936900 |
| DOI: | 10.1002/9781119332015.ch17 |

