New
Programming Massively Parallel Processors,
Edition 5 A Hands-on ApproachEditors: By Wen-mei W. Hwu, David B. Kirk and Izzat El Hajj
Publication Date:
01 Sep 2026
No accessibility information available.
Programming Massively Parallel Processors: A Hands-on Approach Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This new edition has been updated with an expanded repertoire of optimizations, new patterns and applications, ad more coverage of important CUDA features.
Key Features
· Expanded optimization checklist with a more comprehensive demonstration of essential optimizations across patterns
· New pattern and application chapters including: filtering, wavefront parallelism, advanced optimizations for matrix multiplication, and large language models (LLMs) · More coverage of important CUDA features including warp-level programming, cooperative groups, CUDA C++ atomics, and multi-GPU programming with NCCL and NVSHMEM
About the author
By Wen-mei W. Hwu, CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA; David B. Kirk, NVIDIA Fellow and Izzat El Hajj, Assistant Professor, Department of Computer Science, American University of Beirut, Lebanon
1. Introduction
Part I. Fundamental Concepts
2. Heterogeneous data parallel computing
3. Multidimensional grids and data
4. Compute architecture and scheduling
5. Memory architecture and data locality
6. Performance considerations
Part II. Parallel Patterns
7. Convolution
8. Stencil
9. Parallel histogram
10. Reduction
11. Prefix sum (scan)
12. Merge
Part III. Advanced Patterns and Applications
13. Sorting
14. Filtering (new)
15. Sparse matrix computation
16. Wavefront Algorithms (new)
17. Graph traversal
18. Deep learning
19. Multi-GPU API (new)
20. Electrostatic potential map
21. Parallel programming and computational thinking
Part IV. Advanced Practices
22. Programming a heterogeneous computing cluster
23. Advanced Optimizations for Matrix Multiplication (new)
24. Advanced practices and future evolution
25. Conclusion and outlook
Part I. Fundamental Concepts
2. Heterogeneous data parallel computing
3. Multidimensional grids and data
4. Compute architecture and scheduling
5. Memory architecture and data locality
6. Performance considerations
Part II. Parallel Patterns
7. Convolution
8. Stencil
9. Parallel histogram
10. Reduction
11. Prefix sum (scan)
12. Merge
Part III. Advanced Patterns and Applications
13. Sorting
14. Filtering (new)
15. Sparse matrix computation
16. Wavefront Algorithms (new)
17. Graph traversal
18. Deep learning
19. Multi-GPU API (new)
20. Electrostatic potential map
21. Parallel programming and computational thinking
Part IV. Advanced Practices
22. Programming a heterogeneous computing cluster
23. Advanced Optimizations for Matrix Multiplication (new)
24. Advanced practices and future evolution
25. Conclusion and outlook
ISBN:
9780443439001
Page Count:
680
Retail Price (USD)
:
Upper-level undergraduate through graduate level students studying parallel computing within computer science or engineering