Description
All computing systems, from mobile
to supercomputers, are becoming heterogeneous parallel
computers using both multi-core CPUs and many-thread GPUs
for higher power efficiency and computation throughput.
While the computing community is racing to build tools and
libraries to ease the use of these heterogeneous parallel
computing systems, effective and confident use of these
systems will always require knowledge about the low-level
programming interfaces in these systems. This lecture is
designed to introduce through examples, based on the CUDA
programming language, the three abstractions that make the
foundations of GPU programming:
- Thread hierarchy
- Synchronization - Memory hierarchy/Shared Memory
The aim of this lecture is to give the audience a solid
foundation on which to start building their own first GPU
application.
|
Audience and benefits
This lecture targets physicists and engineers who are
interested in improving the performance of their software by
using off-the-shelf graphic car.
After this lecture, the attendees are expected to have a
good understanding of the principles that govern parallel
programming in CUDA and will be able to write their first
GPU-accelerated application. |