Computer Architecture and Performance
Tuning
|
Session
|
Description
|
Lecturer
|
Lecture 1
|
Understanding scalable hardware
The first part of this double lecture describes the
hardware architecture of a modern PC server
with processors based on the Intel Core
micro-architecture. Other processor
architectures, such as ARM, will also be
mentioned. Acceleration opportunities (but
also bottlenecks) in the architecture will
be covered in detail, not
just
inside the processor, but also related to
the memory hierarchy. The aim is to give
each student a good understanding of what
resources are available from a hardware
viewpoint.
|
Sverre Jarp
|
Lecture 2
|
Software
that scales with the hardware
In the second part of this double lecture we will
discuss several strategies which can allow
software to scale to the maximum resource
potential in a given architecture. These
strategies are based on both data and task
parallelism. We will stress the importance
of a Data Oriented Design and also mention
the issue of “performance portability”
across platforms. Some important factors
related to programming styles will be
reviewed. To back up everything with
evidence, several scalable examples from
physics will be portrayed.
|
Sverre Jarp
|
Lecture 3
|
Key aspects of
multi-threading
The vast majority of modern micro-processors come with
two to several dozen computing cores,
opening up new possibilities but also
creating some significant challenges. This
major shift in hardware has already been
underway many years ago, but the software
world is still struggling to take full
benefit of the new features. This lecture
goes into the details of key choices and
compromises associated with threaded
programming and scalability. New programming
paradigms are demonstrated alongside real
world technologies that can be used for
implementations.
|
Andrzej Nowak |
Lecture 4
|
Performance Optimization
Considering the rise of many-core processors,
performance tuning has become an even more
important step in software development.
Modern processor architectures often give us
the benefit of being able to look inside the
application from various angles, however
drawing high-level conclusions is not always
straightforward. The objective of this
lecture is to familiarize the attendees with
the topic of performance optimization
“where it matters” and with common techniques
used to define and improve application
efficiency. Language independent performance
tools for Linux will be demonstrated, in
order to obtain information about program
characteristics and bottlenecks.
|
Andrzej Nowak
|
Exercise 1
Exercise 2
Exercise 3
|
The aim of the exercises in this series is to give the
attendees a practical introduction to
performance oriented programming on Linux.
Advanced tools will be used during the
course, enabling the participants to
discover how the interaction of the code and
the hardware influences performance. The
participants will also be given the task of
correlating performance figures with certain
programming decisions. In addition, the
participants will understand the limits of
performance optimization and the ways to
establish at which point inside those limits
their workload is placed. The exercises will
be supported by demonstrating real world
problems in production environments,
including multi-threaded examples.
|
Sverre Jarp
Andrzej Nowak
|
Prerequisite
and
References |
Desirable Prerequisite
-
Basics of modern computer architecture
-
Basic knowledge about compilers
-
Familiarity with Linux and the C/C++ programming
languages
|
|