Session descriptions and details - SWSC2015

Node-level performance engineering

This course teaches performance engineering approaches on the compute node level. "Performance engineering" as we define it is more than employing tools to identify hotspots and bottlenecks. It is about developing a thorough understanding of the interactions between software and hardware. This process must start at the core, socket, and node level, where the code gets executed that does the actual computational work. Once the architectural requirements of a code are understood and correlated with performance measurements, the potential benefit of optimizations can often be predicted. We introduce a "holistic" node-level performance engineering strategy, apply it to different algorithms from computational science, and also show how an awareness of the performance features of an application may lead to notable reductions in power consumption.

  • Introduction and Motivation
  • Performance Engineering as a process
  • Topology and affinity in multicore systems
  • Microbenchmarking for architectural exploration
  • The Roofline Model
    • Basics and simple applications
    • Case study: sparse matrix-vector multiplication
    • Case study: Jacobi smoother
  • Model-guided optimization
    • Blocking optimization for the Jacobi smoother
  • Programming for optimal use of parallel resources
    • Single Instruction Multiple Data (SIMD)
    • Cache-coherent Non-Uniform Memory Architecture (ccNUMA)
    • Simultaneous Multi-Threading (SMT)
  • Pattern-guided performance engineering
    • Hardware performance metrics
    • Typical performance patterns in scientific computing
    • Examples and best practices
  • Beyond Roofline: The ECM Model
  • Optional: Energy-efficient code execution

Prerequisites

A good working knowledge of C, C++ or Fortran is required, as well as basic knowledge of OpenMP.

When

9 April 2015

Where

  • KULeuven, Opleidingscentrum D, KU Leuven, Willem De Croylaan 52A BE-3001 Heverlee, Belgium

About the lecturer

Dr. Georg Hager Regionales RechenZentrum Erlangen, Germany.

MPI

The Message Passing Interface (MPI) is a standardized library specification for message passing between different processes. In layman's terms: MPI provides mechanisms for handling the data communication in a parallel program. It is particularly suited for computational clusters, where the workstations are connected by an interconnection network (e.g. Infiniband, Gigabit Ethernet).
In this workshop, the applicability of MPI will be compared to other parallel programming paradigms such as OpenMP, Cuda and MapReduce. Next, the basic principles of MPI will be gradually introduced (Point-to-point communication, collective communication, MPI datatypes, etc). Hands-on exercises allow the participants to immediately turn the newly acquired skills into practice. Finally, some more theoretical considerations regarding scalability of algorithms are presented.

When

22 April 2015

Where

About the lecturer

Dr. Jan Fostier received his MS and PhD degree in physical engineering from Ghent University in 2005 and 2009 respectively. Currently, he is appointed assistant professor in the department of Information Technology (INTEC) at the same university. His main research interests are (parallel) algorithms for biological sciences, high performance computing and computational electromagnetics.

Migrating old to modern Fortran

This one-day workshop provides guidance to migrate old F77 fortran software to a more modern version.
Participating in the workshop requires a good knowledge of Fortran.

When

20 May 2015

Where

  • KULeuven, Opleidingscentrum D, KU Leuven, Willem De Croylaan 52A BE-3001 Heverlee, Belgium

About the lecturer

Dr. Reinhold Bader studied physics and mathematics at the Ludwigs-Maximilians University in Munich, completing his studies with a PhD in theoretical solid-state physics in 1998. Since the beginning of 1999, he has worked at Leibniz Supercomputing Centre (LRZ) as a member of the scientific staff, being involved in HPC user support, procurements of new systems, benchmarking of prototypes in the context of the PRACE project, courses for parallel programming, and configuration management for the HPC systems deployed at LRZ. He is currently group leader of the HPC services group at LRZ, which is responsible for operation of all HPC-related systems and system software packages at LRZ.

Introduction to multithreading and OpenMP

This workshop gives an introduction to shared-memory parallel programming and optimization on modern multicore systems, focusing on OpenMP. This is the dominant shared-memory programming model in computational science. Parallelism, multicore architecture and the most important shared-memory programming models are discussed. These topics are then applied in the hands-on exercises.

When

21-22 May 2015

Where

About the lecturer

Dr. Reinhold Bader studied physics and mathematics at the Ludwigs-Maximilians University in Munich, completing his studies with a PhD in theoretical solid-state physics in 1998. Since the beginning of 1999, he has worked at Leibniz Supercomputing Centre (LRZ) as a member of the scientific staff, being involved in HPC user support, procurements of new systems, benchmarking of prototypes in the context of the PRACE project, courses for parallel programming, and configuration management for the HPC systems deployed at LRZ. He is currently group leader of the HPC services group at LRZ, which is responsible for operation of all HPC-related systems and system software packages at LRZ.