Specialist Workshops in Scientific Computing - SWSC2017

This series of workshops features seminars on important HPC topics, introduced by experts in the field. Fundamental topics, such as multithreading and multiprocessing, Message Passing Interface (MPI), should feature in every edition. This can be complemented with other, more specialist topics related to particular hardware, such as GPU programming and “Big Data”-related paradigms (e.g. Google MapReduce and Hadoop), scientific program languages (e.g. Fortran), or applications specific to certain research fields.

These workshops are organised in collaboration with the Flemish Supercomputing Center (VSC), all Flemish Universities and their Doctoral Schools.

Program

Every edition of SWSC spans several 1- or 2-day workshops, with sandwich lunches at the workshop locations.

The different workshops are not intended to be back-to-back and are instead spread out in time, to maximise possibility for attendance.

Scheduled workshops in 2017:

Spring

Tue-Wed 18-19 April 2017: Introduction to multithreading and OpenMP (Reinhold Bader)
Wed 3 May 2017: Introduction to MPI (Jan Fostier)

Fall

Thu-Fri Nov 23-24 2017: Introduction to Machine Learning Algorithms (Morris Riedel)
Thu-Fri Nov 30-Dec 1 2017: Deep Learning using a Convolutional Neural Network (Morris Riedel)

(being planned)

Introduction to CP2K

Course Material

Lecture notes, additional reading and other teaching material will be distributed to all participants
Participants are invited to bring their own laptop computer for the hands-on sessions during each of the workshops

Target audience

The series of Specialist Workshops in Scientific Computing is especially targeting PhD students. Postdocs, staff members or non-academic interested parties can also participate depending on availability. Priority will be given to PhD students.

Subscription

Subscription is free of charge.

All participants to a session will receive a certificate of attendance upon request.

SWSC2017 can be followed as part of the Ghent University Doctoral Training Programme: for the Spring session students should attend all three days in order to have it included in their curriculum as one specialist course; the workshops of the Fall session can count as two separate specialist courses.

Introduction to multithreading and OpenMP

This workshop gives an introduction to shared-memory parallel programming and optimization on modern multicore systems, focusing on OpenMP. This is the dominant shared-memory programming model in computational science. Parallelism, multicore architecture and the most important shared-memory programming models are discussed. These topics are then applied in the hands-on exercises.

About the lecturer: Dr. Reinhold Bader. Reinhold studied physics and mathematics at the Ludwigs-Maximilians University in Munich, completing his studies with a PhD in theoretical solid-state physics in 1998. Since the beginning of 1999, he has worked at Leibniz Supercomputing Centre (LRZ) as a member of the scientific staff, being involved in HPC user support, procurements of new systems, benchmarking of prototypes in the context of the PRACE project, courses for parallel programming, and configuration management for the HPC systems deployed at LRZ. He is currently group leader of the HPC services group at LRZ, which is responsible for operation of all HPC-related systems and system software packages at LRZ.

Course prerequisites

Participants should be able to work on the Unix/Linux command line, have a minimal level of programming skills (Fortran or C), and have a general understanding of computer architecture.

Practical information

Tuesday-Wednesday 18-19 April 2017, 9am - 6pm
Venue: Oude Infirmerie, het Pand, Onderbergen 1, 9000 Gent

Introduction to MPI

The Message Passing Interface (MPI) is a standardized library specification for message passing between different processes. In layman's terms: MPI provides mechanisms for handling the data communication in a parallel program. It is particularly suited for computational clusters, where the workstations are connected by an interconnection network (e.g. Infiniband, Gigabit Ethernet).
In this workshop, the applicability of MPI will be compared to other parallel programming paradigms such as OpenMP, Cuda and MapReduce. Next, the basic principles of MPI will be gradually introduced (Point-to-point communication, collective communication, MPI datatypes, etc). Hands-on exercises allow the participants to immediately turn the newly acquired skills into practice. Finally, some more theoretical considerations regarding scalability of algorithms are presented.

About the lecturer: Dr. Jan Fostier. Jan received his MS and PhD degree in physical engineering from Ghent University in 2005 and 2009 respectively. Currently, he is appointed assistant professor in the department of Information Technology (INTEC) at the same university. His main research interests are (parallel) algorithms for biological sciences, high performance computing and computational electromagnetics.

Course prerequisites

Participants should be able to work on the Unix/Linux command line, have a minimal level of programming skills (Fortran or C), and have a general understanding of computer architecture.

Practical information

Wednesday 3 May 2017, 9am - 6pm

Venue: Multimediaroom, building S9, Campus De Sterre, Krijgslaan 281, 9000 Gent

Introduction to Machine Learning Algorithms

This course offers basics of analysing datasets with machine learning algorithms and data mining techniques in order to understand foundations of learning from large quantities of data. It starts with general methods for data analysis in order to understand clustering, classification, and regression. This includes a thorough discussion of test datasets, training datasets, and validation datasets required to learn from data with a high accuracy. Easy application examples will foster the theoretical course elements that also will illustrate problems like overfitting followed by mechanisms such as validation and regularisation that prevent such problems.

The course will start from a very simple application example in order to teach foundations like the role of features in data, linear separability, or decision boundaries for machine learning models. In particular this course will point to key challenges in analysing large quantities of data sets (aka ‘big data’) in order to motivate the use of parallel and scalable machine learning algorithms. Hands-on exercises allow the participants to immediately turn the newly acquired skills into practice. After this course participants will have a general understanding how to approach data analysis problems in a systematic way including knowledge where parallel computing provide benefits.

About the lecturer: Prof. Dr. – Ing. Morris Riedel received his PhD from the Karlsruhe Institute of Technology (KIT) and he is the head of the ‘high productivity data processing’ research group of the Juelich Supercomputing Centre (JSC) in Germany. As an adjunct associated professor at the School of Natural Sciences and Engineering of the University of Iceland he teaches ‘High Performance Computing’, ‘Cloud Computing and Big Data’, as well as ‘Statistical Data Mining’ and all of these courses are on the intersection of parallel computing and machine learning. He has given tutorials like the course above at numerous occasions like at the Barcelona Supercomputing Centre, Smart Data Innovation Conference, or Prace Spring School in Cyprus. His research interests are parallel and scalable machine learning and data science. (More info at http://www.morrisriedel.de )

Course prerequisites

Participants should be able to work on the Unix/Linux command line, have a basic level of understanding of batch scripts required for HPC application submissions, and have a minimal knowledge of probability, statistics, and linear algebra.

Practical information

Thursday-Friday November 23-24 2017, 10am - 5pm

Venue: Multimediaroom, building S9, Campus De Sterre, Krijgslaan 281, 9000 Gent

Course materials

Recordings are available at https://www.youtube.com/playlist?list=PLrmNhuZo9sgbcWtMGN0i6G9HEvh08JG0J

Slides day 1:
Slides day 2:
datasets & scripts are available on the HPC-UGent infrastructure at /apps/gent/tutorials/machine_learning

Deep Learning using a Convolutional Neural Network

This course part focuses on a recent machine learning method known as deep learning that emerged as a promising disruptive approach, allowing knowledge discovery from large datasets in an unprecedented effectiveness and efficiency. It is particularly relevant in research areas, which are not accessible through modelling and simulation often performed in HPC. Traditional learning, which was introduced in the 1950s and became a data-driven paradigm in the 90s, is usually based on an iterative process of feature engineering, learning, and modelling. Although successful on many tasks, the resulting models are often hard to transfer to other datasets and research areas.

This course provides an introduction into deep learning and its inherent ability to derive optimal and often quite generic problem representations from the data (aka ‘feature learning’). Concrete architectures such as Convolutional Neural Networks (CNNs) will be applied to real datasets of applications using known deep learning frameworks such as Tensorflow, Keras, or Torch. As the learning process with CNNs is extremely computational-intensive the course will cover aspects of how parallel computing can be leveraged in order to speed-up the learning process using general purpose computing on graphics processing units (GPGPUs). Hands-on exercises allow the participants to immediately turn the newly acquired skills into practice. After this course participants will have a general understanding for which problems CNN learning architectures are useful and how parallel and scalable computing is facilitating the learning process when facing big datasets.

Course prerequisites

Practical information

Thursday-Friday November 30-December 1 2017, 10am - 5pm

Venue: Multimediaroom, building S9, Campus De Sterre, Krijgslaan 281, 9000 Gent

Course materials

Recordings are available at https://www.youtube.com/playlist?list=PLrmNhuZo9sgZUdaZ-f6OHK2yFW1kTS2qF
Slides day 1:
Slides day 2:
datasets & scripts are available on the HPC-UGent infrastructure at /apps/gent/tutorials/deep_learning

Contact Information

Don't hesitate to contact hpc@ugent.be with any questions.

Alternatively consult your local VSC contact.

Supported by the Flemish Government