Presentations by Todd Gamblin (LLNL) & Georg Rath (NERSC)

Computer scientists from renowned HPC centers present Spack, Slurm and Performance Analysis tools

Date

Friday 2 February 2018 from 10:30 to 17:00

Agenda

Topics covered

[10.30 - 12:00] The Spack Project: State of the community and roadmap (Todd Gamblin, LLNL)

Spack (https://spack.io) is a flexible package manager for HPC that allows users to install packages built with multiple versions, compilers, build configurations, and dependency versions.  Spack is targeted both at HPC cluster administrators and at HPC developers, who frequently need to build many versions of complex applications in user space.  Spack has a rapidly growing community, both in the US at DOE laboratories, and at HPC centers, industry, and academia.  This talk will give a brief overview of Spack, its dependency and installation model, as well as ongoing efforts in the community, such as binary packaging, better dependency resolution, and finer-grained support for architectures like ARM.

(slides - recording)

 

[14.30 - 16:00] Performance Analysis is Data Science (Todd Gamblin, LLNL)

Understanding the performance of a large HPC facility is very complex.  Job runtimes can vary from run to run, and performance depends on many factors: network congestion, filesystem contention, and application input parameters.  Traditional performance tools allow easy analysis of a single run, but tools that can look deeply into performance of applications across the center are rare.  This talk will cover a number of efforts at Livermore Computing to do HPC center-wide performance analysis.  We will discuss LLNL’s ongoing Sonar project, which aims to set up a performance cluster, several efforts to analyze and tune applications using performance data from the center, and the recent center-wide deployment of JupyterHub for data analysis, and how we plan to use it for HPC Center performance data.

(slides) - recording)

 

[16:00 - 17:00] Slurm in Action: Batch Processing for the 21st Century (Georg Rath, NERSC)

This talk will give an overview over how we use Slurm to schedule the workloads of over 6000 scientists at NERSC, while providing high throughput, ease of use and ultimately user satisfaction.

With the emergence of data-intensive applications it was necessary to update the classic scheduling infrastructure to handle things like user defined software stacks (read: containers), data movement and storage provisioning. We did all of this and more through facilities provided by Slurm. In addition to these features we will discuss priority management and quality of service and how that can greatly improve the user experience of computational infrastructures.

(recording)

Venue

Multimediaroom, building S9, Campus De Sterre, Krijgslaan 281, 9000 Gent

A group lunch will be provided, along with water & coffee/tea, for all registered attendees (free of charge).

Organization and lecturers

Workshop organized by HPC-UGent

Lecturers are:

Todd Gamblin

Todd Gamblin is a computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory (LLNL). His research focuses on scalable tools for measuring, analyzing, and visualizing performance data from massively parallel applications. Todd is also involved with many production projects at LLNL. He works with Livermore Computing’s Development Environment Group to build tools that allow users to deploy, run, debug, and optimize their software for machines with million-way concurrency. He is also the lead developer of Spack (https://spack.io).

Georg Rath

Georg Rath is working as a systems engineer at the National Energy Research Scientific Computing Center (NERSC). He is the developer of OGRT (https://github.com/georg-rath/ogrt), a tool designed to track user processes on a HPC cluster.

Registration

Follow this link to register for these lectures: