M13 - Upgrade your Python Skills: Data Wrangling & Plotting

Type of Course - Dates - Venue - Description - Target audience - Exam - IMPORTANT: Incorporation in DTP and reimbursement by DS
Course prerequisites - Teachers - Course material - Fees - Enrol

Type of course

 This is an on campus course, with blended learning options.

Dates

5 Monday and Thursday evenings in March 2022: March 7, 10, 14, 17 and 24, 2022, from 5.30 pm to 9 pm.
Please note: The deadline for UGent PhD students who want a refund to open a dossier on the DS website (Application for Recognition) is February 7, 2022.

Venue

 To be confirmed

Description

The handling of data is a recurring task for data analysts. Reading in experimental data, checking its properties, and creating visualisations may become tedious tasks. Hence, increasing the efficiency in this process is beneficial for many professionals handling data. Spreadsheet-based software lacks the ability to properly support this process, due to the lack of automation and repeatability. The usage of a high-level scripting language such as Python is ideal for these tasks. 

This course trains participants to use Python effectively to do these tasks. The course focuses on data manipulation and cleaning of tabular data, explorative analysis and visualisation using some important packages such as Pandas, Numpy, Matplotlib and Seaborn.

After setting up the programming environment with the required packages using the conda package manager and an introduction of the Jupyter notebook environment, the data analysis package Pandas and the plotting packages Matplotlib and Seaborn are introduced. Advanced usage of Pandas for different data cleaning and manipulation tasks is taught and the acquired skills will immediately be brought into practice to handle real-world data sets. Applications include time series handling, categorical data, merging data, geospatial data,...

The course closes with a discussion on the scientific Python ecosystem and the visualisation landscape learning participants to create interactive charts.

The course does not cover statistics, data mining, machine learning, or predictive modelling. It aims to provide participants the means to effectively tackle commonly encountered data handling tasks in order to increase the overall efficiency. These skills are both useful for data cleaning as well as feature engineering.

All sessions are hands-on in Jupyter notebooks.

Target audience

The course is intended for professionals who wish to enhance their general data manipulation and visualization skills in Python, with a specific focus on tabular data.

Exam

There is no exam connected to this module. Participants receive a certificate of attendance via e-mail at the end of the course.

Incorporation in DTP and reimbursement from DS for UGent PhD students

As a UGent PhD student, to be able to incorporate this course in your Doctoral Training Program (DTP) and get a reimbursement of the registration fee from your Doctoral School (DS) you need to follow strict rules: please take the necessary action in time. The deadline to open a dossier on the DS website (Application for Recognition) for this course is February 7, 2022. Please note that opening a dossier does not mean that you are enrolled. You still need to enrol via the registration form on this site.

Course prerequisites

Basic programming skills are required. A basic (scientific) programming course should suffice. For those who have experience in another programming language (e.g. Matlab, R, ...), following a Python tutorial prior to the course is strongly recommended. A good introduction is the ‘Python language introduction’ section of the Scipy lecture notes: https://scipy-lectures.org/intro/language/python_language.html.

Teacher

Foto Joris Van den BosscheDr. Joris Van den Bossche is a core contributor to Pandas, the main data analysis library in Python and has given several tutorials on this topic at international conferences (Scipy, EuroScipy, PyData Paris). He did a PhD at Ghent University (Faculty of Bio-science Engineering, Department of Mathematical Modelling, Statistics and Bio-informatics) and VITO in air quality research, worked at the Paris-Saclay Center for Data Science, and, currently is a freelance software developer and teacher.


Foto Stijn Van HoeyDr. Stijn Van Hoey is currently working as Research Software Engineer at Fluves, an engineering company operating in water and energy markets, combined with freelance teaching. Before that, he was Research Software Engineer and Open Data Publisher at INBO, supporting and automating the cleaning and publishing of data. Formerly, he did a PhD at Ghent University (Faculty of Bio-science Engineering, Department of Mathematical Modelling, Statistics and Bio-informatics) in collaboration with VITO and was a teaching assistant for courses on modelling and simulation of environmental systems, process control and scientific programming.

Course material

Files and data will be made available via github. A github respository with instructions will be available before the course starts.

Fees

A different price applies, depending on your main type of employment.

Employment Module 13
Industry/Private sector1 600
Non-profit, government, higher education staff2 450
(Doctoral) students, retired, unemployed2 205

1 If two or more employees from the same company enrol simultaneously for this course a reduction of 20% on the module price is taken into account starting from the second enrolment.

2 UGent-staff and UGent doctoral students who pay internally via SAP or internal transfer can participate at these special rates.

Enrol for this course