M19 - Machine Learning with Python

Type of Course - Dates - Venue - Description - Target audience - Exam - IMPORTANT: Incorporation in DTP and reimbursement by DS
Course prerequisites - Teachers - Course material - Fees - Enrol

Type of course

 This is an on campus course, with blended learning options.

Dates

Seven Monday evenings in April, May and June 2022: April 25, May 2, 9, 16, 23 and 30, June 13, 2022, from 5.30 pm to 9 pm.
Please note: the deadline for UGent PhD students who want a refund to open a dossier on the DS website (Application for Recognition) is March 25, 2022.

Venue

 To be confirmed

Description

Many modern digital applications increasingly rely on machine learning as a means to derive predictive strength from high-dimensional data sets. Compared to traditional statistics, the absence of a focus on scientific hypotheses, and the need for easily leveraging detailed signals in the data require a different set of models, tools, and analytical reflexes.

This course aims to bring participants to the level where they can independently tackle the analytical part of data mining projects. This means that the most common types of projects will be addressed - regression-type with continuous outcomes, classification with categorical outcomes, and clustering. For each of these, the practical use of a set of standard methods will be shown, like Random Forests, Gradient Boosting Machines, Support Vector Machines, k-Nearest-Neighbors, K-means,... Furthermore, throughout the course, concepts will be highlighted that are of concern in every statistical learning applications, like the curse of dimensionality, model capacity, overfitting and regularization, and practical strategies will be offered to deal with them, introducing techniques such as the Lasso and ridge regression, cross-validation, bagging and boosting. Instructions will also be given on a selection of specific techniques that are often of interest, such as modern visualization of high-dimensional data, model calibration, outlier detection using isolation forests, explanation of black-box models,... Finally, the last lecture will introduce the idea of deep learning as a powerful tool for data analysis, discussing when and how to practically use it, and when to shy away from it.

    Target audience

    This course targets professionals and investigators from all areas that are involved in predictive modeling based on large and/or high-dimensional databases.

    Exam

    Participants can, if they wish, take part in an exam. Upon succeeding in this test a certificate from Ghent University will be issued. The exam will consist in completing an individual take-home project.

    Incorporation in DTP and reimbursement from DS for UGent PhD students

    As a UGent PhD student, to be able to incorporate this course in your Doctoral Training Program (DTP) and get a reimbursement of the registration fee from your Doctoral School (DS) you need to follow strict rules: please take the necessary action in time. The deadline to open a dossier on the DS website (Application for Recognition) for this course is March 25, 2022.

    Please note: For UGent PhD students it is no longer necessary to participate/succeed in this exam to be able to incorporate the course in the DTP.

    Course prerequisites

    Participants are expected to be familiar with basic statistical modeling (as for instance taught in Module 2 of this program), and to have a had a first experience programming in Python (as for instance taught in Module 4 of this program).

    Teacher

    Foto Bart Van RompayeAs the head of Advanced Analytics and Machine Learning at KPMG, dr. Bart Van Rompaye heads a group of data scientists applying modern data analytical approaches to a broad range of problems in a wide variety of sectors. Before this, Bart was active for 6 years as a Lead Data Scientist within KBC Group, creating products such as Matti, Indigo (Czech Republic) and the first AI-assisted investment fund in Belgium. He obtained his PhD at Ghent University on issues in survival analysis, and held postdoctoral positions at Ghent University and Umea University, Sweden. In the past, he has taught numerous courses for the Master in Statistical Data Analysis, the Institute for Continuing Education in Science, and FLAMES, the Flanders Training Network for Methodology and Statistics.

    Course material

    Copies of the slides and Python code notebooks.

    Fees

    A different price applies, depending on your main type of employment.

    Employment Module 19 Exam
    Industry/Private sector1 1320 30
    Non-profit, government, higher education staff2 990 30
    (Doctoral) students, retired, unemployed2 445 30

    1 If two or more employees from the same company enrol simultaneously for this course a reduction of 20% on the module price is taken into account starting from the second enrolment.

    2 UGent staff and UGent doctoral students who pay internally via SAP or internal transfer can participate at these special prices.

    Enrol for this course