M10 - High Dimensional Data Analysis

Target audience

This course targets professionals and investigators from all areas that are high-dimensional.

Description

Modern high throughput technologies easily generate data on thousands of variables; e.g. health care data, genomics, chemometrics, environmental monitoring, web logs, movie ratings, etc. Conventional statistical methods are no longer suited for effectively analysing such high-dimensional data. Multivariate statistical methods may be used, but for often the dimensionality of the data set is much larger than the number of (biological) samples. Modern advances in statistical data analyses allow for the appropriate analysis of such data. Methods for the analysis of high dimensional data rely heavily on multivariate statistical methods. Therefore a large part of the course content is devoted to multivariate methods, but with a focus on high dimensional settings and issues. Multivariate statistical analysis covers many methods. In this course a selection of techniques is covered based on our experience that they are frequently used in industry and research institutes. The course is taught using case studies with applications from different fields (analytical chemistry, ecology, biotechnology, genomic, etc.).


Content:

  1. Dimension reduction: Singular Value Decomposition (SVD), Principal Component Analysis (PCA), Multidimensional Scaling (MDS) and biplots for dimension-reduced data visualisation
  2. Sparse SVD and sparse PCA
  3. Prediction with high dimensional predictors: principal component regression; ridge, lasso and elastic net penalised regression methods
  4. Classification (prediction of class membership): (penalised) logistic regression and linear discriminant analysis
  5. Evaluation of prediction models: sensitivity, specificity, ROC curves, mean squared error, cross validation
  6. Clustering
  7. Large scale hypotheses testing: FDR, FDR control methods, empirical Bayes (local) FDR control

Course prerequisites

Course prerequisites are ready at hand knowledge of basic statistics: data exploration and descriptive statistics, statistical modeling, and inference: linear models, confidence intervals, t-tests, F-tests, anova, chi-squared test, such as covered in

  • Module 4 - Drawing Conclusions from Data: an Introduction
  • Module 8 - Exploiting Sources of Variation in your Data: the ANOVA Approach
  • Module 11 - Explaining and Predicting Outcomes with Linear Regression of this years' course program.

Exam / Certificate

There is no exam connected to this module. If you attend all classes you will receive a certificate of attendance via e-mail at the end of the course.

Type of course

This is an on campus course.

Schedule

6 evenings in February: February 6, 8, 13, 15 20 & 22, 2024 from 5.30 pm to 9.30 pm.

Venue

UGent, Faculty of Science, Campus Sterre, Krijgslaan 281, Ghent. Building S1 & S9.

Teacher

Lieven Clement Prof. Lieven Clement is an Associate Professor of Statistical Genomics at Ghent University. He is an expert in developing statistical methods and open source tools for differential omics data analysis. His lab is built around two strategic research pillars each connected to an omics domain: (single cell) transcriptomics and proteomics. He is a member of the core team that established a new Master of Science in Bioinformatics at Ghent University and has a track record in teaching statistics, statistical genomics and high dimensional data analysis to students in the life sciences and statistical data-analysis. He also gives short courses in statistics and proteomics data analysis in prominent bioinformatics programmes in Europe (Wellcome Trust Advanced Courses and Gulbenkian Institute Training Programme, amongst others). He is a strong advocate of open and reproducible science and teaching to empower researchers and students with freely available, user-friendly, operating system independent, state-of-the-art bioinformatics tools, and by making all research code, data and teaching materials well documented, open and accessible.

Course material

Access to lecture notes and data files

Fees

The participation fee is 1320 EUR for participants from the private sector. Reduced prices apply to students and staff from non-profit, social profit, and government organizations.

Employment Course fee (€)
Industry, private sector, profession 1320
Nonprofit, government, higher education staff 990
(Doctoral) student, unemployed 595

Register

Register for this course

UGent PhD students

As UGent PhD student you can incorporate this 'specialist course' in your Doctoral Training Program (DTP). To get a refund of the registration fee from your Doctoral School (DS) please follow these strict rules and take the necessary action in time. The deadline to open a dossier on the DS website (Application for Registration) for this course is January 4, 2024.

Opening a dossier with your DS does not mean that you are enrolled for the course with our academy. You still need to register on the site.
It is you or your department that pays the fee first to our academy. The Doctoral School refunds that fee to you or your department once the course has ended.
Please note that it is not obligatory to participate or succeed in the exam.

KMO-portefeuille

Information on "KMO-portefeuille": https://www.ugent.be/nl/opleidingen/levenslang-leren/kmo