M8 - High Dimensional Data Analysis
Type of Course - Dates - Venue - Description - Target audience - Exam - IMPORTANT: Incorporation in DTP and reimbursement by DS
Course prerequisites - Teachers - Course material - Book recommandations - Fees - Enrol
Type of course
Due to the peak in omikron infections this course will be offered online only.
Data
Six Monday and Thursday evenings in February 2022: February 7, 10, 14, 17, 21 and 24, 2022, from 5.30 pm to 9.30 pm.
Please note: the deadline for UGent PhD students who want a refund to open a dossier on the DS website (Application for Registration) is January 7, 2022.
Venue
This is an online course.
Description
Modern high throughput technologies easily generate data on thousands of variables; e.g. health care data, genomics, chemometrics, environmental monitoring, web logs, movie ratings, …
Conventional statistical methods are no longer suited for effectively analysing such high-dimensional data.
Multivariate statistical methods may be used, but for often the dimensionality of the data set is much larger than the number of (biological) samples. Modern advances in statistical data analyses allow for the appropriate analysis of such data.
Methods for the analysis of high dimensional data rely heavily on multivariate statistical methods. Therefore a large part of the course content is devoted to multivariate methods, but with a focus on high dimensional settings and issues.
Multivariate statistical analysis covers many methods. In this course a selection of techniques is covered based on our experience that they are frequently used in industry and research institutes.
The course is taught using case studies with applications from different fields (analytical chemistry, ecology, biotechnology, genomics, …).
Content:
- Dimension reduction: Singular Value Decomposition (SVD), Principal Component Analysis (PCA), Multidimensional Scaling (MDS) and biplots for dimension-reduced data visualisation
- Sparse SVD and sparse PCA
- Prediction with high dimensional predictors: principal component regression; ridge, lasso and elastic net penalised regression methods
- Classification (prediction of class membership): (penalised) logistic regression and linear discriminant analysis
- Evaluation of prediction models: sensitivity, specificity, ROC curves, mean squared error, cross validation
- Clustering
- Large scale hypotheses testing: FDR, FDR control methods, empirical Bayes (local) FDR control
Target audience
This course targets professionals and investigators from all areas that are high-dimensional.
Exam
Participants can, if they wish, take part in an exam. Upon succeeding in this test a certificate from Ghent University will be issued.
The exam consists of a take home project assignment. Students are required to write a report by a set deadline.
Incorporation in DTP and reimbursement from DS for UGent PhD students
As a UGent PhD student, to be able to incorporate this 'specialist course' in your Doctoral Training Program (DTP) and get a refund of the registration fee from your Doctoral School (DS) you need to follow strict rules: please take the necessary action in time. The deadline to open a dossier on the DS website (Application for Registration) for this course is January 7, 2022. Please note that opening a dossier does not mean that you are enrolled. You still need to enrol via the registration form on this site.
Course prerequisites
Ready at hand knowledge of basic statistics: data exploration and descriptive statistics, statistical modeling, and inference: linear models, confidence intervals, t-tests, F-tests, anova, chi-squared test, such as covered in Module 2, Module 5 and Module 12 of this year's course program.
Teacher
Course material
All material will be available on a github course website.
Book recommendations
- The elements of statistical learning: https://web.stanford.edu/~hastie/Papers/ESLII.pdf
- Introduction to Data Science: https://rafalab.github.io/dsbook/.
Fees
Different prices apply, depending on your main type of employment.
Employment | Module 8 | Exam |
---|---|---|
Industry/Private sector1 | 1110 | 30 |
Non-profit, government, higher education staff2 | 835 | 30 |
(Doctoral) students, retired, unemployed2 | 375 | 30 |
1 If two or more employees from the same company enrol simultaneously for this course a reduction of 20% on the module price is taken into account starting from the second enrolment.
2 UGent staff and UGent doctoral students who pay internally via SAP or internal transfer can participate at these special rates