Module 4: Extensions to the Design and Analysis of Case-Control Studies: Part I + II

Type of Course - Dates - Venue - Description - Target audience - Exam - IMPORTANT: Incorporation in DTP and reimbursement by DS
Course prerequisites - Teachers - Course material - Fees - Enrol

Type of course

This is an online course.


Part I: Three days in December 2020: Monday 14, Tuesday 15 and Wednesday 16 December, 2020, from 9 am to 1 pm and from 2 pm to 5 pm.
Part II: Two days in December 2020: Thursday 17 and Friday 18 December, 2020, from 9 am to 1 pm and from 2 pm to 5 pm.
Please note: the deadline for UGent PhD students who want a refund to open a dossier on the DS website (Application for Registration) is November 13, 2020.


 Online course


The purpose of the course is to enable health researchers to recognise the various extensions of standard epidemiological designs in the published literature, how these represent the underlying population/cohort of interest, and which design is most appropriate/efficient for a given research question. In particular, the course will focus on how the biased sampling in different designs can be overcome by reweighting  to “reconstruct” the experience of the whole cohort and produce unbiased estimates. The course will also present extensions to the analysis of standard designs to enable estimation of alternative measures of risk.

Part I

Day 1: Disease occurrence and risk; classical sampling designs (review); Tests of association (RR, OR); Confounding, stratification, matching; Tests of homogeneity and trend; From tables to logistic regression models: prospective/retrospective data, interpreting parameters; Conditional logistic regression for matched data

Day 2: Missing data; two-stage studies, weighted logistic; recognising 2-stage data; potential in own research; Reusing case-control data for a new outcome; potential in own research; Introducing time: survival analysis; nested case-control (NCC) design; Cox, logistic and conditional logistic regression.

Day 3: Case-cohort design; matched nested case-control and matched case-cohort,  comparison of risk sets; implementation of case-cohort analysis; Breaking the matching in nested case-control data, re-weighting to obtain absolute risk.

Part II

Day 4: Estimates available from standard designs: RR estimates from “quasi cohort” analysis; RR from logistic regression by “doubling the cases”; time-dependent exposure effects from nested case-control data. Optimal design of two-stage studies; Reusing controls from a nested case-control study.

Day 5: Extensions to nested case-control design: counter-matched, anti-matched, exposure-enriched controls, exposure-density- sampling; sub-selecting controls. Extensions to analysis: complex exposure; combining matched and unmatched data; clustered two-stage data.


All lectures will be interspersed with tutorials consisting of appraisal of published studies, simple exercises using calculator and on-line statistical calculator, and computer exercises using statistical software (R and Stata). During the course, participants will develop and refine a study design or approach to analysis to address a clinical/epidemiological research question in their own work, which will be presented and discussed on the final day.

After completing this course you as a student are expected to be able to:

  • select a suitable epidemiological design for addressing a specified research question and justify the choice of design compared to other options
  • compare the risk estimates obtained by different sampling strategies from the same underlying cohort and interpret these estimates for common designs
  • compare and contrast the purpose of time-matching and confounder-matching in (nested) case-control studies, and generalise the resulting risk sets to a wide range of standard and non-standard designs.
  • compute weights that enable the reconstruction of an underlying cohort from a (nested) case-control sample and recognise that two-stage designs, re-use of case-control data, and extended/extreme case-control designs can all be analysed using appropriate weights to reflect the sampling
  • discuss the designs of published studies with particular attention to the choice of controls and devise more efficient alternatives

Target audience

This course is aimed at health researchers, graduate students or postdocs in epidemiology, biostatistics or other health research areas where the focus is on studying health outcomes in a population or well-defined cohort.  


There is no exam connected to this module. Participants receive a certificate of attendance via e-mail at the end of the course.

Incorporation in DTP and reimbursement from DS for UGent PhD students

As a UGent PhD student, to be able to incorporate this 'specialist course' in your Doctoral Training Program (DTP) and get a refund of the registration fee from your Doctoral School (DS) you need to follow strict rules: please take the necessary action in time. The deadline to open a dossier on the DS website (Application for Registration) for this course is November 13, 2020. Please note that opening a dossier does not mean that you are enrolled. You still need to enrol via the registration form on this site.

Course prerequisites

Part I

Participants need some familiarity with logistic regression and simple Cox regression (e.g. knowledge competing events is not required);  skills in Stata or R are also necessary for the exercises. Participants who are highly skilled in other software (e.g. SAS) are welcome to try and implement the methods but no support will be offered..

Part II

Additionally to the course requirements for Part I, participants need to be familiar with the topics discussed in Part I of this course.

Preparation for the course

Before the start of the course you will get access to a website with all the lectures and reading material together with an "exercise 0" that you are asked to do after reading the reading material. You need to upload this exercise to a website at least a couple of days before the course starts.


Photo of Professor Marie ReillyMarie Reilly is Professor of Biostatistics at the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm. Her methodological research interests include: development of methods for extending the designs and analysis of epidemiological studies, including re-use of nested case-control data, extensions to case-control type designs, and rank-order logit models for continuous outcomes; development of statistical methodology to analyze Elispot responses, including estimation of individual effects from pooled assays; development of software tools for exploring flow cytometry data.
Applied research work: Perinatal epidemiology - prevalence, timing and effect of alloimmunisation with specific RBC antibodies; studies of preeclampsia, pregnancy-associated VTE and various adverse pregnancy outcomes. HIV studies -Phase I/II randomized clinical trial of candidate HIV-vaccines in infants and adults; studies of the effect of maternal antiretroviral treatment on infant immunity.
Marie Reilly isresponsible for design and delivery of biostatistics education to undergraduate medical students at Karolinska Institutet. She is lecturer to 3rd year and to final year thesis students. She is presenter of courses on extended design and analysis of epidemiological studies at Karolinska Institutet and overseas.

Foto Jay AcharDr. Jay Achar graduated as a medical doctor from University College London and completed his specialist training in Internal Medicine and Infectious Diseases in Melbourne, Australia. His MSc in Epidemiology from the London School of Hygiene and Tropical Medicine focused on the use of advanced statistical methods which he has applied to a wide range of infectious disease data sets across the developing world. His current research interests focus on drug-resistant tuberculosis particularly in the countries of the Former Soviet Union. He supports clinical and research training organized by the WHO, European universities and national programmes. Since May 2020, he has been working part-time as a research assistant with Professor Reilly in the development of illustrative examples and software tools for the analysis of data from extended epidemiological designs.

Foto Ana LamAna Lam has a Master of Public Health from the University of Edinburgh and is currently a PhD student at the University of St. Andrews and the Max Planck Institute of Demographic Research. Her research interests include the social and structural determinants of health, social protection, health economics, and infectious disease modelling. She has spent the last two years as a research assistant with Professor Marie Reilly at Karolinska Institutet where she has helped organise, develop, and create examples and exercises for a textbook on extensions to epidemiological study design and analyses. She has experience using both R and Stata to conduct analyses related to nested case-control, case-cohort, and two-stage data. She was a teaching assistant for a previous version of this course held in March 2020, and has also lectured in the Applied Epidemiology I course for Master of Public Health students at Karolinska Institutet.

Foto Yilin NingDr Yilin Ning has a PhD in Biostatistics and Epidemiology from National University of Singapore, where she currently works as a Research Assistant in the Saw Swee Hock School of Public Health, on collaborative projects with Professor Reilly at Karolinska Institutet. Her research area is in the extension and application of regression models for robust analyses of epidemiological and clinical data, which has been published in Statistical Methods in Medical Research, American Journal of Epidemiology and Lancet Child and Adolescent Health. She is experienced with statistical programming in R and has published two packages on CRAN. In 2017, she was a tutor for a previous version of this course held in Italy, in 2018 and 2019 she delivered short workshops on data cleaning and analysis using R and in 2020 she assisted as a tutor in a course on advanced epidemiology, where she prepared answers, R scripts and an interactive web-based application for the exercises in the course.

Course material

Participants will receive all lectures beforehand as well as further reading material together with an "exercise 0" that you are asked to do after reading the reading material. You need to upload this exercise to a website at least a couple of days before the course starts.


A different price applies, depending on your main type of employment.

EmploymentModule 4, part IModule 4, part II
Industry/Private sector1 1.110 740
Non-profit, government, university outside AUGent2 945 630
(Doctoral)student outside AUGent2 425 285

1 If three or more employees from the same company enrol simultaneously for this course a reduction of 10% on the module price is taken into account.

2 AUGent staff and AUGent doctoral students who pay through use of an SAP internal order/invoice can participate at these special prices.

Enrol for this course