M2 - Getting Started with R Software for Data Analysis

Target audience

This course targets professionals and investigators from diverse areas with little to no R-programming experience who wish to start using R for their data manipulation, data exploration or statistical analysis.

Description

R is a flexible environment for statistical computing and graphics, which is becoming increasingly popular as a tool to get insight in often complex data. While in some ways similar to other programming languages (such as C, Java and Perl), R is particularly suited for data analysis because ready-made functions are available for a wide variety of statistical (classical statistical tests, linear and nonlinear modeling, timeseries analysis, classification, clustering, ...) and graphical techniques.

The base R program can be extended with user-submitted packages, which means new techniques are often implemented in R before being available in other software. This is one of the reasons why R is becoming the de facto standard in certain fields such as bioinformatics (Bioconductor) and financial services.

This course introduces the use of the R environment for the implementation of data management, data exploration, basic statistical analysis and automation of procedures.

It starts with a description of the R GUI, the use of the command line and an overview of basic data structures. The application of standard procedures to import data or to export results to external files will be illustrated.

Creation of new variables, subsetting, merging and stacking of data sets will be covered in the data management section. Exploration of the data by histograms, box plots, scatter plots, summary numbers, correlation coefficients and cross-tabulations will be performed.

Simple statistical procedures that will be covered are:

  • comparisons of observed group means (t-test, ANOVA and their non-parametric versions) and proportions
  • test for independence in 2-way cross tables and linear regression (focusing on the R-implementation of the statistical methods that are the subject of other modules of the statistics series)

Finally, installing new packages and automation of analysis procedures will also be discussed.

Practical sessions and specific exercises will be provided to allow participants to practice their R skills in interaction with the teacher.

Course prerequisites

The course is open to all interested persons.

Knowledge of basic statistical concepts and experience with other programming languages are considered advantages, but not required for learning the R language.

Exam / Certificate

There is no exam connected to this module. If you attend all four classes you will receive a certificate of attendance via e-mail at the end of the course.

Micro-credential

This module is part of the micro-credential 'Data Analysis in R: Basics and Beyond' that consists of three modules:

  • Module 2 - Getting Started with R Software for Data Analysis
  • Module 6 - Leverage your R Skills: Data Wrangling & Plotting with Tidyverse
  • Module 7 - Dynamic Report Generation with R Markdown

If you are planning on registering for all three modules, consider enrolling for the micro-credential instead. Read more...

Type of course

This is an on campus course. We offer blended learning options if, exceptionally, you can't attend a session on campus.

Schedule

October 23, 26 & 30, November 6, 2023, from 5.30 pm to 9 pm.

Venue

Faculty of Science, Campus Sterre, Krijgslaan 281, 9000 Gent, Building S9, 3rd floor, Auditorium 3.

Teacher

Emmanuel Abatih Dr. Emmanuel Abatih is a post-doctoral research fellow at Ghent University and he works as the FLAMES coordinator for UGent and as a statistical consultant for FIRE and DASS. He obtained a PhD in Epidemiology and Biostatistics in 2008 at the University of Copenhagen on the topic: “Assessment of the impact of the non-human use of Antimicrobial Agents on the Selection, Transmission and Distribution of Antimicrobial Resistant Bacteria” . He worked for the Institute of Tropical Medicine (ITM) in Antwerp, as a post doc statistician on topics including: space-time analysis, diagnostic test elevation, transmission dynamic modeling and risk analysis. He also served as a statistical consultant for the TB, Malaria and Parasitology units of the ITM. He has supervised/co-supervised over 30 masters and 7 PhD students. He has been teaching courses ranging from general statistics to more specialized areas like Machine Learning, Causal Inference and Structural Equations Modeling. He has experience with R, python, SPSS and STATA.  

Course material

Access to slides and data files.

The participation fee is 620 EUR for participants from the private sector. Reduced prices apply to students and staff from non-profit, social profit, and government organizations. The exam fee is € 35.

Employment Fee (€)
Industry, private sector, profession 620
Non-profit, government, higher education staff 465
(Doctoral) students, unemployed 280

Register

Register for this course

UGent PhD students

As UGent PhD student you can incorporate this 'transferable skills seminar: research & valorization' in your Doctoral Training Program (DTP). To get a refund of the registration fee from your Doctoral School (DS) please follow these strict rules and take the necessary action in time. The deadline to open a dossier on the DS website (Application for Registration) for this course is September 22, 2023.

Opening a dossier with your DS does not mean that you are enrolled for the course with our academy. You still need to register on the site.
It is you or your department that pays the fee first to our academy. The Doctoral School refunds that fee to you or your department once the course has ended.

KMO-portefeuille

Information on "KMO-portefeuille": https://www.ugent.be/nl/opleidingen/levenslang-leren/kmo