Regression Modeling and extensions with R for linguistics


PhD students of the department of Linguistics of the Doctoral Schools of Gent University, UA and VUB. Previous knowledge and required readings are required (see under).

Organising & Scientific Committe

Renata Enghels (GLIMS), Department of Linguistics, LW06, Spanish Section. Blandijnberg 2, 9000 Gent.

Timothy Colleman (GLIMS), Department of Linguistics, Dutch Section.

Gert De Sutter (EQTIS), Department of Translation, Interpreting and Communication.

Marlies Jansegers (GLIMS), Department of Linguistics, Spanish Section.

Prof. Dr Pedro Gras, UAntwerpen

Prof. Dr. An Vande Casteele, VUB

Prof. Dr. Rik Vosters, VUB

Topic and theme

In the last few decades, linguistics has experienced a significant shift from intuition-based approaches towards the use of corpora and empirical methods. The empirical turn has urged the need to rethink the methodological basis of the discipline. This problem has (partly) been solved by introducing statistical modeling and testing into the field. The proposed course on different methods of regression modeling in linguistics integrates very well within these recent developments.
The objective of this course is to teach PhD students in linguistics how to apply advanced methodologies of quantitative linguistics to their own data, using the opensource software tool R. As opposed to previous courses which have offered an introduction to statistics for linguistics with R (e.g., the doctoral school’s course by Dirk Speelman, 2010-2011), this advanced course focuses on a series of selected multivariate, regression-based methods: linear regression, logistic regression and mixed-effects modeling. The course is set up as an interactive seminar: participants can interact with the lecturer and discuss particular problems through individual sessions, and can work with their own corpus material.


  • Prof. Dr Stefan Th. Gries (University of California, Santa Barbara)

Dates and Venue

16-20 January 2017
Location: Blandijnberg 2, 9000 Gent. PC-room Campus Boekentoren - HIKO b.011
Participants can bring their own laptop to work with.


Day 1: Basic data import/export and processing with R
Day 2: (Generalized) linear regression modeling with R
Day 3: Mixed-effects modeling 1
Day 4: Mixed-effects modeling 2
Day 5: extensions and alternatives

Required readings, previous knowledge and teaching material

Participants are assumed to be acquainted with basic concepts and notions of the domain of quantitative linguistics.
Basic knowledge of how to work with the software program R is required.
Participants who will work on their own laptop should make sure they have the following software installed:
o R and the math library (<>;)
o RStudio (<>).
Participants are asked to have acquired the information provided in chapters 1 to 4 of the book before the start of the course: Gries, Stefan Th. 2013. Statistics for linguistics with R. 2nd rev. and ext. ed. Berlin & New York: De Gruyter Mouton, Ch. 1-4.

Registration fee

Free of charge for members of the Ghent University Doctoral School of Arts, Humanities & Law


Number of participants

Maximum 25

Evaluation criteria (doctoral training programme)

100% active participation in all sessions