Tom Paridaens

(18-12-2018) The doctoral work of Tom Paridaens, former researcher at the ELIS research group IDLab, makes it easier for scientists to get started with DNA sequences.

Genomics research, for example, for finding the cause and course of diseases such as cancer, examining the effects and side effects of new medicines, used to be very expensive and time-consuming. This was not only because every individual gene was looked at, but also the mutual coherence was examined.  Therefore, a very large amount of data had to be analyzed.

In the past decades, enormous progress has been made in the technologies used for DNA sequencing. The cost of sequencing the human genome has dropped from more than $ 10 million in 2007 to less than $ 1,000 now. At the same time, the time required to sequence a human genome has dropped from a few years to several hours.  As a result of these evolutions, the amount of generated data is rising exponentially. In his doctoral thesis, Tom described various solutions for the effective storage of these data. The data to be downloaded, decompressed and analyzed can be reduced to a quarter of the effort by also dividing the data according to location and exact meaning. In addition to the compression effectiveness, an emphasis is also placed on usability and functionality.

His research was at the basis of the MPEG-G standard for the representation, compression and management of genomic data, developed by MPEG (Moving Picture Experts Group).

Tom visited the Carl R. Woese Institute for Genomic Biology Research this autumn, to continue working on an open-source MPEG-G encoder, identification and design of applications for MPEG-G, and further development of the MPEG-G standard.