Multimedia Content Analysis

The past few years have witnessed a significant increase in both the consumption and availability of online multimedia content. This can be attributed to the introduction of easy-to-use devices and online services, the availability of cheap storage and bandwidth, and more and more people going online. To give a few statistics, as of January 2015, people watch six billion hours of video content each month on YouTube, and upload 300 hours of video content each minute. Similarly, 300 to 500 million microposts are created on Twitter on a daily basis. These statistics are indicative of the problem of multimedia content overload (infobesity): our ability to manage and consume multimedia content is not able to keep up with our ability to create multimedia content.

Our main research objective is to narrow the gap between multimedia content creation on the one hand, and multimedia content management and consumption on the other hand. To that end, we focus on the development of novel techniques for machine-based understanding of both textual and visual content, paying particular attention to the use of deep learning. The term ‘deep learning’ was coined in 2006, and refers to machine learning algorithms that make use of multiple non-linear layers to construct feature hierarchies, typically through the use of artificial neural networks.

Specifically, our research efforts focus on the following topics:

Machine learning for modeling and understanding of natural language
Natural language processing of noisy and short form text, such as status updates on social networks
Machine learning for modeling and understanding of visual data, such as news broadcast video clips and short-form noisy video clips

Staff

Wesley De Neve, Azarakhsh Jalalvand, Joni Dambre.

Researchers

Fréderic Godin, Baptist Vandersmissen.

Projects

MiX-ICON STEAMER: Smart Text Enrichment Algorithms for MEdia Retrieval applications

MiX-ICON Audience Measurement: Research into new measurement protocols and multi-platform media consumption

ICON iRead+: The Intelligent Reading Companion

Key publications

Baptist Vandersmissen, Lucas Sterckx, Thomas Demeester, Azarakhsh Jalalvand, Wesley De Neve and Rik Van de Walle (2016). An automated end-to-end pipeline for fine-grained video annotation using deep neural networks. International Conference on Multimedia Retrieval. p. 409-412

Fréderic Godin, Wesley De Neve and Rik Van de Walle (2015). Part-of-speech tagging of Twitter microposts only using distributed word representations and a neural network. Computational linguistics in the Netherlands (CLIN 2015). p. 45

Fréderic Godin, Baptist Vandersmissen, Wesley De Neve and Rik Van de Walle (2015). Multimedia Lab @ ACL W-NUT NER shared task: named entity recognition for Twitter microposts using distributed word representations. ACL 2015 Workshop on Noisy User-generated Text, p. 146-153

Hans Paulussen, Francisco Capdevila, Pedro Debevere, Maribel M. Perez, Martin Vanbrabant, Wesley De Neve and Stefan De Wannemacker (2014). Building an NLP pipeline within a digital publishing workflow. Computational Linguistics in the Netherlands Journal. p. 71-84

Fréderic Godin, Jasper Zuallaert, Baptist Vandersmissen, Wesley De Neve and Rik Van de Walle (2014). Beating the bookmakers: leveraging statistics and Twitter microposts for predicting soccer results. Workshop on Large-Scale Sports Analytics

Baptist Vandersmissen, Fréderic Godin, Abhineshwar Tomar, Wesley De Neve and Rik Van de Walle (2014). The rise of mobile and social short-form video: an in-depth measurement study of Vine. CEUR workshop proceedings. 1198. p. 1-10

Fréderic Godin, Baptist Vandersmissen, Azarakhsh Jalalvand, Wesley De Neve and Rik Van de Walle (2014). Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations. NIPS Workshop on Modern Machine Learning and Natural Language Processing

Visualization of a word vector space of the 1000 most frequent words on Twitter. During training, the model learned which words are similar to each other. For example, the word “the” is similar to the slang word “da” on Twitter.

Visualization of the automatic detection of products in a YouTube video clip and linkage to a retailer website. Machine learning models learn to recognize and link products in video clips to facilitate e-commerce.