Multimedia Content Analysis

The past few years have witnessed a significant increase in both the consumption and availability of online multimedia content. This can be attributed to the introduction of easy-to-use devices and online services, the availability of cheap storage and bandwidth, and more and more people going online. To give a few statistics, as of January 2015, people watch six billion hours of video content each month on YouTube, and upload 300 hours of video content each minute. Similarly, 300 to 500 million microposts are created on Twitter on a daily basis. These statistics are indicative of the problem of multimedia content overload (infobesity): our ability to manage and consume multimedia content is not able to keep up with our ability to create multimedia content.

Our main research objective is to narrow the gap between multimedia content creation on the one hand, and multimedia content management and consumption on the other hand. To that end, we focus on the development of novel techniques for machine-based understanding of both textual and visual content, paying particular attention to the use of deep learning. The term ‘deep learning’ was coined in 2006, and refers to machine learning algorithms that make use of multiple non-linear layers to construct feature hierarchies, typically through the use of artificial neural networks.

Specifically, our research efforts focus on the following topics:

  • Machine learning for modeling and understanding of natural language
  • Natural language processing of noisy and short form text, such as status updates on social networks
  • Machine learning for modeling and understanding of visual data, such as news broadcast video clips and short-form noisy video clips

Staff

Wesley De Neve, Azarakhsh Jalalvand, Joni Dambre.

Researchers

Fréderic Godin, Baptist Vandersmissen.

Projects

  • MiX-ICON STEAMER: Smart Text Enrichment Algorithms for MEdia Retrieval applications
  • MiX-ICON Audience Measurement: Research into new measurement protocols and multi-platform media consumption
  • ICON iRead+: The Intelligent Reading Companion

Key publications

Visualization of a word vector space of the 1000 most frequent words on Twitter. During training, the model learned which words are similar to each other. For example, the word “the” is similar to the slang word “da” on Twitter.
Visualization of a word vector space of the 1000 most frequent words on Twitter. During training, the model learned which words are similar to each other. For example, the word “the” is similar to the slang word “da” on Twitter.

 

Visualization of the automatic detection of products in a YouTube video clip and linkage to a retailer website. Machine learning models learn to recognize and link products in video clips to facilitate e-commerce.
Visualization of the automatic detection of products in a YouTube video clip and linkage to a retailer website. Machine learning models learn to recognize and link products in video clips to facilitate e-commerce.