Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outline Research Activities

1,080 views

Published on

Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outline Research Activities

Published in: Technology
  • Be the first to comment

Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outline Research Activities

  1. 1. ELIS – Multimedia Lab Multimedia Lab @ Ghent University - iMinds: Organizational Overview & Outline Research Activities Research Seminar KAIST, 1 August 2014 Wesley De Neve @wmdeneve Ghent University – iMinds & KAIST
  2. 2. 2 ELIS – Multimedia Lab Outline • Organizational overview (15 minutes) - Ghent University - iMinds - Multimedia Lab • Outline research activities (45 minutes) - social media analysis - visual content understanding - deep machine learning
  3. 3. 3 ELIS – Multimedia Lab Outline • Organizational overview (15 minutes) - Ghent University - iMinds - Multimedia Lab • Outline research activities (45 minutes) - social media analysis - visual content understanding - deep machine learning
  4. 4. 4 ELIS – Multimedia Lab Ghent University (1/3) • A Dutch-speaking public university - located in Ghent, Belgium - established in 1817 Ghent Brussels
  5. 5. 5 ELIS – Multimedia Lab Ghent University (2/3) • Consists of 38,000 students and 8,000 staff members - about 4,000 foreign students and 800 foreign staff members • Consists of eleven faculties, composed of more than 130 departments - campus buildings distributed all over the city Congress Center ‘Het Pand’ Faculty of Engineering and Architecture Aula Academia
  6. 6. 6 ELIS – Multimedia Lab Ghent University (3/3) • Ghent University Global Campus in Songdo - offers academic programs in molecular biotechnology, environmental technology, and food technology - operates together with the State University of New York (SUNY), George Mason University, and University of Utah Songdo Global University Campus Visit to Samsung Biologics
  7. 7. 7 ELIS – Multimedia Lab • Organizational overview - Ghent University - iMinds - Multimedia Lab Outline • Outline research activities - social media analysis - visual content understanding - deep machine learning
  8. 8. 8 ELIS – Multimedia Lab iMinds Research institute founded in 2004 by the Flemish government, with the aim of creating lasting economic and social value through ICT innovation
  9. 9. 9 ELIS – Multimedia Lab iMinds: A Virtual Research Institute Leverages the strengths of 5 universities, 20 research groups, and more than 850 researchers
  10. 10. 10 ELIS – Multimedia Lab iMinds’ Research Departments ICT Media Health Energy Smart Cities Manu-facturing Internet Technologies Multimedia Technologies Security Medical Information Technologies Digital Society
  11. 11. 11 ELIS – Multimedia Lab From Idea to Business: The iMinds Innovation Toolbox 5+ years Time-to-market …1 year Strategic research Incubation & entrepreneurship Applied research Pre-competitive testing Knowledge-driven Explorative Basics for applied research Training & coaching Financing Facilities Networking Internationali-zation Business-driven Interdisciplinary Cooperative Demand-driven Proof of Concept ICON projects Large-scale user trials & living labs Evaluate technical feasibility Simulations
  12. 12. 12 ELIS – Multimedia Lab iMinds ICON: Example Projects • iRead+ – The intelligent reading companion - January 2012 to December 2013 - finished project that built a text analysis pipeline for enriching digital news articles in Dutch and French with links to Wikipedia, dictionary definitions, and images • GiPA – Generic platform for augmented reality - January 2014 to December 2015 - aims at building an interoperable platform for augmented reality applications, ranging from games to simulations, addressing diverse requirements, from capturing to rendering
  13. 13. 13 ELIS – Multimedia Lab • Organizational overview - Ghent University - iMinds - Multimedia Lab Outline • Outline research activities - social media analysis - visual content understanding - deep machine learning
  14. 14. 14 ELIS – Multimedia Lab People (Speech Lab excluded) • Staff - Rik Van de Walle – senior full professor, head of MMLab - Peter Lambert – associate professor - Piet Verhoeve – guest lecturer (ICON program manager at iMinds) - Erik Mannens, Jan De Cock & Wesley De Neve – research management - Ellen Lammens & Laura Smekens – administrative management • 35 researchers - 50% PhD students • Miscellaneous - about 15 master’s thesis students per year - a few Summer internships each year
  15. 15. 15 ELIS – Multimedia Lab Research Activities (1/2) • Cluster 1: Video Coding (Jan De Cock) - compression and transport of video - transcoding and scalable coding - high-dynamic range video • Cluster 2: Game Tech & Graphics (Peter Lambert) - augmented and virtual reality - texture and mesh compression - path planning
  16. 16. 16 ELIS – Multimedia Lab Research Activities (2/2) • Cluster 3: Semantic Web (SWTF; Erik Mannens) - multimedia and interactivity on the Web - knowledge representation and reasoning - (big) data analytics and visualization - digital publishing • Cluster 4: Social & Visual Intelligence (SaVI; Wesley De Neve) - social media analysis - visual content analysis - machine learning
  17. 17. 17 ELIS – Multimedia Lab Teaching Activities • Bachelor/Master Computer Science and Bachelor/Master Electronics (Faculty of Engineering and Architecture) - Multimedia Techniques - Design of Multimedia Applications - Advanced Multimedia Applications • Bachelor Informatics (Faculty of Sciences) - Multimedia - Internet Technology • Bachelor Biotechnology (Songdo Global Campus) - Structured Programming + New graduate course on Big Data Analytics (pending approval)
  18. 18. 18 ELIS – Multimedia Lab Standardization Activities • W3C (World Wide Web Consortium) - new Web techniques - e.g., HTML5 and Media Annotations • MPEG (Moving Picture Experts Group) - new compression techniques • e.g., H.264/AVC and 3-D Video Coding - new storage and transport techniques • e.g., MP4 file format and MPEG DASH • VQEG (Video Quality Experts Group) - measurement of video quality - e.g., subjective quality evaluations
  19. 19. 19 ELIS – Multimedia Lab • Organizational overview - Ghent University - iMinds - Multimedia Lab Outline • Outline research activities - social media analysis - visual content understanding - deep machine learning
  20. 20. 20 ELIS – Multimedia Lab Twitter • An online social network service that enables users to send and read short 140-character text messages, called "tweets" or "microposts" Hashtag (starts with #) Tweet or Mention (starts with @) Favorite (like or bookmark) Retweet micropost (sharing)
  21. 21. 21 ELIS – Multimedia Lab Famous Tweets Note the presence of both textual and (embedded) visual information!
  22. 22. 22 ELIS – Multimedia Lab • Usage in general Twitter Statistics - 271 million monthly active users - 500 million Tweets are sent per day - 78% of active users are on mobile - expected revenue for 2014 is $1.33 billion • mobile advertising + data licensing • Usage during the World Cup 2014 - fans sent 672 million related tweets in total - during the semi-final between Brazil and Germany, fans sent more than 35.6 million tweets - during the final, the number of tweets sent by fans peaked at 618,725 Tweets Per Minute (TPM)
  23. 23. 23 ELIS – Multimedia Lab Twitter Research Goal and Challenges • Research goal - to make sense of the vast amounts of textual and visual information communicated on Twitter by means of machine learning • Challenges - microposts are noisy in nature - microposts are short-form in nature - microposts are multi-lingual in nature - microposts come in highly varying quantities - microposts are real-time in nature - microposts are multi-modal in nature (textual & visual, a/o)
  24. 24. 24 ELIS – Multimedia Lab • What? Deep Learning (1/4) - simply speaking: use of multi-layered neural networks that are able to learn complicated mappings between inputs and outputs x y = hθ(x) learned intermediate features deep learning = (hierarchical) representation learning
  25. 25. 25 ELIS – Multimedia Lab Deep Learning (2/4) • Example learned features Supervised handwritten digit recognition Unsupervised visual object recognition (Google Brain)
  26. 26. 26 ELIS – Multimedia Lab Deep Learning (3/4) • Why the resurgence of neural networks? - availability of large data sets (cf. social media & Internet of Things) - availability of cheap computing power (cf. GPU & cloud) - availability of algorithmic improvements (cf. DropOut & max pooling) • Current achievements - top performance in handwritten digit recognition - top performance in automatic speech recognition - top performance in large-scale visual concept detection • Attracts substantial private R&D investments - Google (Geoffrey Hinton & Ray Kurzweil), Facebook (Yann LeCun), Baidu (Andrew Ng & Kai Yu), Microsoft, Twitter, Netflix, and so on
  27. 27. 27 ELIS – Multimedia Lab Deep Learning (4/4) • Plenty of open research challenges - how to tailor deep neural networks to novel applications? - how to scale up deep neural networks? - how to scale down neural networks at no cost in effectiveness? - how to take advantage of massively parallel hardware? - how to develop effective hybrid architectures? - how to take into account long-term temporal dependencies? - how to implement multi-modal approaches? - how to establish solid theoretical foundations? - how to bridge the gap between deep learning and strong A.I.?
  28. 28. 28 ELIS – Multimedia Lab Ongoing Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition and disambiguation • Sports analytics • Social television • Vine video classification
  29. 29. 29 ELIS – Multimedia Lab Social and Visual Intelligence (SaVI) Abhineshwar Tomar abhineshwar.tomar@ugent.be Fréderic Godin frederic.godin@ugent.be Baptist Vandersmissen baptist.vandersmissen@ugent.be Wesley De Neve wesley.deneve@ugent.be Azarakhsh Jalalvand azarakhsh.jalalvand@ugent.be + 3 master’s thesis students
  30. 30. 30 ELIS – Multimedia Lab Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition and disambiguation • Social television • Sports analytics • Vine video classification
  31. 31. 31 ELIS – Multimedia Lab Hashtags on Twitter Hashtag usage: - topic-based indexing & search • #socialnetwork • #Reddit - conversational/event clustering • #www2014 Observation: only about 10% of tweets contain a hashtag Research challenge: develop techniques for Twitter hashtag recommendation
  32. 32. 32 ELIS – Multimedia Lab Twitter Hashtag Recommendation Using Deep Learning (1/2) • Training: learning the relation between tweets and hashtags Tweet Hashtag word2vec 300-D tweet vector word2vec 300-D hashtag vector Deep feed-forward neural network 300-D input layer 1000-D hidden layer 500-D hidden layer 400-D hidden layer 300-D output layer Elizabeth Warren Taking on Hillary as New Democratic Powerhouse #politics
  33. 33. 33 ELIS – Multimedia Lab Twitter Hashtag Recommendation Using Deep Learning (2/2) • Testing: recommending hashtags to tweets word2vec 300-D tweet vector 300-D hashtag vector Deep feed-forward neural network 300-D input layer 1000-D hidden layer 500-D hidden layer 400-D hidden layer 300-D output layer Tweet House Democrats suggest Obama impeachment is imminent to raise cash vec2word Hashtag Hashtag Hashtag Hashtags #politics #crisis
  34. 34. 34 ELIS – Multimedia Lab word2vec • Developed by Google Research • Computes vector representations for words - through the use of neural network technology • trained on part of the Google News dataset (+/- 100 billion words) • the model contains vectors for 3 million words and phrases - capture the semantic meaning of a word • Example word vector properties - vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome') - vector('king') - vector('man') + vector('woman') ≈ vector('queen')
  35. 35. 35 ELIS – Multimedia Lab Experimental Results Tweet Recommended hashtags 1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson, sooooooo, fricken 2 The good life is one inspired by love and guided by knowledge. Ahh yes, FIVE THINGS About, YANKEES TALK, Kinder gentler, Ya gotta love 3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect Cancer, Warps, Calorie Burn 4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping robot, % #F######## 3v.jsn, Interest EURO JAP 5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF 5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S WATER HEALING SPELL: A... http://t.co/k0TfrqJFQW DEBUTS NEW, NOW AVAILABLE FOR, TO PUBLISH, DESIGNED TO, IS READY TO
  36. 36. 36 ELIS – Multimedia Lab Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition and disambiguation • Sports analytics • Social television • Vine video classification
  37. 37. 37 ELIS – Multimedia Lab Named Entity Recognition and Disambiguation • Named entity - person - location - organization - miscellaneous • film/movie, entertainment award event, political event, programming language, sporting event and TV show • Recognition - identification of a named entity in a given text • Disambiguation - e.g., fruit ‘apple’ versus company ‘Apple’
  38. 38. 38 ELIS – Multimedia Lab Research Challenge • Tools for named entity recognition and disambiguation have thus far been developed for long-form news articles using formal language • Need for development of tools for named entity recognition and disambiguation for short-form microposts using informal language
  39. 39. 39 ELIS – Multimedia Lab Natural Language Processing (NLP) for Twitter from Scratch Tweet Tokenization Part-of-Speech Tagging (PoS) Chunking Named Entity Recognition and Disambiguation Information Retrieval Text-to-Speech Artificial Intelligence (cf. Siri, Cortana, Google Now) General Text Parsing pronoun verb noun Tom likes Sprite.
  40. 40. 40 ELIS – Multimedia Lab Our Approach: Twitter PoS using Deep Learning Word 1 Word 2 Word 3 L o o k u p word vector word vector word vector • Use of a feed-forward neural network for learning the mapping between a collection of word vector representations and a PoS tag - feature learning and not feature engineering • Use of word vector representations derived from Twitter - not from Google News Neural network PoS tag of word 2
  41. 41. 41 ELIS – Multimedia Lab Twitter-based word2vec Examples (1/2) Input: reddish Word Cosine distance ----------------------------------------------------------------- redish 0.829081 brownish 0.814688 purple 0.812775 burgundy 0.804166 blueish 0.786641 pastel 0.783559 magenta 0.779790 ombre 0.778065 lilac 0.777773 pink 0.775110 Captures spelling mistakes
  42. 42. 42 ELIS – Multimedia Lab Twitter-based word2vec Examples (2/2) Input: :) Word Cosine distance ----------------------------------------------- :)) 0.918219 (: 0.870493 :-) 0.855738 =) 0.855088 :))) 0.853806 xo 0.852893 xx 0.846706 ;)) 0.829732 !:) 0.822094 xox 0.819353 Input: :( Word Cosine distance ----------------------------------------------- :'( 0.865362 ;( 0.858428 :(( 0.829048 :-( 0.825194 :(((( 0.812367 !:( 0.807746 )): 0.791888 /: 0.769977 :((( 0.758594 :((((( 0.739779
  43. 43. 43 ELIS – Multimedia Lab Experimental Results Dataset Vector size Accuracy 2 weeks (~5M tweets) 100 82% 2 weeks (~5M tweets) 300 83% 2 weeks (~5M tweets) 500 83% 6 months (~70M tweets) 300 81,5% CMU ARK Tagger 91,6%
  44. 44. 44 ELIS – Multimedia Lab Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition • Sports analytics • Social television • Video classification
  45. 45. 45 ELIS – Multimedia Lab • What? Rationale - prediction of the outcome of football matches in the English Premier League (EPL), using both traditional statistics and Twitter microposts • Why? - betting on football is a billion dollar industry - Twitter is highly popular for real-time coverage of sports events • How? - fusion of the output of four simple methods, using different features and machine learning techniques
  46. 46. 46 ELIS – Multimedia Lab Approach • Method 1: Statistical features - ranking in the league, the number of points gathered in the league, the number of points gathered during the last five games, the number of goals made, and the number of goals against • Method 2: Twitter volume changes • Method 3: Twitter sentiment analysis • Method 4: Twitter user predictions • Machine learning - Naive Bayes, Logistic Regression, and SVM social features derived from +50 million tweets
  47. 47. 47 ELIS – Multimedia Lab Experimental Results (1/2) Method Accuracy Baseline methods Naive predictions 51% Expert predictions 60% Bookmaker predictions 67% Individual methods Statistical features 64% Twitter volume changes 50% Twitter sentiment analysis 52% Twitter user predictions 63% Combination of statistical features and Twitter user predictions Majority voting 64% Early fusion 68% Late fusion 66%
  48. 48. 48 ELIS – Multimedia Lab Experimental Results (2/2) Method Monetary profit (when betting 100 EUR) Bookmaker predictions +18.55 EUR Proposed method +29.70 EUR
  49. 49. 49 ELIS – Multimedia Lab Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition • Sports analytics • Social television • Video classification
  50. 50. 50 ELIS – Multimedia Lab Rationale (1/2) • Social television (second screen) - interaction between televised content and online social networks • Breaking Bad finale: peak of 22,373 TPM • Super Bowl 2014: peak of 382,000 TPM • World Cup 2014 final: peak of 618,725 TPM
  51. 51. 51 ELIS – Multimedia Lab • Challenges Rationale (2/2) - how to measure engagement and reach on online social networks? • cf. the Nielsen television ratings - how to profile your audience? • e.g., age, gender and location • Addressing these challenges is important for the allocation of advertisement budgets and targeted advertisement strategies versus
  52. 52. 52 ELIS – Multimedia Lab Measurement of Engagement and Reach in Flanders • Three major difficulties - privacy concerns - low usage of Twitter (at that time) - identification of Flemish users of Twitter
  53. 53. 53 ELIS – Multimedia Lab Twitter User Profiling: Gender Detection (1/3) • What? - classification of Flemish Twitter users into male and female classes • Why? - current user profiles do not contain gender information - gender information is important for targeted advertising • How? - through (mostly n-gram) features extracted from the profile of the user, the tweets of the user, and the social network of the user - through machine learning based on Naive Bayes and SVM
  54. 54. 54 ELIS – Multimedia Lab Twitter User Profiling: Gender Detection (2/3) Male Female E n s e m b l e averaging of probabilities Username Classifier Name Classifier Description Classifier Tweet Content Classifier Tweet Style Classifier Friend Description Classifier @wmdeneve Wesley De Neve Senior Researcher at Ghent University - iMinds & KAIST. Interested in social media analysis, visual content understanding and machine learning. Attending "The Future of Metadata" at CONTEC. #TISP URL usage, emoticon usage, and punctuation Sports fan, basketball player, outdoor lover and a Ph.D. researcher #SocialTV and Natural Language Processing (#NLP) @iMinds - @UGent
  55. 55. 55 ELIS – Multimedia Lab Twitter User Profiling: Gender Detection (3/3) Classifier Accuracy Username 78.86% Name 87.54% Description 65.74% Tweet content 75.36% Tweet style 66.34% Friend description 75.34% Test set TweetGenie Ensemble Test set 2 82.15% 91.89% Test set 3 86.44% 93.32%
  56. 56. 56 ELIS – Multimedia Lab Research Topics with a Twitter Focus • Hashtag recommendation • Named entity recognition • Social television • Sports analytics • Vine video classification
  57. 57. 57 ELIS – Multimedia Lab What is Vine? (1/4) • Platform for social & mobile video - established in June 2012 • Allows creating & distributing videos of up to 6 seconds - maximum video length resembles Twitter’s character limitation • Acquired by Twitter in October 2012 - currently has more than 40 million users • Has the potential to become a new social news platform - cf. Ninja News in Belgium
  58. 58. 58 ELIS – Multimedia Lab What is Vine? (2/4)
  59. 59. 59 ELIS – Multimedia Lab What is Vine? (3/4)
  60. 60. 60 ELIS – Multimedia Lab What is Vine? (4/4)
  61. 61. 61 ELIS – Multimedia Lab Automatic Understanding of Social Video Content (1/2) Recognition of general concepts in video fragments Categorize short and noisy video fragments Localize and recognize named entities in video fragments Localize and recognize products in video fragments + Neural network Output
  62. 62. 62 ELIS – Multimedia Lab Automatic Understanding of Social Video Content (2/2) Representation learning for social video Learn general noise-robust features Exploitation of temporal information in video to improve classification Investigate recurrent neural networks and reservoir computing networks
  63. 63. Visualization 63 ELIS – Multimedia Lab Future Research Vision SaVI & SWTF Cognitive computing? Strong A.I.? Technological singularity ;-)? Human & machine action Machine-understandable information Data (online social networks & Internet of Things) Deep learning Semantic Web understanding Natural language Visual content understanding Application domains Technology stacks
  64. 64. ELIS – Multimedia Lab
  65. 65. 65 ELIS – Multimedia Lab References [1] F. Godin, B. Vandersmissen, A. Jalalvand, W. De Neve, and R. Van de Walle, “Alleviating manual feature engineering for Part-of-Speech tagging of Twitter microposts using distributed word representations,” Proceedings of the NIPS Workshop on Modern Machine Learning Methods and Natural Language Processing, Dec. 2014. [2] A. Tomar, F. Godin, B. Vandersmissen, W. De Neve, and R. Van de Walle, “Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network,” Proceedings of the IEEE International Workshop on Cyber-Physical Systems and Social Computing (CSSC-2014) , Sep. 2014. [3] F. Godin, J. Zuallaert, B. Vandersmissen, W. De Neve, and R. Van de Walle, "Beating the bookmakers: leveraging statistics and Twitter microposts for predicting soccer results,“ Proceedings of the 2014 KDD Workshop on Large-Scale Sports Analytics, Aug. 2014. [4] B. Vandersmissen, F. Godin, A. Tomar, W. De Neve, and R. Van de Walle, "The rise of mobile and social short-form video: an in-depth measurement study of Vine," Proceedings of SoMuS 2014 : Workshop on Social Multimedia and Storytelling (co-located with ICMR 2014), Apr. 2014.

×