SlideShare a Scribd company logo
1 of 25
ELIS – Multimedia Lab 
Towards Twitter Hashtag Recommendation Using 
Distributed Word Representations and a Deep Feed 
Forward Neural Network 
CSSC-2014 
New Delhi, 24 September 2014 
Abhineshwar Tomar, Frederic Godin, Baptist Vandersmissen, 
Wesley De Neve, Rik Van de Walle 
Multimedia Lab, Ghent University – iMinds, Belgium 
Image and Video Systems Lab, KAIST, South Korea
2 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
3 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
4 
ELIS – Multimedia Lab 
Twitter 
• An online social network service that enables users to send and read 
short 140-character text messages, called "tweets" or "microposts" 
Hashtag 
(starts with #) 
Tweet or 
Mention 
(starts with @) 
Favorite 
(like or 
bookmark) 
Retweet micropost 
(sharing)
5 
ELIS – Multimedia Lab 
Famous Tweets 
Note the presence of both textual and (embedded) visual information!
6 
ELIS – Multimedia Lab 
• Usage in general 
Twitter Statistics 
- 271 million monthly active users 
- 500 million Tweets are sent per day 
• Hashtags 
- Only 8% of the tweets contain hashtags 
- 3% of the hashtags are used more than 5 times
7 
ELIS – Multimedia Lab 
Hashtags on Twitter 
Hashtag usage: 
- topic-based indexing & search 
• #socialnetwork 
• #Reddit 
- conversational/event clustering 
• #www2014 
Observation: only 8% of tweets contain a hashtag
8 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Why 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
9 
ELIS – Multimedia Lab 
Goal 
Generate hashtags that adhere to the 
semantic and linguistic regularity of a tweet
10 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
11 
ELIS – Multimedia Lab 
• Hashtags 
Why 
- Content categorization and discovery 
- Effective search of tweets 
• Our approach 
- Connect similar hashtags (topics) 
- Promote the use of hashtags 
• By understanding the semantics of the tweet
12 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
13 
ELIS – Multimedia Lab 
• Preprocessing 
- Remove non-English words 
- Remove non-ASCII characters 
- Remove mentions (@USER) 
- Remove URLs 
- Remove RT @from retweets 
• Feature vector generation 
• Training of a feed forward neural network 
• Evaluation 
Methodology (1/3)
14 
ELIS – Multimedia Lab 
Methodology (2/3) 
• Training: learning the relation between tweets and hashtags 
Tweet Hashtag 
word2vec 
300-D 
tweet 
vector 
word2vec 
300-D 
hashtag 
vector 
Deep feed-forward 
neural 
network 
300-D input layer 
1000-D hidden layer 
500-D hidden layer 
400-D hidden layer 
300-D output layer 
Elizabeth Warren Taking on 
Hillary as New Democratic 
Powerhouse 
#politics
15 
ELIS – Multimedia Lab 
Methodology (3/3) 
• Testing: recommending hashtags to tweets 
word2vec 
300-D 
tweet 
vector 
300-D 
hashtag 
vector 
Deep feed-forward 
neural 
network 
300-D input layer 
1000-D hidden layer 
500-D hidden layer 
400-D hidden layer 
300-D output layer 
Tweet 
House Democrats suggest 
Obama impeachment is 
imminent to raise cash 
vec2word 
Hashtag 
Hashtag 
Hashtag 
Hashtags 
#politics 
#crisis
16 
ELIS – Multimedia Lab 
word2vec 
• Developed by Google Research 
• Computes vector representations for words 
- Through the use of neural network technology 
• Trained on part of the Google News dataset (+/- 100 billion words) 
• The model contains vectors for 3 million words and phrases 
- Capture the semantic meaning of a word 
• Example word vector properties 
- vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome') 
- vector('king') - vector('man') + vector('woman') ≈ vector('queen')
17 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
18 
ELIS – Multimedia Lab 
Results (1/3) 
Tweet Recommended hashtags 
1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson, 
sooooooo, fricken 
2 The good life is one inspired by love and guided by 
knowledge. 
Ahh yes, FIVE THINGS About, 
YANKEES TALK, Kinder gentler, 
Ya gotta love 
3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect 
Cancer, Warps, Calorie Burn 
4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping 
robot, % #F######## 3v.jsn, Interest 
EURO JAP 
5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF 
5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S 
WATER HEALING SPELL: A... 
http://t.co/k0TfrqJFQW 
DEBUTS NEW, NOW AVAILABLE FOR, 
TO PUBLISH, DESIGNED TO, 
IS READY TO
19 
ELIS – Multimedia Lab 
Results (2/3)
20 
ELIS – Multimedia Lab 
Results (3/3) 
Top-k recommendation Hit-rate 
She et al. Our approach 
1 Top-5 82% 83.33% 
2 Top-10 89% 86.67%
21 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
22 
ELIS – Multimedia Lab 
Conclusion 
• Introduced a novel approach for hashtag recommendation, using 
distributed word representations and a feed forward neural network 
• Learns semantic and linguistic regularities without requiring careful 
feature engineering 
• Can easily take advantage of temporal information 
• Supports the automatic creation of new hashtags/trends
23 
ELIS – Multimedia Lab 
 Introduction 
 Goal 
 Motivation 
 Methodology 
 Results 
 Conclusion 
 Future work 
Overview
24 
ELIS – Multimedia Lab 
Future Work 
• Use of more than four days of data 
• Use word representations from different data sources 
• Investigate impact of the quality of the word representations created 
• Investigate impact of the use of DBpedia and Freebase
ELIS – Multimedia Lab

More Related Content

Similar to Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Christoph Rensing
 

Similar to Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network (20)

Doctoral student discussion forum on MOOCs
Doctoral student discussion forum on MOOCsDoctoral student discussion forum on MOOCs
Doctoral student discussion forum on MOOCs
 
What can we learn from UKOER?
What can we learn from UKOER?What can we learn from UKOER?
What can we learn from UKOER?
 
Saner17 sharma
Saner17 sharmaSaner17 sharma
Saner17 sharma
 
Vidi webinar for Developers
Vidi webinar for DevelopersVidi webinar for Developers
Vidi webinar for Developers
 
EXTRA: Integrating External Knowledge into Multimodal Hashtag Recommendation ...
EXTRA: Integrating External Knowledge into Multimodal Hashtag Recommendation ...EXTRA: Integrating External Knowledge into Multimodal Hashtag Recommendation ...
EXTRA: Integrating External Knowledge into Multimodal Hashtag Recommendation ...
 
Ppt tale kn_intro_final
Ppt tale kn_intro_finalPpt tale kn_intro_final
Ppt tale kn_intro_final
 
What happen after crawling big data?
What happen after crawling big data?What happen after crawling big data?
What happen after crawling big data?
 
Better Software, Better Practices, Better Research
Better Software, Better Practices, Better ResearchBetter Software, Better Practices, Better Research
Better Software, Better Practices, Better Research
 
VII Jornadas eMadrid "Education in exponential times". "Analysing and Alterin...
VII Jornadas eMadrid "Education in exponential times". "Analysing and Alterin...VII Jornadas eMadrid "Education in exponential times". "Analysing and Alterin...
VII Jornadas eMadrid "Education in exponential times". "Analysing and Alterin...
 
Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)
 
Videolectures for ocwc2010
Videolectures for ocwc2010Videolectures for ocwc2010
Videolectures for ocwc2010
 
Keynote Presentation at Moscow State University.
Keynote Presentation at Moscow State University.Keynote Presentation at Moscow State University.
Keynote Presentation at Moscow State University.
 
UK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationUK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing Participation
 
Impact and Opportunity of OER - A DOL TAACCCT Case Study
Impact and Opportunity of OER - A DOL TAACCCT Case StudyImpact and Opportunity of OER - A DOL TAACCCT Case Study
Impact and Opportunity of OER - A DOL TAACCCT Case Study
 
fakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower Scientists
 
E-learning system
E-learning systemE-learning system
E-learning system
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
 
Reference Rot and E-Theses: Threat and Remedy
Reference Rot and E-Theses: Threat and RemedyReference Rot and E-Theses: Threat and Remedy
Reference Rot and E-Theses: Threat and Remedy
 

More from Wesley De Neve

More from Wesley De Neve (20)

Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
 
Investigating the biological relevance in trained embedding representations o...
Investigating the biological relevance in trained embedding representations o...Investigating the biological relevance in trained embedding representations o...
Investigating the biological relevance in trained embedding representations o...
 
Impact of adversarial examples on deep learning models for biomedical image s...
Impact of adversarial examples on deep learning models for biomedical image s...Impact of adversarial examples on deep learning models for biomedical image s...
Impact of adversarial examples on deep learning models for biomedical image s...
 
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
 
The 5th Aslla Symposium
The 5th Aslla SymposiumThe 5th Aslla Symposium
The 5th Aslla Symposium
 
Ghent University Global Campus 101
Ghent University Global Campus 101Ghent University Global Campus 101
Ghent University Global Campus 101
 
Booklet for the First GUGC Research Symposium
Booklet for the First GUGC Research SymposiumBooklet for the First GUGC Research Symposium
Booklet for the First GUGC Research Symposium
 
Center for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global CampusCenter for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global Campus
 
Center for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global CampusCenter for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global Campus
 
Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...
 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniques
 
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
 
GUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and BioinformaticsGUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and Bioinformatics
 
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
 
Ghent University and GUGC-K: Overview of Teaching and Research Activities
Ghent University and GUGC-K: Overview of Teaching and Research ActivitiesGhent University and GUGC-K: Overview of Teaching and Research Activities
Ghent University and GUGC-K: Overview of Teaching and Research Activities
 
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
 
Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
 Exploring Deep Machine Learning for Automatic Right Whale Recognition and No... Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
 
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
 
Towards using multimedia technology for biological data processing
Towards using multimedia technology for biological data processingTowards using multimedia technology for biological data processing
Towards using multimedia technology for biological data processing
 
Orientation day at the Ghent University Global Campus in Korea: Introduction
Orientation day at the Ghent University Global Campus in Korea: IntroductionOrientation day at the Ghent University Global Campus in Korea: Introduction
Orientation day at the Ghent University Global Campus in Korea: Introduction
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

  • 1. ELIS – Multimedia Lab Towards Twitter Hashtag Recommendation Using Distributed Word Representations and a Deep Feed Forward Neural Network CSSC-2014 New Delhi, 24 September 2014 Abhineshwar Tomar, Frederic Godin, Baptist Vandersmissen, Wesley De Neve, Rik Van de Walle Multimedia Lab, Ghent University – iMinds, Belgium Image and Video Systems Lab, KAIST, South Korea
  • 2. 2 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 3. 3 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 4. 4 ELIS – Multimedia Lab Twitter • An online social network service that enables users to send and read short 140-character text messages, called "tweets" or "microposts" Hashtag (starts with #) Tweet or Mention (starts with @) Favorite (like or bookmark) Retweet micropost (sharing)
  • 5. 5 ELIS – Multimedia Lab Famous Tweets Note the presence of both textual and (embedded) visual information!
  • 6. 6 ELIS – Multimedia Lab • Usage in general Twitter Statistics - 271 million monthly active users - 500 million Tweets are sent per day • Hashtags - Only 8% of the tweets contain hashtags - 3% of the hashtags are used more than 5 times
  • 7. 7 ELIS – Multimedia Lab Hashtags on Twitter Hashtag usage: - topic-based indexing & search • #socialnetwork • #Reddit - conversational/event clustering • #www2014 Observation: only 8% of tweets contain a hashtag
  • 8. 8 ELIS – Multimedia Lab  Introduction  Goal  Why  Methodology  Results  Conclusion  Future work Overview
  • 9. 9 ELIS – Multimedia Lab Goal Generate hashtags that adhere to the semantic and linguistic regularity of a tweet
  • 10. 10 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 11. 11 ELIS – Multimedia Lab • Hashtags Why - Content categorization and discovery - Effective search of tweets • Our approach - Connect similar hashtags (topics) - Promote the use of hashtags • By understanding the semantics of the tweet
  • 12. 12 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 13. 13 ELIS – Multimedia Lab • Preprocessing - Remove non-English words - Remove non-ASCII characters - Remove mentions (@USER) - Remove URLs - Remove RT @from retweets • Feature vector generation • Training of a feed forward neural network • Evaluation Methodology (1/3)
  • 14. 14 ELIS – Multimedia Lab Methodology (2/3) • Training: learning the relation between tweets and hashtags Tweet Hashtag word2vec 300-D tweet vector word2vec 300-D hashtag vector Deep feed-forward neural network 300-D input layer 1000-D hidden layer 500-D hidden layer 400-D hidden layer 300-D output layer Elizabeth Warren Taking on Hillary as New Democratic Powerhouse #politics
  • 15. 15 ELIS – Multimedia Lab Methodology (3/3) • Testing: recommending hashtags to tweets word2vec 300-D tweet vector 300-D hashtag vector Deep feed-forward neural network 300-D input layer 1000-D hidden layer 500-D hidden layer 400-D hidden layer 300-D output layer Tweet House Democrats suggest Obama impeachment is imminent to raise cash vec2word Hashtag Hashtag Hashtag Hashtags #politics #crisis
  • 16. 16 ELIS – Multimedia Lab word2vec • Developed by Google Research • Computes vector representations for words - Through the use of neural network technology • Trained on part of the Google News dataset (+/- 100 billion words) • The model contains vectors for 3 million words and phrases - Capture the semantic meaning of a word • Example word vector properties - vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome') - vector('king') - vector('man') + vector('woman') ≈ vector('queen')
  • 17. 17 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 18. 18 ELIS – Multimedia Lab Results (1/3) Tweet Recommended hashtags 1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson, sooooooo, fricken 2 The good life is one inspired by love and guided by knowledge. Ahh yes, FIVE THINGS About, YANKEES TALK, Kinder gentler, Ya gotta love 3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect Cancer, Warps, Calorie Burn 4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping robot, % #F######## 3v.jsn, Interest EURO JAP 5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF 5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S WATER HEALING SPELL: A... http://t.co/k0TfrqJFQW DEBUTS NEW, NOW AVAILABLE FOR, TO PUBLISH, DESIGNED TO, IS READY TO
  • 19. 19 ELIS – Multimedia Lab Results (2/3)
  • 20. 20 ELIS – Multimedia Lab Results (3/3) Top-k recommendation Hit-rate She et al. Our approach 1 Top-5 82% 83.33% 2 Top-10 89% 86.67%
  • 21. 21 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 22. 22 ELIS – Multimedia Lab Conclusion • Introduced a novel approach for hashtag recommendation, using distributed word representations and a feed forward neural network • Learns semantic and linguistic regularities without requiring careful feature engineering • Can easily take advantage of temporal information • Supports the automatic creation of new hashtags/trends
  • 23. 23 ELIS – Multimedia Lab  Introduction  Goal  Motivation  Methodology  Results  Conclusion  Future work Overview
  • 24. 24 ELIS – Multimedia Lab Future Work • Use of more than four days of data • Use word representations from different data sources • Investigate impact of the quality of the word representations created • Investigate impact of the use of DBpedia and Freebase