Exploring the Future Potential of AI-Enabled Smartphone Processors
Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network
1. ELIS – Multimedia Lab
Towards Twitter Hashtag Recommendation Using
Distributed Word Representations and a Deep Feed
Forward Neural Network
CSSC-2014
New Delhi, 24 September 2014
Abhineshwar Tomar, Frederic Godin, Baptist Vandersmissen,
Wesley De Neve, Rik Van de Walle
Multimedia Lab, Ghent University – iMinds, Belgium
Image and Video Systems Lab, KAIST, South Korea
4. 4
ELIS – Multimedia Lab
Twitter
• An online social network service that enables users to send and read
short 140-character text messages, called "tweets" or "microposts"
Hashtag
(starts with #)
Tweet or
Mention
(starts with @)
Favorite
(like or
bookmark)
Retweet micropost
(sharing)
5. 5
ELIS – Multimedia Lab
Famous Tweets
Note the presence of both textual and (embedded) visual information!
6. 6
ELIS – Multimedia Lab
• Usage in general
Twitter Statistics
- 271 million monthly active users
- 500 million Tweets are sent per day
• Hashtags
- Only 8% of the tweets contain hashtags
- 3% of the hashtags are used more than 5 times
7. 7
ELIS – Multimedia Lab
Hashtags on Twitter
Hashtag usage:
- topic-based indexing & search
• #socialnetwork
• #Reddit
- conversational/event clustering
• #www2014
Observation: only 8% of tweets contain a hashtag
11. 11
ELIS – Multimedia Lab
• Hashtags
Why
- Content categorization and discovery
- Effective search of tweets
• Our approach
- Connect similar hashtags (topics)
- Promote the use of hashtags
• By understanding the semantics of the tweet
16. 16
ELIS – Multimedia Lab
word2vec
• Developed by Google Research
• Computes vector representations for words
- Through the use of neural network technology
• Trained on part of the Google News dataset (+/- 100 billion words)
• The model contains vectors for 3 million words and phrases
- Capture the semantic meaning of a word
• Example word vector properties
- vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome')
- vector('king') - vector('man') + vector('woman') ≈ vector('queen')
18. 18
ELIS – Multimedia Lab
Results (1/3)
Tweet Recommended hashtags
1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson,
sooooooo, fricken
2 The good life is one inspired by love and guided by
knowledge.
Ahh yes, FIVE THINGS About,
YANKEES TALK, Kinder gentler,
Ya gotta love
3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect
Cancer, Warps, Calorie Burn
4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping
robot, % #F######## 3v.jsn, Interest
EURO JAP
5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF
5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S
WATER HEALING SPELL: A...
http://t.co/k0TfrqJFQW
DEBUTS NEW, NOW AVAILABLE FOR,
TO PUBLISH, DESIGNED TO,
IS READY TO
22. 22
ELIS – Multimedia Lab
Conclusion
• Introduced a novel approach for hashtag recommendation, using
distributed word representations and a feed forward neural network
• Learns semantic and linguistic regularities without requiring careful
feature engineering
• Can easily take advantage of temporal information
• Supports the automatic creation of new hashtags/trends
24. 24
ELIS – Multimedia Lab
Future Work
• Use of more than four days of data
• Use word representations from different data sources
• Investigate impact of the quality of the word representations created
• Investigate impact of the use of DBpedia and Freebase