Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Natural Language Processing to Understand Emoji in Social Media Text

238 views

Published on

The ability to automatically process, derive meaning, and interpret text fused with emoji will be essential as society embraces emoji as a standard form of online communication. Yet the pictorial nature of emoji, the fact that (the same) emoji may be used in different contexts to express different meanings, and that emoji are used in different cultures over the world who interpret emoji differently, make it especially difficult to apply traditional Natural Language Processing (NLP) techniques to analyze them. This talk presents the creation of EmojiNet, the first machine-readable emoji sense repository that is designed by extracting emoji meanings from reliable online web sources and its applications for understanding emoji meaning in the social media text. It discusses how EmojiNet enables using NLP techniques to solve novel emoji research problems including emoji similarity and emoji sense disambiguation. A live demo of EmojiNet is available at http://emojinet.knoesis.org

Published in: Education
  • Be the first to comment

  • Be the first to like this

Using Natural Language Processing to Understand Emoji in Social Media Text

  1. 1. Using Natural Language Processing to Understand Emoji in Social Media Text Presented at Emojicon 2018 Brooklyn, NY, USA. 14th July, 2018 Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University, USA Sanjaya Wijeratne Ph.D. Candidate @ Kno.e.sis Center Department of Computer Science and Engineering Wright State University, Dayton, OH sanjaya@knoesis.org @sanjrockz
  2. 2. Emoji Usage Distribution: Gang Vs. Non-gang Members 2 PercentofprofilesthatcontainstheemojiinTweets Balasuriya et al. Finding Street Gang Members on Twitter, ASONAM 2016. https://goo.gl/12qtD2
  3. 3. Gas Vs Marijuana 3 Balasuriya et al. Finding Street Gang Members on Twitter, ASONAM 2016. https://goo.gl/12qtD2
  4. 4. Emoji Sense/Meaning Disambiguation 4 Image Source – https://goo.gl/rjS1hX I Look Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  5. 5. Challenges in Building an Emoji Meaning Dictionary • Multiple websites provide basic information on emoji, but not in the form of a dictionary 5 Image Sources – http://www.unicode.org/emoji/charts/full-emoji-list.html, https://emojipedia.org/thinking-face/, https://emojidictionary.emojifoundation.com/thinking_face
  6. 6. Can we combine the online resources and create a dictionary that we can feed to a computer to understand emoji? 6 Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  7. 7. We Created EmojiNet • The largest computer-readable emoji meaning dictionary ̶ Combines information extracted from online websites • Unicode.org • Emojipedia.org • EmojiDictionary.com ̶ Contains 12,904 emoji meanings definitions over 2,389 emoji • Can be accessible at - http://emojinet.knoesis.org/ 7 Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  8. 8. Building EmojiNet 8Building EmojiNet using Multiple Web Resources Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  9. 9. 9 Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  10. 10. Disambiguating Emoji Meaning 10 Sense Context words extracted from EmojiNet for each Sense Pray (verb) worship, thanksgiving, saint, pray, higher, god, confession High five (noun) Palm, high, hand, slide, celebrate, raise, person, head, five T1 – Pray for my family God gained an angel today T2 – Hard to win, but we did it man Lets celebrate! Context Words Extracted from EmojiNet Wijeratne et al. EmojiNet: Building a Machine Readable Sense Inventory for Emoji, SocInfo 2016. https://goo.gl/EWCXAV
  11. 11. Disambiguating Emoji Meaning Cont. • We selected the 25 most commonly misunderstood emoji and selected 50 tweets for each emoji ̶ Used Simplified LESK algorithm for disambiguation ̶ Context words were learned for each emoji sense definition using Twitter and Google News-based word embedding models ̶ Twitter-based embeddings outperform others 11 Top 10 Emoji based on the Emoji Sense Disambiguation Accuracy (in % values) Wijeratne et al. EmojiNet: An Open Service and API for Emoji Sense Discovery, ICWSM 2017. https://goo.gl/hksTzS
  12. 12. 12 Can a computer calculate the similarity of two emoji? Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  13. 13. Measuring Emoji Similarity 13 Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  14. 14. Measuring Emoji Similarity Cont. • Extract emoji definitions from EmojiNet 14 Extracting Emoji Definitions from EmojiNet Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  15. 15. Ground Truth Data Creation • 110M Tweets were used to identify 508 most frequently co-occurred emoji pairs, which covered 25% of the dataset. Full emoji list – https://goo.gl/fvP7K9 15 Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  16. 16. Emoji Similarity Evaluation • Except for Sense_Desc.-based embedding models which correlated moderately with ground truth data (40.0 < ρ < 59.0), all other models show a strong correlation (60.0 < ρ < 79.0) • Sense_Labels-based embedding models correlate best with ground truth data 16 Spearman’s Rank Correlation Results Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  17. 17. Emoji Similarity Evaluation Cont. 17 Accuracy of the Sentiment Analysis task using Emoji Embeddings Word Embedding Model Classification accuracy on sections of testing dataset All Tweets Tweets with Emoji Tweets /w top 10% Emoji Tweets /w bottom 10% Emoji N = 12,920 N = 2,295 N = 2,186 N = 308 RF SVM RF SVM RF SVM RF SVM Google News 57.5 58.5 46.0 47.1 47.3 45.1 44.7 43.2 Google News + emoji2vec 59.5 60.5 54.4 59.2 55.0 59.5 54.5 55.2 Google News + (Sense_Label) 60.3 63.3 55.0 61.8 56.8 62.3 54.2 59.0 Twitter + (Sense_Label) 60.7 63.6 57.3 60.8 57.5 61.5 56.1 58.4 Wijeratne et al. A Semantics-Based Measure of Emoji Similarity, Web Intelligence 2017. https://goo.gl/Ye2gyh
  18. 18. Ongoing Emoji Research @ Kno.e.sis • Emoji Prediction ̶ Given a text message (e.g., a Tweet), can we predict an emoji that summarize the text? • Yay! My talk got accepted at Emojicon! • Applying NLP to understand emoji use in different problem settings ̶ Depression Detection on Social Media ̶ Harassment Detection in Social Media ̶ Opioid Crisis in Social Media ̶ Disaster Relief in Social Media 18
  19. 19. Collaborators @ Kno.e.sis and Sponsors 19 Lakshika Balasuriya lakshika@knoesis.org Derek Doran derek@knoesis.org Amit Sheth amit@knoesis.org
  20. 20. 20 sanjaya@knoesis.org @sanjrockz
  21. 21. EmojiNet References 1. Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: An Open Service and API for Emoji Sense Discovery. In 11th International AAAI Conference on Web and Social Media (ICWSM 2017). Montreal, Canada; 2017. [Kno.e.sis Library Page] | [PDF] | [BibTeX] 2. Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: Building a Machine Readable Sense Inventory for Emoji. In 8th International Conference on Social Informatics (SocInfo 2016). Bellevue, WA, USA; 2016. [Kno.e.sis Library Page] | [PDF] | [BibTeX] 3. Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. A Semantics-Based Measure of Emoji Similarity. In 2017 IEEE/WIC/ACM International Conference on Web Intelligence (Web Intelligence 2017). Leipzig, Germany; 2017. [Kno.e.sis Library Page] | [PDF] | [BibTeX] 4. EmojiNet Datasets – http://emojinet.knoesis.org/home.php 5. More Technical version of the talk is available at – https://www.youtube.com/watch?v=mp29MqcaY04 21

×