Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enhancing Named Entity Recognition in Twitter Messages Using Entity Linking

5,203 views

Published on

The slides of the talk at ACL 2015 Workshop on Noisy User-generated Text

Published in: Software
  • There is a useful site for you that will help you to write a perfect and valuable essay and so on. Check out, please ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! I have searched hard to find a reliable and best research paper writing service and finally i got a good option for my needs as ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • My brother found Custom Writing Service ⇒ www.WritePaper.info ⇐ and ordered a couple of works. Their customer service is outstanding, never left a query unanswered.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❤❤❤ http://bit.ly/2F4cEJi ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❤❤❤ http://bit.ly/2F4cEJi ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Enhancing Named Entity Recognition in Twitter Messages Using Entity Linking

  1. 1. Enhancing Named Entity Recognition in Twitter Messages Using Entity Linking Ikuya Yamada1,2,3 Hideaki Takeda3 Yoshiyasu Takefuji2 1 Studio Ousia 2 Keio University 3 National Institute of Informatics 15年7月31日金曜日
  2. 2. STUDIO OUSIA Background ‣ Twitter NER is difficult because of the noisy, short, and colloquial nature of tweets ‣ The performance of standard NER software suffers significantly 2 15年7月31日金曜日
  3. 3. STUDIO OUSIA Entity Linking 3 New Frozen Boutique to Open at Disney's Hollywood Studios /wiki/Frozen_(2013_film) /wiki/The_Walt_Disney_Company /wiki/Disney’s_Hollywood_Studios ‣ Entity Linking: The task of linking entity mentions to entries in a knowledge base (KB) (e.g., Wikipedia) ‣ Recently entity linking has received considerable attention ✦ Many research papers (2006-) [Cucerzan 2007, Milne et al. 2008, etc.] ✦ Competitions (TAC KBP, ERD@SIGIR, #Microposts@WWW, etc.) 15年7月31日金曜日
  4. 4. STUDIO OUSIA Can we enhance Twitter NER by using entity linking? 4 15年7月31日金曜日
  5. 5. STUDIO OUSIA 5 New Frozen Boutique to Open at Disney's Hollywood Studios Detecting “Frozen” from this tweet is difficult 15年7月31日金曜日
  6. 6. STUDIO OUSIA Entity Linking 6 New Frozen Boutique to Open at Disney's Hollywood Studios /wiki/Frozen_(2013_film) /wiki/The_Walt_Disney_Company /wiki/Disney’s_Hollywood_Studios ‣ By using entity linking, we can detect “Frozen”: ✦ “Frozen” is a very popular entity (from Wikipedia link structure and page view count) ✦ “Frozen” is semantically related to the context entities 15年7月31日金曜日
  7. 7. STUDIO OUSIA Our Approach ‣ Our system first performs entity linking in an end-to-end manner ‣ Detected entity mentions are used to enhance the NER tasks ‣ The data of entities are extracted from several open knowledge bases (Wikipedia, DBpedia, Freebase) ‣ Segmentation and classification tasks are addressed by using separate components 7 End-to-End Entity Linking Segmentation (NER) Classification (NER) 15年7月31日金曜日
  8. 8. End-to-End Entity Linking End-to-End Entity Linking Segmentation (NER) Classification (NER) 15年7月31日金曜日
  9. 9. STUDIO OUSIA End-to-End Entity Linking ‣ An entity linking system specifically designed for tweets ✦ Does not depend on NER to detect entity mentions (considering all possible n-grams as mention candidates) ✦ Based on supervised machine-learning (random forest) using various kinds of features (trained using #Microposts2015 dataset) ✦ Winner of a recent Twitter entity linking competition called #Microposts2015 NEEL Challenge at WWW2015 ‣ For further details, please refer to: Yamada et al, An End-to-End Entity Linking Approach for Tweets in Proceedings of #Microposts 2015 9 Image taken from NEEL2015 Challenge Summary: http://www.slideshare.net/giusepperizzo/neel2015-challenge-summary 15年7月31日金曜日
  10. 10. Segmentation of Named Entities End-to-End Entity Linking Segmentation (NER) Classification (NER) 15年7月31日金曜日
  11. 11. STUDIO OUSIA Segmentation: Approach ‣ Supervised machine-learning is used to assign a binary label to each of possible n-grams ‣ Random forest is used as the machine-learning algorithm ‣ Overlaps of mentions are resolved by iteratively selecting the longest entity mention from the beginning of the tweet ‣ Machine-learning features can be classified as follows: ✦ Entity-based features ✦ Linguistic features 11 15年7月31日金曜日
  12. 12. STUDIO OUSIA Segmentation: Entity-based Features ‣ The relevance score assigned by the entity linking system ‣ The popularity of the entity: ✦ The number of inbound links of the entity in Wikipedia ✦ The average page view count of the Wikipedia entity ‣ Mention statistics in Wikipedia: ✦ Link probability ✦ Capitalization probability 12 15年7月31日金曜日
  13. 13. STUDIO OUSIA Segmentation: Link Probability Feature 13 Her public image is associated with Japan's kawaisa culture centered in Harajuku, Tokyo Takeshita Street is a street lined with fashion boutiques, and cafes in Harajuku in Tokyo, Japan. Department Store and Museum is a department store located in the Harajuku... Takeshita Street Kyary Pamyu Pamyu Laforet Link Plain text LINK_PROBABILITY(Harajuku) = 2/3 15年7月31日金曜日
  14. 14. STUDIO OUSIA Segmentation: Linguistic Features ‣ Whether or not Stanford NER detects the mention ‣ Part-of-speech tags of the current and surrounding words ‣ Whether or not the current and surrounding words are capitalized ‣ Mention length (# of words, # of characters) 14 15年7月31日金曜日
  15. 15. Classification of Named Entities End-to-End Entity Linking Segmentation (NER) Classification (NER) 15年7月31日金曜日
  16. 16. STUDIO OUSIA Classification ‣ Supervised machine-learning is used to classify detected mentions into the predefined types ‣ Linear SVM is used as the machine-learning algorithm ‣ Main machine-learning features: ✦ Entity types in knowledge bases (DBpedia Ontology Classes and Freebase Types) ✦ Entity type detected by Stanford NER (i.e., PERSON, ORGANIZATION, LOCATION) ✦ The average of vectors of words in the n-gram using Stanford GloVe word embeddings (840B model) ✦ The relevance score assigned by entity linking 16 15年7月31日金曜日
  17. 17. STUDIO OUSIA Results ‣ Our method outperformed the 2nd-ranked method by 10.34 F1 at the segmentation task and by 5.01 F1 at the end-to-end task! 17 Performances of the proposed systems at segmenting entities Performances of the proposed systems at both segmentation and classification tasks 15年7月31日金曜日
  18. 18. STUDIO OUSIA Conclusion ‣ Twitter NER can be enhanced by using entity linking ‣ Entity linking enables us to use quality data in knowledge bases for NER tasks 18 15年7月31日金曜日
  19. 19. THANK YOU! 15年7月31日金曜日

×