Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia

  • 557 views
Uploaded on

The 7th Workshop on Intelligent and Knowledge Oriented Technologies, Smolenice, Slovakia.

The 7th Workshop on Intelligent and Knowledge Oriented Technologies, Smolenice, Slovakia.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
557
On Slideshare
556
From Embeds
1
Number of Embeds
1

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 1

https://twitter.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. University of Economics Czech Technical University Prague in Prague Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia Milan Dojchinovski1, Tomas Kliegr21 Faculty of Information Technology 2Faculty of Informatics and StatisticsCzech Technical University in Prague University of Economics, Prague Milan Dojchinovski milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk The 7th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2012) November 22-23, 2012, Smolenice, SK Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
  • 2. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future WorkRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 2
  • 3. Introduction ‣ Unsupervised and fully-automated: - entity recognition - rule based lexico-syntactic patterns - entity classification by extraction of hypernyms - targeted hypernym extraction - entity linking to DBpedia concepts ‣ Publication as Linked Data - results in NLP Interchange Format (NIF)Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 3
  • 4. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future WorkRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 4
  • 5. Tool Architecture ‣ Available as Web 2.0 application at: http://ner.vse.cz/thd ‣ Web API available at: http://ner.vse.cz/thd/docs Fig 1. Architecture overviewRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 5
  • 6. Entity Recognition and Classification ‣ Entity Recognition - 2 JAPE grammars: 1) NNP+ 2) JJ* NN+ - input: free text - output: Named (e.g., Diego Maradona ) or Common Entities (e.g., hockey player ) ‣ Entity Classification - supported by the Targeted Hypernym Discovery algorithm - lexico-syntactic patterns, e.g. _x_ is a _y_Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 6
  • 7. Entity Linking and Publication ‣ Entity Linking - linking with concepts from DBpedia - used Wikipedia Search API - mapping Wikipedia article URL to its DBpedia representation ‣ Publication in NIF - NLP Interchange Format (RDF-based representation) - each processed document (context) has unique identifier - each entity and hypernym as offset-based stringRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 7
  • 8. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future WorkRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 8
  • 9. Experiments ‣ Question addressed - How well our tool recognizes, classifies and links Named and Common Entities? ‣ Experiment setup - manually created dataset, Czech Traveler Dataset - 101 Named Entities, 85 Common Entities - comparison with 3 other systems: DBpedia Spotlight, Open Calais, Alchemy API ‣ Results - Named Entities, • f-score: recognition 0.66, classification 0.66, linking 0.58 - Common Entities • f-score: recognition 0.60, classification 0.51, linking 0.61 - better results in all tasks • overtaken only by DBpedia Spotlight - linking of common entities with f-score 0.69Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 9
  • 10. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future WorkRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 10
  • 11. Conclusion and Future Work ‣ Tool for Entity Recognition, Classification and Publication ‣ Future directions - multilingual support - Dutch, German and Czech language - grammar improvements - evaluation on a standard benchmarkRecognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 11
  • 12. Feedback Thank you! Questions, comments, ideas? demo at: http://ner.vse.cz/thd Milan Dojchinovski @m1ci milan.dojchinovski@fit.cvut.cz http://dojchinovski.mk Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported 12