SlideShare a Scribd company logo
1 of 12
Download to read offline
University of Economics                                                            Czech Technical University
             Prague                                                                             in Prague



           Recognizing, Classifying and Linking
           Entities with Wikipedia and DBpedia

                                                  Milan Dojchinovski1, Tomas Kliegr2
1 Faculty of Information Technology                                                 2Faculty
                                                                                           of Informatics and Statistics
Czech Technical University in Prague                                                 University of Economics, Prague


                                                                Milan Dojchinovski
                              milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk



                                            The 7th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2012)
                                                                                        November 22-23, 2012, Smolenice, SK

 Except where otherwise noted, the content of this presentation is licensed under
 Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   2
Introduction

 ‣    Unsupervised and fully-automated:
  -    entity recognition - rule based lexico-syntactic patterns
  -    entity classification by extraction of hypernyms - targeted hypernym extraction
  -    entity linking to DBpedia concepts

 ‣    Publication as Linked Data
  -    results in NLP Interchange Format (NIF)




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   3
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   4
Tool Architecture

 ‣   Available as Web 2.0 application at: http://ner.vse.cz/thd

 ‣   Web API available at: http://ner.vse.cz/thd/docs




                                                          Fig 1. Architecture overview




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   5
Entity Recognition and Classification

 ‣    Entity Recognition
  -    2 JAPE grammars: 1) NNP+ 2) JJ* NN+
  -    input: free text
  -    output: Named (e.g., Diego Maradona ) or Common Entities (e.g., hockey player )

 ‣    Entity Classification
  -    supported by the Targeted Hypernym Discovery algorithm
  -    lexico-syntactic patterns, e.g. _x_ is a _y_




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   6
Entity Linking and Publication

 ‣    Entity Linking
  -    linking with concepts from DBpedia
  -    used Wikipedia Search API
  -    mapping Wikipedia article URL to its DBpedia representation

 ‣    Publication in NIF
  -    NLP Interchange Format (RDF-based representation)
  -    each processed document (context) has unique identifier
  -    each entity and hypernym as offset-based string




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   7
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   8
Experiments

 ‣   Question addressed
     -   How well our tool recognizes, classifies and links Named and Common Entities?
 ‣   Experiment setup
     -   manually created dataset, Czech Traveler Dataset
     -   101 Named Entities, 85 Common Entities
     -   comparison with 3 other systems: DBpedia Spotlight, Open Calais, Alchemy API
 ‣   Results
     -   Named Entities,
         •   f-score: recognition 0.66, classification 0.66, linking 0.58

     -   Common Entities
         •   f-score: recognition 0.60, classification 0.51, linking 0.61

     -   better results in all tasks
         •   overtaken only by DBpedia Spotlight - linking of common entities with f-score 0.69


Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   9
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   10
Conclusion and Future Work

 ‣   Tool for Entity Recognition, Classification and Publication

 ‣   Future directions
     -   multilingual support - Dutch, German and Czech language
     -   grammar improvements
     -   evaluation on a standard benchmark




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   11
Feedback




                                                               Thank you!
                                             Questions, comments, ideas?


                                          demo at: http://ner.vse.cz/thd

                            Milan Dojchinovski                                       @m1ci
                            milan.dojchinovski@fit.cvut.cz                            http://dojchinovski.mk

  Except where otherwise noted, the content of this presentation is licensed under
  Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported                                          12

More Related Content

Similar to Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia

Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data PresentationShawn Day
 
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Anthony Fisher Camilleri
 
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayConstructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayBaoxu Shi
 
DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME Stavros Zervoudakis
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
Personalised Access to Linked Data
Personalised Access to Linked DataPersonalised Access to Linked Data
Personalised Access to Linked DataMilan Dojchinovski
 
Dariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingDariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingMinel Jean-Luc
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …Marc Smith
 
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Hristian Daskalov
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Frieda Brioschi
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaElena-Oana Tabaranu
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceKai Eckert
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT
 
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)IT Arena
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 

Similar to Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia (20)

LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data Presentation
 
Lod2
Lod2Lod2
Lod2
 
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
 
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayConstructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
 
DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Personalised Access to Linked Data
Personalised Access to Linked DataPersonalised Access to Linked Data
Personalised Access to Linked Data
 
Dariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingDariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displaying
 
Building arguments on Open Data
Building arguments on Open DataBuilding arguments on Open Data
Building arguments on Open Data
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
 
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata Provenance
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
 
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 

Recently uploaded

A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 

Recently uploaded (20)

A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 

Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia

  • 1. University of Economics Czech Technical University Prague in Prague Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia Milan Dojchinovski1, Tomas Kliegr2 1 Faculty of Information Technology 2Faculty of Informatics and Statistics Czech Technical University in Prague University of Economics, Prague Milan Dojchinovski milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk The 7th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2012) November 22-23, 2012, Smolenice, SK Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
  • 2. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 2
  • 3. Introduction ‣ Unsupervised and fully-automated: - entity recognition - rule based lexico-syntactic patterns - entity classification by extraction of hypernyms - targeted hypernym extraction - entity linking to DBpedia concepts ‣ Publication as Linked Data - results in NLP Interchange Format (NIF) Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 3
  • 4. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 4
  • 5. Tool Architecture ‣ Available as Web 2.0 application at: http://ner.vse.cz/thd ‣ Web API available at: http://ner.vse.cz/thd/docs Fig 1. Architecture overview Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 5
  • 6. Entity Recognition and Classification ‣ Entity Recognition - 2 JAPE grammars: 1) NNP+ 2) JJ* NN+ - input: free text - output: Named (e.g., Diego Maradona ) or Common Entities (e.g., hockey player ) ‣ Entity Classification - supported by the Targeted Hypernym Discovery algorithm - lexico-syntactic patterns, e.g. _x_ is a _y_ Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 6
  • 7. Entity Linking and Publication ‣ Entity Linking - linking with concepts from DBpedia - used Wikipedia Search API - mapping Wikipedia article URL to its DBpedia representation ‣ Publication in NIF - NLP Interchange Format (RDF-based representation) - each processed document (context) has unique identifier - each entity and hypernym as offset-based string Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 7
  • 8. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 8
  • 9. Experiments ‣ Question addressed - How well our tool recognizes, classifies and links Named and Common Entities? ‣ Experiment setup - manually created dataset, Czech Traveler Dataset - 101 Named Entities, 85 Common Entities - comparison with 3 other systems: DBpedia Spotlight, Open Calais, Alchemy API ‣ Results - Named Entities, • f-score: recognition 0.66, classification 0.66, linking 0.58 - Common Entities • f-score: recognition 0.60, classification 0.51, linking 0.61 - better results in all tasks • overtaken only by DBpedia Spotlight - linking of common entities with f-score 0.69 Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 9
  • 10. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 10
  • 11. Conclusion and Future Work ‣ Tool for Entity Recognition, Classification and Publication ‣ Future directions - multilingual support - Dutch, German and Czech language - grammar improvements - evaluation on a standard benchmark Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 11
  • 12. Feedback Thank you! Questions, comments, ideas? demo at: http://ner.vse.cz/thd Milan Dojchinovski @m1ci milan.dojchinovski@fit.cvut.cz http://dojchinovski.mk Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported 12