SlideShare a Scribd company logo

NERD: Evaluating Named Entity Recognition Tools in the Web of Data

Giuseppe Rizzo
Giuseppe Rizzo
Giuseppe RizzoSenior Data Scientist at Istituto Superiore Mario Boella

Talk "NERD: Evaluating Named Entity Recognition Tools in the Web of Data" event during WEKEX'11 workshop (ISWC'11), Bonn, Germany

NERD: Evaluating Named Entity Recognition Tools in the Web of Data

1 of 21
Download to read offline
NERD: Evaluating Named Entity
Recognition Tools in the Web of Data

     Giuseppe Rizzo <giuseppe.rizzo@eurecom.fr>
     Raphaël Troncy <raphael.troncy@eurecom.fr>
What is a Named Entity recognition task?

A task that aims to locate and classify the name of a person or an
organization, a location, a brand, a product, a numeric expression
including time, date, money and percent in a textual document




24 October 2011   Workshop on Web Scale Knowledge Extraction (WEKEX'11)   - 2/21
Named Entity recognition tools




24 October 2011   Workshop on Web Scale Knowledge Extraction (WEKEX'11)   - 3/21
Differences among those NER extractors

  
      Granularity
          
              extract NE from sentences vs from the entire document


  
      Technologies used
          
            algorithms used to extract NE
          
            supported languages
          
            taxonomy of type of NE recognized
          
            disambiguation (dataset used to provide links)
          
            content request size
          
            Response format



24 October 2011      Workshop on Web Scale Knowledge Extraction (WEKEX'11)   - 4/21
And ...




                  
                    What about precision and recall?
                  
                    Which extractor best fits my needs?



24 October 2011       Workshop on Web Scale Knowledge Extraction (WEKEX'11)   - 5/21
Seeks to find pros and cons of
                   those extractors


                                        What is NERD?
                                   REST API1                                     ontology3
                                                                      UI2
1
  http://nerd.eurecom.fr/api/application.wadl
2
  http://nerd.eurecom.fr/
3
  http://nerd.eurecom.fr/ontology


    24 October 2011          Workshop on Web Scale Knowledge Extraction (WEKEX'11)    - 6/21
Ad

Recommended

NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...Onando Mulang'
 
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...inside-BigData.com
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphFedorNikolaev
 
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...Pablo Mendes
 

More Related Content

Viewers also liked

DBpedia Spotlight at I-SEMANTICS 2011
DBpedia Spotlight at I-SEMANTICS 2011DBpedia Spotlight at I-SEMANTICS 2011
DBpedia Spotlight at I-SEMANTICS 2011Pablo Mendes
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsNYC Predictive Analytics
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiSocial Media Camp
 
Latent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyLatent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyAuro Tripathy
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkSandy Ryza
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisNYC Predictive Analytics
 
Syntactic Analysis
Syntactic AnalysisSyntactic Analysis
Syntactic AnalysisAleli Lac
 
Semantics: Seven types of meaning
Semantics: Seven types of meaningSemantics: Seven types of meaning
Semantics: Seven types of meaningMiftadia Laula
 
3 Des Nas
3 Des Nas3 Des Nas
3 Des Nasepaper
 
Ken Lawrence Minipaper Georges Melies
Ken Lawrence Minipaper Georges MeliesKen Lawrence Minipaper Georges Melies
Ken Lawrence Minipaper Georges MeliesThisco
 
Binder20aceh
Binder20acehBinder20aceh
Binder20acehepaper
 
Mike Brunsberg Technology Power Point
Mike Brunsberg   Technology Power PointMike Brunsberg   Technology Power Point
Mike Brunsberg Technology Power Pointmbrunsberg
 
Edisi1 Okt
Edisi1 OktEdisi1 Okt
Edisi1 Oktepaper
 
Binder09 Aceh
Binder09 AcehBinder09 Aceh
Binder09 Acehepaper
 
6 weird tourist spots in the us
6 weird tourist spots in the us 6 weird tourist spots in the us
6 weird tourist spots in the us kanika sharma
 

Viewers also liked (20)

DBpedia Spotlight at I-SEMANTICS 2011
DBpedia Spotlight at I-SEMANTICS 2011DBpedia Spotlight at I-SEMANTICS 2011
DBpedia Spotlight at I-SEMANTICS 2011
 
GoogLeNet Insights
GoogLeNet InsightsGoogLeNet Insights
GoogLeNet Insights
 
Ao artìculo
Ao artìculoAo artìculo
Ao artìculo
 
Entity Linking
Entity LinkingEntity Linking
Entity Linking
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
 
Latent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyLatent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro Tripathy
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with Spark
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic Analysis
 
Syntactic Analysis
Syntactic AnalysisSyntactic Analysis
Syntactic Analysis
 
Semantics: Seven types of meaning
Semantics: Seven types of meaningSemantics: Seven types of meaning
Semantics: Seven types of meaning
 
Semantics
SemanticsSemantics
Semantics
 
3 Des Nas
3 Des Nas3 Des Nas
3 Des Nas
 
Ken Lawrence Minipaper Georges Melies
Ken Lawrence Minipaper Georges MeliesKen Lawrence Minipaper Georges Melies
Ken Lawrence Minipaper Georges Melies
 
Binder20aceh
Binder20acehBinder20aceh
Binder20aceh
 
Mike Brunsberg Technology Power Point
Mike Brunsberg   Technology Power PointMike Brunsberg   Technology Power Point
Mike Brunsberg Technology Power Point
 
Branding Philosophy
Branding PhilosophyBranding Philosophy
Branding Philosophy
 
Edisi1 Okt
Edisi1 OktEdisi1 Okt
Edisi1 Okt
 
Binder09 Aceh
Binder09 AcehBinder09 Aceh
Binder09 Aceh
 
6 weird tourist spots in the us
6 weird tourist spots in the us 6 weird tourist spots in the us
6 weird tourist spots in the us
 

Similar to NERD: Evaluating Named Entity Recognition Tools in the Web of Data

10-10-06-02 Paul Copioli VEX
10-10-06-02 Paul Copioli VEX10-10-06-02 Paul Copioli VEX
10-10-06-02 Paul Copioli VEXDarrell Caron
 
"Virtual" VREs - bringing research into the curriculum
"Virtual" VREs - bringing research into the curriculum"Virtual" VREs - bringing research into the curriculum
"Virtual" VREs - bringing research into the curriculumChristopher Brown
 
Unified Systems Engeneering with GoedelWorks
Unified Systems Engeneering with GoedelWorksUnified Systems Engeneering with GoedelWorks
Unified Systems Engeneering with GoedelWorksEric Verhulst
 
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau Monné
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau MonnéOpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau Monné
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau MonnéOpenNebula Project
 
Training di Base Neo4j
Training di Base Neo4jTraining di Base Neo4j
Training di Base Neo4jNeo4j
 
Linked Open Data : opportunités et défis par Makx Dekkers
Linked Open Data : opportunités et défis par Makx DekkersLinked Open Data : opportunités et défis par Makx Dekkers
Linked Open Data : opportunités et défis par Makx DekkersABES
 
EOSC Architecture Session - EOSC Stakeholders Forum 2018
EOSC Architecture Session - EOSC Stakeholders Forum 2018EOSC Architecture Session - EOSC Stakeholders Forum 2018
EOSC Architecture Session - EOSC Stakeholders Forum 2018EOSCpilot .eu
 
Text Retrieval Conferences (TREC)
Text Retrieval Conferences (TREC)Text Retrieval Conferences (TREC)
Text Retrieval Conferences (TREC)Abdul Gaffar
 
01.19.2011 AIIT InfoTalk on OpenStack
01.19.2011 AIIT InfoTalk on OpenStack01.19.2011 AIIT InfoTalk on OpenStack
01.19.2011 AIIT InfoTalk on OpenStackAdam Johnson
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...OW2
 
JavaOne 2011 Progressive JavaFX 2.0 Custom Components
JavaOne 2011 Progressive JavaFX 2.0 Custom ComponentsJavaOne 2011 Progressive JavaFX 2.0 Custom Components
JavaOne 2011 Progressive JavaFX 2.0 Custom ComponentsPeter Pilgrim
 
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...Deltares
 

Similar to NERD: Evaluating Named Entity Recognition Tools in the Web of Data (16)

10-10-06-02 Paul Copioli VEX
10-10-06-02 Paul Copioli VEX10-10-06-02 Paul Copioli VEX
10-10-06-02 Paul Copioli VEX
 
"Virtual" VREs - bringing research into the curriculum
"Virtual" VREs - bringing research into the curriculum"Virtual" VREs - bringing research into the curriculum
"Virtual" VREs - bringing research into the curriculum
 
Unified Systems Engeneering with GoedelWorks
Unified Systems Engeneering with GoedelWorksUnified Systems Engeneering with GoedelWorks
Unified Systems Engeneering with GoedelWorks
 
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau Monné
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau MonnéOpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau Monné
OpenNebulaConf2015 1.17 What’s Going on in Xen - Roger Pau Monné
 
Training di Base Neo4j
Training di Base Neo4jTraining di Base Neo4j
Training di Base Neo4j
 
Linked Open Data : opportunités et défis par Makx Dekkers
Linked Open Data : opportunités et défis par Makx DekkersLinked Open Data : opportunités et défis par Makx Dekkers
Linked Open Data : opportunités et défis par Makx Dekkers
 
IMPACT Final Conference - NCSR - Wordspotting
IMPACT Final Conference - NCSR - WordspottingIMPACT Final Conference - NCSR - Wordspotting
IMPACT Final Conference - NCSR - Wordspotting
 
EOSC Architecture Session - EOSC Stakeholders Forum 2018
EOSC Architecture Session - EOSC Stakeholders Forum 2018EOSC Architecture Session - EOSC Stakeholders Forum 2018
EOSC Architecture Session - EOSC Stakeholders Forum 2018
 
Text Retrieval Conferences (TREC)
Text Retrieval Conferences (TREC)Text Retrieval Conferences (TREC)
Text Retrieval Conferences (TREC)
 
oai-2.0-adv.ppt
oai-2.0-adv.pptoai-2.0-adv.ppt
oai-2.0-adv.ppt
 
01.19.2011 AIIT InfoTalk on OpenStack
01.19.2011 AIIT InfoTalk on OpenStack01.19.2011 AIIT InfoTalk on OpenStack
01.19.2011 AIIT InfoTalk on OpenStack
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
 
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack PrototypeLOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
 
JavaOne 2011 Progressive JavaFX 2.0 Custom Components
JavaOne 2011 Progressive JavaFX 2.0 Custom ComponentsJavaOne 2011 Progressive JavaFX 2.0 Custom Components
JavaOne 2011 Progressive JavaFX 2.0 Custom Components
 
SelEQ
SelEQSelEQ
SelEQ
 
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
 

More from Giuseppe Rizzo

Artificial intelligence for social good
Artificial intelligence for social goodArtificial intelligence for social good
Artificial intelligence for social goodGiuseppe Rizzo
 
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HRCOMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HRGiuseppe Rizzo
 
Understand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational AgentsUnderstand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational AgentsGiuseppe Rizzo
 
AI For Profiling Your Customers
AI For Profiling Your CustomersAI For Profiling Your Customers
AI For Profiling Your CustomersGiuseppe Rizzo
 
AI for Personalized Chatbot
AI for Personalized ChatbotAI for Personalized Chatbot
AI for Personalized ChatbotGiuseppe Rizzo
 
Tourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel BookingsTourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel BookingsGiuseppe Rizzo
 
The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1Giuseppe Rizzo
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingGiuseppe Rizzo
 
From Data to Knowledge for Tourists
From Data to Knowledge for TouristsFrom Data to Knowledge for Tourists
From Data to Knowledge for TouristsGiuseppe Rizzo
 
Enabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart CityEnabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart CityGiuseppe Rizzo
 
NEEL2015 challenge summary
NEEL2015 challenge summaryNEEL2015 challenge summary
NEEL2015 challenge summaryGiuseppe Rizzo
 
Inductive Entity Typing Alignment
Inductive Entity Typing AlignmentInductive Entity Typing Alignment
Inductive Entity Typing AlignmentGiuseppe Rizzo
 
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...Giuseppe Rizzo
 
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksCrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksGiuseppe Rizzo
 
Learning with the Web. Structuring data to ease machine understanding
Learning with the Web. Structuring data to ease  machine understandingLearning with the Web. Structuring data to ease  machine understanding
Learning with the Web. Structuring data to ease machine understandingGiuseppe Rizzo
 
Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Giuseppe Rizzo
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data CloudGiuseppe Rizzo
 
L'enorme archivio di dati: il Web
L'enorme archivio di dati: il WebL'enorme archivio di dati: il Web
L'enorme archivio di dati: il WebGiuseppe Rizzo
 

More from Giuseppe Rizzo (20)

Artificial intelligence for social good
Artificial intelligence for social goodArtificial intelligence for social good
Artificial intelligence for social good
 
AI in 60 minutes
AI in 60 minutesAI in 60 minutes
AI in 60 minutes
 
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HRCOMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HR
 
Understand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational AgentsUnderstand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational Agents
 
AI For Profiling Your Customers
AI For Profiling Your CustomersAI For Profiling Your Customers
AI For Profiling Your Customers
 
AI for Personalized Chatbot
AI for Personalized ChatbotAI for Personalized Chatbot
AI for Personalized Chatbot
 
Tourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel BookingsTourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel Bookings
 
The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity Linking
 
From Data to Knowledge for Tourists
From Data to Knowledge for TouristsFrom Data to Knowledge for Tourists
From Data to Knowledge for Tourists
 
Enabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart CityEnabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart City
 
NEEL2015 challenge summary
NEEL2015 challenge summaryNEEL2015 challenge summary
NEEL2015 challenge summary
 
Inductive Entity Typing Alignment
Inductive Entity Typing AlignmentInductive Entity Typing Alignment
Inductive Entity Typing Alignment
 
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
 
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksCrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
 
Learning with the Web. Structuring data to ease machine understanding
Learning with the Web. Structuring data to ease  machine understandingLearning with the Web. Structuring data to ease  machine understanding
Learning with the Web. Structuring data to ease machine understanding
 
Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
 
The NERD project
The NERD projectThe NERD project
The NERD project
 
L'enorme archivio di dati: il Web
L'enorme archivio di dati: il WebL'enorme archivio di dati: il Web
L'enorme archivio di dati: il Web
 

Recently uploaded

AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupMemory Fabric Forum
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17Ana-Maria Mihalceanu
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut meManoj Prabakar B
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxBrandon Minnick, MBA
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERNRonnelBaroc
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Adrian Sanabria
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfThomas Poetter
 
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfZ-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfDomotica daVinci
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEandreiandasan
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
zigbee motion sensor user manual NAS-PD07B2.pdf
zigbee motion sensor user manual NAS-PD07B2.pdfzigbee motion sensor user manual NAS-PD07B2.pdf
zigbee motion sensor user manual NAS-PD07B2.pdfDomotica daVinci
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...shaiyuvasv
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stackSummit
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfTete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfDomotica daVinci
 

Recently uploaded (20)

AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user group
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product Lineup
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut me
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptx
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
 
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfZ-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
zigbee motion sensor user manual NAS-PD07B2.pdf
zigbee motion sensor user manual NAS-PD07B2.pdfzigbee motion sensor user manual NAS-PD07B2.pdf
zigbee motion sensor user manual NAS-PD07B2.pdf
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stack
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfTete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
 

NERD: Evaluating Named Entity Recognition Tools in the Web of Data

  • 1. NERD: Evaluating Named Entity Recognition Tools in the Web of Data Giuseppe Rizzo <giuseppe.rizzo@eurecom.fr> Raphaël Troncy <raphael.troncy@eurecom.fr>
  • 2. What is a Named Entity recognition task? A task that aims to locate and classify the name of a person or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual document 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 2/21
  • 3. Named Entity recognition tools 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 3/21
  • 4. Differences among those NER extractors  Granularity  extract NE from sentences vs from the entire document  Technologies used  algorithms used to extract NE  supported languages  taxonomy of type of NE recognized  disambiguation (dataset used to provide links)  content request size  Response format 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 4/21
  • 5. And ...  What about precision and recall?  Which extractor best fits my needs? 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 5/21
  • 6. Seeks to find pros and cons of those extractors What is NERD? REST API1 ontology3 UI2 1 http://nerd.eurecom.fr/api/application.wadl 2 http://nerd.eurecom.fr/ 3 http://nerd.eurecom.fr/ontology 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 6/21
  • 7. Showcase http://nerd.eurecom.fr Science: "Google Cars Drive Themselves", http://bit.ly/oTj8md (part of the original resource found at http://nyti.ms/9p19i8) 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 7/21
  • 8. Evaluation 5 extractors using default configurations  Controlled experiment  4 human raters  10 English news articles (5 from BBC and 5 from The New York Times)  each rater evaluated each article for all the extractors  200 evaluations in total  Uncontrolled experiment  17 human raters  53 English news articles (sources: CNN, BBC, The New York Times and Yahoo! News)  free selection of articles Each human rater received a training1 1 http://nerd.eurecom.fr/help 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 8/21
  • 9. Evaluation output t = (NE, type, URI, relevant) The assessment consists in rating these criteria with a Boolean value If no type or no disambiguation URI is provided by the extractor, it is considered false by default 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 9/21
  • 10. Controlled experiment - dataset1 Categories: World, Business, Sport, Science, Health 1 BBC article and 1 NYT article for each category Average word number per article: 981 The final number of unique entities detected is 4641 with an average number of named entity per article equal to 23.2 Some of the extractors (e.g. DBpedia Spotlight and Extractiv) provide NE duplicates. We removed all duplicates do not bias the statistics 1 http://nerd.eurecom.fr/ui/evaluation/wekex2011-goldenset.tar.gz 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 10/21
  • 11. Controlled experiment – agreement score Fleiss's kappa score1 Grouped by extractor Grouped by source Grouped by category 1 Joseph L. Fleiss. Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5):378–382, 1971 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 11/21
  • 12. Controlled experiment – statistic result Overall statistics Grouped by extractor different behavior for different sources Grouped by category 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 12/21
  • 13. Uncontrolled experiment - dataset 17 raters were free to select English news articles from CNN, BBC, The New York Times and Yahoo! News 53 news articles selected Total number of assessments = 94 and the assessment average number per user = 5.2 Each article assessed at least by 2 different tools The final number of unique entities detected is 1616 with an average number of named entity per article equal to 34 Some of the extractors (e.g. DBpedia Spotlight and Extractiv) provide NE duplicates. In order do not bias the statistics, we removed all duplicates 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 13/21
  • 14. Uncontrolled experiment – statistic result (I) Overall precision Grouped by extractors 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 14/21
  • 15. Uncontrolled experiment – statistic result (II) Grouped by category 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 15/21
  • 16. Q. Which are the best NER tools ? Conclusion A. They are ... AlchemyAPI has obtained the best results in NE extraction and categorization DBpedia Spotlight and Zemanta showed ability to disambiguate NE in the LOD cloud Experiments across categories of articles did not show significant differences in the analysis. Published the WEKEX'11 ground-truth http://nerd.eurecom.fr/ui/evaluation/wekex2011-goldenset.tar.gz 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 16/21
  • 17. Future Work (NERD Timeline) beginning core application uncontrolled experiment controlled experiment today REST API, release WEKEX'11 ground-truth release ISWC'11 ground truth NERD “smart” service: combining the best of all NER tools 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 17/21
  • 18. ISWC'11 golden-set Do you believe it's easy to find an agreement among all raters? We'd like inviting to create a new golden-set during the ISWC'2011 poster and demo session. We will kindly ask each rater to evaluate two short parts of two English news articles with all extractors supported by NERD 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 18/21
  • 19. Thanks for your time and your attention http://nerd.eurecom.fr @giusepperizzo @rtroncy #nerd http://www.slideshare.net/giusepperizzo 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 19/21
  • 20. Fleiss ' Kappa chance agreement K = 1 fully agreement among all raters K = 0 (or lesser than) poor agreement 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 20/21
  • 21. Fleiss ' kappa Interpretation Kappa Interpretation <0 Poor agreement 0.01 – 0.20 Slight agreement 0.21 – 0.40 Fair agreement 0.41 – 0.60 Moderate agreement 0.61 – 0.80 Substantial agreement 0.81 – 1.00 Almost perfect agreement 24 October 2011 Workshop on Web Scale Knowledge Extraction (WEKEX'11) - 21/21