SlideShare a Scribd company logo
1 of 15
Download to read offline
s




       NERD meets NIF:
Lifting NLP Extraction Results
   to the Linked Data Cloud
        Giuseppe Rizzo and Raphaël Troncy
                EURECOM, France
      Sebastian Hellmann and Martin Bruemmer
            Universität Leipzig, Germany
What is a Named Entity recognition task?
A task that aims to locate and classify the name of a person or an
organization, a location, a brand, a product, a numeric expression
including time, date, money and percent in a textual document




 16/04/2012        5th Workshop on Linked Data on the Web (LDOW2012)   2/15
NER tools

 Standalone software
 GATE
 Stanford CoreNLP
 Temis

 Web APIs




16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)   3/15
Factual comparison of 10 Web NER tools
                    Alchemy   DBpedia         Evri     Extractiv      Lupedia       Open      Saplo   Wikimeta    Yahoo!     Zemanta
                      API     Spotlight                                             Calais

Language            EN,FR,      EN            EN,          EN         EN,FR,        EN,FR     EN,      EN,FR            EN     EN
                    GR,IT,      GR*            IT                       IT           SP       SW        SP
                    PT,RU,      PT*
                    SP,SW       SP*

Granularity          OEN        OEN           OED        OEN            OEN          OEN      OED       OEN         OEN        OED
Entity               N/A       char           N/A        word         range of       char     N/A      POS         range       N/A
position                       offset                    offset        chars         offset            offset        of
                                                                                                                   chars

Classification      Alchemy   DBpedia         Evri     DBpedia        DBpedia       Open      N/A     ESTER        Yahoo     FreeBase
schema                        FreeBase                                LinkedM       Calais
                              Scema.or                                   DB
                                  g



Number of            324        320             5          34            319           95      5         7              13     81
classes
Response            JSON       HTML           HTM        HTML          HTML         JSON      JSON     JSON        JSON        XML
Format              MicroF     JSON            L         JSON          JSON         MicroF              XML         XML       JSON
                     XML        RDF           JSO         RDF          RDFa         ormat                                      RDF
                     RDF        XML            N          XML           XML
                                              RDF

Quota               30000       unl          3000        3000            unl        50000     1333      unl         5000      10000
(calls/day)
       16/04/2012                         5th Workshop on Linked Data on the Web (LDOW2012)                      4/15
What is NERD?
    ontology1                    REST API2
                         UI3
                                                                                         The NERD ontology has been
                                                                                         integrated in the NIF project,
                                                                                         a EU FP7 in the context of the
                                                                                         LOD2: Creating Knowledge
                                                                                         out of Interlinked Data


1
  http://nerd.eurecom.fr/ontology
2
  http://nerd.eurecom.fr/api/application.wadl
3
  http://nerd.eurecom.fr


    16/04/2012                       5th Workshop on Linked Data on the Web (LDOW2012)                  5/15
NERD Ontology




             Aligned the taxonomies used by
                      the extractors
16/04/2012        5th Workshop on Linked Data on the Web (LDOW2012)   6/15
NERD type          Occurrence
Building the NERD Ontology                                       Person                     10
                                                                 Organization               10
                                                                 Country                      6
                                                                 Company                      6
                                                                 Location                     6
                                                                 Continent                    5
                                                                 City                         5
                                                                 RadioStation                 5
                                                                 Album                        5
                                                                 Product                      5
                                                                 ...                         ...




16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)               7/15
Ontology alignment validation

  5 TED
   talks

  1000
   NYT
  news
 articles
   217
WWW2011
 abstracts

  16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)   8/15
Integration
  Different outputs for the NLP tools (Standalone
 and Web APIs)
  OpenCalais                                                       DBpedia Spotlight
  "_type": "Organization",                                         "@URI": "http://dbpedia.org/resource/DBpedia",
  “name": "North Atlantic Treaty Organization",                    "@types": "DBpedia:Software,DBpedia:Work”
  "organizationtype": "governmental civilian",                     "@surfaceForm": "dbpedia",
  "nationality": "N/A",                                            "@offset": "0",
  "_typeReference":                                                "@support": "11",
      http://s.opencalais.com/1/type/em/e/Organization",           "@similarityScore": "0.2387271374464035",
  ...                                                              …



  For integration or reuse manual effort is needed
         time consuming
         difficult to track definitions
  NERD creates a sharable JSON/RDF annotation
 output

16/04/2012                        5th Workshop on Linked Data on the Web (LDOW2012)                       9/15
NERD REST API

                                                                         “entities” : [{
                                                                                “entity”: “W3C” ,
                                                                                “type”: “Organization” ,
                                                                                “uri”: "http://dbpedia.org/page/W3C",
                                                      JSON                      “nerdType”:
                                                                         "http://nerd.eurecom.fr/ontology#Organization",
                                                                                “startChar”: 30,
                                                                                “endChar”: 32,
/document/{idDocument}                                                          “confidence”: 1,
/user/{idUser}                     GET,                                         “relevance”: 0.5
                                  POST,                                  }]
/annotation/{extractor}
/extraction/{idExtraction}         PUT,
/evaluation                      DELETE
...

                                                        RDF




   16/04/2012                5th Workshop on Linked Data on the Web (LDOW2012)                     10/15
Textual annotation

 Let's consider the URI:
 http://www.w3.org/DesignIssues/LinkedData.html
     The Semantic Web isn't just about putting data on the web. It is about
     making links, so that a person or machine can explore the web of data.
     With linked data, when you have some of it, you can find other, related,
     data.….
     All the above plus, Use open standards from W3C (RDF and SPARQL) to
     identify things, so that people can point at your stuff
     ...

   entities: {
       …
       [entity: W3C, startChar: 23107, endChar: 23110],
       …
   }
16/04/2012             5th Workshop on Linked Data on the Web (LDOW2012)   11/15
NERD meets NIF

                                      Model documents through a
                                      set of strings deferencable
                                      within the Web
                      : offset_23107_ 23110 a str:String ;
                               str:referenceContext :offset_0_26546 .

                                      Map string to entity
                      : offset_23107_ 23110 sso:oen dbpedia:W3C .


                                       Classification
                       dbpedia:W3C rdf:type nerd:Organization .



16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)   12/15
NERD User Interface




16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)   13/15
Conclusions and perspectives
NERD UI and REST API
   unified interface for extracting NEs from various type of texts
NERD ontology
    common schema for entity classification
NERD & NIF
    lift the extraction annotation results to the LOD cloud

Systematic comparison for the NE extraction and
classification tasks:
      ETAPE corpus
      CoNLL 2003 corpus

Combining several extractions to improve the strengths
of a single tool
16/04/2012          5th Workshop on Linked Data on the Web (LDOW2012)   14/15
Thanks for your time and your attention




             http://nerd.eurecom.fr

             @giusepperizzo @rtroncy #nerd



             http://www.slideshare.net/giusepperizzo




16/04/2012   5th Workshop on Linked Data on the Web (LDOW2012)   15/15

More Related Content

What's hot

LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data Categories
Menzo Windhouwer
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparql
Mustafa Jarrar
 
Pal gov.tutorial2.session13 3.data integration and fusion using rdf
Pal gov.tutorial2.session13 3.data integration and fusion using rdfPal gov.tutorial2.session13 3.data integration and fusion using rdf
Pal gov.tutorial2.session13 3.data integration and fusion using rdf
Mustafa Jarrar
 
Calvin Hendryx Parker, Enabling the Semantic Web with RDF
Calvin Hendryx Parker, Enabling the Semantic Web with RDFCalvin Hendryx Parker, Enabling the Semantic Web with RDF
Calvin Hendryx Parker, Enabling the Semantic Web with RDF
webcontent2007
 
Pal gov.tutorial2.session5 2.rdfs_jarrar
Pal gov.tutorial2.session5 2.rdfs_jarrarPal gov.tutorial2.session5 2.rdfs_jarrar
Pal gov.tutorial2.session5 2.rdfs_jarrar
Mustafa Jarrar
 

What's hot (14)

Ontologies in RDF-S/OWL
Ontologies in RDF-S/OWLOntologies in RDF-S/OWL
Ontologies in RDF-S/OWL
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
 
LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data Categories
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDF
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
2010 10 provxg_datagovuk
2010 10 provxg_datagovuk2010 10 provxg_datagovuk
2010 10 provxg_datagovuk
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF Data
 
Jarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and SolutionsJarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and Solutions
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparql
 
Pal gov.tutorial2.session13 3.data integration and fusion using rdf
Pal gov.tutorial2.session13 3.data integration and fusion using rdfPal gov.tutorial2.session13 3.data integration and fusion using rdf
Pal gov.tutorial2.session13 3.data integration and fusion using rdf
 
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and ExchangeCompact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and Exchange
 
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
 
Calvin Hendryx Parker, Enabling the Semantic Web with RDF
Calvin Hendryx Parker, Enabling the Semantic Web with RDFCalvin Hendryx Parker, Enabling the Semantic Web with RDF
Calvin Hendryx Parker, Enabling the Semantic Web with RDF
 
Pal gov.tutorial2.session5 2.rdfs_jarrar
Pal gov.tutorial2.session5 2.rdfs_jarrarPal gov.tutorial2.session5 2.rdfs_jarrar
Pal gov.tutorial2.session5 2.rdfs_jarrar
 

Viewers also liked

Habits of mind launch
Habits of mind launchHabits of mind launch
Habits of mind launch
douglasgreig
 
Edisi 12 Aceh Sep
Edisi 12 Aceh SepEdisi 12 Aceh Sep
Edisi 12 Aceh Sep
epaper
 
10desnasyg Bner
10desnasyg Bner10desnasyg Bner
10desnasyg Bner
epaper
 
Edisi 31 Okt Nas
Edisi 31 Okt NasEdisi 31 Okt Nas
Edisi 31 Okt Nas
epaper
 
28des N As
28des N As28des N As
28des N As
epaper
 
Edisi2novaceh
Edisi2novacehEdisi2novaceh
Edisi2novaceh
epaper
 
Edisi2novnas
Edisi2novnasEdisi2novnas
Edisi2novnas
epaper
 
Edisi10okt
Edisi10oktEdisi10okt
Edisi10okt
epaper
 
Edisi1nas
Edisi1nasEdisi1nas
Edisi1nas
epaper
 
Edisi26sep
Edisi26sepEdisi26sep
Edisi26sep
epaper
 
GXYSearch: job seeking advice
GXYSearch: job seeking adviceGXYSearch: job seeking advice
GXYSearch: job seeking advice
Joanne Spain
 
Edisi26aceh
Edisi26acehEdisi26aceh
Edisi26aceh
epaper
 
Edisi 6 Des Nas
Edisi 6 Des NasEdisi 6 Des Nas
Edisi 6 Des Nas
epaper
 
Edisi 16 Nov Aceh
Edisi 16 Nov AcehEdisi 16 Nov Aceh
Edisi 16 Nov Aceh
epaper
 

Viewers also liked (20)

Habits of mind launch
Habits of mind launchHabits of mind launch
Habits of mind launch
 
Edisi 12 Aceh Sep
Edisi 12 Aceh SepEdisi 12 Aceh Sep
Edisi 12 Aceh Sep
 
10desnasyg Bner
10desnasyg Bner10desnasyg Bner
10desnasyg Bner
 
Edisi 31 Okt Nas
Edisi 31 Okt NasEdisi 31 Okt Nas
Edisi 31 Okt Nas
 
28des N As
28des N As28des N As
28des N As
 
Edisi2novaceh
Edisi2novacehEdisi2novaceh
Edisi2novaceh
 
Edisi2novnas
Edisi2novnasEdisi2novnas
Edisi2novnas
 
Lect 8 env sustainability in oe 2013
Lect 8 env sustainability in oe 2013Lect 8 env sustainability in oe 2013
Lect 8 env sustainability in oe 2013
 
Edisi10okt
Edisi10oktEdisi10okt
Edisi10okt
 
Edisi1nas
Edisi1nasEdisi1nas
Edisi1nas
 
Edisi26sep
Edisi26sepEdisi26sep
Edisi26sep
 
Vim Cards - Keynote Format
Vim Cards - Keynote FormatVim Cards - Keynote Format
Vim Cards - Keynote Format
 
102215 front porch think
102215 front porch think102215 front porch think
102215 front porch think
 
GXYSearch: job seeking advice
GXYSearch: job seeking adviceGXYSearch: job seeking advice
GXYSearch: job seeking advice
 
Edisi26aceh
Edisi26acehEdisi26aceh
Edisi26aceh
 
Edisi 6 Des Nas
Edisi 6 Des NasEdisi 6 Des Nas
Edisi 6 Des Nas
 
15 sep 11 bt property 2011_ultra luxury apartments here to stay
15 sep 11 bt property 2011_ultra luxury apartments here to stay15 sep 11 bt property 2011_ultra luxury apartments here to stay
15 sep 11 bt property 2011_ultra luxury apartments here to stay
 
Analisis tendencias pedagogicas caso personal
Analisis tendencias pedagogicas caso personalAnalisis tendencias pedagogicas caso personal
Analisis tendencias pedagogicas caso personal
 
Edisi 16 Nov Aceh
Edisi 16 Nov AcehEdisi 16 Nov Aceh
Edisi 16 Nov Aceh
 
great Frank lloyd wright
great Frank lloyd wright great Frank lloyd wright
great Frank lloyd wright
 

Similar to NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud

Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
Marin Dimitrov
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
Luis Daniel Ibáñez
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
European Data Forum
 

Similar to NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud (20)

Accessing the Linked Open Data Cloud via ODBC
Accessing the Linked Open Data Cloud via ODBCAccessing the Linked Open Data Cloud via ODBC
Accessing the Linked Open Data Cloud via ODBC
 
RJ Broker, ORI, & OpenDepot.org
RJ Broker, ORI, & OpenDepot.orgRJ Broker, ORI, & OpenDepot.org
RJ Broker, ORI, & OpenDepot.org
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
What Factors Influence the Design of a Linked Data Generation Algorithm?
What Factors Influence the Design of a Linked Data Generation Algorithm?What Factors Influence the Design of a Linked Data Generation Algorithm?
What Factors Influence the Design of a Linked Data Generation Algorithm?
 
Applying Semantic Extensions And New Services To Drupal Sem Tech June 2010
Applying Semantic Extensions And New Services To Drupal   Sem Tech June 2010Applying Semantic Extensions And New Services To Drupal   Sem Tech June 2010
Applying Semantic Extensions And New Services To Drupal Sem Tech June 2010
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
Websci17 final
Websci17 finalWebsci17 final
Websci17 final
 
Linking the world with Python and Semantics
Linking the world with Python and SemanticsLinking the world with Python and Semantics
Linking the world with Python and Semantics
 
.Net and Rdf APIs
.Net and Rdf APIs.Net and Rdf APIs
.Net and Rdf APIs
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Fact forge20 edf
Fact forge20 edfFact forge20 edf
Fact forge20 edf
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
 
Sheldon challenge
Sheldon challengeSheldon challenge
Sheldon challenge
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
LOD2 Webinar: The 2nd release of the LOD2 stack
LOD2 Webinar: The 2nd release of the LOD2 stackLOD2 Webinar: The 2nd release of the LOD2 stack
LOD2 Webinar: The 2nd release of the LOD2 stack
 
Open Data & Linked Open Data
Open Data & Linked Open DataOpen Data & Linked Open Data
Open Data & Linked Open Data
 

More from Giuseppe Rizzo

Zenaminer: driving the SCORM tandard towards the Web of Data
Zenaminer: driving the SCORM tandard towards the Web of DataZenaminer: driving the SCORM tandard towards the Web of Data
Zenaminer: driving the SCORM tandard towards the Web of Data
Giuseppe Rizzo
 

More from Giuseppe Rizzo (20)

Artificial intelligence for social good
Artificial intelligence for social goodArtificial intelligence for social good
Artificial intelligence for social good
 
AI in 60 minutes
AI in 60 minutesAI in 60 minutes
AI in 60 minutes
 
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HRCOMPRENDE, PERSONALIZZA, INTERAGISCE E  IMPARA: L’AI COGNITIVA PER L’HR
COMPRENDE, PERSONALIZZA, INTERAGISCE E IMPARA: L’AI COGNITIVA PER L’HR
 
Understand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational AgentsUnderstand, Answer and Argument: Conversational Agents
Understand, Answer and Argument: Conversational Agents
 
AI For Profiling Your Customers
AI For Profiling Your CustomersAI For Profiling Your Customers
AI For Profiling Your Customers
 
AI for Personalized Chatbot
AI for Personalized ChatbotAI for Personalized Chatbot
AI for Personalized Chatbot
 
Tourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel BookingsTourist Knowledge Graph Creation to Automating Travel Bookings
Tourist Knowledge Graph Creation to Automating Travel Bookings
 
The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1The SentiME System at the SSA Challenge Task 1
The SentiME System at the SSA Challenge Task 1
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity Linking
 
From Data to Knowledge for Tourists
From Data to Knowledge for TouristsFrom Data to Knowledge for Tourists
From Data to Knowledge for Tourists
 
Enabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart CityEnabling Visitors to Explore a Smart City
Enabling Visitors to Explore a Smart City
 
NEEL2015 challenge summary
NEEL2015 challenge summaryNEEL2015 challenge summary
NEEL2015 challenge summary
 
Inductive Entity Typing Alignment
Inductive Entity Typing AlignmentInductive Entity Typing Alignment
Inductive Entity Typing Alignment
 
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
Benchmarking the Extraction and Disambiguation of Named Entities on the Seman...
 
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksCrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
 
Learning with the Web. Structuring data to ease machine understanding
Learning with the Web. Structuring data to ease  machine understandingLearning with the Web. Structuring data to ease  machine understanding
Learning with the Web. Structuring data to ease machine understanding
 
Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Learning with the Web: Spotting Named Entities on the intersection of NERD an...
Learning with the Web: Spotting Named Entities on the intersection of NERD an...
 
L'enorme archivio di dati: il Web
L'enorme archivio di dati: il WebL'enorme archivio di dati: il Web
L'enorme archivio di dati: il Web
 
NERD: Evaluating Named Entity Recognition Tools in the Web of Data
NERD: Evaluating Named Entity Recognition Tools in the Web of DataNERD: Evaluating Named Entity Recognition Tools in the Web of Data
NERD: Evaluating Named Entity Recognition Tools in the Web of Data
 
Zenaminer: driving the SCORM tandard towards the Web of Data
Zenaminer: driving the SCORM tandard towards the Web of DataZenaminer: driving the SCORM tandard towards the Web of Data
Zenaminer: driving the SCORM tandard towards the Web of Data
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Recently uploaded (20)

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 

NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud

  • 1. s NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud Giuseppe Rizzo and Raphaël Troncy EURECOM, France Sebastian Hellmann and Martin Bruemmer Universität Leipzig, Germany
  • 2. What is a Named Entity recognition task? A task that aims to locate and classify the name of a person or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual document 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 2/15
  • 3. NER tools  Standalone software GATE Stanford CoreNLP Temis  Web APIs 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 3/15
  • 4. Factual comparison of 10 Web NER tools Alchemy DBpedia Evri Extractiv Lupedia Open Saplo Wikimeta Yahoo! Zemanta API Spotlight Calais Language EN,FR, EN EN, EN EN,FR, EN,FR EN, EN,FR EN EN GR,IT, GR* IT IT SP SW SP PT,RU, PT* SP,SW SP* Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OED Entity N/A char N/A word range of char N/A POS range N/A position offset offset chars offset offset of chars Classification Alchemy DBpedia Evri DBpedia DBpedia Open N/A ESTER Yahoo FreeBase schema FreeBase LinkedM Calais Scema.or DB g Number of 324 320 5 34 319 95 5 7 13 81 classes Response JSON HTML HTM HTML HTML JSON JSON JSON JSON XML Format MicroF JSON L JSON JSON MicroF XML XML JSON XML RDF JSO RDF RDFa ormat RDF RDF XML N XML XML RDF Quota 30000 unl 3000 3000 unl 50000 1333 unl 5000 10000 (calls/day) 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 4/15
  • 5. What is NERD? ontology1 REST API2 UI3 The NERD ontology has been integrated in the NIF project, a EU FP7 in the context of the LOD2: Creating Knowledge out of Interlinked Data 1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 5/15
  • 6. NERD Ontology Aligned the taxonomies used by the extractors 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 6/15
  • 7. NERD type Occurrence Building the NERD Ontology Person 10 Organization 10 Country 6 Company 6 Location 6 Continent 5 City 5 RadioStation 5 Album 5 Product 5 ... ... 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 7/15
  • 8. Ontology alignment validation 5 TED talks 1000 NYT news articles 217 WWW2011 abstracts 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 8/15
  • 9. Integration  Different outputs for the NLP tools (Standalone and Web APIs) OpenCalais DBpedia Spotlight "_type": "Organization", "@URI": "http://dbpedia.org/resource/DBpedia", “name": "North Atlantic Treaty Organization", "@types": "DBpedia:Software,DBpedia:Work” "organizationtype": "governmental civilian", "@surfaceForm": "dbpedia", "nationality": "N/A", "@offset": "0", "_typeReference": "@support": "11", http://s.opencalais.com/1/type/em/e/Organization", "@similarityScore": "0.2387271374464035", ... …  For integration or reuse manual effort is needed  time consuming  difficult to track definitions  NERD creates a sharable JSON/RDF annotation output 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 9/15
  • 10. NERD REST API “entities” : [{ “entity”: “W3C” , “type”: “Organization” , “uri”: "http://dbpedia.org/page/W3C", JSON “nerdType”: "http://nerd.eurecom.fr/ontology#Organization", “startChar”: 30, “endChar”: 32, /document/{idDocument} “confidence”: 1, /user/{idUser} GET, “relevance”: 0.5 POST, }] /annotation/{extractor} /extraction/{idExtraction} PUT, /evaluation DELETE ... RDF 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 10/15
  • 11. Textual annotation Let's consider the URI: http://www.w3.org/DesignIssues/LinkedData.html The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.…. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff ... entities: { … [entity: W3C, startChar: 23107, endChar: 23110], … } 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 11/15
  • 12. NERD meets NIF Model documents through a set of strings deferencable within the Web : offset_23107_ 23110 a str:String ; str:referenceContext :offset_0_26546 . Map string to entity : offset_23107_ 23110 sso:oen dbpedia:W3C . Classification dbpedia:W3C rdf:type nerd:Organization . 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 12/15
  • 13. NERD User Interface 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 13/15
  • 14. Conclusions and perspectives NERD UI and REST API unified interface for extracting NEs from various type of texts NERD ontology common schema for entity classification NERD & NIF lift the extraction annotation results to the LOD cloud Systematic comparison for the NE extraction and classification tasks: ETAPE corpus CoNLL 2003 corpus Combining several extractions to improve the strengths of a single tool 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 14/15
  • 15. Thanks for your time and your attention http://nerd.eurecom.fr @giusepperizzo @rtroncy #nerd http://www.slideshare.net/giusepperizzo 16/04/2012 5th Workshop on Linked Data on the Web (LDOW2012) 15/15