SlideShare a Scribd company logo
NIF Tutorial – 2013/09/24 – Page 1 http://lod2.eu
Creating Knowledge out of Interlinked Data
LOD2 Presentation . 02.09.2010 . Page http://lod2.eu
AKSW, Universität Leipzig
Sebastian Hellmann
Content Analysis
and the Semantic Web
NIF 2.0 Tutorial
http://nlp2rdf.org
http://lod2.eu
http://slideshare.net/kurzum
NIF Tutorial – 2013/09/24 – Page 2 http://lod2.eu
Sebastian Hellmann – researcher working on LOD2 EU Project
AKSW – Agile Knowledge and the Semantic Web research group in Leipzig -
http://aksw.org
InfAI – Institute for Applied Informatics - http://infai.org
ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013
Introduction
NIF Tutorial – 2013/09/24 – Page 3 http://lod2.eu
Introduction
ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013
NIF Tutorial – 2013/09/24 – Page 4 http://lod2.eu
End users have tasks for NLP, but:
Each new tool is a challenge:
• How to download and start it?
• What kind of annotations does it use?
• How good does it perform (on my domain)?
• If badly, are there any alternatives? How can I find them?
• Open source?
• Lot's of know-how needed to exploit NLP.
• Lot's of data needed to exploit NLP.
Barriers to NLP
NIF Tutorial – 2013/09/24 – Page 5 http://lod2.eu
The Semantic Gap
NIF Tutorial – 2013/09/24 – Page 6 http://lod2.eu
NIF Tutorial – 2013/09/24 – Page 7 http://lod2.eu
• Part 1: exploiting free, open and interoperable (FOI)
language resources
• Part 2: Connecting text to these resources
• Part 3: tools, demos, infrastructure
From a walled garden to
an interoperable infrastructure
NIF Tutorial – 2013/09/24 – Page 8 http://lod2.eu
• Part 1: exploiting free, open and interoperable (FOI)
language resources
From a walled garden to
an interoperable infrastructure
NIF Tutorial – 2013/09/24 – Page 9 http://lod2.eu
http://lod-cloud.net
Linguistic/NLP Data currently filed
under “cross-domain”
NIF Tutorial – 2013/09/24 – Page 10 http://lod2.eu
http://lod-cloud.net
Linked Open Data
- All datasets provide open access to individual records via HTTP
- Many are free (no payment required, as in royalty-free)
- Some are openly licensed, e.g. CC-0 or CC-BY-SA
=> Open access also applies to published HTML on the WWW, but in LOD the data
itself is published unrendered via RDF
NIF Tutorial – 2013/09/24 – Page 11 http://lod2.eu
Question:
• Who knows how to add a new bubble to the LOD cloud?
From a walled garden to
an interoperable infrastructure
NIF Tutorial – 2013/09/24 – Page 12 http://lod2.eu
• Who knows how to add a new bubble to the LOD cloud?
http://datahub.io/group/linguistics
https://github.com/jmccrae/llod-cloud.py
http://validator.lod-cloud.net/validate.php
From a walled garden to
an interoperable infrastructure
NIF Tutorial – 2013/09/24 – Page 13 http://lod2.eu
NIF Tutorial – 2013/09/24 – Page 14 http://lod2.eu
NIF Tutorial – 2013/09/24 – Page 15 http://lod2.eu
Question:
• What are the most important data sets and ontologies for NLP?
• Who has used what?
FOI data
NIF Tutorial – 2013/09/24 – Page 16 http://lod2.eu
Analysis of mentions of Wikipedia / DBpedia at LREC 2012:
• https://www.google.com/webhp?q=site:http%3A%2F%2Fwww.lrec-conf.org%2
→ 163 papers
• https://www.google.com/webhp?q=site:http%3A%2F%2Fwww.lrec-conf.org%2
→ 24 papers
FOI data 1: Wikipedia / DBpedia
NIF Tutorial – 2013/09/24 – Page 17 http://lod2.eu
• Training data for NLP, e.g. URI, surrounding text, surface form
• Probabilities:
• P(sf|URI): P that “apple” refers to wikipedia:Apple_Inc.
• P(URI|sf): P that wikipedia:Apple_Inc. is “apple” in text
FOI data 1: Wikipedia / DBpedia
http://wiki.dbpedia.org/Datasets/NLP
NIF Tutorial – 2013/09/24 – Page 18 http://lod2.eu
FOI data: Wikipedia / DBpedia
http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?
QueryString=sodium
http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?
QueryString=sodium
Available data for “Sodium”
http://dbpedia.org/snorql
select ?labels where {
<http://dbpedia.org/resource/Sodium> rdfs:label ?labels .
} LIMIT 100
select ?altlabel where {
?redirect dbpedia-owl:wikiPageRedirects <http://dbpedia.org/resource/Sodium> .
?redirect rdfs:label ?altlabel .
} LIMIT 100
http://lcl.uniroma1.it/babelnet/explore.jsp?word=sodium&lang=EN
NIF Tutorial – 2013/09/24 – Page 19 http://lod2.eu
Wiktionary2RDF – Mediator Wrapper
http://dbpedia.org/Wiktionary
NIF Tutorial – 2013/09/24 – Page 20 http://lod2.eu
http://dbpedia.org/Wiktionary
NIF Tutorial – 2013/09/24 – Page 21 http://lod2.eu
http://dbpedia.org/Wiktionary
NIF Tutorial – 2013/09/24 – Page 22 http://lod2.eu
Wiktionary2RDF – Mediator Wrapper
http://dbpedia.org/Wiktionary
Mediator
Lemon
NIF Tutorial – 2013/09/24 – Page 23 http://lod2.eu
Wiktionary2RDF – Mediator Wrapper
http://lcl.uniroma1.it/babelnet/explore.jsp?word=sodium&lang=EN
https://en.wiktionary.org/wiki/sodium#English
http://wiktionary.dbpedia.org/resource/sodium
NIF Tutorial – 2013/09/24 – Page 24 http://lod2.eu
Lemon Ontology - http://lemon-model.net
NIF Tutorial – 2013/09/24 – Page 25 http://lod2.eu
Lemon Ontology - http://lemon-model.net
IntersectiveDataPropertyAdjective ("extinct" ,
dbpedia:conservationStatus ,"EX")
IntersectiveDataPropertyAdjective ("endangered" ,
dbpedia:conservationStatus ,"EN")
https://github.com/cunger/lemon.dbpedia
Christina Unger, John Mccrae, Sebastian Walter, Sara Winter and Philipp Cimiano (2013):
A lemon lexicon for DBpedia. NLP & DBpedia Workshop
NIF Tutorial – 2013/09/24 – Page 26 http://lod2.eu
• Part 2: Connecting text to these resources
From a walled garden to
an interoperable infrastructure
NIF Tutorial – 2013/09/24 – Page 27 http://lod2.eu
From a walled garden to
an interoperable infrastructure
https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki
NIF Tutorial – 2013/09/24 – Page 28 http://lod2.eu
From a walled garden to
an interoperable infrastructure
Overview of existing tools:
• http://en.wikipedia.org/wiki/Knowledge_extraction#Tools
NIF Tutorial – 2013/09/24 – Page 29 http://lod2.eu
From a walled garden to
an interoperable infrastructure
Developers nightmare:
• All tools belong to similar class of NLP tools
→ Wikifier or Named Entity Linking, SOA principle
But they all have:
• Heterogeneous output formats (JSON, XML)
• Heterogeneous API parameters
• Heterogeneous ways of annotating text:
• Some remove HTML internally, offsets not usable
• Some use byte offset instead of char offset
NIF Tutorial – 2013/09/24 – Page 30 http://lod2.eu
From a walled garden to
an interoperable infrastructure
Demo
• http://rdface.aksw.org/new/tinymce/examples/rdface.html
NIF Tutorial – 2013/09/24 – Page 31 http://lod2.eu
ITS 2.0 - http://www.w3.org/TR/its20/
The Internationalization Tag Set (ITS) 2.0 – enhances the foundation to
integrate automated processing of human language into core Web
technologies.
• Currently last call
• Driven by localization industry
• Embed translation aids into HTML and XML
• Robust way to encode NLP information in HTML
• ITS 2.0 describes 20 data categories → ontology
NIF Tutorial – 2013/09/24 – Page 32 http://lod2.eu
NIF overview
Summary
• Motivated the Walled Garden problem
• Overview of the emerging Web of Language resources
• Motivated the NLP tool heterogeneity problem
• Introduction of ITS 2.0 Use case for NIF
• Now: NIF 2.0
NIF Tutorial – 2013/09/24 – Page 33 http://lod2.eu
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
achieve interoperability between Natural Language Processing (NLP) tools,
language resources and annotations.
• Reuse of existing standards such as RDF, OWL 2, the PROV Ontology, LAF
(ISO 24612), Unicode and RFC 5147
• Standardize access parameters, annotations (e.g. tokenization), validation
and log messages.
• A NIF workflow, however, can obviously not provide any better performance
(F-measure, speed) than a properly configured UIMA or GATE pipeline with
the same components.
• Lower entry barrier, easy data integration, reusability of tools and
conceptualisation, off-the-shelf solutions for common tasks.
NIF Overview
NIF Tutorial – 2013/09/24 – Page 34 http://lod2.eu
Relation of NIF and UIMA and Gate
• A Formal Framework for Linguistic Annotation (2000) by Steven Bird, Mark
Liberman
• take home message: generic annotation formats should be based on
graphs
• Ontologies in NIF (e.g. OliA, lemon) can be hard compiled for internal use (as
is done in Stanbol)
WP3 Task 3.2 – Community work: NLP2RDF
Not primarily aimed at
increasing features or
performance (F-Measure)
NIF Tutorial – 2013/09/24 – Page 35 http://lod2.eu
WP3 Task 3.2 – NIF overview
NIF Tutorial – 2013/09/24 – Page 36 http://lod2.eu
• NIF turns out to have a Unique selling proposition regarding NLP and RDF
• NIF will be the recommended RDF conversion of the Internationalisation
Tagset 2.0 of W3C (ITS 2.0) - http://www.w3.org/TR/its20/
• There was no alternative RDF vocabulary for this conversion available.
NIF Overview
NIF Tutorial – 2013/09/24 – Page 37 http://lod2.eu
WP3 Task 3.2 – Community work: NLP2RDF
RDFa parsers loose all provenance information:
<http://examples.com/books/wikinomics> dc:title ''Wikinomics'' .
https://en.wikipedia.org/wiki/RDFa
NIF Tutorial – 2013/09/24 – Page 38 http://lod2.eu
Available resources:
http://persistence.uni-leipzig.org/nlp2rdf/
Disclaimer
Migration to the online presence is still on-going, but there are 15 scientific
publications, e.g.
Integrating NLP using Linked Data. Sebastian Hellmann, Jens Lehmann, Sören Auer, and Martin Brümmer. 12th
International Semantic Web Conference, 21-25 October 2013, Sydney, Australia, (2013) -
http://svn.aksw.org/papers/2013/ISWC_NIF/public.pdf
NIF Overview
NIF Tutorial – 2013/09/24 – Page 39 http://lod2.eu
Question:
• What is a String?
NIF Basics
NIF Tutorial – 2013/09/24 – Page 40 http://lod2.eu
Counting strings is more difficult than it seems:
• Three ways to count Unicode:
• Code Units
• Code Points
• Graphems
• Encoding:
• UTF-8, 16, 32
NIF Basics Unicode
NIF Tutorial – 2013/09/24 – Page 41 http://lod2.eu
• Code Unit. The minimal bit combination that can represent a unit of encoded
text for processing or interchange. The Unicode Standard uses 8-bit code
units in the UTF-8 encoding form, 16-bit code units in the UTF-16 encoding
form, and 32-bit code units in the UTF-32 encoding form.
• Code Point. (1) Any value in the Unicode codespace; that is, the range of
integers from 0 to 10FFFF16. Not all code points are assigned to encoded
characters. See code point type. (2) A value, or position, for a character, in
any coded character set.
• Unicode Normal Form C
• http://unicode.org/reports/tr15/#Norm_Forms
Unicode
NIF Tutorial – 2013/09/24 – Page 42 http://lod2.eu
• Recommendation for RDF Literals
• http://unicode.org/reports/tr15/#Norm_Forms
Unicode Normal Form C
NIF Tutorial – 2013/09/24 – Page 43 http://lod2.eu
• NIF uses Unicode Normal Form C
• NIF counts in Code Points
Unicode
NIF Tutorial – 2013/09/24 – Page 44 http://lod2.eu
• Sadly, there are still implementation problems:
• Java length() vs. PHP strlen() function
• curl --data-urlencode i=" 대 " -d f=text "http://nlp2rdf.lod2.eu/nif-ws.php"
• Korean Character is URL encoded (#%EB%8C%80) and counted as 3
characters (not NFC in PHP)
Demo
ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013
NIF Tutorial – 2013/09/24 – Page 45 http://lod2.eu
• Now some RDF (finally):
• Note that in NIF the document is != content of the document.
• two different documents can have the same content
=> must not have the same URI
Context
NIF Tutorial – 2013/09/24 – Page 46 http://lod2.eu
Annotations
NIF Tutorial – 2013/09/24 – Page 47 http://lod2.eu
Tokenization
Christian Chiarcos, Julia Ritz, Manfred Stede: By all these lovely tokens... Merging conflicting tokenizations.
Language Resources and Evaluation 46(1): 53-74 (2012)
NIF Tutorial – 2013/09/24 – Page 48 http://lod2.eu
NIF
Demo:
http://nlp2rdf.lod2.eu/demo.php
NIF Tutorial – 2013/09/24 – Page 49 http://lod2.eu
• SPARQL queries produce (find) errors
• http://persistence.uni-leipzig.org/nlp2rdf/ontologies/testcase/lib/nif-2.0-suite.t
• RLOG – An RDF Logging Ontology
• ./validate.jar -i nif-erroneous-model.ttl -t file
• Demo → character count
• Demo → all errors
Validation over specification
ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013
NIF Tutorial – 2013/09/24 – Page 50 http://lod2.eu
NIF
Demo:
http://nlp2rdf.lod2.eu/demo.php
NIF Tutorial – 2013/09/24 – Page 51 http://lod2.eu
NIF
NIF Tutorial – 2013/09/24 – Page 52 http://lod2.eu
• http://www.w3.org/TR/its20/#conversion-to-nif
• http://www.w3.org/TR/its20/#nif-backconversion
NIF
NIF Tutorial – 2013/09/24 – Page 53 http://lod2.eu
• Demo
• Load Terminological model or Inference Model
Reasoning
NIF Tutorial – 2013/09/24 – Page 54 http://lod2.eu
Open Community – All feedback is welcome!
http://slideshare.net/kurzum
Websites:
http://dbpedia.org
http://nlp2rdf.org
http://lod2.eu
Thanks for your attention
ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013

More Related Content

What's hot

LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Creating Knowledge out of Interlinked Data
 
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
WARCnet
 
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
FREMEProjectH2020
 
Freme at feisgiltt 2015 freme & linked data & localisers
Freme at feisgiltt 2015   freme & linked data & localisersFreme at feisgiltt 2015   freme & linked data & localisers
Freme at feisgiltt 2015 freme & linked data & localisers
Felix Sasaki
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
Georg Rehm
 
viki.
viki.viki.
viki.
itians
 
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Creating Knowledge out of Interlinked Data
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for Europe
Georg Rehm
 

What's hot (9)

LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar: SIREn
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
 
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
Fremeatfeisgiltt2015 fremelinkeddatalocalisers-150603090934-lva1-app6891
 
Freme at feisgiltt 2015 freme & linked data & localisers
Freme at feisgiltt 2015   freme & linked data & localisersFreme at feisgiltt 2015   freme & linked data & localisers
Freme at feisgiltt 2015 freme & linked data & localisers
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
 
viki.
viki.viki.
viki.
 
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for Europe
 

Similar to NIF 2.0 Tutorial: Content Analysis and the Semantic Web

Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
Sebastian Hellmann
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
Sebastian Hellmann
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
Sebastian Hellmann
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Sebastian Hellmann
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting Information
Julien PLU
 
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge BasesLOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Creating Knowledge out of Interlinked Data
 
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
Pieter Pauwels
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
semanticsconference
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
Sebastian Hellmann
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 Creating Knowledge out of Interlinked Data
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
Sebastian Hellmann
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Sebastian Hellmann
 
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industryLOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Creating Knowledge out of Interlinked Data
 
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF toolsCIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
Pieter Pauwels
 
TPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the WebTPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the Web
Pieter Pauwels
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
Sebastian Hellmann
 
Free Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st releaseFree Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st release
LOD2 Creating Knowledge out of Interlinked Data
 
RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
Boris Villazón-Terrazas
 
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and AuthoringLOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2 Creating Knowledge out of Interlinked Data
 

Similar to NIF 2.0 Tutorial: Content Analysis and the Semantic Web (20)

Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting Information
 
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge BasesLOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
 
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
Linked Open Data stuff
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industryLOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industry
 
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF toolsCIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
 
TPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the WebTPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the Web
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Free Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st releaseFree Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st release
 
RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
 
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and AuthoringLOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
 

More from Sebastian Hellmann

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
Sebastian Hellmann
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
Sebastian Hellmann
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
Sebastian Hellmann
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
Sebastian Hellmann
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
Sebastian Hellmann
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
Sebastian Hellmann
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
Sebastian Hellmann
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
Sebastian Hellmann
 

More from Sebastian Hellmann (10)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Recently uploaded

(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
Priyanka Aash
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
Priyanka Aash
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
janagijoythi
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
Priyanka Aash
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
softsuave
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
Arpan Buwa
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 

Recently uploaded (20)

(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 

NIF 2.0 Tutorial: Content Analysis and the Semantic Web

  • 1. NIF Tutorial – 2013/09/24 – Page 1 http://lod2.eu Creating Knowledge out of Interlinked Data LOD2 Presentation . 02.09.2010 . Page http://lod2.eu AKSW, Universität Leipzig Sebastian Hellmann Content Analysis and the Semantic Web NIF 2.0 Tutorial http://nlp2rdf.org http://lod2.eu http://slideshare.net/kurzum
  • 2. NIF Tutorial – 2013/09/24 – Page 2 http://lod2.eu Sebastian Hellmann – researcher working on LOD2 EU Project AKSW – Agile Knowledge and the Semantic Web research group in Leipzig - http://aksw.org InfAI – Institute for Applied Informatics - http://infai.org ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013 Introduction
  • 3. NIF Tutorial – 2013/09/24 – Page 3 http://lod2.eu Introduction ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013
  • 4. NIF Tutorial – 2013/09/24 – Page 4 http://lod2.eu End users have tasks for NLP, but: Each new tool is a challenge: • How to download and start it? • What kind of annotations does it use? • How good does it perform (on my domain)? • If badly, are there any alternatives? How can I find them? • Open source? • Lot's of know-how needed to exploit NLP. • Lot's of data needed to exploit NLP. Barriers to NLP
  • 5. NIF Tutorial – 2013/09/24 – Page 5 http://lod2.eu The Semantic Gap
  • 6. NIF Tutorial – 2013/09/24 – Page 6 http://lod2.eu
  • 7. NIF Tutorial – 2013/09/24 – Page 7 http://lod2.eu • Part 1: exploiting free, open and interoperable (FOI) language resources • Part 2: Connecting text to these resources • Part 3: tools, demos, infrastructure From a walled garden to an interoperable infrastructure
  • 8. NIF Tutorial – 2013/09/24 – Page 8 http://lod2.eu • Part 1: exploiting free, open and interoperable (FOI) language resources From a walled garden to an interoperable infrastructure
  • 9. NIF Tutorial – 2013/09/24 – Page 9 http://lod2.eu http://lod-cloud.net Linguistic/NLP Data currently filed under “cross-domain”
  • 10. NIF Tutorial – 2013/09/24 – Page 10 http://lod2.eu http://lod-cloud.net Linked Open Data - All datasets provide open access to individual records via HTTP - Many are free (no payment required, as in royalty-free) - Some are openly licensed, e.g. CC-0 or CC-BY-SA => Open access also applies to published HTML on the WWW, but in LOD the data itself is published unrendered via RDF
  • 11. NIF Tutorial – 2013/09/24 – Page 11 http://lod2.eu Question: • Who knows how to add a new bubble to the LOD cloud? From a walled garden to an interoperable infrastructure
  • 12. NIF Tutorial – 2013/09/24 – Page 12 http://lod2.eu • Who knows how to add a new bubble to the LOD cloud? http://datahub.io/group/linguistics https://github.com/jmccrae/llod-cloud.py http://validator.lod-cloud.net/validate.php From a walled garden to an interoperable infrastructure
  • 13. NIF Tutorial – 2013/09/24 – Page 13 http://lod2.eu
  • 14. NIF Tutorial – 2013/09/24 – Page 14 http://lod2.eu
  • 15. NIF Tutorial – 2013/09/24 – Page 15 http://lod2.eu Question: • What are the most important data sets and ontologies for NLP? • Who has used what? FOI data
  • 16. NIF Tutorial – 2013/09/24 – Page 16 http://lod2.eu Analysis of mentions of Wikipedia / DBpedia at LREC 2012: • https://www.google.com/webhp?q=site:http%3A%2F%2Fwww.lrec-conf.org%2 → 163 papers • https://www.google.com/webhp?q=site:http%3A%2F%2Fwww.lrec-conf.org%2 → 24 papers FOI data 1: Wikipedia / DBpedia
  • 17. NIF Tutorial – 2013/09/24 – Page 17 http://lod2.eu • Training data for NLP, e.g. URI, surrounding text, surface form • Probabilities: • P(sf|URI): P that “apple” refers to wikipedia:Apple_Inc. • P(URI|sf): P that wikipedia:Apple_Inc. is “apple” in text FOI data 1: Wikipedia / DBpedia http://wiki.dbpedia.org/Datasets/NLP
  • 18. NIF Tutorial – 2013/09/24 – Page 18 http://lod2.eu FOI data: Wikipedia / DBpedia http://lookup.dbpedia.org/api/search.asmx/KeywordSearch? QueryString=sodium http://lookup.dbpedia.org/api/search.asmx/KeywordSearch? QueryString=sodium Available data for “Sodium” http://dbpedia.org/snorql select ?labels where { <http://dbpedia.org/resource/Sodium> rdfs:label ?labels . } LIMIT 100 select ?altlabel where { ?redirect dbpedia-owl:wikiPageRedirects <http://dbpedia.org/resource/Sodium> . ?redirect rdfs:label ?altlabel . } LIMIT 100 http://lcl.uniroma1.it/babelnet/explore.jsp?word=sodium&lang=EN
  • 19. NIF Tutorial – 2013/09/24 – Page 19 http://lod2.eu Wiktionary2RDF – Mediator Wrapper http://dbpedia.org/Wiktionary
  • 20. NIF Tutorial – 2013/09/24 – Page 20 http://lod2.eu http://dbpedia.org/Wiktionary
  • 21. NIF Tutorial – 2013/09/24 – Page 21 http://lod2.eu http://dbpedia.org/Wiktionary
  • 22. NIF Tutorial – 2013/09/24 – Page 22 http://lod2.eu Wiktionary2RDF – Mediator Wrapper http://dbpedia.org/Wiktionary Mediator Lemon
  • 23. NIF Tutorial – 2013/09/24 – Page 23 http://lod2.eu Wiktionary2RDF – Mediator Wrapper http://lcl.uniroma1.it/babelnet/explore.jsp?word=sodium&lang=EN https://en.wiktionary.org/wiki/sodium#English http://wiktionary.dbpedia.org/resource/sodium
  • 24. NIF Tutorial – 2013/09/24 – Page 24 http://lod2.eu Lemon Ontology - http://lemon-model.net
  • 25. NIF Tutorial – 2013/09/24 – Page 25 http://lod2.eu Lemon Ontology - http://lemon-model.net IntersectiveDataPropertyAdjective ("extinct" , dbpedia:conservationStatus ,"EX") IntersectiveDataPropertyAdjective ("endangered" , dbpedia:conservationStatus ,"EN") https://github.com/cunger/lemon.dbpedia Christina Unger, John Mccrae, Sebastian Walter, Sara Winter and Philipp Cimiano (2013): A lemon lexicon for DBpedia. NLP & DBpedia Workshop
  • 26. NIF Tutorial – 2013/09/24 – Page 26 http://lod2.eu • Part 2: Connecting text to these resources From a walled garden to an interoperable infrastructure
  • 27. NIF Tutorial – 2013/09/24 – Page 27 http://lod2.eu From a walled garden to an interoperable infrastructure https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki
  • 28. NIF Tutorial – 2013/09/24 – Page 28 http://lod2.eu From a walled garden to an interoperable infrastructure Overview of existing tools: • http://en.wikipedia.org/wiki/Knowledge_extraction#Tools
  • 29. NIF Tutorial – 2013/09/24 – Page 29 http://lod2.eu From a walled garden to an interoperable infrastructure Developers nightmare: • All tools belong to similar class of NLP tools → Wikifier or Named Entity Linking, SOA principle But they all have: • Heterogeneous output formats (JSON, XML) • Heterogeneous API parameters • Heterogeneous ways of annotating text: • Some remove HTML internally, offsets not usable • Some use byte offset instead of char offset
  • 30. NIF Tutorial – 2013/09/24 – Page 30 http://lod2.eu From a walled garden to an interoperable infrastructure Demo • http://rdface.aksw.org/new/tinymce/examples/rdface.html
  • 31. NIF Tutorial – 2013/09/24 – Page 31 http://lod2.eu ITS 2.0 - http://www.w3.org/TR/its20/ The Internationalization Tag Set (ITS) 2.0 – enhances the foundation to integrate automated processing of human language into core Web technologies. • Currently last call • Driven by localization industry • Embed translation aids into HTML and XML • Robust way to encode NLP information in HTML • ITS 2.0 describes 20 data categories → ontology
  • 32. NIF Tutorial – 2013/09/24 – Page 32 http://lod2.eu NIF overview Summary • Motivated the Walled Garden problem • Overview of the emerging Web of Language resources • Motivated the NLP tool heterogeneity problem • Introduction of ITS 2.0 Use case for NIF • Now: NIF 2.0
  • 33. NIF Tutorial – 2013/09/24 – Page 33 http://lod2.eu The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. • Reuse of existing standards such as RDF, OWL 2, the PROV Ontology, LAF (ISO 24612), Unicode and RFC 5147 • Standardize access parameters, annotations (e.g. tokenization), validation and log messages. • A NIF workflow, however, can obviously not provide any better performance (F-measure, speed) than a properly configured UIMA or GATE pipeline with the same components. • Lower entry barrier, easy data integration, reusability of tools and conceptualisation, off-the-shelf solutions for common tasks. NIF Overview
  • 34. NIF Tutorial – 2013/09/24 – Page 34 http://lod2.eu Relation of NIF and UIMA and Gate • A Formal Framework for Linguistic Annotation (2000) by Steven Bird, Mark Liberman • take home message: generic annotation formats should be based on graphs • Ontologies in NIF (e.g. OliA, lemon) can be hard compiled for internal use (as is done in Stanbol) WP3 Task 3.2 – Community work: NLP2RDF Not primarily aimed at increasing features or performance (F-Measure)
  • 35. NIF Tutorial – 2013/09/24 – Page 35 http://lod2.eu WP3 Task 3.2 – NIF overview
  • 36. NIF Tutorial – 2013/09/24 – Page 36 http://lod2.eu • NIF turns out to have a Unique selling proposition regarding NLP and RDF • NIF will be the recommended RDF conversion of the Internationalisation Tagset 2.0 of W3C (ITS 2.0) - http://www.w3.org/TR/its20/ • There was no alternative RDF vocabulary for this conversion available. NIF Overview
  • 37. NIF Tutorial – 2013/09/24 – Page 37 http://lod2.eu WP3 Task 3.2 – Community work: NLP2RDF RDFa parsers loose all provenance information: <http://examples.com/books/wikinomics> dc:title ''Wikinomics'' . https://en.wikipedia.org/wiki/RDFa
  • 38. NIF Tutorial – 2013/09/24 – Page 38 http://lod2.eu Available resources: http://persistence.uni-leipzig.org/nlp2rdf/ Disclaimer Migration to the online presence is still on-going, but there are 15 scientific publications, e.g. Integrating NLP using Linked Data. Sebastian Hellmann, Jens Lehmann, Sören Auer, and Martin Brümmer. 12th International Semantic Web Conference, 21-25 October 2013, Sydney, Australia, (2013) - http://svn.aksw.org/papers/2013/ISWC_NIF/public.pdf NIF Overview
  • 39. NIF Tutorial – 2013/09/24 – Page 39 http://lod2.eu Question: • What is a String? NIF Basics
  • 40. NIF Tutorial – 2013/09/24 – Page 40 http://lod2.eu Counting strings is more difficult than it seems: • Three ways to count Unicode: • Code Units • Code Points • Graphems • Encoding: • UTF-8, 16, 32 NIF Basics Unicode
  • 41. NIF Tutorial – 2013/09/24 – Page 41 http://lod2.eu • Code Unit. The minimal bit combination that can represent a unit of encoded text for processing or interchange. The Unicode Standard uses 8-bit code units in the UTF-8 encoding form, 16-bit code units in the UTF-16 encoding form, and 32-bit code units in the UTF-32 encoding form. • Code Point. (1) Any value in the Unicode codespace; that is, the range of integers from 0 to 10FFFF16. Not all code points are assigned to encoded characters. See code point type. (2) A value, or position, for a character, in any coded character set. • Unicode Normal Form C • http://unicode.org/reports/tr15/#Norm_Forms Unicode
  • 42. NIF Tutorial – 2013/09/24 – Page 42 http://lod2.eu • Recommendation for RDF Literals • http://unicode.org/reports/tr15/#Norm_Forms Unicode Normal Form C
  • 43. NIF Tutorial – 2013/09/24 – Page 43 http://lod2.eu • NIF uses Unicode Normal Form C • NIF counts in Code Points Unicode
  • 44. NIF Tutorial – 2013/09/24 – Page 44 http://lod2.eu • Sadly, there are still implementation problems: • Java length() vs. PHP strlen() function • curl --data-urlencode i=" 대 " -d f=text "http://nlp2rdf.lod2.eu/nif-ws.php" • Korean Character is URL encoded (#%EB%8C%80) and counted as 3 characters (not NFC in PHP) Demo ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013
  • 45. NIF Tutorial – 2013/09/24 – Page 45 http://lod2.eu • Now some RDF (finally): • Note that in NIF the document is != content of the document. • two different documents can have the same content => must not have the same URI Context
  • 46. NIF Tutorial – 2013/09/24 – Page 46 http://lod2.eu Annotations
  • 47. NIF Tutorial – 2013/09/24 – Page 47 http://lod2.eu Tokenization Christian Chiarcos, Julia Ritz, Manfred Stede: By all these lovely tokens... Merging conflicting tokenizations. Language Resources and Evaluation 46(1): 53-74 (2012)
  • 48. NIF Tutorial – 2013/09/24 – Page 48 http://lod2.eu NIF Demo: http://nlp2rdf.lod2.eu/demo.php
  • 49. NIF Tutorial – 2013/09/24 – Page 49 http://lod2.eu • SPARQL queries produce (find) errors • http://persistence.uni-leipzig.org/nlp2rdf/ontologies/testcase/lib/nif-2.0-suite.t • RLOG – An RDF Logging Ontology • ./validate.jar -i nif-erroneous-model.ttl -t file • Demo → character count • Demo → all errors Validation over specification ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013
  • 50. NIF Tutorial – 2013/09/24 – Page 50 http://lod2.eu NIF Demo: http://nlp2rdf.lod2.eu/demo.php
  • 51. NIF Tutorial – 2013/09/24 – Page 51 http://lod2.eu NIF
  • 52. NIF Tutorial – 2013/09/24 – Page 52 http://lod2.eu • http://www.w3.org/TR/its20/#conversion-to-nif • http://www.w3.org/TR/its20/#nif-backconversion NIF
  • 53. NIF Tutorial – 2013/09/24 – Page 53 http://lod2.eu • Demo • Load Terminological model or Inference Model Reasoning
  • 54. NIF Tutorial – 2013/09/24 – Page 54 http://lod2.eu Open Community – All feedback is welcome! http://slideshare.net/kurzum Websites: http://dbpedia.org http://nlp2rdf.org http://lod2.eu Thanks for your attention ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013