SlideShare a Scribd company logo
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Roundtripping of NIF based
Linguistic Linked Data with non
linked data sources
Felix Sasaki
DFKI / W3C Fellow
Slides:
http://de.slideshare.net/atcfsenzoku/sasaki-datathonmadrid2015
1
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
What is NIF?
• Natural Language Processing Interchange
Format
– See http://nlp2rdf.org/
• LLD format to store annotations & to organize
NLP pipelines
• API specification to create NIF workflows
• More details: after the coffee break 
• Following slides: main roles for NIF
2
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example (Partial; JSON-LD Syntax)
{ "@graph" : [ {
"@id" : "p:char=0,18",
"@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ],
"anchorOf" : "Welcome to Prague.",
"beginIndex" : "0",
"endIndex" : "18",
"isString" : "Welcome to Prague.",
"referenceContext" : "p:char=0,18”
}, {
"@id" : "p:char=11,17",
"@type" : [ "nif:RFC5147String", "nif:Word" ], …
"referenceContext" : "p:char=0,18",
"taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] }
3
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example (Partial; JSON-LD Syntax)
{ "@graph" : [ {
"@id" : "p:char=0,18",
"@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ],
"anchorOf" : "Welcome to Prague.",
"beginIndex" : "0",
"endIndex" : "18",
"isString" : "Welcome to Prague.",
"referenceContext" : "p:char=0,18”
}, {
"@id" : "p:char=11,17",
"@type" : [ "nif:RFC5147String", "nif:Word" ], …
"referenceContext" : "p:char=0,18",
"taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] }
4
• Identifying and typing
annotations
• Identifying annotation
offsets
• Adding additional
knowledge, e.g. named
entity identifier
• Interrelating
annotations
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example (Partial; JSON-LD Syntax)
{ "@graph" : [ {
"@id" : "p:char=0,18",
"@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ],
"anchorOf" : "Welcome to Prague.",
"beginIndex" : "0",
"endIndex" : "18",
"isString" : "Welcome to Prague.",
"referenceContext" : "p:char=0,18”
}, {
"@id" : "p:char=11,17",
"@type" : [ "nif:RFC5147String", "nif:Word" ], …
"referenceContext" : "p:char=0,18",
"taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] }
5
• Identifying and typing
annotations
• Identifying annotation
offsets
• Adding additional
knowledge, e.g. named
entity identifier
• Interrelating
annotations
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example (Partial; JSON-LD Syntax)
{ "@graph" : [ {
"@id" : "p:char=0,18",
"@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ],
"anchorOf" : "Welcome to Prague.",
"beginIndex" : "0",
"endIndex" : "18",
"isString" : "Welcome to Prague.",
"referenceContext" : "p:char=0,18”
}, {
"@id" : "p:char=11,17",
"@type" : [ "nif:RFC5147String", "nif:Word" ], …
"referenceContext" : "p:char=0,18",
"taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] }
6
• Identifying and typing
annotations
• Identifying annotation
offsets
• Adding additional
knowledge, e.g.
named entity identifier
• Interrelating
annotations
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example (Partial; JSON-LD Syntax)
{ "@graph" : [ {
"@id" : "p:char=0,18",
"@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ],
"anchorOf" : "Welcome to Prague.",
"beginIndex" : "0",
"endIndex" : "18",
"isString" : "Welcome to Prague.",
"referenceContext" : "p:char=0,18”
}, {
"@id" : "p:char=11,17",
"@type" : [ "nif:RFC5147String", "nif:Word" ], …
"referenceContext" : "p:char=0,18",
"taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] }
7
• Identifying and typing
annotations
• Identifying annotation
offsets
• Adding additional
knowledge, e.g.
named entity identifier
• Interrelating
annotations
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
A NIF workflow
8
Existing
content
Content analytics, e.g.
named entity
recognition
Conversion to
NIF
Deploying knowledge from the LLD cloud
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Potential scenario: roundtripping
9
Existing
content
Content analytics, e.g.
named entity
recognition
Conversion to
NIF
Storing annotations in original content
Deploying knowledge from the LLD cloud
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Roundtripping
• Roundtripping: Storing the outcome of
content processing (analytics) tasks in the
original content
• Not always needed, but sometimes –
examples:
– Enriching Web content with named entity
information; generating Schema.org markup via
NIF pipelines. Format: HTML
– Enriching localisation content, to add value
beyond translation: Format: XLIFF
10
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example: HTML
Example roundtripping workflow
11
… <p>Welcome to Prague!</p>…
…<p>Welcome to <span …
itemtype="http://schema.org/Place">Prague</span>!<
/p>…
1) Conversion to NIF 2) NER processing
3) Back conversion to HTML
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example: XLIFF
Example roundtripping workflow
12
… <xlf:source>Welcome to Prague!</xlf:source> …
… <xlf:source>Welcome to <mrk …
its:taClassRef="http://schema.org/Place">Prague
</mrk>!</xlf:source> …
1) Conversion to NIF 2) NER processing
3) Back conversion to HTML
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Example usage scenario:
FREME project
• See http://www.freme-project.eu/
• Developing interfaces for multilingual and semantic
enrichment of digital content
• Relies on NIF based enrichment workflows
– See FREME API version 0.1
http://api.freme-project.eu/doc/0.1/
• Deploys aspects of the LIDER reference architecture for LLD
processing
– See D3.1.1 at http://lider-project.eu/?q=doc/deliverables
• Focuses on four business cases
– Localization BC requires XLIFF roundtripping
– Web content personalisation BC requires HTML roundtripping
13
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Challenges for roundtripping
• Source format
– How to store enrichment information
(annotations)
– How to handle existing information
• Annotation model
– NIF = a general graph-based annotation model
– Sources format and annotation motivation may
require restriction of the model
14
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
How to store annotations in various
source formats
• Solvable for markup languages like HTML or
XLIFF
• Challenge to preserve existing markup
“<p>Welcome to <b>Prague</b>!</p>”
• General issue with complex and proprietary
formats:
– “My own” storage mechanism = no tool support
– Using existing storage mechanisms may mean:
overloading semantics
15
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Source format example: Word
… <w:t>Welcome to Prague!</w:t> …
16
… <w:commentRangeStart w:id="0"/><w:t>Prague</w:t>
<w:commentRangeEnd w:id="0"/>
<w:r w:rsidR="00987079"> …
<w:p w:rsidRPr="00987079">… Enrichment: type "http://schema.org/Place"…</w:p>
Enrichment process; storing enrichment as comments
Change of original content: creation of anchor
Comment stored separately; refers to anchor: “standoff approach”
Content storage
Comment storage
Content storage (Word file unzipped)
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Annotation models
• NIF: like RDF = general graph model
– Consisting of nodes and arcs
17
p:char=11,17 dbp:Prague
taIdentRef
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Restricting graphs: Tree structured annotations
on several layers
18
• Tree structures
for syntactic
annotations
• Several
annotation layers
for the same text
• Concurrent
hierarchies
• Representation
only of one of
these in
roundtripping
with XML
Example taken from TEI http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Representing overlapping hierarchies
with markup (1/2)
Solutions advertised by the TEI
• Multiple encoding of the same information
– One XML document per annotation
• Boundary marking with empty “milestone”
elements
– Also used by XLIFF
19
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Representing overlapping hierarchies
with markup (2/2)
Solutions advertised by the TEI
• Fragmentation and reconstitution of virtual
elements
– One hierarchy explicit, others with interrelated
marked-up spans
• Stand-off markup
– Separation of text and annotations, interlinked via
anchor and reference
– Cf. Word example
20
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Representing overlapping hierarchies
in RDF
POWLA (cf. Chiarcos, 2012)
• RDF representation for corpus annotation,
based on PAULA XML Standoff format
• Allows to represent hierarchical, multi-layer
corpora in RDF and query in SPARQL
• Not relevant for roundtripping, but for
linguistic annotation representation and
processing in RDF
21
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Lessons learned
• Choose the overlap solution that fits your
roundtripping modelling and processing needs
• Consider off-the-shelf tooling
– For 100% hierarchical data: XPath / CSS selectors, DOM, …
• Consider libraries
– For extraction only: Tika http://tika.apache.org/
– For roundtripping: Okapi http://okapi.opentag.com/ - in
FREME currently being adapted for roundtripping in
selected formats
• Make sure the annotation survives in the original
format – cf. Word example
– Soon to be made easier by using Okapi
22
Sasaki – LLD Datathon – Cercedilla, Spain, May 2015
Roundtripping of NIF based
Linguistic Linked Data with non
linked data sources
Felix Sasaki
DFKI / W3C Fellow
23

More Related Content

What's hot

Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
Peter Haase
 
Linked base Registries | The Scottish Government - Webinar 2017
Linked base Registries | The Scottish Government - Webinar 2017Linked base Registries | The Scottish Government - Webinar 2017
Linked base Registries | The Scottish Government - Webinar 2017
Raf Buyle
 
Linked data-tooling-xml
Linked data-tooling-xmlLinked data-tooling-xml
Linked data-tooling-xml
Felix Sasaki
 
Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
Jerven Bolleman
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
Peter Haase
 
Hybrid Enterprise Knowledge Graphs
Hybrid Enterprise Knowledge GraphsHybrid Enterprise Knowledge Graphs
Hybrid Enterprise Knowledge Graphs
Peter Haase
 
Semantic Web Technology
Semantic Web TechnologySemantic Web Technology
Semantic Web Technology
Rathachai Chawuthai
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
National Institute of Informatics
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
Heiko Paulheim
 
LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
Vrije Universiteit Amsterdam
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
Ontotext
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
Martin Voigt
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
Ontotext
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
Ontotext
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 
Regal - a Repository for Electronic Documents and Bibliographic Data
Regal - a Repository for Electronic Documents and Bibliographic DataRegal - a Repository for Electronic Documents and Bibliographic Data
Regal - a Repository for Electronic Documents and Bibliographic DataFelix Ostrowski
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
Michele Pasin
 
Linked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publicationLinked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publication
horvadam
 
Semantics 2017 - Trying Not to Die Benchmarking using LITMUS
Semantics 2017 - Trying Not to Die Benchmarking using LITMUSSemantics 2017 - Trying Not to Die Benchmarking using LITMUS
Semantics 2017 - Trying Not to Die Benchmarking using LITMUS
Harsh Thakkar
 

What's hot (20)

Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Linked base Registries | The Scottish Government - Webinar 2017
Linked base Registries | The Scottish Government - Webinar 2017Linked base Registries | The Scottish Government - Webinar 2017
Linked base Registries | The Scottish Government - Webinar 2017
 
Linked data-tooling-xml
Linked data-tooling-xmlLinked data-tooling-xml
Linked data-tooling-xml
 
Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Hybrid Enterprise Knowledge Graphs
Hybrid Enterprise Knowledge GraphsHybrid Enterprise Knowledge Graphs
Hybrid Enterprise Knowledge Graphs
 
Semantic Web Technology
Semantic Web TechnologySemantic Web Technology
Semantic Web Technology
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Regal - a Repository for Electronic Documents and Bibliographic Data
Regal - a Repository for Electronic Documents and Bibliographic DataRegal - a Repository for Electronic Documents and Bibliographic Data
Regal - a Repository for Electronic Documents and Bibliographic Data
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
 
Linked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publicationLinked Data at the National Széchényi Library : road to the publication
Linked Data at the National Széchényi Library : road to the publication
 
Semantics 2017 - Trying Not to Die Benchmarking using LITMUS
Semantics 2017 - Trying Not to Die Benchmarking using LITMUSSemantics 2017 - Trying Not to Die Benchmarking using LITMUS
Semantics 2017 - Trying Not to Die Benchmarking using LITMUS
 

Similar to Sasaki datathon-madrid-2015

MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open DataMuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
21Style
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015
Michele Pasin
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
Tony Hammond
 
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Sergio Fernández
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
Enrico Daga
 
Eclipse RDF4J - Working with RDF in Java
Eclipse RDF4J - Working with RDF in JavaEclipse RDF4J - Working with RDF in Java
Eclipse RDF4J - Working with RDF in Java
Jeen Broekstra
 
Sasaki practical-linked-data
Sasaki practical-linked-dataSasaki practical-linked-data
Sasaki practical-linked-data
Felix Sasaki
 
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
FREMEProjectH2020
 
Oc wg-nif-20130711
Oc wg-nif-20130711Oc wg-nif-20130711
Oc wg-nif-20130711
STIinnsbruck
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
Dimitris Kontokostas
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
nvitucci
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
Ivan Herman
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
Roberto García
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
Ivan Ermilov
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
Enrico Daga
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
Ioan Toma
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
LDBC council
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
Ivan Herman
 
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE
 
RDF Linked Data - Automatic Exchange of BIM Containers
RDF Linked Data - Automatic Exchange of BIM ContainersRDF Linked Data - Automatic Exchange of BIM Containers
RDF Linked Data - Automatic Exchange of BIM Containers
Safe Software
 

Similar to Sasaki datathon-madrid-2015 (20)

MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open DataMuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Eclipse RDF4J - Working with RDF in Java
Eclipse RDF4J - Working with RDF in JavaEclipse RDF4J - Working with RDF in Java
Eclipse RDF4J - Working with RDF in Java
 
Sasaki practical-linked-data
Sasaki practical-linked-dataSasaki practical-linked-data
Sasaki practical-linked-data
 
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
 
Oc wg-nif-20130711
Oc wg-nif-20130711Oc wg-nif-20130711
Oc wg-nif-20130711
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
 
RDF Linked Data - Automatic Exchange of BIM Containers
RDF Linked Data - Automatic Exchange of BIM ContainersRDF Linked Data - Automatic Exchange of BIM Containers
RDF Linked Data - Automatic Exchange of BIM Containers
 

More from Felix Sasaki

Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbankenThb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
Felix Sasaki
 
XML Seminar
XML SeminarXML Seminar
XML Seminar
Felix Sasaki
 
Sasaki Presentation at EVA 2016
Sasaki Presentation at EVA 2016Sasaki Presentation at EVA 2016
Sasaki Presentation at EVA 2016
Felix Sasaki
 
Freme at feisgiltt 2015 freme & linked data & localisers
Freme at feisgiltt 2015   freme & linked data & localisersFreme at feisgiltt 2015   freme & linked data & localisers
Freme at feisgiltt 2015 freme & linked data & localisers
Felix Sasaki
 
Freme at feisgiltt 2015 freme use cases
Freme at feisgiltt 2015   freme use casesFreme at feisgiltt 2015   freme use cases
Freme at feisgiltt 2015 freme use casesFelix Sasaki
 
1114 sasaki-metadata
1114 sasaki-metadata1114 sasaki-metadata
1114 sasaki-metadata
Felix Sasaki
 
Its2 ontology-localization
Its2 ontology-localizationIts2 ontology-localization
Its2 ontology-localization
Felix Sasaki
 
Sasaki ins-netz-gegangen-20111117
Sasaki ins-netz-gegangen-20111117Sasaki ins-netz-gegangen-20111117
Sasaki ins-netz-gegangen-20111117Felix Sasaki
 
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 PräsentationFelix Sasaki
 
Sasaki markupforum2011
Sasaki markupforum2011Sasaki markupforum2011
Sasaki markupforum2011Felix Sasaki
 
Sasaki webtechcon2010
Sasaki webtechcon2010Sasaki webtechcon2010
Sasaki webtechcon2010Felix Sasaki
 
Mlw sasaki-20101027
Mlw sasaki-20101027Mlw sasaki-20101027
Mlw sasaki-20101027Felix Sasaki
 
HTML5 - presentation at W3C-Tag 2009
HTML5 - presentation at W3C-Tag 2009HTML5 - presentation at W3C-Tag 2009
HTML5 - presentation at W3C-Tag 2009
Felix Sasaki
 

More from Felix Sasaki (13)

Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbankenThb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
Thb tag-des-offenen-fensters-2021-sasaki-graphdatenbanken
 
XML Seminar
XML SeminarXML Seminar
XML Seminar
 
Sasaki Presentation at EVA 2016
Sasaki Presentation at EVA 2016Sasaki Presentation at EVA 2016
Sasaki Presentation at EVA 2016
 
Freme at feisgiltt 2015 freme & linked data & localisers
Freme at feisgiltt 2015   freme & linked data & localisersFreme at feisgiltt 2015   freme & linked data & localisers
Freme at feisgiltt 2015 freme & linked data & localisers
 
Freme at feisgiltt 2015 freme use cases
Freme at feisgiltt 2015   freme use casesFreme at feisgiltt 2015   freme use cases
Freme at feisgiltt 2015 freme use cases
 
1114 sasaki-metadata
1114 sasaki-metadata1114 sasaki-metadata
1114 sasaki-metadata
 
Its2 ontology-localization
Its2 ontology-localizationIts2 ontology-localization
Its2 ontology-localization
 
Sasaki ins-netz-gegangen-20111117
Sasaki ins-netz-gegangen-20111117Sasaki ins-netz-gegangen-20111117
Sasaki ins-netz-gegangen-20111117
 
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation
"Warum Metadaten? Ein Plädoyer und mehr …" - webtechcon 2011 Präsentation
 
Sasaki markupforum2011
Sasaki markupforum2011Sasaki markupforum2011
Sasaki markupforum2011
 
Sasaki webtechcon2010
Sasaki webtechcon2010Sasaki webtechcon2010
Sasaki webtechcon2010
 
Mlw sasaki-20101027
Mlw sasaki-20101027Mlw sasaki-20101027
Mlw sasaki-20101027
 
HTML5 - presentation at W3C-Tag 2009
HTML5 - presentation at W3C-Tag 2009HTML5 - presentation at W3C-Tag 2009
HTML5 - presentation at W3C-Tag 2009
 

Recently uploaded

[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
cuobya
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 

Recently uploaded (20)

[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 

Sasaki datathon-madrid-2015

  • 1. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Roundtripping of NIF based Linguistic Linked Data with non linked data sources Felix Sasaki DFKI / W3C Fellow Slides: http://de.slideshare.net/atcfsenzoku/sasaki-datathonmadrid2015 1
  • 2. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 What is NIF? • Natural Language Processing Interchange Format – See http://nlp2rdf.org/ • LLD format to store annotations & to organize NLP pipelines • API specification to create NIF workflows • More details: after the coffee break  • Following slides: main roles for NIF 2
  • 3. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example (Partial; JSON-LD Syntax) { "@graph" : [ { "@id" : "p:char=0,18", "@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ], "anchorOf" : "Welcome to Prague.", "beginIndex" : "0", "endIndex" : "18", "isString" : "Welcome to Prague.", "referenceContext" : "p:char=0,18” }, { "@id" : "p:char=11,17", "@type" : [ "nif:RFC5147String", "nif:Word" ], … "referenceContext" : "p:char=0,18", "taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] } 3
  • 4. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example (Partial; JSON-LD Syntax) { "@graph" : [ { "@id" : "p:char=0,18", "@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ], "anchorOf" : "Welcome to Prague.", "beginIndex" : "0", "endIndex" : "18", "isString" : "Welcome to Prague.", "referenceContext" : "p:char=0,18” }, { "@id" : "p:char=11,17", "@type" : [ "nif:RFC5147String", "nif:Word" ], … "referenceContext" : "p:char=0,18", "taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] } 4 • Identifying and typing annotations • Identifying annotation offsets • Adding additional knowledge, e.g. named entity identifier • Interrelating annotations
  • 5. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example (Partial; JSON-LD Syntax) { "@graph" : [ { "@id" : "p:char=0,18", "@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ], "anchorOf" : "Welcome to Prague.", "beginIndex" : "0", "endIndex" : "18", "isString" : "Welcome to Prague.", "referenceContext" : "p:char=0,18” }, { "@id" : "p:char=11,17", "@type" : [ "nif:RFC5147String", "nif:Word" ], … "referenceContext" : "p:char=0,18", "taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] } 5 • Identifying and typing annotations • Identifying annotation offsets • Adding additional knowledge, e.g. named entity identifier • Interrelating annotations
  • 6. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example (Partial; JSON-LD Syntax) { "@graph" : [ { "@id" : "p:char=0,18", "@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ], "anchorOf" : "Welcome to Prague.", "beginIndex" : "0", "endIndex" : "18", "isString" : "Welcome to Prague.", "referenceContext" : "p:char=0,18” }, { "@id" : "p:char=11,17", "@type" : [ "nif:RFC5147String", "nif:Word" ], … "referenceContext" : "p:char=0,18", "taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] } 6 • Identifying and typing annotations • Identifying annotation offsets • Adding additional knowledge, e.g. named entity identifier • Interrelating annotations
  • 7. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example (Partial; JSON-LD Syntax) { "@graph" : [ { "@id" : "p:char=0,18", "@type" : [ "nif:Context", "nif:Sentence", "nif:RFC5147String" ], "anchorOf" : "Welcome to Prague.", "beginIndex" : "0", "endIndex" : "18", "isString" : "Welcome to Prague.", "referenceContext" : "p:char=0,18” }, { "@id" : "p:char=11,17", "@type" : [ "nif:RFC5147String", "nif:Word" ], … "referenceContext" : "p:char=0,18", "taIdentRef" : "http://dbpedia.org/resource/Prague" }, …] } 7 • Identifying and typing annotations • Identifying annotation offsets • Adding additional knowledge, e.g. named entity identifier • Interrelating annotations
  • 8. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 A NIF workflow 8 Existing content Content analytics, e.g. named entity recognition Conversion to NIF Deploying knowledge from the LLD cloud
  • 9. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Potential scenario: roundtripping 9 Existing content Content analytics, e.g. named entity recognition Conversion to NIF Storing annotations in original content Deploying knowledge from the LLD cloud
  • 10. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Roundtripping • Roundtripping: Storing the outcome of content processing (analytics) tasks in the original content • Not always needed, but sometimes – examples: – Enriching Web content with named entity information; generating Schema.org markup via NIF pipelines. Format: HTML – Enriching localisation content, to add value beyond translation: Format: XLIFF 10
  • 11. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example: HTML Example roundtripping workflow 11 … <p>Welcome to Prague!</p>… …<p>Welcome to <span … itemtype="http://schema.org/Place">Prague</span>!< /p>… 1) Conversion to NIF 2) NER processing 3) Back conversion to HTML
  • 12. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example: XLIFF Example roundtripping workflow 12 … <xlf:source>Welcome to Prague!</xlf:source> … … <xlf:source>Welcome to <mrk … its:taClassRef="http://schema.org/Place">Prague </mrk>!</xlf:source> … 1) Conversion to NIF 2) NER processing 3) Back conversion to HTML
  • 13. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Example usage scenario: FREME project • See http://www.freme-project.eu/ • Developing interfaces for multilingual and semantic enrichment of digital content • Relies on NIF based enrichment workflows – See FREME API version 0.1 http://api.freme-project.eu/doc/0.1/ • Deploys aspects of the LIDER reference architecture for LLD processing – See D3.1.1 at http://lider-project.eu/?q=doc/deliverables • Focuses on four business cases – Localization BC requires XLIFF roundtripping – Web content personalisation BC requires HTML roundtripping 13
  • 14. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Challenges for roundtripping • Source format – How to store enrichment information (annotations) – How to handle existing information • Annotation model – NIF = a general graph-based annotation model – Sources format and annotation motivation may require restriction of the model 14
  • 15. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 How to store annotations in various source formats • Solvable for markup languages like HTML or XLIFF • Challenge to preserve existing markup “<p>Welcome to <b>Prague</b>!</p>” • General issue with complex and proprietary formats: – “My own” storage mechanism = no tool support – Using existing storage mechanisms may mean: overloading semantics 15
  • 16. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Source format example: Word … <w:t>Welcome to Prague!</w:t> … 16 … <w:commentRangeStart w:id="0"/><w:t>Prague</w:t> <w:commentRangeEnd w:id="0"/> <w:r w:rsidR="00987079"> … <w:p w:rsidRPr="00987079">… Enrichment: type "http://schema.org/Place"…</w:p> Enrichment process; storing enrichment as comments Change of original content: creation of anchor Comment stored separately; refers to anchor: “standoff approach” Content storage Comment storage Content storage (Word file unzipped)
  • 17. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Annotation models • NIF: like RDF = general graph model – Consisting of nodes and arcs 17 p:char=11,17 dbp:Prague taIdentRef
  • 18. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Restricting graphs: Tree structured annotations on several layers 18 • Tree structures for syntactic annotations • Several annotation layers for the same text • Concurrent hierarchies • Representation only of one of these in roundtripping with XML Example taken from TEI http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html
  • 19. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Representing overlapping hierarchies with markup (1/2) Solutions advertised by the TEI • Multiple encoding of the same information – One XML document per annotation • Boundary marking with empty “milestone” elements – Also used by XLIFF 19
  • 20. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Representing overlapping hierarchies with markup (2/2) Solutions advertised by the TEI • Fragmentation and reconstitution of virtual elements – One hierarchy explicit, others with interrelated marked-up spans • Stand-off markup – Separation of text and annotations, interlinked via anchor and reference – Cf. Word example 20
  • 21. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Representing overlapping hierarchies in RDF POWLA (cf. Chiarcos, 2012) • RDF representation for corpus annotation, based on PAULA XML Standoff format • Allows to represent hierarchical, multi-layer corpora in RDF and query in SPARQL • Not relevant for roundtripping, but for linguistic annotation representation and processing in RDF 21
  • 22. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Lessons learned • Choose the overlap solution that fits your roundtripping modelling and processing needs • Consider off-the-shelf tooling – For 100% hierarchical data: XPath / CSS selectors, DOM, … • Consider libraries – For extraction only: Tika http://tika.apache.org/ – For roundtripping: Okapi http://okapi.opentag.com/ - in FREME currently being adapted for roundtripping in selected formats • Make sure the annotation survives in the original format – cf. Word example – Soon to be made easier by using Okapi 22
  • 23. Sasaki – LLD Datathon – Cercedilla, Spain, May 2015 Roundtripping of NIF based Linguistic Linked Data with non linked data sources Felix Sasaki DFKI / W3C Fellow 23