SlideShare a Scribd company logo
Linguistic Linked Open Data
LLOD
Challenges, Approaches, Future Work
Sebastian Hellmann
TKE 2016
1
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
AKSW / KILT in Leipzig
Leipzig has become one of the largest Semantic Web centers
AKSW has 4 subgroups and 45 PhD students http://aksw.org/Team.html
Current position:
- Head of AKSW / KILT research group (8 PhD students)
- Knowledge Integration and Language Technology (KILT) http://aksw.org/Groups/KILT.html
- Project manager for 2 H2020 and 1 German research project (BMWi)
- http://freme-project.eu/ , http://aligned-project.eu/ , http://smartdataweb.de/
- Executive Director of the DBpedia Association http://dbpedia.org
2
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Outline
● The vision behind Linked Data - a technological introduction
● Linguistic Linked Open Data
● Knowledge Modelling vs. Data Encoding
● LIDER
● Challenges and Approaches
3
Linked Data
4
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Web of Data
WWW vs. GGG - https://en.wikipedia.org/wiki/Giant_Global_Graph
Data on the Web vs. the Web of Data vs. the Semantic Web
RDF - Entity Attribute Value - http://dbpedia.org/resource/Copenhagen
Three ways to publish RDF:
1. Linked Data: resource-level access via HTTP request (next slide)
2. SPARQL: query access via triplestore database
3. Dump: dataset-level access via bulk download
5
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Linked Data
Four rules of https://www.w3.org/DesignIssues/LinkedData
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the
standards (RDF*, SPARQL)
4. Include links to other URIs. so that they can discover more things.
https://en.wikipedia.org/wiki/Copenhagen vs.
http://dbpedia.org/resource/Copenhagen
Source: https://www.w3.org/DesignIssues/LinkedData.html
6
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Open Data != Open Data
Open Access vs Open License
Open Access means accessible like a web page (often unclear license)
http://opendefinition.org by OKFN:
“Knowledge is open if anyone is free to access, use, modify, and share it —
subject, at most, to measures that preserve provenance and openness.”
7
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
8
http://lod-cloud.net/
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
How is the Linked Data Cloud built?
9
- Open Access as the basis
- 50 links between things required to receive
a dataset link
- http://lov.okfn.org
- http://datahub.io
- Assessing Quantity and Quality of Links Between Linked Data Datasets by Cir
Sebastian Hellmann, Kay Müller, and Martin Brümmer in LDOW 2016 http://ev
org/ldow2016/papers/LDOW2016_paper_09.pdf
Linguistic Linked Open Data
10
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Linguistic Linked Open Data
● Movement originated in the context of the Working Group for Open Data in
Linguistics (OWLG) at Open Knowledge Foundation (OKFN)
● Open is supposed to mean Open license
● Join community mailing list at http://linguistics.okfn.org/
● Current information at http://linguistic-lod.org/
maintained by John McCrae
-> Instructions on how to join the LLOD cloud
11
January 2011
12
13
February 2012
Linked Data in Linguistics. Representing Language Data and Metadata (http://www.springer.
com/computer/ai/book/978-3-642-28248-5 ) Christian Chiarcos, Sebastian Nordhoff, and
Sebastian Hellmann (Eds.). Springer, Heidelberg, (2012)
August 2012
14
Sept 2012
MLODE
15
Special Issue on Multilingual Linked Open Data (MLOD)
Editors: Sebastian Hellmann, Steven Moran, Martin Brümm
and John McCrae,
Semantic Web, vol. 6, no. 4, pp. 315-317, 2015
Jan 2013
16
Sep 2013
17
LIDER FP7 EU Project
Start: Nov 2013
Duration: 2 years
http://lider-project.eu/
May 2014
18
LIDER FP7 EU Project
Start: Nov 2013
Duration: 2 years
http://lider-project.eu/
Nov 2014
19
LIDER FP7 EU Project
Start: Nov 2013
Duration: 2 years
http://lider-project.eu/
May 2015
20
LIDER FP7 EU Project
Start: Nov 2013
Duration: 2 years
http://lider-project.eu/
May 2016
21
LIDER FP7 EU Project
Start: Nov 2013
Duration: 2 years
http://lider-project.eu/
22
Should we all use Linked Data?
23
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Should we all use Linked Data?
When should we use linked data?
How should we use linked data?
When should we not use it?
24
Knowledge Modeling vs. Data Encoding
25
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Entity Relationship Diagrams and UML
26
The Metadata Ecosystem of the
DataId Ontology, Markus
Freudenberg, submitted to MTSR
Conf 2016
http://dataid.dbpedia.org
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
XML encoding variants
27
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
XML encoding variants
28
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
XML encoding variants
<same> should be symmetric, reflexive and transitive https://en.wikipedia.org/wiki/Equivalence_relation
Apples and oranges
29
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Who can you ask what XML tags and structure
mean and what they are used for?
30
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Who can you ask what XML tags and structure
mean and what they are used for?
31
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Internationalization Tag Set (ITS) 2.0
http://www.w3.org/TR/its20/
● W3C Recommendation since 29 October 2013
● defines how to embed Machine Translation and Localisation
annotations, so called Data Categories, in (X)HTML and XML
● In addition to the human-readable document two ontologies are referenced
that capture the semantics of the standard.
● ITS Ontology as companion
● NLP Interchange Format (NIF) is the recommended format for RDF
conversion of ITS2.0 http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-
core
32
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Internationalization Tag Set (ITS) 2.0
33
One of the most efficient and robust ways to annotate HTML in a standardized manner
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
NLP Interchange Format 2.0 (old example)
34
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
NLP Interchange Format 2.0 (old example)
35
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
NIF 2.1 release pending
Join W3C Community Group: https://www.w3.org/community/ld4lt/
NIF useful for:
● Adding semantics to NLP tool output and corpora
● Providing and publishing identifiers for text and annotations
NIF is compact and scalable (cf. http://wiki-link.nlp2rdf.org/ ):
● Google Wikilinks Corpus with 10.6 million webpages and 31.5 million Wikipedia links (about 3 per
page) with a zipped size of 180 GB.
● 533 million triples (other formats 7-27% more)
● 79 GB (12 GB gzipped dumps) in Turtle format (original size 180 GB containing HTML markup)
36
LIDER
Towards a linguistic linked data ecosystem
37
Website: http://lider-project.eu
Guidelines: http://lider-project.eu/?q=guidelines
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
NIF
38
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
LIDER - Deliverable 2.1.2
39
http://www.lider-project.eu/sites/default/files/D2.1.2-Phase-II.pdf
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
LIDER Reference Architecture Deliverable 3.1.2.
General:
lemon - developed by
40
http://www.lider-project.
eu/sites/default/files/D3.1.2-v2.0.pdf
Challenges and Work in Progress
41
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Identifier management
- Ideal identifiers are stable, i.e. the meaning behind the URI does not change
- Unrealistic for most use cases
- Easier for individuals, i.e. persons, organisations
- Non-trivial for terminology
Proposals:
1. Apply software development practices, i.e. versioning, update scripts http:
//vocol.org , http://github.org , http://aligned-project.eu
2. ??
42
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Knowledge Fusion
- Linking is mostly done manual
- Linking 200 datasets pairwise requires maintenance of 40000 mappings
- Adding one after the other depends on the merge order
- Ideally we would be able to structure all datasets into clusters before linking
Proposals:
1. Under discussion with: Erhard Rahm - The Case for Holistic Data Integration
ADBIS 2016 Keynote: http://adbis2016.vsb.cz/keynote/ (to appear)
2. Apply software development processes: https://github.com/dbpedia/links
43
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
The Metadata Challenge
Where to publish metadata for your data?
- Barrier between data and dataset description
- Stale metadata
- Single point of truth missing
- Metadata too heterogeneous
- Download link missing
- No (sufficiently) complete view over the web of data possible, discovery failure
Proposals:
1. build an index: http://linghub.lider-project.eu/ (Clarin, LRE Map, Metashare, Datahub)
2. create a better schema: http://dataid.dbpedia.org and provide benefits for complying
44
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
MMoOn
- LIDER
- Lemon
- ODRL
- Olia
- NIF
- Morphology quite complex
- Specific to language and to the
linguist
- http://mmoon.org
45
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
The Metadata Challenge 2
● RDF structure is too simple to keep additional metadata
○ Scope
○ Validity
○ Confidence
○ Technical metadata, i.e. collection time
Contextualisation is probably already better researched in lexicography than in Semantic Web.
46
Future work and take home messages
47
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
● Data Quality can be defined and measure with the tools.
● http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf Test-driven
Evaluation of Linked Data Quality by Dimitris Kontokostas, Patrick Westphal,
Sören Auer, Sebastian Hellmann, Jens Lehmann, Roland Cornelissen, and
Amrapali J. Zaveri in Proceedings of the 23rd International Conference on
World Wide Web
● Current standard:
○ https://www.w3.org/TR/shacl/
Data quality and verification
48
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Open licenses in research
49
Are you willing to publish
your data under an open
license?
Can you make a product
out of your data?
No
Yes
Start
Congratulations, your paper
has been accepted
Yes
Good luck, we wish you all
the best and a high profit
No
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Entity Linking Verification - new translator job profile
● http://www.freme-project.eu/
● Business Case: Integrating semantic enrichment into multilingual content in
translation and localisation
● In the future, translators and lexicographers
might be asked to judge entity linking and
verify data
50
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Should I invest in publishing linked data?
Long-term data strategy, if you:
● Have many expected
inbound links
● Persistent ids
● Long term hosting and curation
Is no problem for you
-> yes (data value increases)
One time thing:
● Interest of externals only in the yellow zone
-> Publish under open license (let someone else do it)
51
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
DBpedia Association
DBpedia+
● Maintain identifier space
● Add open and member data to DBpedia+
● Add data following the LIDER guidelines
● Ability to add your backlinks
DBpedia Community meeting on the 15th of September in Leipzig
52
Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016
Events in 2016
● KEKI 2016 Workshop - Uses of Linguistic Linked Open Data http://keki2016.
linguistic-lod.org/ Deadline is 1st of July, but might be extended
● http://2016.semantics.cc
53
Thank you
hellmann@informatik.uni-leipzig.de
54

More Related Content

What's hot

LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
m_ackermann
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
LOD2 Creating Knowledge out of Interlinked Data
 
Swib12 workshop lod_beginners
Swib12 workshop lod_beginnersSwib12 workshop lod_beginners
Swib12 workshop lod_beginners
dr0i
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Sergio Fernández
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
Sebastian Hellmann
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
Sebastian Hellmann
 
CKAN overview
CKAN overviewCKAN overview
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org
sopekmir
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical Domains
Chris Bizer
 
Redlink, The Data Linking API
Redlink, The Data Linking APIRedlink, The Data Linking API
Redlink, The Data Linking API
Sergio Fernández
 
PhD Defense
PhD DefensePhD Defense
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
FREMEProjectH2020
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Sören Auer
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Creating Knowledge out of Interlinked Data
 
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and RepairLOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Creating Knowledge out of Interlinked Data
 
Haystack 2018 apache_tika-eval_tallison
Haystack 2018 apache_tika-eval_tallisonHaystack 2018 apache_tika-eval_tallison
Haystack 2018 apache_tika-eval_tallison
Tim Allison
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
Sebastian Hellmann
 
Publishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDFPublishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDF
PeterWinstanley1
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
Sebastian Hellmann
 

What's hot (20)

LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar: SIREn
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
Swib12 workshop lod_beginners
Swib12 workshop lod_beginnersSwib12 workshop lod_beginners
Swib12 workshop lod_beginners
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
CKAN overview
CKAN overviewCKAN overview
CKAN overview
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical Domains
 
Redlink, The Data Linking API
Redlink, The Data Linking APIRedlink, The Data Linking API
Redlink, The Data Linking API
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and RepairLOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
 
Haystack 2018 apache_tika-eval_tallison
Haystack 2018 apache_tika-eval_tallisonHaystack 2018 apache_tika-eval_tallison
Haystack 2018 apache_tika-eval_tallison
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
Publishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDFPublishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDF
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 

Similar to Linguistic Linked Open Data, Challenges, Approaches, Future Work

CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Enno Meijers
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcase
RDTF-Discovery
 
Microservices in LoCloud
Microservices in LoCloud Microservices in LoCloud
Microservices in LoCloud
locloud
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
Sebastian Hellmann
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)
Sergio Fernández
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Vladimir Alexiev, PhD, PMP
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
Enno Meijers
 
Medical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSparkMedical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSpark
Helge Holzmann
 
Web Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web ArchivesWeb Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web Archives
Helge Holzmann
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...
DataWorks Summit
 
GLENNA: The Nordic cloud
GLENNA: The Nordic cloud GLENNA: The Nordic cloud
GLENNA: The Nordic cloud
EOSC-hub project
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
Stefan Gradmann
 
Local content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providersLocal content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providers
locloud
 
Putting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open DataPutting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open Data
Martin Kaltenböck
 
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
Dr. Haxel Consult
 
Towards a Linked Data Publishing Methodology
Towards a Linked Data Publishing MethodologyTowards a Linked Data Publishing Methodology
Towards a Linked Data Publishing Methodology
Danube University Krems, Centre for E-Governance
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
DataWorks Summit
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
Marin Dimitrov
 
ArchiveSpark at CEDWARC workshop 2019
ArchiveSpark at CEDWARC workshop 2019ArchiveSpark at CEDWARC workshop 2019
ArchiveSpark at CEDWARC workshop 2019
Helge Holzmann
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
Enno Meijers
 

Similar to Linguistic Linked Open Data, Challenges, Approaches, Future Work (20)

CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcase
 
Microservices in LoCloud
Microservices in LoCloud Microservices in LoCloud
Microservices in LoCloud
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
Medical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSparkMedical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSpark
 
Web Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web ArchivesWeb Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web Archives
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...
 
GLENNA: The Nordic cloud
GLENNA: The Nordic cloud GLENNA: The Nordic cloud
GLENNA: The Nordic cloud
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
Local content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providersLocal content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providers
 
Putting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open DataPutting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open Data
 
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
ICIC 2017: Building a Linked Data Knowledge Graph for the Scholarly Publishin...
 
Towards a Linked Data Publishing Methodology
Towards a Linked Data Publishing MethodologyTowards a Linked Data Publishing Methodology
Towards a Linked Data Publishing Methodology
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
ArchiveSpark at CEDWARC workshop 2019
ArchiveSpark at CEDWARC workshop 2019ArchiveSpark at CEDWARC workshop 2019
ArchiveSpark at CEDWARC workshop 2019
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 

More from Sebastian Hellmann

DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
Sebastian Hellmann
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
Sebastian Hellmann
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
Sebastian Hellmann
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
Sebastian Hellmann
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Sebastian Hellmann
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
Sebastian Hellmann
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
Sebastian Hellmann
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
Sebastian Hellmann
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
Sebastian Hellmann
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
Sebastian Hellmann
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
Sebastian Hellmann
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
Sebastian Hellmann
 

More from Sebastian Hellmann (14)

DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Recently uploaded

怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
rtunex8r
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
Tarandeep Singh
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
xjq03c34
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
APNIC
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
thezot
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
APNIC
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
k4ncd0z
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
3a0sd7z3
 

Recently uploaded (12)

怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
 

Linguistic Linked Open Data, Challenges, Approaches, Future Work

  • 1. Linguistic Linked Open Data LLOD Challenges, Approaches, Future Work Sebastian Hellmann TKE 2016 1
  • 2. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 AKSW / KILT in Leipzig Leipzig has become one of the largest Semantic Web centers AKSW has 4 subgroups and 45 PhD students http://aksw.org/Team.html Current position: - Head of AKSW / KILT research group (8 PhD students) - Knowledge Integration and Language Technology (KILT) http://aksw.org/Groups/KILT.html - Project manager for 2 H2020 and 1 German research project (BMWi) - http://freme-project.eu/ , http://aligned-project.eu/ , http://smartdataweb.de/ - Executive Director of the DBpedia Association http://dbpedia.org 2
  • 3. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Outline ● The vision behind Linked Data - a technological introduction ● Linguistic Linked Open Data ● Knowledge Modelling vs. Data Encoding ● LIDER ● Challenges and Approaches 3
  • 5. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Web of Data WWW vs. GGG - https://en.wikipedia.org/wiki/Giant_Global_Graph Data on the Web vs. the Web of Data vs. the Semantic Web RDF - Entity Attribute Value - http://dbpedia.org/resource/Copenhagen Three ways to publish RDF: 1. Linked Data: resource-level access via HTTP request (next slide) 2. SPARQL: query access via triplestore database 3. Dump: dataset-level access via bulk download 5
  • 6. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Linked Data Four rules of https://www.w3.org/DesignIssues/LinkedData 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. https://en.wikipedia.org/wiki/Copenhagen vs. http://dbpedia.org/resource/Copenhagen Source: https://www.w3.org/DesignIssues/LinkedData.html 6
  • 7. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Open Data != Open Data Open Access vs Open License Open Access means accessible like a web page (often unclear license) http://opendefinition.org by OKFN: “Knowledge is open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness.” 7
  • 8. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 8 http://lod-cloud.net/
  • 9. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 How is the Linked Data Cloud built? 9 - Open Access as the basis - 50 links between things required to receive a dataset link - http://lov.okfn.org - http://datahub.io - Assessing Quantity and Quality of Links Between Linked Data Datasets by Cir Sebastian Hellmann, Kay Müller, and Martin Brümmer in LDOW 2016 http://ev org/ldow2016/papers/LDOW2016_paper_09.pdf
  • 11. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Linguistic Linked Open Data ● Movement originated in the context of the Working Group for Open Data in Linguistics (OWLG) at Open Knowledge Foundation (OKFN) ● Open is supposed to mean Open license ● Join community mailing list at http://linguistics.okfn.org/ ● Current information at http://linguistic-lod.org/ maintained by John McCrae -> Instructions on how to join the LLOD cloud 11
  • 13. 13 February 2012 Linked Data in Linguistics. Representing Language Data and Metadata (http://www.springer. com/computer/ai/book/978-3-642-28248-5 ) Christian Chiarcos, Sebastian Nordhoff, and Sebastian Hellmann (Eds.). Springer, Heidelberg, (2012)
  • 15. Sept 2012 MLODE 15 Special Issue on Multilingual Linked Open Data (MLOD) Editors: Sebastian Hellmann, Steven Moran, Martin Brümm and John McCrae, Semantic Web, vol. 6, no. 4, pp. 315-317, 2015
  • 17. Sep 2013 17 LIDER FP7 EU Project Start: Nov 2013 Duration: 2 years http://lider-project.eu/
  • 18. May 2014 18 LIDER FP7 EU Project Start: Nov 2013 Duration: 2 years http://lider-project.eu/
  • 19. Nov 2014 19 LIDER FP7 EU Project Start: Nov 2013 Duration: 2 years http://lider-project.eu/
  • 20. May 2015 20 LIDER FP7 EU Project Start: Nov 2013 Duration: 2 years http://lider-project.eu/
  • 21. May 2016 21 LIDER FP7 EU Project Start: Nov 2013 Duration: 2 years http://lider-project.eu/
  • 22. 22
  • 23. Should we all use Linked Data? 23
  • 24. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Should we all use Linked Data? When should we use linked data? How should we use linked data? When should we not use it? 24
  • 25. Knowledge Modeling vs. Data Encoding 25
  • 26. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Entity Relationship Diagrams and UML 26 The Metadata Ecosystem of the DataId Ontology, Markus Freudenberg, submitted to MTSR Conf 2016 http://dataid.dbpedia.org
  • 27. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 XML encoding variants 27
  • 28. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 XML encoding variants 28
  • 29. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 XML encoding variants <same> should be symmetric, reflexive and transitive https://en.wikipedia.org/wiki/Equivalence_relation Apples and oranges 29
  • 30. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Who can you ask what XML tags and structure mean and what they are used for? 30
  • 31. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Who can you ask what XML tags and structure mean and what they are used for? 31
  • 32. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Internationalization Tag Set (ITS) 2.0 http://www.w3.org/TR/its20/ ● W3C Recommendation since 29 October 2013 ● defines how to embed Machine Translation and Localisation annotations, so called Data Categories, in (X)HTML and XML ● In addition to the human-readable document two ontologies are referenced that capture the semantics of the standard. ● ITS Ontology as companion ● NLP Interchange Format (NIF) is the recommended format for RDF conversion of ITS2.0 http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif- core 32
  • 33. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Internationalization Tag Set (ITS) 2.0 33 One of the most efficient and robust ways to annotate HTML in a standardized manner
  • 34. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 NLP Interchange Format 2.0 (old example) 34
  • 35. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 NLP Interchange Format 2.0 (old example) 35
  • 36. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 NIF 2.1 release pending Join W3C Community Group: https://www.w3.org/community/ld4lt/ NIF useful for: ● Adding semantics to NLP tool output and corpora ● Providing and publishing identifiers for text and annotations NIF is compact and scalable (cf. http://wiki-link.nlp2rdf.org/ ): ● Google Wikilinks Corpus with 10.6 million webpages and 31.5 million Wikipedia links (about 3 per page) with a zipped size of 180 GB. ● 533 million triples (other formats 7-27% more) ● 79 GB (12 GB gzipped dumps) in Turtle format (original size 180 GB containing HTML markup) 36
  • 37. LIDER Towards a linguistic linked data ecosystem 37 Website: http://lider-project.eu Guidelines: http://lider-project.eu/?q=guidelines
  • 38. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 NIF 38
  • 39. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 LIDER - Deliverable 2.1.2 39 http://www.lider-project.eu/sites/default/files/D2.1.2-Phase-II.pdf
  • 40. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 LIDER Reference Architecture Deliverable 3.1.2. General: lemon - developed by 40 http://www.lider-project. eu/sites/default/files/D3.1.2-v2.0.pdf
  • 41. Challenges and Work in Progress 41
  • 42. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Identifier management - Ideal identifiers are stable, i.e. the meaning behind the URI does not change - Unrealistic for most use cases - Easier for individuals, i.e. persons, organisations - Non-trivial for terminology Proposals: 1. Apply software development practices, i.e. versioning, update scripts http: //vocol.org , http://github.org , http://aligned-project.eu 2. ?? 42
  • 43. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Knowledge Fusion - Linking is mostly done manual - Linking 200 datasets pairwise requires maintenance of 40000 mappings - Adding one after the other depends on the merge order - Ideally we would be able to structure all datasets into clusters before linking Proposals: 1. Under discussion with: Erhard Rahm - The Case for Holistic Data Integration ADBIS 2016 Keynote: http://adbis2016.vsb.cz/keynote/ (to appear) 2. Apply software development processes: https://github.com/dbpedia/links 43
  • 44. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 The Metadata Challenge Where to publish metadata for your data? - Barrier between data and dataset description - Stale metadata - Single point of truth missing - Metadata too heterogeneous - Download link missing - No (sufficiently) complete view over the web of data possible, discovery failure Proposals: 1. build an index: http://linghub.lider-project.eu/ (Clarin, LRE Map, Metashare, Datahub) 2. create a better schema: http://dataid.dbpedia.org and provide benefits for complying 44
  • 45. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 MMoOn - LIDER - Lemon - ODRL - Olia - NIF - Morphology quite complex - Specific to language and to the linguist - http://mmoon.org 45
  • 46. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 The Metadata Challenge 2 ● RDF structure is too simple to keep additional metadata ○ Scope ○ Validity ○ Confidence ○ Technical metadata, i.e. collection time Contextualisation is probably already better researched in lexicography than in Semantic Web. 46
  • 47. Future work and take home messages 47
  • 48. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 ● Data Quality can be defined and measure with the tools. ● http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf Test-driven Evaluation of Linked Data Quality by Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens Lehmann, Roland Cornelissen, and Amrapali J. Zaveri in Proceedings of the 23rd International Conference on World Wide Web ● Current standard: ○ https://www.w3.org/TR/shacl/ Data quality and verification 48
  • 49. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Open licenses in research 49 Are you willing to publish your data under an open license? Can you make a product out of your data? No Yes Start Congratulations, your paper has been accepted Yes Good luck, we wish you all the best and a high profit No
  • 50. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Entity Linking Verification - new translator job profile ● http://www.freme-project.eu/ ● Business Case: Integrating semantic enrichment into multilingual content in translation and localisation ● In the future, translators and lexicographers might be asked to judge entity linking and verify data 50
  • 51. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Should I invest in publishing linked data? Long-term data strategy, if you: ● Have many expected inbound links ● Persistent ids ● Long term hosting and curation Is no problem for you -> yes (data value increases) One time thing: ● Interest of externals only in the yellow zone -> Publish under open license (let someone else do it) 51
  • 52. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 DBpedia Association DBpedia+ ● Maintain identifier space ● Add open and member data to DBpedia+ ● Add data following the LIDER guidelines ● Ability to add your backlinks DBpedia Community meeting on the 15th of September in Leipzig 52
  • 53. Sebastian Hellmann - AKSW/KILT Copenhagen TKE 2016 Events in 2016 ● KEKI 2016 Workshop - Uses of Linguistic Linked Open Data http://keki2016. linguistic-lod.org/ Deadline is 1st of July, but might be extended ● http://2016.semantics.cc 53