SlideShare a Scribd company logo
Turning Data into Knowledge – profiling and interlinking Web datasets 
Stefan Dietze 
L3S Research Center 
- KESW2014 - 
30/09/14 
1 
Stefan Dietze 
KESW2014
KESW2014 
Recent work on Linked Data exploration/discovery/search 
 Entity interlinking & dataset interlinking recommendation 
 Dataset profiling 
 Data consistency & conflicts 
Research areas 
 Web science, Information Retrieval, Semantic Web & Linked 
Data, data & knowledge integration (mapping, classification, 
interlinking) 
 Application domains: education/TEL, Web archiving, … 
Some projects 
Introduction 
http://www.l3s.de/ 
30/09/14 2 
 See also: http://purl.org/dietze 
Stefan Dietze
KESW2014 
…why are there so few datasets actually used? 
Date reuse and in-links focused on trusted „reference graphs“ such as DBpedia, Freebase etc 
Long tail of LD datasets which are neither reused nor linked to (LOD Cloud alone 300+ datasets, 50 bn triples) 
Explanations? 
Linked Data is awesome, but... 
30/09/14 
„HTTP-accessibility“ (SPARQL, URI-dereferencing) 
„Structure“ & „Semantics“ (=> shared/linked vocabularies) 
„Interlinked“ 
„Persistent“ 
Hm, really? 
Stefan Dietze 
3
KESW2014 
Linked data is more diverse (and messy) than we think 
SPARQL endpoint availability over time [Buil-Aranda et al 2013] 
Accessibility of datasets? 
Less than 50% of all SPARQL endpoints actually responsive at given point of time [Buil-Aranda2013] 
“THE” SPARQL protocol? No, but many variants & subsets 
“Semantics”, links, quality? 
…data accuracy (eg DBpedia)? [Paulheim2013] 
…vocabulary reuse? [D’AquinWebSci13] 
…schema compliance (RDFS, schemas) [HoganJWS2012] 
Stefan Dietze 
SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch, International Semantic Web Conference 2013, (ISWC2013). 
Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. 
Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web – ISWC 2013, Lecture Notes in Computer Science Volume 8218, 2013, pp 510-525 
An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., Journal of Web Semantics 14, 2012 
30/09/14 
4
KESW2014 
What about data consistency? 
Analyzing Relative Incompleteness of Movie Descriptions in the Web of Data: A Case Study, Yuan, W., Demidova, E., Dietze, S., Zhu, X., International Semantic Web Conference 2014 (ISWC2014) 
30/09/14 
Stefan Dietze 
5
KESW2014 
Too many/diverse datasets, too little knowledge 
Stefan Dietze 
30/09/14 
? 
? 
? 
? 
? 
? 
Topics? Which datasets are useful & trustworthy for case XY (eg „learning about the solar system“) ? Which topics are covered? 
Types? Which datasets describe statistics, videos, slides, publications etc? 
Quality? Currentness, dynamics, accessability/reliability, data quantity & quality? 
6
KESW2014 
db:Astro. Objects 
Dataset Metadata 
Stefan Dietze 
30/09/14 
BIBO 
AAISO 
FOAF 
contains 
Entity & dataset disambiguation & linking [ESWC13] 
Topic profile extraction [WWW13, ESCW14] 
db:Astronomy 
db:Astro. Objects 
Dataset Catalog/Registry 
yov:Video 
po:Programme 
BBC Programme 
<po:Programme …> <po:Series>Wonders of the Solar System</.> <po:Actor>Brian Cox</…> </po:Programme…> 
<yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> 
Yovisto Video 
bibo:Fil 
bibo:Fi 
bibo:Film 
Schema mappings [WebSci13] 
Data mapping, linking and profiling 
7
KESW2014 
Schemas/vocabularies on the Web: XKCD 927 
Stefan Dietze 
30/09/14 
https://xkcd.com/927/ 
schemas & vocabularies 
8
KESW2014 
Schema assessment and mapping 
Co-occurence of data types (in 146 datasets: 144 Vocabularies, 588 highly overlapping types, 719 Properties) 
Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. 
po:Programme 
sioc:Item 
30/09/14 
yov:Video 
? 
Stefan Dietze 
9
KESW2014 
typeX 
typeX 
Schema assessment and mapping 
Co-occurence of data types (in 146 datasets: 144 Vocabularies, 588 highly overlapping types, 719 Properties) 
Co-occurence after mapping into most frequent schemas (201 frequent types mapped into 79 classes) 
Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. 
bibo:Film 
bibo:Document 
po:Programme 
sioc:Item 
30/09/14 
foaf:Document 
yov:Video 
typeX 
10
KESW2014 
Application: LinkedUp Data Catalog 
in a nutshell 
 RDF (VoID) dataset catalog: browse & 
query distributed datasets 
 Federated queries using type mappings 
 Live information about endpoint 
accessibility 
Stefan Dietze 30/09/14 
11 
http://data.linkededucation.org/linkedup/catalog/ 
http://datahub.io/group/linked-education 
DBpedia categories
KESW2014 
Stefan Dietze 
30/09/14 
contains 
yov:Video 
po:Programme 
BBC Programme 
<po:Programme …> <po:Series>Wonders of the Solar System</.> <po:Actor>Brian Cox</…> </po:Programme…> 
<yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> 
Yovisto Video 
Towards profiling: dataset disambiguation/linking 
? 
Relatedness of entities, meaningfulness of paths? [ESWC13] 
Extraction of “topics” & relatedness of datasets [ESWC14] 
? 
? 
? 
14 
db:Astro. Objects 
db:CartoonCharacters 
?
KESW2014 
Stefan Dietze 
30/09/14 
contains 
yov:Video 
po:Programme 
BBC Programme 
<po:Programme …> 
<po:Series>Wonders of the Solar System</.> 
<po:Actor>Brian Cox</…> 
</po:Programme…> 
<yo:Video …> 
<dc:title>Pluto & the Dwarf Planets</dc:title> 
… 
</yo:Video…> 
Yovisto Video 
Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). 
db:Pluto (Dwarf Planet) 
db:Astrono- mical Objects 
db:Sun 
db:Astronomy 
Computation of connectivity scores between entities 
Combination of a (i) semantic (graph-based) connectivity score (SCS) with (ii) a Web co-occurence-based measure (CBM) (similar to NGD) 
For (i): adaptation of Katz-Index from SNA for (linked) data graphs (considering path number and path lengths of transversal properties) 
SCS = 0.32 
CBM = 0.24 
15 
Dataset disambiguation/linking
KESW2014 
Entity linking: evaluation 
30/09/14 
16 
Stefan Dietze 
 Evaluation based on USA Today News items (80.000 entity pairs) 
 Manually created gold standard (1000 entity pairs) 
 Baseline: Explicit Semantic Analysis (ESA) => CBM/SCS: „relatedness“; ESA: „similarity“ 
Precision/Recall/F1 for SCS, CBM, ESA. 
Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013).
KESW2014 
„SCS Connector“ demo 
http://lod2.inf.puc-rio.br/scs/SemConnectivities 
SCS Connector – Quantifying and Visualising Semantic Paths between Entity Pairs, Nunes, B. P., Herrera, J. E. T., Taibi, D., Lopes, G. R., Casanova, M. A., Dietze, S., Demo Paper at 11th Extended Semantic Web Conference (ESWC2014), Heraklion, Crete, Greece, (2014. – *BEST ESWC2014 DEMO AWARD* 
17 
Stefan Dietze 
30/09/14
KESW2014 
Dataset Metadata 
db:Astronomy 
db:Astro. Objects 
Dataset Catalog/Registry 
yov:Video 
<yo:Video …> 
<dc:title>Pluto & the Dwarf Planets</dc:title> 
… 
</yo:Video…> 
Yovisto Video 
Extracting representative (DBpedia) categories („topic profile“) & entities for arbitrary datasets 
Sounds easy? But how to do that for 300+ datasets with < 50 bn triples? 
Scalability vs representativeness: sampling & ranking for good scalability/accuracy balance [ESWC2014] (applied to all responsive LOD datasets) 
A Scalable Approach for Efficiently Generating 
Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B. P., Casanova, M. A., Nejdl, W., 11th Extended Semantic Web Conference (ESWC2014), Crete, Greece, (2014). 
Dataset profiling: what‘s the data about? 
18 
Stefan Dietze 
30/09/14 
db:Pluto (Dwarf Planet)
KESW2014 
Efficient dataset profiling: method 
1.Sampling of resource instances (random sampling, weighted sampling, resource centrality sampling) 
2.Entity and topic extraction (NER via DBpedia Spotlight, category mapping and expansion) 
3.Normalisation and ranking (using graphical- models such as PageRank with Priors, HITS with Priors and K-Step Markov) 
Result: weighted dataset-topic profile graph 
A Scalable Approach for Efficiently Generating 
Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B. P., Casanova, M. A., Nejdl, W., 11th Extended Semantic Web Conference (ESWC2014), Crete, Greece, (2014). 
19 
Stefan Dietze 
30/09/14
KESW2014 
Dataset profiling: exploring LOD datasets/topics in a nutshell 
http://data-observatory.org/lod-profiles/ 
Automatic extraction of dataset “topics” [ESWC2014] => RDF/VoiD dataset profiles 
Visualisation & exploration of dataset-topic graph (datasets, topics, relationships) 
Includes all (responsive) datasets of LOD Cloud 
20 
Stefan Dietze 
30/09/14
KESW2014 
Dataset profiling: evaluation 
NDCG (averaged over all datasets) . 
Datasets & Ground Truth 
Yovisto, Oxpoints, LAK Dataset, Semantic Web Dogfood 
Crowd-sourced topic indicators from datasets (keywords, tags) 
Manual mapping to entities & category extraction (ranking according to frequency) Baselines 
1) LDA, 2) tf/idf (applied to entire datasets) 
Topic extraction according to our approach, weighting/ranking based on term weight Measure 
NDCG @ rank l 
Performance (time/NDCG) for different sampling strategies/sizes etc 
21 
Stefan Dietze 
30/09/14
KESW2014 
30/09/14 
What (dataset) have these categories in common? 
dbp:Category:1955_births 
dbp:Category:People_from_London 
dbp:Category:Buzzwords 
dbp:Category:Semantic_Web 
dbp:Category:Web_Services 
dbp:Category:HTTP 
dbp:Category:Unitarian_Universalists 
dbp:Category:World_Wide_Web 
dbp:Category:Royal_Medal_winners 
Stefan Dietze 
22 
? 
?
KESW2014 
30/09/14 
Diversity of category profile for a single publication 
Berners-Lee, Tim; Hendler, James, Ora Lassila (2001). "The Semantic Web". Scientific American Magazine. 
foaf:Person 
foaf:Document 
dbp:Tim_Berners-Lee 
dbp:Category:1955_births 
dbp:Category:People_from_London 
dbp:Category:Buzzwords 
dbp:Semantic_Web 
dbp:Category:Semantic_Web 
dbp:Category:Web_Services 
dbp:Category:HTTP 
dbp:Category:Unitarian_Universalists 
first-level categories (dcterms:subject) 
dbp:Category:World_Wide_Web 
dbp:Category:Royal_Medal_winners 
Stefan Dietze 
DBLP 
23
KESW2014 
30/09/14 
http://data-observatory.org/led-explorer/ 
Type specific views on datasets/ categories 
“Document” (foaf:document) 
“Person “ (foaf:person) 
“Course” (aaiso:course) 
Currently applied to datasets in LinkedUp Catalog only (as schema mappings already available here) 
Type-specific exploration of dataset categories 
Stefan Dietze 
Exploring type-specific topic profiles of datasets: a demo for educational linked data, Taibi, D., Dietze, S., Fetahu, B., Fulantelli, G., Demo at International Semantic Web Conference 2014 (ISWC2014) 
24
KESW2014 
data.l3s.de – the L3S DataHub
KESW2014 
KEYSTONE & PROFILES 2014 
30/09/14 
27 
Stefan Dietze 
http://www.keystone-cost.eu/ 
KEYSTONE: semantic keyword-based search on structured data sources (2013-2017) 
Research network focused on distributed search, dataset profiling, to Semantic Web, Databases, etc. 
Open to new members (beyond Europe) 
http://www.keystone-cost.eu/profiles 
http://www.ijswis.org/?q=node/51/ 
PROFILES2014 - Dataset PROFIling & fEderated Search for Linked Data 
Workshop collocated with ESWC2014 
IJSWIS Special Issue on … LD search & profiling 
Deadline 8 December 2014
KESW2014 
Summing up 
Summary 
Increasing amounts of data => require knowledge about nature and relationships of datasets 
Profiling: scalable methods for extracting dataset metadata 
Interlinking: connectivity of entities or datasets What about LD evolution? 
In RDF graphs (eg LOD Cloud), „all“ nodes are connected 
Impact of evolution on preservation, linking and enrichment? 
Which parts of datasets to preserve (entity „neighbourhood“)? => semantic relatedness /relevance/entity retrieval 
Link correctness in evolving LD? 
…. 
30/09/14 
29 
Stefan Dietze
KESW2014 
Спасибо! Thank You! 
WWW See also (general) 
 http://purl.org/dietze 
 http://linkedup-project.eu 
 http://duraark.eu 
 http://data.l3s.de See also (data) 
 http://data.l3s.de 
 http://data.linkededucation.org 
http://lak.linkededucation.org 
30/09/14 
30 
Stefan Dietze 
Besnik Fetahu (L3S) 
Elena Demidova (L3S) 
Bernardo Pereira Nunes (PUC Rio) 
Marco Casanova (PUC Rio) 
Luiz Andre Paes Leme (PUC Rio) 
Giseli Lopes (PUC Rio) 
Davide Taibi (CNR, IT) 
Mathieu d’Aquin (Open University, UK) 
and many more… 
Acknowledgements

More Related Content

What's hot

Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Stefan Dietze
 
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Stefan Dietze
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionEUCLID project
 
Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014Stefan Dietze
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedStefan Dietze
 
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic WebWeb Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic WebStefan Dietze
 
Mining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the WebMining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the Web
Stefan Dietze
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public Sector
Marieke Guy
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Stefan Dietze
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
KNOWeSCAPE2014
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationStefan Dietze
 
Interpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning AnalyticsInterpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning Analytics
Mathieu d'Aquin
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the WebBeyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Stefan Dietze
 
Analysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebAnalysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the Web
Stefan Dietze
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
Mathieu d'Aquin
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...
Mathieu d'Aquin
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked dataSören Auer
 
20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museumsandrea huang
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
Christophe Guéret
 

What's hot (20)

Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
 
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
 
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic WebWeb Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
 
Mining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the WebMining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the Web
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public Sector
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in Education
 
Interpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning AnalyticsInterpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning Analytics
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the WebBeyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
 
Analysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebAnalysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the Web
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked data
 
20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 

Viewers also liked

A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...lindlar
 
Quality criteria for architectural 3D data in usage and preservation processes
Quality criteria for architectural 3D data in usage and preservation processesQuality criteria for architectural 3D data in usage and preservation processes
Quality criteria for architectural 3D data in usage and preservation processes
lindlar
 
DURAARK at IGeLU 2014
DURAARK at IGeLU 2014DURAARK at IGeLU 2014
DURAARK at IGeLU 2014
panitzm
 
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
Jakob Beetz
 
Grapp2014 presentation
Grapp2014 presentationGrapp2014 presentation
Grapp2014 presentation
netsoxx
 
Towards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledgeTowards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledge
Stefan Dietze
 
Presentation nokobit
Presentation nokobitPresentation nokobit
Presentation nokobit
netsoxx
 
DURAARK at Bibliotheksymposium Wildau
DURAARK at Bibliotheksymposium WildauDURAARK at Bibliotheksymposium Wildau
DURAARK at Bibliotheksymposium Wildau
panitzm
 
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
lindlar
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
Besnik Fetahu
 
DURAARK at AUdS 2015
DURAARK at AUdS 2015DURAARK at AUdS 2015
DURAARK at AUdS 2015
panitzm
 
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.Lena Lindbäck
 
Preservation of 3 d objects of buildings
Preservation of 3 d objects of buildingsPreservation of 3 d objects of buildings
Preservation of 3 d objects of buildings
netsoxx
 

Viewers also liked (13)

A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
 
Quality criteria for architectural 3D data in usage and preservation processes
Quality criteria for architectural 3D data in usage and preservation processesQuality criteria for architectural 3D data in usage and preservation processes
Quality criteria for architectural 3D data in usage and preservation processes
 
DURAARK at IGeLU 2014
DURAARK at IGeLU 2014DURAARK at IGeLU 2014
DURAARK at IGeLU 2014
 
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
 
Grapp2014 presentation
Grapp2014 presentationGrapp2014 presentation
Grapp2014 presentation
 
Towards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledgeTowards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledge
 
Presentation nokobit
Presentation nokobitPresentation nokobit
Presentation nokobit
 
DURAARK at Bibliotheksymposium Wildau
DURAARK at Bibliotheksymposium WildauDURAARK at Bibliotheksymposium Wildau
DURAARK at Bibliotheksymposium Wildau
 
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
DURAARK at AUdS 2015
DURAARK at AUdS 2015DURAARK at AUdS 2015
DURAARK at AUdS 2015
 
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
 
Preservation of 3 d objects of buildings
Preservation of 3 d objects of buildingsPreservation of 3 d objects of buildings
Preservation of 3 d objects of buildings
 

Similar to Turning Data into Knowledge (KESW2014 Keynote)

Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
From Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsFrom Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web Datasets
Stefan Dietze
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphs
Stefan Dietze
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
Stefan Dietze
 
LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014Stefan Dietze
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
Sören Auer
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
Semantic Web Company
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday Learning
Stefan Dietze
 
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
Open Science Fair
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...Patricia Tavares Boralli
 
Dc 2014 baierer-droege
Dc 2014 baierer-droegeDc 2014 baierer-droege
Dc 2014 baierer-droege
Digitised Manuscripts to Europeana
 
Linked Data vs Open Educational Resources
Linked Data vs Open Educational ResourcesLinked Data vs Open Educational Resources
Linked Data vs Open Educational Resources
Stefan Dietze
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
hala Skaf
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
Elena Simperl
 
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Stefan Dietze
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupalemmanuel_jamin
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
andrea huang
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
Dhaval Thakker
 
Camp 4-data workshop presentation
Camp 4-data workshop presentationCamp 4-data workshop presentation
Camp 4-data workshop presentation
Paolo Missier
 

Similar to Turning Data into Knowledge (KESW2014 Keynote) (20)

Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
 
From Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsFrom Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web Datasets
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphs
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday Learning
 
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
 
Dc 2014 baierer-droege
Dc 2014 baierer-droegeDc 2014 baierer-droege
Dc 2014 baierer-droege
 
Linked Data vs Open Educational Resources
Linked Data vs Open Educational ResourcesLinked Data vs Open Educational Resources
Linked Data vs Open Educational Resources
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupal
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Camp 4-data workshop presentation
Camp 4-data workshop presentationCamp 4-data workshop presentation
Camp 4-data workshop presentation
 

More from Stefan Dietze

Understanding Scientific and Societal Adoption and Impact of Science Through ...
Understanding Scientific and Societal Adoption and Impact of Science Through ...Understanding Scientific and Societal Adoption and Impact of Science Through ...
Understanding Scientific and Societal Adoption and Impact of Science Through ...
Stefan Dietze
 
NEWORDER Project - Science in the online knowledge order
NEWORDER Project - Science in the online knowledge orderNEWORDER Project - Science in the online knowledge order
NEWORDER Project - Science in the online knowledge order
Stefan Dietze
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Stefan Dietze
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
Stefan Dietze
 
An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Stefan Dietze
 
Research Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESISResearch Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESIS
Stefan Dietze
 
Research Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScienceResearch Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScience
Stefan Dietze
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Stefan Dietze
 
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Stefan Dietze
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
Stefan Dietze
 
Using AI to understand everyday learning on the Web
Using AI to understand everyday learning on the WebUsing AI to understand everyday learning on the Web
Using AI to understand everyday learning on the Web
Stefan Dietze
 
Analysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online ActivitiesAnalysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online Activities
Stefan Dietze
 
Towards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the WebTowards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the Web
Stefan Dietze
 
Dietze linked data-vr-es
Dietze linked data-vr-esDietze linked data-vr-es
Dietze linked data-vr-es
Stefan Dietze
 

More from Stefan Dietze (14)

Understanding Scientific and Societal Adoption and Impact of Science Through ...
Understanding Scientific and Societal Adoption and Impact of Science Through ...Understanding Scientific and Societal Adoption and Impact of Science Through ...
Understanding Scientific and Societal Adoption and Impact of Science Through ...
 
NEWORDER Project - Science in the online knowledge order
NEWORDER Project - Science in the online knowledge orderNEWORDER Project - Science in the online knowledge order
NEWORDER Project - Science in the online knowledge order
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
 
An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...
 
Research Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESISResearch Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESIS
 
Research Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScienceResearch Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScience
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
 
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
 
Using AI to understand everyday learning on the Web
Using AI to understand everyday learning on the WebUsing AI to understand everyday learning on the Web
Using AI to understand everyday learning on the Web
 
Analysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online ActivitiesAnalysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online Activities
 
Towards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the WebTowards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the Web
 
Dietze linked data-vr-es
Dietze linked data-vr-esDietze linked data-vr-es
Dietze linked data-vr-es
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Turning Data into Knowledge (KESW2014 Keynote)

  • 1. Turning Data into Knowledge – profiling and interlinking Web datasets Stefan Dietze L3S Research Center - KESW2014 - 30/09/14 1 Stefan Dietze KESW2014
  • 2. KESW2014 Recent work on Linked Data exploration/discovery/search  Entity interlinking & dataset interlinking recommendation  Dataset profiling  Data consistency & conflicts Research areas  Web science, Information Retrieval, Semantic Web & Linked Data, data & knowledge integration (mapping, classification, interlinking)  Application domains: education/TEL, Web archiving, … Some projects Introduction http://www.l3s.de/ 30/09/14 2  See also: http://purl.org/dietze Stefan Dietze
  • 3. KESW2014 …why are there so few datasets actually used? Date reuse and in-links focused on trusted „reference graphs“ such as DBpedia, Freebase etc Long tail of LD datasets which are neither reused nor linked to (LOD Cloud alone 300+ datasets, 50 bn triples) Explanations? Linked Data is awesome, but... 30/09/14 „HTTP-accessibility“ (SPARQL, URI-dereferencing) „Structure“ & „Semantics“ (=> shared/linked vocabularies) „Interlinked“ „Persistent“ Hm, really? Stefan Dietze 3
  • 4. KESW2014 Linked data is more diverse (and messy) than we think SPARQL endpoint availability over time [Buil-Aranda et al 2013] Accessibility of datasets? Less than 50% of all SPARQL endpoints actually responsive at given point of time [Buil-Aranda2013] “THE” SPARQL protocol? No, but many variants & subsets “Semantics”, links, quality? …data accuracy (eg DBpedia)? [Paulheim2013] …vocabulary reuse? [D’AquinWebSci13] …schema compliance (RDFS, schemas) [HoganJWS2012] Stefan Dietze SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch, International Semantic Web Conference 2013, (ISWC2013). Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web – ISWC 2013, Lecture Notes in Computer Science Volume 8218, 2013, pp 510-525 An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., Journal of Web Semantics 14, 2012 30/09/14 4
  • 5. KESW2014 What about data consistency? Analyzing Relative Incompleteness of Movie Descriptions in the Web of Data: A Case Study, Yuan, W., Demidova, E., Dietze, S., Zhu, X., International Semantic Web Conference 2014 (ISWC2014) 30/09/14 Stefan Dietze 5
  • 6. KESW2014 Too many/diverse datasets, too little knowledge Stefan Dietze 30/09/14 ? ? ? ? ? ? Topics? Which datasets are useful & trustworthy for case XY (eg „learning about the solar system“) ? Which topics are covered? Types? Which datasets describe statistics, videos, slides, publications etc? Quality? Currentness, dynamics, accessability/reliability, data quantity & quality? 6
  • 7. KESW2014 db:Astro. Objects Dataset Metadata Stefan Dietze 30/09/14 BIBO AAISO FOAF contains Entity & dataset disambiguation & linking [ESWC13] Topic profile extraction [WWW13, ESCW14] db:Astronomy db:Astro. Objects Dataset Catalog/Registry yov:Video po:Programme BBC Programme <po:Programme …> <po:Series>Wonders of the Solar System</.> <po:Actor>Brian Cox</…> </po:Programme…> <yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> Yovisto Video bibo:Fil bibo:Fi bibo:Film Schema mappings [WebSci13] Data mapping, linking and profiling 7
  • 8. KESW2014 Schemas/vocabularies on the Web: XKCD 927 Stefan Dietze 30/09/14 https://xkcd.com/927/ schemas & vocabularies 8
  • 9. KESW2014 Schema assessment and mapping Co-occurence of data types (in 146 datasets: 144 Vocabularies, 588 highly overlapping types, 719 Properties) Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. po:Programme sioc:Item 30/09/14 yov:Video ? Stefan Dietze 9
  • 10. KESW2014 typeX typeX Schema assessment and mapping Co-occurence of data types (in 146 datasets: 144 Vocabularies, 588 highly overlapping types, 719 Properties) Co-occurence after mapping into most frequent schemas (201 frequent types mapped into 79 classes) Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. bibo:Film bibo:Document po:Programme sioc:Item 30/09/14 foaf:Document yov:Video typeX 10
  • 11. KESW2014 Application: LinkedUp Data Catalog in a nutshell  RDF (VoID) dataset catalog: browse & query distributed datasets  Federated queries using type mappings  Live information about endpoint accessibility Stefan Dietze 30/09/14 11 http://data.linkededucation.org/linkedup/catalog/ http://datahub.io/group/linked-education DBpedia categories
  • 12. KESW2014 Stefan Dietze 30/09/14 contains yov:Video po:Programme BBC Programme <po:Programme …> <po:Series>Wonders of the Solar System</.> <po:Actor>Brian Cox</…> </po:Programme…> <yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> Yovisto Video Towards profiling: dataset disambiguation/linking ? Relatedness of entities, meaningfulness of paths? [ESWC13] Extraction of “topics” & relatedness of datasets [ESWC14] ? ? ? 14 db:Astro. Objects db:CartoonCharacters ?
  • 13. KESW2014 Stefan Dietze 30/09/14 contains yov:Video po:Programme BBC Programme <po:Programme …> <po:Series>Wonders of the Solar System</.> <po:Actor>Brian Cox</…> </po:Programme…> <yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> Yovisto Video Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). db:Pluto (Dwarf Planet) db:Astrono- mical Objects db:Sun db:Astronomy Computation of connectivity scores between entities Combination of a (i) semantic (graph-based) connectivity score (SCS) with (ii) a Web co-occurence-based measure (CBM) (similar to NGD) For (i): adaptation of Katz-Index from SNA for (linked) data graphs (considering path number and path lengths of transversal properties) SCS = 0.32 CBM = 0.24 15 Dataset disambiguation/linking
  • 14. KESW2014 Entity linking: evaluation 30/09/14 16 Stefan Dietze  Evaluation based on USA Today News items (80.000 entity pairs)  Manually created gold standard (1000 entity pairs)  Baseline: Explicit Semantic Analysis (ESA) => CBM/SCS: „relatedness“; ESA: „similarity“ Precision/Recall/F1 for SCS, CBM, ESA. Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013).
  • 15. KESW2014 „SCS Connector“ demo http://lod2.inf.puc-rio.br/scs/SemConnectivities SCS Connector – Quantifying and Visualising Semantic Paths between Entity Pairs, Nunes, B. P., Herrera, J. E. T., Taibi, D., Lopes, G. R., Casanova, M. A., Dietze, S., Demo Paper at 11th Extended Semantic Web Conference (ESWC2014), Heraklion, Crete, Greece, (2014. – *BEST ESWC2014 DEMO AWARD* 17 Stefan Dietze 30/09/14
  • 16. KESW2014 Dataset Metadata db:Astronomy db:Astro. Objects Dataset Catalog/Registry yov:Video <yo:Video …> <dc:title>Pluto & the Dwarf Planets</dc:title> … </yo:Video…> Yovisto Video Extracting representative (DBpedia) categories („topic profile“) & entities for arbitrary datasets Sounds easy? But how to do that for 300+ datasets with < 50 bn triples? Scalability vs representativeness: sampling & ranking for good scalability/accuracy balance [ESWC2014] (applied to all responsive LOD datasets) A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B. P., Casanova, M. A., Nejdl, W., 11th Extended Semantic Web Conference (ESWC2014), Crete, Greece, (2014). Dataset profiling: what‘s the data about? 18 Stefan Dietze 30/09/14 db:Pluto (Dwarf Planet)
  • 17. KESW2014 Efficient dataset profiling: method 1.Sampling of resource instances (random sampling, weighted sampling, resource centrality sampling) 2.Entity and topic extraction (NER via DBpedia Spotlight, category mapping and expansion) 3.Normalisation and ranking (using graphical- models such as PageRank with Priors, HITS with Priors and K-Step Markov) Result: weighted dataset-topic profile graph A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B. P., Casanova, M. A., Nejdl, W., 11th Extended Semantic Web Conference (ESWC2014), Crete, Greece, (2014). 19 Stefan Dietze 30/09/14
  • 18. KESW2014 Dataset profiling: exploring LOD datasets/topics in a nutshell http://data-observatory.org/lod-profiles/ Automatic extraction of dataset “topics” [ESWC2014] => RDF/VoiD dataset profiles Visualisation & exploration of dataset-topic graph (datasets, topics, relationships) Includes all (responsive) datasets of LOD Cloud 20 Stefan Dietze 30/09/14
  • 19. KESW2014 Dataset profiling: evaluation NDCG (averaged over all datasets) . Datasets & Ground Truth Yovisto, Oxpoints, LAK Dataset, Semantic Web Dogfood Crowd-sourced topic indicators from datasets (keywords, tags) Manual mapping to entities & category extraction (ranking according to frequency) Baselines 1) LDA, 2) tf/idf (applied to entire datasets) Topic extraction according to our approach, weighting/ranking based on term weight Measure NDCG @ rank l Performance (time/NDCG) for different sampling strategies/sizes etc 21 Stefan Dietze 30/09/14
  • 20. KESW2014 30/09/14 What (dataset) have these categories in common? dbp:Category:1955_births dbp:Category:People_from_London dbp:Category:Buzzwords dbp:Category:Semantic_Web dbp:Category:Web_Services dbp:Category:HTTP dbp:Category:Unitarian_Universalists dbp:Category:World_Wide_Web dbp:Category:Royal_Medal_winners Stefan Dietze 22 ? ?
  • 21. KESW2014 30/09/14 Diversity of category profile for a single publication Berners-Lee, Tim; Hendler, James, Ora Lassila (2001). "The Semantic Web". Scientific American Magazine. foaf:Person foaf:Document dbp:Tim_Berners-Lee dbp:Category:1955_births dbp:Category:People_from_London dbp:Category:Buzzwords dbp:Semantic_Web dbp:Category:Semantic_Web dbp:Category:Web_Services dbp:Category:HTTP dbp:Category:Unitarian_Universalists first-level categories (dcterms:subject) dbp:Category:World_Wide_Web dbp:Category:Royal_Medal_winners Stefan Dietze DBLP 23
  • 22. KESW2014 30/09/14 http://data-observatory.org/led-explorer/ Type specific views on datasets/ categories “Document” (foaf:document) “Person “ (foaf:person) “Course” (aaiso:course) Currently applied to datasets in LinkedUp Catalog only (as schema mappings already available here) Type-specific exploration of dataset categories Stefan Dietze Exploring type-specific topic profiles of datasets: a demo for educational linked data, Taibi, D., Dietze, S., Fetahu, B., Fulantelli, G., Demo at International Semantic Web Conference 2014 (ISWC2014) 24
  • 23. KESW2014 data.l3s.de – the L3S DataHub
  • 24. KESW2014 KEYSTONE & PROFILES 2014 30/09/14 27 Stefan Dietze http://www.keystone-cost.eu/ KEYSTONE: semantic keyword-based search on structured data sources (2013-2017) Research network focused on distributed search, dataset profiling, to Semantic Web, Databases, etc. Open to new members (beyond Europe) http://www.keystone-cost.eu/profiles http://www.ijswis.org/?q=node/51/ PROFILES2014 - Dataset PROFIling & fEderated Search for Linked Data Workshop collocated with ESWC2014 IJSWIS Special Issue on … LD search & profiling Deadline 8 December 2014
  • 25. KESW2014 Summing up Summary Increasing amounts of data => require knowledge about nature and relationships of datasets Profiling: scalable methods for extracting dataset metadata Interlinking: connectivity of entities or datasets What about LD evolution? In RDF graphs (eg LOD Cloud), „all“ nodes are connected Impact of evolution on preservation, linking and enrichment? Which parts of datasets to preserve (entity „neighbourhood“)? => semantic relatedness /relevance/entity retrieval Link correctness in evolving LD? …. 30/09/14 29 Stefan Dietze
  • 26. KESW2014 Спасибо! Thank You! WWW See also (general)  http://purl.org/dietze  http://linkedup-project.eu  http://duraark.eu  http://data.l3s.de See also (data)  http://data.l3s.de  http://data.linkededucation.org http://lak.linkededucation.org 30/09/14 30 Stefan Dietze Besnik Fetahu (L3S) Elena Demidova (L3S) Bernardo Pereira Nunes (PUC Rio) Marco Casanova (PUC Rio) Luiz Andre Paes Leme (PUC Rio) Giseli Lopes (PUC Rio) Davide Taibi (CNR, IT) Mathieu d’Aquin (Open University, UK) and many more… Acknowledgements