SlideShare a Scribd company logo
1 of 14
Analyzing Statistics
with Background Knowledge
from Linked Open Data

10/22/13 Ristoski, Heiko Paulheim
Petar Ristoski, Heiko Paulheim
Petar

1
Idea
•

Background knowledge from LOD can help
– finding explanations
– creating more sophisticated visualizations

•

Steps taken
– linking the statistics datasets to LOD datasets
– DBpedia, Eurostat, GADM, Linked Geo Data
– Extracting features

•

Finding correlations with unemployment rate
– using only one target variable for demonstration purposes
– works for arbitrary target variables

10/22/13

Petar Ristoski, Heiko Paulheim

2
Linking to LOD Datasets
•

Linking to DBpedia
– using DBpedia Lookup
– restricting results to Place and AdministrativeArea
– select from many results by minimum edit distance

•

Linking to Eurostat
– using SPARQL to query for labels
– querying for word 1-grams, 2-grams, … from original labels
– selecting by minimum edit distance

10/22/13

Petar Ristoski, Heiko Paulheim

3
Linking to LOD Datasets
•

Linking to GADM
– searching by name turned out to be error-prone
– searching by coordinates (from DBpedia) is precise
• but suffers from low recall
– two-stage approach:
• searching by coordinates
• searching by average coordinates of all linked objects (in DBpedia)

•

Total figures:
– France regions: 27/27 DBpedia, 26/27 Eurostat, 27/27 GADM
– France departments: 101/101 DBpedia, 101/101 GADM
– Australia states: 8/9 DBpedia, 9/9 GADM
– Australia SA3/SA4: no satisfying results, discarded

10/22/13

Petar Ristoski, Heiko Paulheim

4
Feature Extraction
•

Once the links have been created
– get polygon shapes from GADM (for visualization)
– get datatype properties from Eurostat/DBpedia
– get direct types from DBpedia (incl. YAGO types)
– get qualified relations from DBpedia

•

Using information from Linked Geo Data
– extract objects within GADM polygon, aggregate by type
(e.g., region contains 125 police stations)
– spatial queries only possible with rectangles
– workaround: use minimum enclosing rectangle and filter afterwards

10/22/13

Petar Ristoski, Heiko Paulheim

5
Visualization with GADM Polygons
•

Polygons from GADM allow for visualization
of unemployment on maps

10/22/13

Petar Ristoski, Heiko Paulheim

6
Finding Correlations
•

Using extracted features to find interesting correlations

•

Example correlation for unemployment in France:
– African islands, Islands in the Indian Ocean,
Outermost regions of the EU (positive)
– GDP (negative)
– Disposable income (negative)
– Hospital beds/inhabitants (negative)
– RnD spendings (negative)
– Energy consumption (negative)
– Population growth (positive)
– Casualties in traffic accidents (negative)
– Fast food restaurants (positive)
– Police stations (positive)

10/22/13

Petar Ristoski, Heiko Paulheim

7
Visualization Correlations
•

e.g., unemployment rate ~ number of police stations

10/22/13

Petar Ristoski, Heiko Paulheim

8
Tools
•

FeGeLOD/Explain-a-LOD (ESWC 2012: best demo award)

10/22/13

Petar Ristoski, Heiko Paulheim

9
Tools
•

RapidMiner Linked Open Data Extension (2013)

10/22/13

Petar Ristoski, Heiko Paulheim

10
Assets
•

New modules for RapidMiner LOD extension
– DBpedia Lookup linker
– Label-based linker

•

New links for DBpedia 3.9 release
– DBpedia – GADM (39,000 links)

10/22/13

Petar Ristoski, Heiko Paulheim

11
Conclusions & Lessons Learned
•

Linked Open Data provides useful background knowledge
– For finding explanations
– For creating visualizations

•

Some data sources are more suitable than others
– official data sources (e.g. Eurostat) provide best results

10/22/13

Petar Ristoski, Heiko Paulheim

12
Conclusions & Lessons Learned
•

Negative correlation: traffic accident casualties ~ unemployment
– Fight unemployment by increasing traffic accidents?

http://xkcd.com/552/

10/22/13

Petar Ristoski, Heiko Paulheim

13
Analyzing Statistics
with Background Knowledge
from Linked Open Data

10/22/13 Ristoski, Heiko Paulheim
Petar Ristoski, Heiko Paulheim
Petar

14

More Related Content

Similar to Analyzing Statistics with Background Knowledge from Linked Open Data

Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerHeiko Paulheim
 
Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018osimod
 
INSPIRE Data harmonisation : methodology and tools
INSPIRE Data harmonisation : methodology and toolsINSPIRE Data harmonisation : methodology and tools
INSPIRE Data harmonisation : methodology and toolsGIM_nv
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesPiet J.H. Daas
 
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019PaNOSC
 
P. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsP. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsIstituto nazionale di statistica
 
Jean claude burgelman implications of open data
Jean claude burgelman implications of open dataJean claude burgelman implications of open data
Jean claude burgelman implications of open dataPlatforma Otwartej Nauki
 
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...Platforma Otwartej Nauki
 
08 wp7 progresses&results-20130221
08 wp7 progresses&results-2013022108 wp7 progresses&results-20130221
08 wp7 progresses&results-20130221fruitbreedomics
 
Statistics project outline team the oni
Statistics project outline team the oniStatistics project outline team the oni
Statistics project outline team the oniPark JunPyo
 
DS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesDS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesPetar Ristoski
 
4. empirical and practical issues
4. empirical and practical issues4. empirical and practical issues
4. empirical and practical issuesLandDegradation
 
Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereAlex Hardisty
 
Data in Switzerland: BFS at OKCon 2013
Data in Switzerland: BFS at OKCon 2013Data in Switzerland: BFS at OKCon 2013
Data in Switzerland: BFS at OKCon 2013CH_Bundesarchiv
 
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesItem 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesSoils FAO-GSP
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
TFMAP: Optimizing MAP for Top-N Context-aware Recommendation
TFMAP: Optimizing MAP for Top-N Context-aware RecommendationTFMAP: Optimizing MAP for Top-N Context-aware Recommendation
TFMAP: Optimizing MAP for Top-N Context-aware RecommendationAlexandros Karatzoglou
 
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...Heiko Paulheim
 

Similar to Analyzing Statistics with Background Knowledge from Linked Open Data (20)

Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
 
Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018
 
INSPIRE Data harmonisation : methodology and tools
INSPIRE Data harmonisation : methodology and toolsINSPIRE Data harmonisation : methodology and tools
INSPIRE Data harmonisation : methodology and tools
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
 
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
 
P. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsP. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European Statistics
 
Jean claude burgelman implications of open data
Jean claude burgelman implications of open dataJean claude burgelman implications of open data
Jean claude burgelman implications of open data
 
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
Open Research Data: Present and planned EC Policy, Jean-Claude Burgelman impl...
 
08 wp7 progresses&results-20130221
08 wp7 progresses&results-2013022108 wp7 progresses&results-20130221
08 wp7 progresses&results-20130221
 
Statistics project outline team the oni
Statistics project outline team the oniStatistics project outline team the oni
Statistics project outline team the oni
 
DS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spacesDS2014: Feature selection in hierarchical feature spaces
DS2014: Feature selection in hierarchical feature spaces
 
4. empirical and practical issues
4. empirical and practical issues4. empirical and practical issues
4. empirical and practical issues
 
Open statistics Belgium
Open statistics BelgiumOpen statistics Belgium
Open statistics Belgium
 
Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphere
 
Data in Switzerland: BFS at OKCon 2013
Data in Switzerland: BFS at OKCon 2013Data in Switzerland: BFS at OKCon 2013
Data in Switzerland: BFS at OKCon 2013
 
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesItem 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
 
Querying Patent Data for Empirical Scholarship : Tools and Strategies
Querying Patent Data for Empirical Scholarship : Tools and StrategiesQuerying Patent Data for Empirical Scholarship : Tools and Strategies
Querying Patent Data for Empirical Scholarship : Tools and Strategies
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
TFMAP: Optimizing MAP for Top-N Context-aware Recommendation
TFMAP: Optimizing MAP for Top-N Context-aware RecommendationTFMAP: Optimizing MAP for Top-N Context-aware Recommendation
TFMAP: Optimizing MAP for Top-N Context-aware Recommendation
 
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...
DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in ...
 

More from Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph BlockHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Heiko Paulheim
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge GraphsHeiko Paulheim
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Heiko Paulheim
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsHeiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyHeiko Paulheim
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 

More from Heiko Paulheim (20)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge Graphs
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge Graphs
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 

Recently uploaded

Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistandanishmna97
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 

Recently uploaded (20)

Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 

Analyzing Statistics with Background Knowledge from Linked Open Data

  • 1. Analyzing Statistics with Background Knowledge from Linked Open Data 10/22/13 Ristoski, Heiko Paulheim Petar Ristoski, Heiko Paulheim Petar 1
  • 2. Idea • Background knowledge from LOD can help – finding explanations – creating more sophisticated visualizations • Steps taken – linking the statistics datasets to LOD datasets – DBpedia, Eurostat, GADM, Linked Geo Data – Extracting features • Finding correlations with unemployment rate – using only one target variable for demonstration purposes – works for arbitrary target variables 10/22/13 Petar Ristoski, Heiko Paulheim 2
  • 3. Linking to LOD Datasets • Linking to DBpedia – using DBpedia Lookup – restricting results to Place and AdministrativeArea – select from many results by minimum edit distance • Linking to Eurostat – using SPARQL to query for labels – querying for word 1-grams, 2-grams, … from original labels – selecting by minimum edit distance 10/22/13 Petar Ristoski, Heiko Paulheim 3
  • 4. Linking to LOD Datasets • Linking to GADM – searching by name turned out to be error-prone – searching by coordinates (from DBpedia) is precise • but suffers from low recall – two-stage approach: • searching by coordinates • searching by average coordinates of all linked objects (in DBpedia) • Total figures: – France regions: 27/27 DBpedia, 26/27 Eurostat, 27/27 GADM – France departments: 101/101 DBpedia, 101/101 GADM – Australia states: 8/9 DBpedia, 9/9 GADM – Australia SA3/SA4: no satisfying results, discarded 10/22/13 Petar Ristoski, Heiko Paulheim 4
  • 5. Feature Extraction • Once the links have been created – get polygon shapes from GADM (for visualization) – get datatype properties from Eurostat/DBpedia – get direct types from DBpedia (incl. YAGO types) – get qualified relations from DBpedia • Using information from Linked Geo Data – extract objects within GADM polygon, aggregate by type (e.g., region contains 125 police stations) – spatial queries only possible with rectangles – workaround: use minimum enclosing rectangle and filter afterwards 10/22/13 Petar Ristoski, Heiko Paulheim 5
  • 6. Visualization with GADM Polygons • Polygons from GADM allow for visualization of unemployment on maps 10/22/13 Petar Ristoski, Heiko Paulheim 6
  • 7. Finding Correlations • Using extracted features to find interesting correlations • Example correlation for unemployment in France: – African islands, Islands in the Indian Ocean, Outermost regions of the EU (positive) – GDP (negative) – Disposable income (negative) – Hospital beds/inhabitants (negative) – RnD spendings (negative) – Energy consumption (negative) – Population growth (positive) – Casualties in traffic accidents (negative) – Fast food restaurants (positive) – Police stations (positive) 10/22/13 Petar Ristoski, Heiko Paulheim 7
  • 8. Visualization Correlations • e.g., unemployment rate ~ number of police stations 10/22/13 Petar Ristoski, Heiko Paulheim 8
  • 9. Tools • FeGeLOD/Explain-a-LOD (ESWC 2012: best demo award) 10/22/13 Petar Ristoski, Heiko Paulheim 9
  • 10. Tools • RapidMiner Linked Open Data Extension (2013) 10/22/13 Petar Ristoski, Heiko Paulheim 10
  • 11. Assets • New modules for RapidMiner LOD extension – DBpedia Lookup linker – Label-based linker • New links for DBpedia 3.9 release – DBpedia – GADM (39,000 links) 10/22/13 Petar Ristoski, Heiko Paulheim 11
  • 12. Conclusions & Lessons Learned • Linked Open Data provides useful background knowledge – For finding explanations – For creating visualizations • Some data sources are more suitable than others – official data sources (e.g. Eurostat) provide best results 10/22/13 Petar Ristoski, Heiko Paulheim 12
  • 13. Conclusions & Lessons Learned • Negative correlation: traffic accident casualties ~ unemployment – Fight unemployment by increasing traffic accidents? http://xkcd.com/552/ 10/22/13 Petar Ristoski, Heiko Paulheim 13
  • 14. Analyzing Statistics with Background Knowledge from Linked Open Data 10/22/13 Ristoski, Heiko Paulheim Petar Ristoski, Heiko Paulheim Petar 14