SlideShare a Scribd company logo
Focused Exploration of Geospatial 
Context on Linked Open Data 
Thomas Gottron, Johannes Schmitz, Stuart E. Middleton 
20 October 2014 
IESD workshop, Riva del Garda 
Thomas Gottron Focused Institute for Web Science and Technolo Egxieplso r a·t i oUnn oifv LeOrDs ity of Koblenz-Landau, Germany 1
Challenge: Focused Exploration of LOD 
• Linked Data entities 
Thomas Gottron Focused Exploration of LOD 2
Challenge: Focused Exploration of LOD 
• Linked Data entities 
• (Semantic) link 
structure 
Thomas Gottron Focused Exploration of LOD 3
Challenge: Focused Exploration of LOD 
• Linked Data entities 
• (Semantic) link 
structure 
• „Relevant“ entities 
Thomas Gottron Focused Exploration of LOD 4
Challenge: Focused Exploration of LOD 
• Linked Data entities 
• (Semantic) link 
structure 
• „Relevant“ entities 
• Seed entity 
Thomas Gottron Focused Exploration of LOD 5
Challenge: Focused Exploration of LOD 
• Linked Data entities 
• (Semantic) link 
structure 
• „Relevant“ entities 
• Seed entity 
? ? 
? ? 
? ? 
Classification: 
Which links lead to 
relevant entities? 
Ranking: 
How probable is a link 
leading to a relevant entity? 
Use Cases: 
Guided exploration 
Focused LOD crawler 
Thomas Gottron Focused Exploration of LOD 6
Focused exploration of Geospatial Context 
Rovereto 
Relevant entities: 
Locations semantically 
related to seed entities 
Bensheim (Germany) 
Thomas Gottron Focused Exploration of LOD 7
Focused Exploration: Formalisation 
• E: set of entities (URIs) 
• R: set of RDF triples (s,p,o) 
s∈ L 
– Restricted to s,o ∈ E 
wgs84:long 
• L⊆E: relevant entities 
-1.404 
– For us: Locations with coordinates 
• Task: for given s‘ and all (s‘,p,o) ∈ R 
– Classification: Predict which o are in L 
– Ranking: Sort object entities o starting from the 
one presumed most probable to be relevant 
wgs84:lat 
50.897 
Thomas Gottron Focused Exploration of LOD 8
5 Approaches 
• Based on 3 paradigms: 
– Schema semantics (1 approach) 
– Supervised machine learning (2 approaches) 
– Information Retrieval inspired (2 approaches) 
Thomas Gottron Focused Exploration of LOD 9
Exploration based on Schema Semantics 
• Exploit rdfs:range definitions of link predicates 
rdfs:range 
dbpedia:Place 
rdfs:subClassOf 
dbponto:twinCity dbpedia:City 
• Follow links which lead to locations 
Thomas Gottron Focused Exploration of LOD 10
Exploration based on Schema Semantics 
s 
Classification 
p1 
p2 
• Range of any pi is a 
location? 
àLabel = relevant 
o 
pm 
Ranking 
Location? 
• Re-use classification: 
– Relevant before 
irrelevant 
... 
Thomas Gottron Focused Exploration of LOD 11
Supervised Machine Learning 
• Use incoming link predicates as features 
– Learn predicates which typically leading to locations 
p4 
p6 
p2 
p3 o‘ 
o 
xxx 
wgs84:lat 
yyy 
wgs84:long 
• Train a classifier (e.g. Naive Bayes) 
2 Variations: 
Use all or only 
observed predicates 
Thomas Gottron Focused Exploration of LOD 12
Supervised Machine Learning 
s 
Classification 
• 
p1 
P(o ∈ L) > P(o ∉ L)? 
àLabel = relevant 
o 
pm 
Ranking 
Location? 
• Rank by odds: 
p2 
... 
O(o ∈ L) = 
P(o ∈ L) 
P(o ∉ L) 
Thomas Gottron Focused Exploration of LOD 13
IR Inspired Approaches 
• Discriminativeness of predicates (inspired by tf-idf) 
• Property relevance frequency: 
• Inverse property frequency 
• Combine into prf-ipf and prr-ipf 
• Total score ρ: aggregate over all predicates 
prf = c(p, L) 
ipf = log 
c(∗,∗) 
c(p,∗) 
" 
# $ 
Thomas Gottron Focused Exploration of LOD 14 
% 
& ' 
o p3 
2nd Variation: 
prr: normalised prf
IR Inspired Approaches 
s 
Classification 
p1 
p2 
• Determine threshold 
– Nearest centroid 
o 
pm 
Ranking 
Location? 
• Rank by score 
... 
ρ prr-ipf (o) 
Thomas Gottron Focused Exploration of LOD 15
Evaluation 
• Metrics: 
– Ranking: 
• ROC curves 
• AUC 
– Classification: 
• Precision 
• Recall 
• F1 
• Accuracy 
• Cross validation: 
– 10-times / 10-fold 
– Averages 
99,951 entities 
1,728,633 links 
425,338 entities 
128,171 relevant 
Seed 
Exploration 
owl:sameAs 
Thomas Gottron Focused Exploration of LOD 16
Performance (Ranking) 
1 
0.8 
0.6 
0.4 
0.2 
0 
ROC 
1 
0.975 
0.95 
0 0.025 0.05 
0 0.2 0.4 0.6 0.8 1 
random 
Schema Semantics 
NB (all predicates) 
NB (present predicates) 
prf-ipf 
prr-ipf 
Thomas Gottron Focused Exploration of LOD 17
Performance (Classification & Ranking) 
2. Average performance of approaches († indicates significant improvements confidence level ⇢ = 0.01) 
Method Recall Precision F1 Accuracy AUC 
Schema Scemantics 0.1188 0.8119 0.2073 0.7262 0.5552 
NB (all predicates) 0.9906 0.9491 † 0.9694 † 0.9812 0.9970 
NB (observed predicates) 0.9943 0.9436 0.9683 0.9804 0.9968 
prf-ipf 0.8512 † 0.9754 0.9091 0.9487 0.9958 
prr-ipf † 0.9973 0.9240 0.9592 0.9745 0.9769 
performance in bold. Furthermore, we marked the results where we had a significant over the second best method at confidence level of ⇢ = 0.01. The aggregated 
basically Thomas Gottron confirm the observations Focused Exploration made of above. LOD In general, when considering 18
Summary 
• Focused exploration feasible 
• ML approach performing best 
• Future work: 
– Other data sets 
– Generalise scenario (more than locations) 
– Better approaches using more features 
Thomas Gottron Focused Exploration of LOD 19
Questions? 
Thomas Gottron 
Institute for Web Science and Technologies 
Universität Koblenz-Landau 
gottron@uni-koblenz.de 
Thomas Gottron Focused Institute for Web Science and Technolo Egxieplso r a·t i oUnn oifv LeOrDs ity of Koblenz-Landau, Germany 20

More Related Content

Similar to Focused Exploration of Geospatial Context on Linked Open Data

The Maze of Deletion in Ontology Stream Reasoning
The Maze of Deletion in Ontology Stream Reasoning The Maze of Deletion in Ontology Stream Reasoning
The Maze of Deletion in Ontology Stream Reasoning
Jeff Z. Pan
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
Julie Iskander
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
Valentina Paunovic
 
Personalised Search for the Social Semantic Web
Personalised Search for the Social Semantic WebPersonalised Search for the Social Semantic Web
Personalised Search for the Social Semantic Web
Oana Tifrea-Marciuska
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
Frank van Harmelen
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Yohanes Nuwara
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSCSSN
 
Knowledge engg using & in fol
Knowledge engg using & in folKnowledge engg using & in fol
Knowledge engg using & in fol
chandsek666
 
machine-learning-with-large-networks-of-people-and-places
machine-learning-with-large-networks-of-people-and-placesmachine-learning-with-large-networks-of-people-and-places
machine-learning-with-large-networks-of-people-and-places
Tony Frame
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting Spatial
FAO
 
A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...
Petar Ristoski
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
Labreport
LabreportLabreport
Labreport
AMR koura
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Kostis Kyzirakos
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
Charles Martin
 
Artificial intelligence for Social Good
Artificial intelligence for Social GoodArtificial intelligence for Social Good
Artificial intelligence for Social Good
Oana Tifrea-Marciuska
 
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
Asai Masataro
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
MITS Gwalior
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data
Thomas Gottron
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Ontotext
 

Similar to Focused Exploration of Geospatial Context on Linked Open Data (20)

The Maze of Deletion in Ontology Stream Reasoning
The Maze of Deletion in Ontology Stream Reasoning The Maze of Deletion in Ontology Stream Reasoning
The Maze of Deletion in Ontology Stream Reasoning
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
Personalised Search for the Social Semantic Web
Personalised Search for the Social Semantic WebPersonalised Search for the Social Semantic Web
Personalised Search for the Social Semantic Web
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision Making
 
Knowledge engg using & in fol
Knowledge engg using & in folKnowledge engg using & in fol
Knowledge engg using & in fol
 
machine-learning-with-large-networks-of-people-and-places
machine-learning-with-large-networks-of-people-and-placesmachine-learning-with-large-networks-of-people-and-places
machine-learning-with-large-networks-of-people-and-places
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting Spatial
 
A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF Graphs
 
Labreport
LabreportLabreport
Labreport
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
Artificial intelligence for Social Good
Artificial intelligence for Social GoodArtificial intelligence for Social Good
Artificial intelligence for Social Good
 
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
[AAAI-16] Tiebreaking Strategies for A* Search: How to Explore the Final Fron...
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 

More from REVEAL - Social Media Verification

Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...
REVEAL - Social Media Verification
 
Veracity & Velocity of Social Media Content during Breaking News
Veracity & Velocity of Social Media Content during Breaking NewsVeracity & Velocity of Social Media Content during Breaking News
Veracity & Velocity of Social Media Content during Breaking News
REVEAL - Social Media Verification
 
REVEAL Project - Trust and Credibility Analysis
REVEAL Project - Trust and Credibility AnalysisREVEAL Project - Trust and Credibility Analysis
REVEAL Project - Trust and Credibility Analysis
REVEAL - Social Media Verification
 
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ..."Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
REVEAL - Social Media Verification
 
Prix Italia 2015 - Verification in Social Newsgathering
Prix Italia 2015 - Verification in Social NewsgatheringPrix Italia 2015 - Verification in Social Newsgathering
Prix Italia 2015 - Verification in Social Newsgathering
REVEAL - Social Media Verification
 
Verification of UGC/Eyewitness Media: Challenges and Approaches
Verification of UGC/Eyewitness Media: Challenges and Approaches Verification of UGC/Eyewitness Media: Challenges and Approaches
Verification of UGC/Eyewitness Media: Challenges and Approaches
REVEAL - Social Media Verification
 
Web image size prediction for efficient focused image crawling
Web image size prediction for efficient focused image crawlingWeb image size prediction for efficient focused image crawling
Web image size prediction for efficient focused image crawling
REVEAL - Social Media Verification
 
News-oriented multimedia search over multiple social networks
News-oriented multimedia search over multiple social networksNews-oriented multimedia search over multiple social networks
News-oriented multimedia search over multiple social networks
REVEAL - Social Media Verification
 
WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility AnalysisWWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
REVEAL - Social Media Verification
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
REVEAL - Social Media Verification
 
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
REVEAL - Social Media Verification
 
Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
 Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
REVEAL - Social Media Verification
 
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
REVEAL - Social Media Verification
 
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
REVEAL - Social Media Verification
 
Reveal - Social Media Verification - poster
Reveal - Social Media Verification - posterReveal - Social Media Verification - poster
Reveal - Social Media Verification - poster
REVEAL - Social Media Verification
 
REVEAL - Social Media Verification - brochure
REVEAL - Social Media Verification - brochureREVEAL - Social Media Verification - brochure
REVEAL - Social Media Verification - brochure
REVEAL - Social Media Verification
 

More from REVEAL - Social Media Verification (16)

Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...Geoparsing and Real-time Social Media Analytics - technical and social challe...
Geoparsing and Real-time Social Media Analytics - technical and social challe...
 
Veracity & Velocity of Social Media Content during Breaking News
Veracity & Velocity of Social Media Content during Breaking NewsVeracity & Velocity of Social Media Content during Breaking News
Veracity & Velocity of Social Media Content during Breaking News
 
REVEAL Project - Trust and Credibility Analysis
REVEAL Project - Trust and Credibility AnalysisREVEAL Project - Trust and Credibility Analysis
REVEAL Project - Trust and Credibility Analysis
 
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ..."Extracting Attributed Verification and Debunking Reports from Social Media: ...
"Extracting Attributed Verification and Debunking Reports from Social Media: ...
 
Prix Italia 2015 - Verification in Social Newsgathering
Prix Italia 2015 - Verification in Social NewsgatheringPrix Italia 2015 - Verification in Social Newsgathering
Prix Italia 2015 - Verification in Social Newsgathering
 
Verification of UGC/Eyewitness Media: Challenges and Approaches
Verification of UGC/Eyewitness Media: Challenges and Approaches Verification of UGC/Eyewitness Media: Challenges and Approaches
Verification of UGC/Eyewitness Media: Challenges and Approaches
 
Web image size prediction for efficient focused image crawling
Web image size prediction for efficient focused image crawlingWeb image size prediction for efficient focused image crawling
Web image size prediction for efficient focused image crawling
 
News-oriented multimedia search over multiple social networks
News-oriented multimedia search over multiple social networksNews-oriented multimedia search over multiple social networks
News-oriented multimedia search over multiple social networks
 
WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility AnalysisWWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
WWW2015 - RDSM2015 Workshop - Trust and Credibility Analysis
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
 
Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
 Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
Cross-Media Konferenz "Think Cross - Change Media" in Magdeburg, Germany
 
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
News Impact Summit - Verification, Investigation and Digital Ethics – Hamburg...
 
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
TRIDEC and REVEAL projects: Geoparsing and Geosemantic knowledge model for tr...
 
Reveal - Social Media Verification - poster
Reveal - Social Media Verification - posterReveal - Social Media Verification - poster
Reveal - Social Media Verification - poster
 
REVEAL - Social Media Verification - brochure
REVEAL - Social Media Verification - brochureREVEAL - Social Media Verification - brochure
REVEAL - Social Media Verification - brochure
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

Focused Exploration of Geospatial Context on Linked Open Data

  • 1. Focused Exploration of Geospatial Context on Linked Open Data Thomas Gottron, Johannes Schmitz, Stuart E. Middleton 20 October 2014 IESD workshop, Riva del Garda Thomas Gottron Focused Institute for Web Science and Technolo Egxieplso r a·t i oUnn oifv LeOrDs ity of Koblenz-Landau, Germany 1
  • 2. Challenge: Focused Exploration of LOD • Linked Data entities Thomas Gottron Focused Exploration of LOD 2
  • 3. Challenge: Focused Exploration of LOD • Linked Data entities • (Semantic) link structure Thomas Gottron Focused Exploration of LOD 3
  • 4. Challenge: Focused Exploration of LOD • Linked Data entities • (Semantic) link structure • „Relevant“ entities Thomas Gottron Focused Exploration of LOD 4
  • 5. Challenge: Focused Exploration of LOD • Linked Data entities • (Semantic) link structure • „Relevant“ entities • Seed entity Thomas Gottron Focused Exploration of LOD 5
  • 6. Challenge: Focused Exploration of LOD • Linked Data entities • (Semantic) link structure • „Relevant“ entities • Seed entity ? ? ? ? ? ? Classification: Which links lead to relevant entities? Ranking: How probable is a link leading to a relevant entity? Use Cases: Guided exploration Focused LOD crawler Thomas Gottron Focused Exploration of LOD 6
  • 7. Focused exploration of Geospatial Context Rovereto Relevant entities: Locations semantically related to seed entities Bensheim (Germany) Thomas Gottron Focused Exploration of LOD 7
  • 8. Focused Exploration: Formalisation • E: set of entities (URIs) • R: set of RDF triples (s,p,o) s∈ L – Restricted to s,o ∈ E wgs84:long • L⊆E: relevant entities -1.404 – For us: Locations with coordinates • Task: for given s‘ and all (s‘,p,o) ∈ R – Classification: Predict which o are in L – Ranking: Sort object entities o starting from the one presumed most probable to be relevant wgs84:lat 50.897 Thomas Gottron Focused Exploration of LOD 8
  • 9. 5 Approaches • Based on 3 paradigms: – Schema semantics (1 approach) – Supervised machine learning (2 approaches) – Information Retrieval inspired (2 approaches) Thomas Gottron Focused Exploration of LOD 9
  • 10. Exploration based on Schema Semantics • Exploit rdfs:range definitions of link predicates rdfs:range dbpedia:Place rdfs:subClassOf dbponto:twinCity dbpedia:City • Follow links which lead to locations Thomas Gottron Focused Exploration of LOD 10
  • 11. Exploration based on Schema Semantics s Classification p1 p2 • Range of any pi is a location? àLabel = relevant o pm Ranking Location? • Re-use classification: – Relevant before irrelevant ... Thomas Gottron Focused Exploration of LOD 11
  • 12. Supervised Machine Learning • Use incoming link predicates as features – Learn predicates which typically leading to locations p4 p6 p2 p3 o‘ o xxx wgs84:lat yyy wgs84:long • Train a classifier (e.g. Naive Bayes) 2 Variations: Use all or only observed predicates Thomas Gottron Focused Exploration of LOD 12
  • 13. Supervised Machine Learning s Classification • p1 P(o ∈ L) > P(o ∉ L)? àLabel = relevant o pm Ranking Location? • Rank by odds: p2 ... O(o ∈ L) = P(o ∈ L) P(o ∉ L) Thomas Gottron Focused Exploration of LOD 13
  • 14. IR Inspired Approaches • Discriminativeness of predicates (inspired by tf-idf) • Property relevance frequency: • Inverse property frequency • Combine into prf-ipf and prr-ipf • Total score ρ: aggregate over all predicates prf = c(p, L) ipf = log c(∗,∗) c(p,∗) " # $ Thomas Gottron Focused Exploration of LOD 14 % & ' o p3 2nd Variation: prr: normalised prf
  • 15. IR Inspired Approaches s Classification p1 p2 • Determine threshold – Nearest centroid o pm Ranking Location? • Rank by score ... ρ prr-ipf (o) Thomas Gottron Focused Exploration of LOD 15
  • 16. Evaluation • Metrics: – Ranking: • ROC curves • AUC – Classification: • Precision • Recall • F1 • Accuracy • Cross validation: – 10-times / 10-fold – Averages 99,951 entities 1,728,633 links 425,338 entities 128,171 relevant Seed Exploration owl:sameAs Thomas Gottron Focused Exploration of LOD 16
  • 17. Performance (Ranking) 1 0.8 0.6 0.4 0.2 0 ROC 1 0.975 0.95 0 0.025 0.05 0 0.2 0.4 0.6 0.8 1 random Schema Semantics NB (all predicates) NB (present predicates) prf-ipf prr-ipf Thomas Gottron Focused Exploration of LOD 17
  • 18. Performance (Classification & Ranking) 2. Average performance of approaches († indicates significant improvements confidence level ⇢ = 0.01) Method Recall Precision F1 Accuracy AUC Schema Scemantics 0.1188 0.8119 0.2073 0.7262 0.5552 NB (all predicates) 0.9906 0.9491 † 0.9694 † 0.9812 0.9970 NB (observed predicates) 0.9943 0.9436 0.9683 0.9804 0.9968 prf-ipf 0.8512 † 0.9754 0.9091 0.9487 0.9958 prr-ipf † 0.9973 0.9240 0.9592 0.9745 0.9769 performance in bold. Furthermore, we marked the results where we had a significant over the second best method at confidence level of ⇢ = 0.01. The aggregated basically Thomas Gottron confirm the observations Focused Exploration made of above. LOD In general, when considering 18
  • 19. Summary • Focused exploration feasible • ML approach performing best • Future work: – Other data sets – Generalise scenario (more than locations) – Better approaches using more features Thomas Gottron Focused Exploration of LOD 19
  • 20. Questions? Thomas Gottron Institute for Web Science and Technologies Universität Koblenz-Landau gottron@uni-koblenz.de Thomas Gottron Focused Institute for Web Science and Technolo Egxieplso r a·t i oUnn oifv LeOrDs ity of Koblenz-Landau, Germany 20