SlideShare a Scribd company logo
1 of 18
Serving DBpedia with DOLCE
More than Just Adding a Cherry on Top
DBpedia Extraction Framework
Mappings Wiki
DOLCE
Heiko Paulheim and Aldo Gangemi
10/13/15 Heiko Paulheim and Aldo Gangemi 2
DOLCE Mappings: Served Since DBpedia 2014
10/13/15 Heiko Paulheim and Aldo Gangemi 3
More than Just a Cherry on Top
• DOLCE adds a layer of formalization
– high level axioms
– additional domain and range restrictions
– fundamental disjointness (e.g., physical object vs. social object)
• Enriches DBpedia ontology
• Can be used for consistency checking
10/13/15 Heiko Paulheim and Aldo Gangemi 4
More than Just a Cherry on Top
Tim Berners-Lee Royal Society
Award
award
range
Description
subclass of
Social Agent
disjoint
with
DBpedia
ontology
DBpedia
instances
DOLCE
ontology
Organisation
is a
Social Person
equivalent class
subclass of
10/13/15 Heiko Paulheim and Aldo Gangemi 5
DBpedia in a Nutshell
• Raw extraction from Wikipedia infoboxes
• Infobox types and keys mapped to ontology
– Crowdsourcing process (aka Mappings Wiki)
– 735 classes
– ~2,800 properties
• Almost no disjointness
– only 24 disjointness axioms
– many of those are corner cases
• MovingWalkway vs. Person
f
10/13/15 Heiko Paulheim and Aldo Gangemi 6
DOLCE in a Nutshell
• A top level ontology
• Defines top level classes and relations
• Including rich axiomatization
10/13/15 Heiko Paulheim and Aldo Gangemi 7
DOLCE in a Nutshell
• Original DOLCE ontologies were too heavy weight
– thus: hardly used on the semantic web
– remember: a little semantics goes a long way!
• DOLCE-Zero
– Simplified version
– Contains both DOLCE and D&S (Descriptions and Situations)
– D&S introduces some high level design patterns
10/13/15 Heiko Paulheim and Aldo Gangemi 8
Systematic vs. Individual Errors in DBpedia
• Systematic errors
– occur frequently following a pattern
– e.g., organizations are frequently used as objects of the relation award
• are likely to have a common root cause
– wrong mapping from infobox to ontology
– error in the extraction code
– ...
Tim Berners-Lee Royal Society
Award
award
range
Description
subclass of
Social Agent
disjoint
with
DBpedia
ontology
DBpedia
instances
DOLCE
ontology
Organisation
is a
Social Person
equivalent class
subclass of
10/13/15 Heiko Paulheim and Aldo Gangemi 9
Overall Workflow
• Overall workflow
• For each statement
– add the statement plus all subject/object types to the ontology
– check consistency
– present inconsistent statements and explanations for inspection
; Reasoning 2
RDF Graph
Statements
+ Types
Inconsistent
Statements
+ explanations
User
10/13/15 Heiko Paulheim and Aldo Gangemi 10
Overall Workflow
• Inspecting a single statement takes 2.6 seconds
– on a standard laptop
• DBpedia 2014 has 15,001,543 statements
– in the dbpedia-owl namespace
→ consistency checking would take 451 days
• Solution
– cache results for signatures (predicate + subject types + object types)
– there are only 34,554 different signatures!
; Reasoning 2
RDF Graph
Statements
+ Types
Inconsistent
Statements
+ explanations
UserCache
10/13/15 Heiko Paulheim and Aldo Gangemi 11
Overall Workflow
• Overall, we find 3,654,255 inconsistent statements (24.4%)
– cf.: only 97,749 (0.7%) without DOLCE
• Too much to inspect!
– We are looking for systematic errors
– Cluster explanations w/ DBSCAN
– Each cluster represents a systematic error
; Reasoning Clustering 2
RDF Graph
Statements
+ Types
Inconsistent
Statements
+ explanations
Clusters
UserCache
10/13/15 Heiko Paulheim and Aldo Gangemi 12
Clustering Inconsistent Statements
• Each explanation as a binary vector
– with 0/1 for all axioms involved in the explanations
– 1,467 axioms (dimensions) in total
• DBSCAN
– Manhattan distance (i.e., number of axioms added/removed)
– MinPts=100 (minimum frequency for a systematic error)
– ε=4 (explanations in a cluster differ by two axioms at most)
10/13/15 Heiko Paulheim and Aldo Gangemi 13
Major Systematic Errors
• Inspection of the top 40 clusters
– they contain 96% of all inconsistent statements
• Overcommitment (19) – using properties in different contexts
– e.g., dbo:team is defined as a relation between persons and
sports teams
– but also used for relating participating teams to events
– fix: relax domain/range constraints, or introduce new properties
• Metonymy (11) – ambiguous language terms
– e.g., instances of dbo:Species contain both species
as well as single animals
– hard to refactor
10/13/15 Heiko Paulheim and Aldo Gangemi 14
Major Systematic Errors
• Misalignment (5) – classes/properties mapped to
the wrong concept in DOLCE
– occasionally occurs if intended and actual use differ
– e.g., dbo:commander is more frequently used with events (e.g., battles)
than military units – d0:hasParticipant rather than d0:coparticipatesWith
– fix: change alignment
• Version branching (3) – semantics of dbo concepts have changed
– e.g., dbo:team in DBpedia 3.9: career stations and teams,
in DBpedia 2014: athletes and teams
– fix: change alignment
10/13/15 Heiko Paulheim and Aldo Gangemi 15
`
A Look at the Long Tail
• DBSCAN identifies clusters and “noise”
– i.e., statements that are not contained in clusters
• Manual inspection of a sample of 100 instances
– 64 are erroneous
– 30 are false negatives (i.e., correct statements)
– 6 are questionable
• Typical error sources in the long tail
– are expected to be cross-cutting, i.e.,
occurring with various classes and properties
10/13/15 Heiko Paulheim and Aldo Gangemi 16
A Look at the Long Tail
• Typical error sources in the long tail
• Link in longer text (23)
– e.g., dbr:Cosmo_Kramer dbo:occupation dbr:Bagel .
• Wrong link in Wikipedia (9)
– e.g., dbr#Stone_(band) dbo:associatedMusicArtist dbr#Dementia .
– Dementia should link to the band, not the disease
• Redirects (7)
– e.g., dbpedia#Ben_Casey dbo:company dbpedia#Bing_Crosby .
– The target Bing_Crosby_Productions redirects to Bing_Crosby
• Links with Anchors (6)
– e.g., dbr:Tim_Berners-Lee dbo:award dbr:Royal_Society .
– the original link target is Royal_Society#Fellows
– anchors are ignored by DBpedia
10/13/15 Heiko Paulheim and Aldo Gangemi 17
Conclusions
• We have shown that
– DOLCE helps identifying inconsistent statements
– Cluster analysis allows for identifying systematic errors
– User interaction is minimized
• we analyzed one statement each from 40 clusters
• corresponding to 3,497,068 affected statements
• Outcomes for future DBpedia versions
– DBpedia ontology changes
– Mapping changes
– DOLCE alignment changes
– Bug reports
Serving DBpedia with DOLCE
More than Just Adding a Cherry on Top
DBpedia Extraction Framework
Mappings Wiki
DOLCE
Heiko Paulheim and Aldo Gangemi

More Related Content

Similar to Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top

Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101Luigi De Russis
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ Prateek Jain
 
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014eswcsummerschool
 
Building and using ontologies
Building and using ontologies Building and using ontologies
Building and using ontologies Elena Simperl
 
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...eswcsummerschool
 
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - IntroductionOntology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - IntroductionAldo Gangemi
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulationstbruce
 
Ontologies: vehicles for reuse
Ontologies: vehicles for reuseOntologies: vehicles for reuse
Ontologies: vehicles for reuseGuus Schreiber
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Brian Ulicny
 
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...RuleML
 
RDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsRDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsLouise Spiteri
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoveryJack Park
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 
Temple University Digital Scholarship Center: Model of the Month Club: Septem...
Temple University Digital Scholarship Center: Model of the Month Club: Septem...Temple University Digital Scholarship Center: Model of the Month Club: Septem...
Temple University Digital Scholarship Center: Model of the Month Club: Septem...Liz Rodrigues
 
RDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsRDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsLouise Spiteri
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologiesProf. Wim Van Criekinge
 

Similar to Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top (20)

Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 
Building and using ontologies
Building and using ontologies Building and using ontologies
Building and using ontologies
 
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
 
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - IntroductionOntology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulations
 
Ontologies: vehicles for reuse
Ontologies: vehicles for reuseOntologies: vehicles for reuse
Ontologies: vehicles for reuse
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13
 
BT02.pptx
BT02.pptxBT02.pptx
BT02.pptx
 
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...
 
RDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsRDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dots
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Temple University Digital Scholarship Center: Model of the Month Club: Septem...
Temple University Digital Scholarship Center: Model of the Month Club: Septem...Temple University Digital Scholarship Center: Model of the Month Club: Septem...
Temple University Digital Scholarship Center: Model of the Month Club: Septem...
 
RDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dotsRDA, FRBR, and FRAD: Connecting the dots
RDA, FRBR, and FRAD: Connecting the dots
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologies
 

More from Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph BlockHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Heiko Paulheim
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge GraphsHeiko Paulheim
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Heiko Paulheim
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsHeiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesHeiko Paulheim
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataHeiko Paulheim
 

More from Heiko Paulheim (20)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge Graphs
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge Graphs
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
 

Recently uploaded

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top

  • 1. Serving DBpedia with DOLCE More than Just Adding a Cherry on Top DBpedia Extraction Framework Mappings Wiki DOLCE Heiko Paulheim and Aldo Gangemi
  • 2. 10/13/15 Heiko Paulheim and Aldo Gangemi 2 DOLCE Mappings: Served Since DBpedia 2014
  • 3. 10/13/15 Heiko Paulheim and Aldo Gangemi 3 More than Just a Cherry on Top • DOLCE adds a layer of formalization – high level axioms – additional domain and range restrictions – fundamental disjointness (e.g., physical object vs. social object) • Enriches DBpedia ontology • Can be used for consistency checking
  • 4. 10/13/15 Heiko Paulheim and Aldo Gangemi 4 More than Just a Cherry on Top Tim Berners-Lee Royal Society Award award range Description subclass of Social Agent disjoint with DBpedia ontology DBpedia instances DOLCE ontology Organisation is a Social Person equivalent class subclass of
  • 5. 10/13/15 Heiko Paulheim and Aldo Gangemi 5 DBpedia in a Nutshell • Raw extraction from Wikipedia infoboxes • Infobox types and keys mapped to ontology – Crowdsourcing process (aka Mappings Wiki) – 735 classes – ~2,800 properties • Almost no disjointness – only 24 disjointness axioms – many of those are corner cases • MovingWalkway vs. Person f
  • 6. 10/13/15 Heiko Paulheim and Aldo Gangemi 6 DOLCE in a Nutshell • A top level ontology • Defines top level classes and relations • Including rich axiomatization
  • 7. 10/13/15 Heiko Paulheim and Aldo Gangemi 7 DOLCE in a Nutshell • Original DOLCE ontologies were too heavy weight – thus: hardly used on the semantic web – remember: a little semantics goes a long way! • DOLCE-Zero – Simplified version – Contains both DOLCE and D&S (Descriptions and Situations) – D&S introduces some high level design patterns
  • 8. 10/13/15 Heiko Paulheim and Aldo Gangemi 8 Systematic vs. Individual Errors in DBpedia • Systematic errors – occur frequently following a pattern – e.g., organizations are frequently used as objects of the relation award • are likely to have a common root cause – wrong mapping from infobox to ontology – error in the extraction code – ... Tim Berners-Lee Royal Society Award award range Description subclass of Social Agent disjoint with DBpedia ontology DBpedia instances DOLCE ontology Organisation is a Social Person equivalent class subclass of
  • 9. 10/13/15 Heiko Paulheim and Aldo Gangemi 9 Overall Workflow • Overall workflow • For each statement – add the statement plus all subject/object types to the ontology – check consistency – present inconsistent statements and explanations for inspection ; Reasoning 2 RDF Graph Statements + Types Inconsistent Statements + explanations User
  • 10. 10/13/15 Heiko Paulheim and Aldo Gangemi 10 Overall Workflow • Inspecting a single statement takes 2.6 seconds – on a standard laptop • DBpedia 2014 has 15,001,543 statements – in the dbpedia-owl namespace → consistency checking would take 451 days • Solution – cache results for signatures (predicate + subject types + object types) – there are only 34,554 different signatures! ; Reasoning 2 RDF Graph Statements + Types Inconsistent Statements + explanations UserCache
  • 11. 10/13/15 Heiko Paulheim and Aldo Gangemi 11 Overall Workflow • Overall, we find 3,654,255 inconsistent statements (24.4%) – cf.: only 97,749 (0.7%) without DOLCE • Too much to inspect! – We are looking for systematic errors – Cluster explanations w/ DBSCAN – Each cluster represents a systematic error ; Reasoning Clustering 2 RDF Graph Statements + Types Inconsistent Statements + explanations Clusters UserCache
  • 12. 10/13/15 Heiko Paulheim and Aldo Gangemi 12 Clustering Inconsistent Statements • Each explanation as a binary vector – with 0/1 for all axioms involved in the explanations – 1,467 axioms (dimensions) in total • DBSCAN – Manhattan distance (i.e., number of axioms added/removed) – MinPts=100 (minimum frequency for a systematic error) – ε=4 (explanations in a cluster differ by two axioms at most)
  • 13. 10/13/15 Heiko Paulheim and Aldo Gangemi 13 Major Systematic Errors • Inspection of the top 40 clusters – they contain 96% of all inconsistent statements • Overcommitment (19) – using properties in different contexts – e.g., dbo:team is defined as a relation between persons and sports teams – but also used for relating participating teams to events – fix: relax domain/range constraints, or introduce new properties • Metonymy (11) – ambiguous language terms – e.g., instances of dbo:Species contain both species as well as single animals – hard to refactor
  • 14. 10/13/15 Heiko Paulheim and Aldo Gangemi 14 Major Systematic Errors • Misalignment (5) – classes/properties mapped to the wrong concept in DOLCE – occasionally occurs if intended and actual use differ – e.g., dbo:commander is more frequently used with events (e.g., battles) than military units – d0:hasParticipant rather than d0:coparticipatesWith – fix: change alignment • Version branching (3) – semantics of dbo concepts have changed – e.g., dbo:team in DBpedia 3.9: career stations and teams, in DBpedia 2014: athletes and teams – fix: change alignment
  • 15. 10/13/15 Heiko Paulheim and Aldo Gangemi 15 ` A Look at the Long Tail • DBSCAN identifies clusters and “noise” – i.e., statements that are not contained in clusters • Manual inspection of a sample of 100 instances – 64 are erroneous – 30 are false negatives (i.e., correct statements) – 6 are questionable • Typical error sources in the long tail – are expected to be cross-cutting, i.e., occurring with various classes and properties
  • 16. 10/13/15 Heiko Paulheim and Aldo Gangemi 16 A Look at the Long Tail • Typical error sources in the long tail • Link in longer text (23) – e.g., dbr:Cosmo_Kramer dbo:occupation dbr:Bagel . • Wrong link in Wikipedia (9) – e.g., dbr#Stone_(band) dbo:associatedMusicArtist dbr#Dementia . – Dementia should link to the band, not the disease • Redirects (7) – e.g., dbpedia#Ben_Casey dbo:company dbpedia#Bing_Crosby . – The target Bing_Crosby_Productions redirects to Bing_Crosby • Links with Anchors (6) – e.g., dbr:Tim_Berners-Lee dbo:award dbr:Royal_Society . – the original link target is Royal_Society#Fellows – anchors are ignored by DBpedia
  • 17. 10/13/15 Heiko Paulheim and Aldo Gangemi 17 Conclusions • We have shown that – DOLCE helps identifying inconsistent statements – Cluster analysis allows for identifying systematic errors – User interaction is minimized • we analyzed one statement each from 40 clusters • corresponding to 3,497,068 affected statements • Outcomes for future DBpedia versions – DBpedia ontology changes – Mapping changes – DOLCE alignment changes – Bug reports
  • 18. Serving DBpedia with DOLCE More than Just Adding a Cherry on Top DBpedia Extraction Framework Mappings Wiki DOLCE Heiko Paulheim and Aldo Gangemi