SlideShare a Scribd company logo
1 of 17
An Analysis of the Quality Issues
of the Properties Available
in the Spanish DBpedia
Nandana Mihindukulasooriya, Mariano Rico,
Raúl García-Castro, and Asunción Gómez-Pérez
Ontology Engineering Group (OEG)
Departamento de Inteligencia Artificial
Escuela Técnica Superior de Ingenieros Informáticos
Universidad Politécnica de Madrid
Acknowledgments:
4V (TIN2013-46238-C4-2-R) and LIDER (EU FP7 610782) projects
http://loupe.linkeddata.es
Collaborative editing in Wikipedia
2Ontology Engineering Group, Universidad Politécnica de Madrid
Spontaneous data model creation
3Ontology Engineering Group, Universidad Politécnica de Madrid
Can spontaneous data models
support us in data quality
assessment?
But, first, what is the quality of such
spontaneous data models?
4Ontology Engineering Group, Universidad Politécnica de Madrid
DBpedia – Exposing Wikipedia’s content as Linked Data
5Ontology Engineering Group, Universidad Politécnica de Madrid
RDF Triple
store
Rendering
esDBpedia – the Spanish DBpedia chapter
6Ontology Engineering Group, Universidad Politécnica de Madrid
http://es.dbpedia.org/
Can esDBpedia’s
spontaneous data model
support us in data quality
assessment?
But, first, what is the quality of the
properties of such spontaneous
data model?
7Ontology Engineering Group, Universidad Politécnica de Madrid
Quality Dimensions for Datasets
A. Conciseness. A dataset does not contain
redundant concepts with different identifiers
B. Consistency. A dataset does not contain
conflicting or contradictory data
C. Syntactic Validity. Values belong to the
legal value range for the represented domain
and do not violate the syntactic rules
D. Semantic Accuracy. Values correctly
represent real world facts
8Ontology Engineering Group, Universidad Politécnica de Madrid
Extraction and inspection of property statistics
9Ontology Engineering Group, Universidad Politécnica de Madrid
http://loupe.linkeddata.es/
Information extracted about properties
10Ontology Engineering Group, Universidad Politécnica de Madrid
Property statistics template Example Data
General information URI http://es.dbpedia.org/property/edad
Local name edad
Namespace http://es.dbpedia.org/property/
Number of triples 4623
Subject Analysis IRI subject count 4623 (100 %)
Extracted domain classes
(i.e., ?subject a ?class)
dbpedia-owl:Agent 2611 (56,48 %)
schema:Person 1515 (32.77 %)
…
Object analysis URI object count 186 (4.02 %)
Extracted range classes
(i.e., ?object a ?class)
skos:Concept 17 (9.14 %)
schema:Place 2 (1.08 %)
…
Literal object count 4437 (95.98 %)
Numerical object count 2491 (53.88 %)
Integer object count 2382 (51.52 %)
Average of numerics 3.53
Max numeric sample 8.79E11, 1.5E8, 1.5E7, 8.2E6, 8121540
Min numeric sample -5, 0, 1, 1.08, 1.2
Properties in esDBpedia
11Ontology Engineering Group, Universidad Politécnica de Madrid
Property prefix Properties Property values
# % # %
http://es.dbpedia.org/property/ 19,885 52.53 18,021,389 10.66
http://dbpedia.org/property/ 17,188 45.40 9,742,710 5.76
http://dbpedia.org/ontology/ 576 1.52 86,602,281 51.21
http://xmlns.com/foaf/0.1/ 12 0.03 8,132,328 4.81
http://www.w3.org/1999/02/22-rdf-syntax-ns# 8 0.02 12,298,451 7.27
http://www.w3.org/2000/01/rdf-schema# 7 0.02 5,366,982 3.17
http://www.w3.org/2002/07/owl# 6 0.02 16,523,751 9.77
http://purl.org/dc/terms/# 4 0.01 4,148,399 2.45
http://www.w3.org/2004/02/skos/core# 4 0.01 1,153,685 0.68
http://purl.org/dc/elements/1.1/ 3 0.01 3,346,874 1.98
http://www.w3.org/ns/prov# 1 0.00 2,853,681 1.69
Other prefixes 163 0.43 911,131 0.54
Total 37,857 100 169,101,662 100
A. Conciseness
• Many redundant properties in esDBpedia
• 97.93% are auto-generated
• Causes
• Capitalization (857): partidosEnPrimera,partidosenprimera
• Synonyms: causaDeMuerte, causaDeFallecimiento
• Prepositions: causaDeFallecimiento, causaFallecimiento
• Spelling (7,495): apeliido, apelldio, apellid
• Singular/plural: apellido, apellidos
• Gender: administrador, administradora
• Accent usage (1,252): administracion, administración
• Parsing (107): altitudMin/máx, residencia/trabajo, idioma/s
12Ontology Engineering Group, Universidad Politécnica de Madrid
B. Consistency
• OWL properties with IRI and literal values
• 3,380 properties
• Use of strings and URL interchangeably
• esdbpedia:lugarDeEntierro
• "Madrid"@es
• http://es.dbpedia.org/resource/Madrid
• Diverse and incorrect domain and range types
• esdbpedia:edad has range of type dbo:Place
• esdbpedia:lugarmuerte has range of type dbo:Person
• esdbpedia:pais has range of type dbo:Actor
13Ontology Engineering Group, Universidad Politécnica de Madrid
C. Syntactic Validity
• IRIs represented as strings
• Many properties with IRI values and very few string values
• Common cause:
• IRIs encoded as strings -> “http://...”@es
• Numerical values represented as strings
• 3,675 properties with more than 99% integer objects and a
very few string literals
• Common cause:
• Numerics encoded as strings -> “2^^xsd:integer”
14Ontology Engineering Group, Universidad Politécnica de Madrid
D. Semantic Accuracy
• Outliers
• Numerical values allow an automatic analysis
• Properties such as diameter or edad with negative values
• Harder to detect automatically
• Our plan is to use data fusion approaches to compare values
from multiple sources
15Ontology Engineering Group, Universidad Politécnica de Madrid
Conclusions and future work
• DBpedia’s spontaneous data model can support
quality assessment
• Errors in DBpedia are introduced in many stages
• Crowd-sourced data
• Mappings
• Extraction framework
• Some errors can be eliminated with pre-processing
and cleaning
• Quality assessment currently semi-automatic
• Currently working towards its automation
• We plan to investigate if the quality issues are the
same in other DBpedia instances
16Ontology Engineering Group, Universidad Politécnica de Madrid
Questions?
http://loupe.linkeddata.es/

More Related Content

What's hot

Making project data avalialble eNanomapper through Database
Making project data avalialble eNanomapper through  DatabaseMaking project data avalialble eNanomapper through  Database
Making project data avalialble eNanomapper through DatabaseNina Jeliazkova
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseJustin Clark-Casey
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardizationValery Tkachenko
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Valery Tkachenko
 
Using Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4jUsing Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4jNeo4j
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
osm.cs.byu.edu
osm.cs.byu.eduosm.cs.byu.edu
osm.cs.byu.edubutest
 

What's hot (10)

Making project data avalialble eNanomapper through Database
Making project data avalialble eNanomapper through  DatabaseMaking project data avalialble eNanomapper through  Database
Making project data avalialble eNanomapper through Database
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
The Catalan Research portal: ready to go
The Catalan Research portal: ready to goThe Catalan Research portal: ready to go
The Catalan Research portal: ready to go
 
Using Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4jUsing Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4j
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
osm.cs.byu.edu
osm.cs.byu.eduosm.cs.byu.edu
osm.cs.byu.edu
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 

Viewers also liked

Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)dgarijo
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesOscar Corcho
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibilityc.titus.brown
 
10 Recommendations from the Reproducibility Crisis in Psychological Science
10 Recommendations from the Reproducibility Crisis in Psychological Science10 Recommendations from the Reproducibility Crisis in Psychological Science
10 Recommendations from the Reproducibility Crisis in Psychological ScienceJimGrange
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsNitish Aggarwal
 
Using semantics and NLP in experimental protocols
Using semantics and NLP in experimental protocolsUsing semantics and NLP in experimental protocols
Using semantics and NLP in experimental protocolsOlga Ximena Giraldo
 
The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)Oscar Corcho
 
Latinoamericana8(2) 4
Latinoamericana8(2) 4Latinoamericana8(2) 4
Latinoamericana8(2) 4VicenteMarMar
 
Das energiekonzept der bundesregierung (na) im vergleich
Das energiekonzept der bundesregierung (na) im vergleich Das energiekonzept der bundesregierung (na) im vergleich
Das energiekonzept der bundesregierung (na) im vergleich metropolsolar
 
Group Awareness Guidelines for Computer Supported Colllaborative Work & Learning
Group Awareness Guidelines for Computer Supported Colllaborative Work & LearningGroup Awareness Guidelines for Computer Supported Colllaborative Work & Learning
Group Awareness Guidelines for Computer Supported Colllaborative Work & LearningNiki Lambropoulos PhD
 
Tour Of England - Student Example (2)
Tour Of England - Student Example (2)Tour Of England - Student Example (2)
Tour Of England - Student Example (2)S Rackley
 
National Poetry Month 11
National Poetry Month 11National Poetry Month 11
National Poetry Month 11Neelima addanki
 
탄소배출 유료시대
탄소배출 유료시대탄소배출 유료시대
탄소배출 유료시대Ahnku Toh
 
Socialcast Return Over Influence Web2.0 Coutinho
Socialcast Return Over Influence Web2.0 CoutinhoSocialcast Return Over Influence Web2.0 Coutinho
Socialcast Return Over Influence Web2.0 CoutinhoMarcelo Coutinho Lima
 
ECナビ Lightning Talk(s)
ECナビ Lightning Talk(s)ECナビ Lightning Talk(s)
ECナビ Lightning Talk(s)moai kids
 

Viewers also liked (20)

Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Phd
PhdPhd
Phd
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart Cities
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 
10 Recommendations from the Reproducibility Crisis in Psychological Science
10 Recommendations from the Reproducibility Crisis in Psychological Science10 Recommendations from the Reproducibility Crisis in Psychological Science
10 Recommendations from the Reproducibility Crisis in Psychological Science
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
 
Using semantics and NLP in experimental protocols
Using semantics and NLP in experimental protocolsUsing semantics and NLP in experimental protocols
Using semantics and NLP in experimental protocols
 
The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)
 
Latinoamericana8(2) 4
Latinoamericana8(2) 4Latinoamericana8(2) 4
Latinoamericana8(2) 4
 
MLPZ 04
MLPZ 04MLPZ 04
MLPZ 04
 
Das energiekonzept der bundesregierung (na) im vergleich
Das energiekonzept der bundesregierung (na) im vergleich Das energiekonzept der bundesregierung (na) im vergleich
Das energiekonzept der bundesregierung (na) im vergleich
 
Group Awareness Guidelines for Computer Supported Colllaborative Work & Learning
Group Awareness Guidelines for Computer Supported Colllaborative Work & LearningGroup Awareness Guidelines for Computer Supported Colllaborative Work & Learning
Group Awareness Guidelines for Computer Supported Colllaborative Work & Learning
 
Tour Of England - Student Example (2)
Tour Of England - Student Example (2)Tour Of England - Student Example (2)
Tour Of England - Student Example (2)
 
National Poetry Month 11
National Poetry Month 11National Poetry Month 11
National Poetry Month 11
 
National Poetry Month 9
National Poetry Month 9National Poetry Month 9
National Poetry Month 9
 
탄소배출 유료시대
탄소배출 유료시대탄소배출 유료시대
탄소배출 유료시대
 
Making a rainbow
Making a rainbowMaking a rainbow
Making a rainbow
 
Socialcast Return Over Influence Web2.0 Coutinho
Socialcast Return Over Influence Web2.0 CoutinhoSocialcast Return Over Influence Web2.0 Coutinho
Socialcast Return Over Influence Web2.0 Coutinho
 
ECナビ Lightning Talk(s)
ECナビ Lightning Talk(s)ECナビ Lightning Talk(s)
ECナビ Lightning Talk(s)
 

Similar to An analysis of the quality issues of the properties available in the Spanish DBpedia

Loupe API - A Linked Data Profiling Service for Quality Assessment
Loupe API - A Linked Data Profiling Service for Quality AssessmentLoupe API - A Linked Data Profiling Service for Quality Assessment
Loupe API - A Linked Data Profiling Service for Quality AssessmentNandana Mihindukulasooriya
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access finalJean-Paul Calbimonte
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositoriesandrea huang
 
Linq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_siciliaLinq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_siciliaLINQ_Conference
 
Miso-McGill
Miso-McGillMiso-McGill
Miso-McGillmiso_uam
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...João Rocha da Silva
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Takeshi Morita
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebVictor de Boer
 
Curriculum data enrichment with ontologies
Curriculum data enrichment with ontologiesCurriculum data enrichment with ontologies
Curriculum data enrichment with ontologiesILOT Project
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Mariano Rodriguez-Muro
 
2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)Alex Henderson
 
Big Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLPBig Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLPChristian Morbidoni
 
The ENCODE Portal REST API
The ENCODE Portal REST API The ENCODE Portal REST API
The ENCODE Portal REST API ENCODE-DCC
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...University of California, San Diego
 

Similar to An analysis of the quality issues of the properties available in the Spanish DBpedia (20)

Loupe API - A Linked Data Profiling Service for Quality Assessment
Loupe API - A Linked Data Profiling Service for Quality AssessmentLoupe API - A Linked Data Profiling Service for Quality Assessment
Loupe API - A Linked Data Profiling Service for Quality Assessment
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access final
 
Weso research group
Weso research groupWeso research group
Weso research group
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
Linq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_siciliaLinq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_sicilia
 
Miso-McGill
Miso-McGillMiso-McGill
Miso-McGill
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...
 
CV Aritra 08-2020
CV Aritra 08-2020CV Aritra 08-2020
CV Aritra 08-2020
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic Web
 
Curriculum data enrichment with ontologies
Curriculum data enrichment with ontologiesCurriculum data enrichment with ontologies
Curriculum data enrichment with ontologies
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
 
2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)
 
Big Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLPBig Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLP
 
Miso
MisoMiso
Miso
 
The ENCODE Portal REST API
The ENCODE Portal REST API The ENCODE Portal REST API
The ENCODE Portal REST API
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 

More from Nandana Mihindukulasooriya

A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...Nandana Mihindukulasooriya
 
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge BasesLeveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge BasesNandana Mihindukulasooriya
 
A Distributed Transaction Model for Read-Write Linked Data Applications
A Distributed Transaction Model for Read-Write Linked Data ApplicationsA Distributed Transaction Model for Read-Write Linked Data Applications
A Distributed Transaction Model for Read-Write Linked Data ApplicationsNandana Mihindukulasooriya
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
Learning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesLearning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesNandana Mihindukulasooriya
 
Linked data platform adapter for bugzilla poster
Linked data platform adapter for bugzilla posterLinked data platform adapter for bugzilla poster
Linked data platform adapter for bugzilla posterNandana Mihindukulasooriya
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...Nandana Mihindukulasooriya
 
morph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementationmorph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementationNandana Mihindukulasooriya
 
Linked Data Platform as a novel approach for Enterprise Application Integra...
Linked Data Platform as a novel approach for Enterprise Application Integra...Linked Data Platform as a novel approach for Enterprise Application Integra...
Linked Data Platform as a novel approach for Enterprise Application Integra...Nandana Mihindukulasooriya
 
ALM iStack - Application Lifecycle Management using Linked Data
ALM iStack - Application Lifecycle Management using Linked Data ALM iStack - Application Lifecycle Management using Linked Data
ALM iStack - Application Lifecycle Management using Linked Data Nandana Mihindukulasooriya
 
Application integration with the W3C Linked Data standards
Application integration with the W3C Linked Data standardsApplication integration with the W3C Linked Data standards
Application integration with the W3C Linked Data standardsNandana Mihindukulasooriya
 

More from Nandana Mihindukulasooriya (20)

A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
 
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge BasesLeveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
 
ISWC 2020 - Semantic Answer Type Prediction
ISWC 2020 - Semantic Answer Type PredictionISWC 2020 - Semantic Answer Type Prediction
ISWC 2020 - Semantic Answer Type Prediction
 
Fitur - HackaTrips 2018!
Fitur - HackaTrips 2018!Fitur - HackaTrips 2018!
Fitur - HackaTrips 2018!
 
A Distributed Transaction Model for Read-Write Linked Data Applications
A Distributed Transaction Model for Read-Write Linked Data ApplicationsA Distributed Transaction Model for Read-Write Linked Data Applications
A Distributed Transaction Model for Read-Write Linked Data Applications
 
Repairing Hidden Links in Linked Data
Repairing Hidden Links in Linked DataRepairing Hidden Links in Linked Data
Repairing Hidden Links in Linked Data
 
Research Poster Design
Research Poster DesignResearch Poster Design
Research Poster Design
 
Hidden Gems
Hidden GemsHidden Gems
Hidden Gems
 
Erasmus+ promotional event - Kandy, Sri Lanka
Erasmus+ promotional event - Kandy, Sri LankaErasmus+ promotional event - Kandy, Sri Lanka
Erasmus+ promotional event - Kandy, Sri Lanka
 
4V - WP3 Progress Report (TIN2013-46238)
4V - WP3 Progress Report (TIN2013-46238)4V - WP3 Progress Report (TIN2013-46238)
4V - WP3 Progress Report (TIN2013-46238)
 
Introduction to W3C Linked Data Platform
Introduction to W3C Linked Data PlatformIntroduction to W3C Linked Data Platform
Introduction to W3C Linked Data Platform
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
Learning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesLearning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examples
 
Linked data platform adapter for bugzilla poster
Linked data platform adapter for bugzilla posterLinked data platform adapter for bugzilla poster
Linked data platform adapter for bugzilla poster
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...
 
morph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementationmorph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementation
 
Linked Data Platform as a novel approach for Enterprise Application Integra...
Linked Data Platform as a novel approach for Enterprise Application Integra...Linked Data Platform as a novel approach for Enterprise Application Integra...
Linked Data Platform as a novel approach for Enterprise Application Integra...
 
ALM iStack - Application Lifecycle Management using Linked Data
ALM iStack - Application Lifecycle Management using Linked Data ALM iStack - Application Lifecycle Management using Linked Data
ALM iStack - Application Lifecycle Management using Linked Data
 
morph-LDP Demo
morph-LDP Demomorph-LDP Demo
morph-LDP Demo
 
Application integration with the W3C Linked Data standards
Application integration with the W3C Linked Data standardsApplication integration with the W3C Linked Data standards
Application integration with the W3C Linked Data standards
 

Recently uploaded

Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 

Recently uploaded (20)

Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 

An analysis of the quality issues of the properties available in the Spanish DBpedia

  • 1. An Analysis of the Quality Issues of the Properties Available in the Spanish DBpedia Nandana Mihindukulasooriya, Mariano Rico, Raúl García-Castro, and Asunción Gómez-Pérez Ontology Engineering Group (OEG) Departamento de Inteligencia Artificial Escuela Técnica Superior de Ingenieros Informáticos Universidad Politécnica de Madrid Acknowledgments: 4V (TIN2013-46238-C4-2-R) and LIDER (EU FP7 610782) projects http://loupe.linkeddata.es
  • 2. Collaborative editing in Wikipedia 2Ontology Engineering Group, Universidad Politécnica de Madrid
  • 3. Spontaneous data model creation 3Ontology Engineering Group, Universidad Politécnica de Madrid
  • 4. Can spontaneous data models support us in data quality assessment? But, first, what is the quality of such spontaneous data models? 4Ontology Engineering Group, Universidad Politécnica de Madrid
  • 5. DBpedia – Exposing Wikipedia’s content as Linked Data 5Ontology Engineering Group, Universidad Politécnica de Madrid RDF Triple store Rendering
  • 6. esDBpedia – the Spanish DBpedia chapter 6Ontology Engineering Group, Universidad Politécnica de Madrid http://es.dbpedia.org/
  • 7. Can esDBpedia’s spontaneous data model support us in data quality assessment? But, first, what is the quality of the properties of such spontaneous data model? 7Ontology Engineering Group, Universidad Politécnica de Madrid
  • 8. Quality Dimensions for Datasets A. Conciseness. A dataset does not contain redundant concepts with different identifiers B. Consistency. A dataset does not contain conflicting or contradictory data C. Syntactic Validity. Values belong to the legal value range for the represented domain and do not violate the syntactic rules D. Semantic Accuracy. Values correctly represent real world facts 8Ontology Engineering Group, Universidad Politécnica de Madrid
  • 9. Extraction and inspection of property statistics 9Ontology Engineering Group, Universidad Politécnica de Madrid http://loupe.linkeddata.es/
  • 10. Information extracted about properties 10Ontology Engineering Group, Universidad Politécnica de Madrid Property statistics template Example Data General information URI http://es.dbpedia.org/property/edad Local name edad Namespace http://es.dbpedia.org/property/ Number of triples 4623 Subject Analysis IRI subject count 4623 (100 %) Extracted domain classes (i.e., ?subject a ?class) dbpedia-owl:Agent 2611 (56,48 %) schema:Person 1515 (32.77 %) … Object analysis URI object count 186 (4.02 %) Extracted range classes (i.e., ?object a ?class) skos:Concept 17 (9.14 %) schema:Place 2 (1.08 %) … Literal object count 4437 (95.98 %) Numerical object count 2491 (53.88 %) Integer object count 2382 (51.52 %) Average of numerics 3.53 Max numeric sample 8.79E11, 1.5E8, 1.5E7, 8.2E6, 8121540 Min numeric sample -5, 0, 1, 1.08, 1.2
  • 11. Properties in esDBpedia 11Ontology Engineering Group, Universidad Politécnica de Madrid Property prefix Properties Property values # % # % http://es.dbpedia.org/property/ 19,885 52.53 18,021,389 10.66 http://dbpedia.org/property/ 17,188 45.40 9,742,710 5.76 http://dbpedia.org/ontology/ 576 1.52 86,602,281 51.21 http://xmlns.com/foaf/0.1/ 12 0.03 8,132,328 4.81 http://www.w3.org/1999/02/22-rdf-syntax-ns# 8 0.02 12,298,451 7.27 http://www.w3.org/2000/01/rdf-schema# 7 0.02 5,366,982 3.17 http://www.w3.org/2002/07/owl# 6 0.02 16,523,751 9.77 http://purl.org/dc/terms/# 4 0.01 4,148,399 2.45 http://www.w3.org/2004/02/skos/core# 4 0.01 1,153,685 0.68 http://purl.org/dc/elements/1.1/ 3 0.01 3,346,874 1.98 http://www.w3.org/ns/prov# 1 0.00 2,853,681 1.69 Other prefixes 163 0.43 911,131 0.54 Total 37,857 100 169,101,662 100
  • 12. A. Conciseness • Many redundant properties in esDBpedia • 97.93% are auto-generated • Causes • Capitalization (857): partidosEnPrimera,partidosenprimera • Synonyms: causaDeMuerte, causaDeFallecimiento • Prepositions: causaDeFallecimiento, causaFallecimiento • Spelling (7,495): apeliido, apelldio, apellid • Singular/plural: apellido, apellidos • Gender: administrador, administradora • Accent usage (1,252): administracion, administración • Parsing (107): altitudMin/máx, residencia/trabajo, idioma/s 12Ontology Engineering Group, Universidad Politécnica de Madrid
  • 13. B. Consistency • OWL properties with IRI and literal values • 3,380 properties • Use of strings and URL interchangeably • esdbpedia:lugarDeEntierro • "Madrid"@es • http://es.dbpedia.org/resource/Madrid • Diverse and incorrect domain and range types • esdbpedia:edad has range of type dbo:Place • esdbpedia:lugarmuerte has range of type dbo:Person • esdbpedia:pais has range of type dbo:Actor 13Ontology Engineering Group, Universidad Politécnica de Madrid
  • 14. C. Syntactic Validity • IRIs represented as strings • Many properties with IRI values and very few string values • Common cause: • IRIs encoded as strings -> “http://...”@es • Numerical values represented as strings • 3,675 properties with more than 99% integer objects and a very few string literals • Common cause: • Numerics encoded as strings -> “2^^xsd:integer” 14Ontology Engineering Group, Universidad Politécnica de Madrid
  • 15. D. Semantic Accuracy • Outliers • Numerical values allow an automatic analysis • Properties such as diameter or edad with negative values • Harder to detect automatically • Our plan is to use data fusion approaches to compare values from multiple sources 15Ontology Engineering Group, Universidad Politécnica de Madrid
  • 16. Conclusions and future work • DBpedia’s spontaneous data model can support quality assessment • Errors in DBpedia are introduced in many stages • Crowd-sourced data • Mappings • Extraction framework • Some errors can be eliminated with pre-processing and cleaning • Quality assessment currently semi-automatic • Currently working towards its automation • We plan to investigate if the quality issues are the same in other DBpedia instances 16Ontology Engineering Group, Universidad Politécnica de Madrid