SlideShare a Scribd company logo
Managing and Consuming
Completeness Information for Wikidata
Using COOL-WD
KRDB Research Centre, Free University of Bozen-Bolzano
Radityo Eko Prasojo, Fariz Darari, Simon Razniewski, Werner Nutt
COLD 2016 @ Kobe, Japan
October 18, 2016
Supported by the project MAGIC, funded by the province of Bolzano
Web data is mostly incomplete
• Wikidata is missing the fact that Michael Sottile is a cast member of
the movie Reservoir Dogs.
• As per YAGO, the average number of children per person is 0.02.
• DBpedia contains currently only 6 out of 35 Dijkstra Prize winners.
1
Cantons of Switzerland in Wikidata
2
All Swiss cantons by Swiss constitution
3
Wikidata is complete for cantons of Switzerland!
4
Completeness Statements1
Syntax:
(s, p)
Semantics:
Graph G has completeness statement (s, p)
↓
G is complete for all p-values of s that exist in reality
Example:
Wikidata has completeness statement (Q39, P150)
↓
Wikidata is complete for all
administrative territorial divisions/cantons (= P150)
of Switzerland (= Q39)
1Darari et al. Enabling Fine-Grained RDF Data Completeness Assessment. ICWE
2016.
5
Completeness Statement in RDF
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix spv: <http://completeness.inf.unibz.it/sp-vocab#> .
@prefix coolwd: <http://cool-wd.inf.unibz.it/resource/> .
@prefix wdt: <http://www.wikidata.org/prop/direct/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
wd:Q2013 spv:hasSPStatement coolwd:statement-Q39-P150.
coolwd:statement-Q39-P150 a spv:SPStatement;
spv:subject wd:Q39;
spv:predicate wdt:P150;
prov:wasAttributedTo [foaf:name "Fariz Darari";
foaf:mbox <mailto:fariz.darari@stud-inf.unibz.it>];
prov:generatedAtTime "2016-05-19T10:45:52"^^xsd:dateTime;
prov:hadPrimarySource
<https://www.admin.ch/.../index.html#a1>.
6
COOL-WD
We have developed
a completeness management tool for Wikidata
The management feature comprises:
• browsing Wikidata entities enriched with completeness statements
• adding and removing completeness statements
• updating completeness provenance
As for now, we have more than 10000 real completeness statements.
7
COOL-WD interfaces
1. The Web interface, accessible at http://cool-wd.inf.unibz.it/
2. The COOL-WD Gadget, available for Wikidata users by importing
our cool-wd.js2
to their common.js page
2https://www.wikidata.org/wiki/User:Fadirra/coolwd.js
8
COOL-WD Web Interface: Architecture
SPARQL	Endpoint MediaWiki API
COOL-WD	
Engine
COOL-WD	
User	Interface
HTTP RequestData access Web browsing
SPARQL Queries API Calls
SP-Statements DB
9
Consuming completeness information using COOL-WD
• Completeness tracking of Wikidata entities
• Completeness analytics7/16/2016 COOL-WD
Class name #Objects Property
Completeness
percentage
Complete entities
Cantons of
Switzerland
26 official language 15.38%
 Canton of Geneva
 Canton of Bern  Ticino
 Canton of Zürich Show less
Cantons of
Switzerland
26 head of
government
3.85%
 Canton of Bern
10
Consuming completeness information using COOL-WD (2)
• Query completeness assessment
11
Conclusions
• Parts of information in Wikidata are complete, but so far there is no
way to capture them
• COOL-WD manages and consumes completeness information of
Wikidata
• Our framework can also be adopted by similar KBs like YAGO and
DBpedia
• If you want more details on extracting completeness information
from text: “How to Extract Cardinality Information from Text”
(Wednesday evening poster session).
12
Thank you!
13
Backup slides
How to create completeness statements?
KB contributors
Paid crowd workers
Web extraction
COOL-WD, which is also pre-populated using the three approaches
above.
Creating CS: KB contributors
• No-value statements
• Stating the non-existence of information:
Complete for all Elizabeth I’s children (in reality she had none)
• 7600 statements were imported
• among the top 15: “member of political party”, “spouse”, “child”,
and“country of citizenship”.
Creating KB: Paid crowd workers
• 900 SP-statements were crowd sourced
• Pricey
• Task is deemed too difficult for general crowd workers
Creating KB: Web extraction
• Mining cardinality information
• Extracting information in Wikipedia like:
Obama has two children
• Then checking if the cardinality matches with the facts in Wikidata
• 2200 statements were imported for the “child” relation

More Related Content

What's hot

Arkstore web ready2013
Arkstore web ready2013Arkstore web ready2013
Arkstore web ready2013
coldsnipe
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
MOVING Project
 
Linked Data Notifications for RDF Streams
Linked Data Notifications for RDF StreamsLinked Data Notifications for RDF Streams
Linked Data Notifications for RDF Streams
Jean-Paul Calbimonte
 
Querying Linked Geospatial Data with Incomplete Information
Querying Linked Geospatial Data with  Incomplete InformationQuerying Linked Geospatial Data with  Incomplete Information
Querying Linked Geospatial Data with Incomplete Information
Charalampos (Babis) Nikolaou
 
Organising principles
Organising principlesOrganising principles
Organising principles
Intranätverk
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream Processing
Jean-Paul Calbimonte
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
RomanaPernischov
 
Organising principles
Organising principlesOrganising principles
Organising principles
Fredric Landqvist
 
Publishing metadata provenance
Publishing metadata provenancePublishing metadata provenance
Publishing metadata provenance
Jana Hentschke
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
Alasdair Gray
 
What's new in Spark 2.0?
What's new in Spark 2.0?What's new in Spark 2.0?
What's new in Spark 2.0?
rerngvit yanggratoke
 
Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016
Daniele Dell'Aglio
 
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Joachim Neubert
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturator
INRIA-OAK
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
Giorgos Santipantakis
 
Web Archiving: Description and Access
Web Archiving: Description and AccessWeb Archiving: Description and Access
Web Archiving: Description and Access
Elizabeth (Lily) Pregill
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
Navid Sedighpour
 
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
Holistic Benchmarking of Big Linked Data
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informations
Dimitris Akridas
 
Data pipelines observability: OpenLineage & Marquez
Data pipelines observability:  OpenLineage & MarquezData pipelines observability:  OpenLineage & Marquez
Data pipelines observability: OpenLineage & Marquez
Julien Le Dem
 

What's hot (20)

Arkstore web ready2013
Arkstore web ready2013Arkstore web ready2013
Arkstore web ready2013
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
 
Linked Data Notifications for RDF Streams
Linked Data Notifications for RDF StreamsLinked Data Notifications for RDF Streams
Linked Data Notifications for RDF Streams
 
Querying Linked Geospatial Data with Incomplete Information
Querying Linked Geospatial Data with  Incomplete InformationQuerying Linked Geospatial Data with  Incomplete Information
Querying Linked Geospatial Data with Incomplete Information
 
Organising principles
Organising principlesOrganising principles
Organising principles
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream Processing
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
 
Organising principles
Organising principlesOrganising principles
Organising principles
 
Publishing metadata provenance
Publishing metadata provenancePublishing metadata provenance
Publishing metadata provenance
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
What's new in Spark 2.0?
What's new in Spark 2.0?What's new in Spark 2.0?
What's new in Spark 2.0?
 
Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016
 
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturator
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Web Archiving: Description and Access
Web Archiving: Description and AccessWeb Archiving: Description and Access
Web Archiving: Description and Access
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
 
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
 
Efficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informationsEfficient analysis of large scale digital circuits and parasitic informations
Efficient analysis of large scale digital circuits and parasitic informations
 
Data pipelines observability: OpenLineage & Marquez
Data pipelines observability:  OpenLineage & MarquezData pipelines observability:  OpenLineage & Marquez
Data pipelines observability: OpenLineage & Marquez
 

Similar to Managing and Consuming Completeness Information for Wikidata Using COOL-WD

COOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for WikidataCOOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for Wikidata
Fariz Darari
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
Presentation
PresentationPresentation
Presentation
L_Pavelescu
 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
Ghislain Atemezing
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
Sören Auer
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
Myungjin Lee
 
Statistical Linked Data
Statistical Linked DataStatistical Linked Data
Statistical Linked Data
Boris Villazón-Terrazas
 
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARKBig Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Matt Stubbs
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigData_Europe
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana University
inside-BigData.com
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
PRBETTER
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Sujit Pal
 
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
giovannibiallo
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
Ivan Ermilov
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
Ontotext
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
Sujit Pal
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 Creating Knowledge out of Interlinked Data
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigData_Europe
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
Sebastian Hellmann
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
hala Skaf
 

Similar to Managing and Consuming Completeness Information for Wikidata Using COOL-WD (20)

COOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for WikidataCOOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for Wikidata
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
Presentation
PresentationPresentation
Presentation
 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
 
Statistical Linked Data
Statistical Linked DataStatistical Linked Data
Statistical Linked Data
 
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARKBig Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana University
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
 
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
 

Recently uploaded

PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
8 things to know before you start to code in 2024
8 things to know before you start to code in 20248 things to know before you start to code in 2024
8 things to know before you start to code in 2024
ArianaRamos54
 

Recently uploaded (20)

PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
8 things to know before you start to code in 2024
8 things to know before you start to code in 20248 things to know before you start to code in 2024
8 things to know before you start to code in 2024
 

Managing and Consuming Completeness Information for Wikidata Using COOL-WD

  • 1. Managing and Consuming Completeness Information for Wikidata Using COOL-WD KRDB Research Centre, Free University of Bozen-Bolzano Radityo Eko Prasojo, Fariz Darari, Simon Razniewski, Werner Nutt COLD 2016 @ Kobe, Japan October 18, 2016 Supported by the project MAGIC, funded by the province of Bolzano
  • 2. Web data is mostly incomplete • Wikidata is missing the fact that Michael Sottile is a cast member of the movie Reservoir Dogs. • As per YAGO, the average number of children per person is 0.02. • DBpedia contains currently only 6 out of 35 Dijkstra Prize winners. 1
  • 3. Cantons of Switzerland in Wikidata 2
  • 4. All Swiss cantons by Swiss constitution 3
  • 5. Wikidata is complete for cantons of Switzerland! 4
  • 6. Completeness Statements1 Syntax: (s, p) Semantics: Graph G has completeness statement (s, p) ↓ G is complete for all p-values of s that exist in reality Example: Wikidata has completeness statement (Q39, P150) ↓ Wikidata is complete for all administrative territorial divisions/cantons (= P150) of Switzerland (= Q39) 1Darari et al. Enabling Fine-Grained RDF Data Completeness Assessment. ICWE 2016. 5
  • 7. Completeness Statement in RDF @prefix wd: <http://www.wikidata.org/entity/> . @prefix spv: <http://completeness.inf.unibz.it/sp-vocab#> . @prefix coolwd: <http://cool-wd.inf.unibz.it/resource/> . @prefix wdt: <http://www.wikidata.org/prop/direct/> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . wd:Q2013 spv:hasSPStatement coolwd:statement-Q39-P150. coolwd:statement-Q39-P150 a spv:SPStatement; spv:subject wd:Q39; spv:predicate wdt:P150; prov:wasAttributedTo [foaf:name "Fariz Darari"; foaf:mbox <mailto:fariz.darari@stud-inf.unibz.it>]; prov:generatedAtTime "2016-05-19T10:45:52"^^xsd:dateTime; prov:hadPrimarySource <https://www.admin.ch/.../index.html#a1>. 6
  • 8. COOL-WD We have developed a completeness management tool for Wikidata The management feature comprises: • browsing Wikidata entities enriched with completeness statements • adding and removing completeness statements • updating completeness provenance As for now, we have more than 10000 real completeness statements. 7
  • 9. COOL-WD interfaces 1. The Web interface, accessible at http://cool-wd.inf.unibz.it/ 2. The COOL-WD Gadget, available for Wikidata users by importing our cool-wd.js2 to their common.js page 2https://www.wikidata.org/wiki/User:Fadirra/coolwd.js 8
  • 10. COOL-WD Web Interface: Architecture SPARQL Endpoint MediaWiki API COOL-WD Engine COOL-WD User Interface HTTP RequestData access Web browsing SPARQL Queries API Calls SP-Statements DB 9
  • 11. Consuming completeness information using COOL-WD • Completeness tracking of Wikidata entities • Completeness analytics7/16/2016 COOL-WD Class name #Objects Property Completeness percentage Complete entities Cantons of Switzerland 26 official language 15.38%  Canton of Geneva  Canton of Bern  Ticino  Canton of Zürich Show less Cantons of Switzerland 26 head of government 3.85%  Canton of Bern 10
  • 12. Consuming completeness information using COOL-WD (2) • Query completeness assessment 11
  • 13. Conclusions • Parts of information in Wikidata are complete, but so far there is no way to capture them • COOL-WD manages and consumes completeness information of Wikidata • Our framework can also be adopted by similar KBs like YAGO and DBpedia • If you want more details on extracting completeness information from text: “How to Extract Cardinality Information from Text” (Wednesday evening poster session). 12
  • 16. How to create completeness statements? KB contributors Paid crowd workers Web extraction COOL-WD, which is also pre-populated using the three approaches above.
  • 17. Creating CS: KB contributors • No-value statements • Stating the non-existence of information: Complete for all Elizabeth I’s children (in reality she had none) • 7600 statements were imported • among the top 15: “member of political party”, “spouse”, “child”, and“country of citizenship”.
  • 18. Creating KB: Paid crowd workers • 900 SP-statements were crowd sourced • Pricey • Task is deemed too difficult for general crowd workers
  • 19. Creating KB: Web extraction • Mining cardinality information • Extracting information in Wikipedia like: Obama has two children • Then checking if the cardinality matches with the facts in Wikidata • 2200 statements were imported for the “child” relation