Assessing Linkset Quality For
Complementing Third Party Datasets
Riccardo Albertoni1,2, Asunción Gómez Pérez1
1Ontology Engineering Group
Departamento de Inteligencia Artificial
Facultad de Informática
Universidad Politécnica de Madrid
2CNR-IMATI,
Via De Marini, 6, Torre di Francia, 16149 Genova, Italy
3RD INTERNATIONAL WORKSHOP ON LINKED WEB DATA
MANAGEMENT (LWDM 2013)
in conjunction with the 16th International Conference on Extending
Database Technology (EDBT 2013)
March 22, 2013 - Genoa, Italy
2
Motivations
Riccardo Albertoni
LINKED DATA’s PROMISE:
Evolving the Web into a Global Data Space
It should help to overcome data silos effect….
So many
bubbles there,
THAT’S SO
COOL!!
BUT ….
Can I exploit
that third party
data for my
OWN
ANALYSES?
3
Motivation
Riccardo Albertoni
What does this
arrow mean ??
NO GROUND CONCEPT
about
what makes a linkset
suitable for a target
application
Well founded works on
quality for datasets, but
Linksets are not yet directly
addressed!SWDF
DBLP
4
What is Linkset Quality for?
Linked Data Publishers can check if a linkset they
have provided
• is good enough or need to be improved;
• is still good enough after one of the two target
datasets is updated.
Linked Data Consumers can
• figure out if they can or can’t rely on a linkset;
• have a first guess of what is the next move they can
take to improve the linkset;
• rank possible linkset alternatives.
Riccardo Albertoni
5
foaf:made
a
Pub1
Pub2
b
foaf:made
Pub3
Pub4
Yolanda Gil
DBLP
Y
Linkset L
a owl:sameAs a’
b owl:sameAs b’
XL
foaf:member
a’
Afflii5
Affili4
b’
foaf:member
Affili3
X
Journal 1
c’
Complementing a Dataset X via a Linkset L
≠
Complementation might
introduce some “data missing”
The less “data missing” (like
researcher c) are introduced the
more the Linkset is complete
6
What is a Linkset ? (http://vocab.deri.ie/void)
Riccardo Albertoni
Every linkset is a special kind of dataset !!
Every linkset has two target datasets:
Subject and Object datasets
Every linkset should have only one
linking property
owl:sameAs linksets
7
Defining quality measures
Riccardo Albertoni
Considering the terminology adopted by
C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA
policy framework. J. Web Sem., 7(1):1-10, 2009
What to define providing a quality
measure
Provided in this Linkset quality
Quality Indicator is an aspect of a data item
or data set that may give an indication to the
user of the suitability of the data for some
intended use.
Entities Types
Number of Entities for Types
… …
Scoring Function namely, functions
evaluating quality indicators to measure the
suitability of the data for some intended use.
Linkset Type Coverage
Linkset Type Completeness
Linkset Entity Coverage for Type
Aggregate Metric user-specified
assessment metric built upon scoring
functions. These aggregations produce new
assessment values through the average,
sum, max, min or threshold functions applied
to the set of scoring functions.
Interpretation tables:
interpretation on the scoring
functions that helps in figuring out
which is the next action to do
8
Defining quality measures
Riccardo Albertoni
Considering the terminology adopted by
C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA
policy framework. J. Web Sem., 7(1):1-10, 2009
What to define providing a quality
measure
Provided in this Linkset quality
Quality Indicator is an aspect of a data item
or data set that may give an indication to the
user of the suitability of the data for some
intended use.
Entities Types
Number of Entities for Types
… …
Scoring Function namely, functions
evaluating quality indicators to measure the
suitability of the data for some intended use.
Linkset Type Coverage
Linkset Type Completeness
Linkset Entity Coverage for Type
Aggregate Metric user-specified
assessment metric built upon scoring
functions. These aggregations produce new
assessment values through the average,
sum, max, min or threshold functions applied
to the set of scoring functions.
Interpretation tables:
interpretation on the scoring
functions that helps in figuring out
which is the next action to do
9
INDICATORS: Examples on DBLP & SWDF
Riccardo Albertoni
foaf:Organization
foaf:Person
ro:FullPaperfoaf:Document
foaf:Agent
swr:Proceedingsswrc:Proceedings
DBLP SWDF
ro:ShortPaper
ro:PosterPaper
Type(DBLP) Type(SWDF)
#E4Type(foaf:Agent,DBLP)=1000000
#E4Type(foaf:Document,DBLP)=1984087
#E4Type(swrc:Proceedings,DBLP)=1108400
11
INDICATORS: Examples on DBLP & SWDF
Riccardo Albertoni
foaf:Organization
foaf:Person
ro:FullPaperfoaf:Document
foaf:Agent
swr:Proceedingsswrc:Proceedings
DBLP SWDF
L2
ro:PosterPaper
Type(DBLP) Type(SWDF)
#E4Type(foaf:Agent,L2)=100
#E4Type(foaf:Person,L2)=100 Type(L2)
12
Quality indicators: Types
Riccardo Albertoni
Dataset/
Linkset
Power set on the
possible User
defined Types
e.g.
owl:Class, owl:Restricti
on, skos:Concept, sko
s:ConceptScheme
Returns the types
of entities
exposed in a
dataset or a
linkset
13
Quality indicators: # of Entity for a Type
Riccardo Albertoni
Dataset/
Linkset
One of the possible User
defined Types
Set of (positive) integer
Returns the number of entities exposed
in a dataset/ linkset for a given type
Blank nodes are left out
15
Defining quality measures
Riccardo Albertoni
Considering the terminology adopted by
C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA
policy framework. J. Web Sem., 7(1):1-10, 2009
What to define providing a quality
measure
Provided in this Linkset quality
Quality Indicator is an aspect of a data item
or data set that may give an indication to the
user of the suitability of the data for some
intended use.
Entities Types
Number of Entities for Types
… …
Scoring Function namely, functions
evaluating quality indicators to measure the
suitability of the data for some intended use.
Linkset Type Coverage
Linkset Type Completeness
Linkset Entity Coverage for Type
Aggregate Metric user-specified
assessment metric built upon scoring
functions. These aggregations produce new
assessment values through the average,
sum, max, min or threshold functions applied
to the set of scoring functions.
Interpretation tables:
interpretation on the scoring
functions that helps in figuring out
which is the next action to do
16
SCORING FUNCTIONS: Linkset Type Coverage (1)
Riccardo Albertoni
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
L1
Type(DBLP) Type(SWDF)
Complementing DBLP with L1, are we adding some
new entities to DBLP?
DBLPL1 “imports” organizations for the researchers
(foaf:Agent) involved in the linkset
17
SCORING FUNCTIONS: Linkset Type Coverage (2)
Riccardo Albertoni
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
Type(DBLP) Type(SWDF)
Complementing SWDF with L2, we don’t add any new type of entities
SWDFL2 has exactly the same kind of Entities of SWDF
swr:Proceedings
L2
18
Definition of Linkset Type Coverage
Riccardo Albertoni
Linkset
Target dataset
Considering a dataset X, What percentage of types
of X that are also covered by the linkset?
19
SCORING FUNCTION: Ideas behind Type Completeness (1)
Riccardo Albertoni
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
L1
Type(DBLP) Type(SWDF)
L1 is type complete
It does not make sense to run a procedure ( e.g., SILK) trying to discover
interlinks between the instances of swrc:Proceedings and foaf:Organization!!!
20
SCORING FUNCTION: Ideas behind Type Completeness(2)
Riccardo Albertoni
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
L1
Type(DBLP) Type(SWDF)
swr:Proceedings
We should try to run a procedure ( e.g., SILK) trying to discover interlinks
between the instances of swrc:Proceedings and swr:Proceedings!!!
Alignment
among classes
L1 is type incomplete
21
Formalization of Linkset Type Completeness
Riccardo Albertoni
Linkset
Terget dataset 2
Target dataset 1
Types In the subject that are
not considered in the linkset
returns the set of types that X have an equivalent in Y
according to a relation of equivalence among classes
A linkset is complete with respect to types  LTCom= 1
LTCom<1 otherwise
22
Example on Type Completeness
Riccardo Albertoni
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
L1
Type(DBLP) Type(SWDF)
swr:Proceedings
L2
LTCom(L1,DBLP, SWDF) = 1- (|{swrc:Proceedings}| /
|{swrc:Proceedings,foaf:Person}|)=1/2
LTCom(L2,DBLP, SWDF) = 1- (|{}| /
|{swr:Proceedings,foaf:Person}|)=1
23
foaf:Organization
foaf:Personfoaf:Agent
swrc:Proceedings
DBLP SWDF
L1
L1 and L2 are indistinguishable from the point of view of types
Which is the most interesting? L1 or L2? Or L1 U L2 ?
swr:Proceedings
L2
Linkset Entity Coverage for Type
Riccardo Albertoni
Number of Entity of type T in
the linkset L
Number of Entity of type T in
the Dataset X
How good is a linkset providing 100 owl:sameAs?
25
Defining quality measures
Riccardo Albertoni
Considering the terminology adopted by
C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA
policy framework. J. Web Sem., 7(1):1-10, 2009
What to define providing a quality
measure
Provided in this Linkset quality
Quality Indicator is an aspect of a data item
or data set that may give an indication to the
user of the suitability of the data for some
intended use.
Entities Types
Number of Entities for Types
… …
Scoring Function namely, functions
evaluating quality indicators to measure the
suitability of the data for some intended use.
Linkset Type Coverage
Linkset Type Completeness
Linkset Entity Coverage for Type
Aggregate Metric user-specified
assessment metric built upon scoring
functions. These aggregations produce new
assessment values through the average,
sum, max, min or threshold functions applied
to the set of scoring functions.
Interpretation tables:
interpretation on the scoring
functions that helps in figuring out
which is the next action to do
26Riccardo Albertoni
Aggregate Metrics: Interpretation upon
the presented score functions
Interpretation is summed up
as “decision tree”
27
Related work: (extended discussion in the paper)
• WIQA is a Information Quality
Assessment Framework
• C. Bizer and R. Cyganiak. Quality-driven information
filtering using the WIQA policy framework. J. WebSem.,
7(1):110, 2009
• LOD2
• P. N. Mendes, C. Bizer, J. H. Young, Z. Miklos, J.-P.
Calbimonte, and A. Moraru. Conceptual model and best
practices for high-quality metadata publishing.Technical
report, PlanetData, Deliverable 2.1, 2012,http://planet-
data-wiki.sti2.at/web/File:D2.1.pdf.
• PlanetData
• P. N. Mendes and C. Bizer. Survey report state of the art
in mapping, quality assessment and data fusion. Technical
report, LOD2- Creating Knowledge out of Interlinked data,
Deliverable 4.3.1, 2011,http://static.lod2.eu/Deliverables
• SIEVE
• P. N. Mendes, H. Muhleisen, and C. Bizer. Sieve: linked
data quality assessment and fusion. In D. Srivastava and I.
Ari, editors, LWDM EDBT/ICDT Workshops, pp. 116-123.
ACM, 2012.
Riccardo Alberton
Contributes with a policy language,
engine for interpreting such policies,
Explanation if a piece of information
satisfies that policy
Quality criteria are parameters of the
system It does not aim at proposing new
quality measures
Reviews quality dimensions
No indicators or criteria for completeness
Intensionally compl. : the schema
contains all the necessary attributes,;
Extensionally compl. : all instances re
quired al present),
LDS Completeness: relevant properties
have a values
SIEVE deploys some of the idea developed
in WIQA and LDS completeness
They don’t explicitly address quality for
Linksets
28
Related work: (extended discussion in the paper)
• Link-QA
• C. Gueret, P. T. Groth, C. Stadler, and J. Lehmann.
Assessing linked data mappings using network measures.
In E. Simperl, P. Cimiano, A. Polleres, O. Corcho, and V.
Presutti, editors, ESWC, volume 7295 of Lecture Notes in
Computer Science, pp. 87-102. Springer, 2012
Riccardo Alberton
Different approach:
They apply classic network measure such
as degree, centrality, clustering coefficient +
open-sameAs chain, description richness
for determining whether a bunch of links
improves the overall dataset quality
Quality of interlinking not for linkset
LINK-QA works on links independently
of they are part or not of the same linksets;
LINK-QA addresses correctness and it
does not deal with
Completeness
LINK-QA is for ranking sets of links, it
can be used to say a linkset is better than
another, but it
does not suggest what is the next move
a consumer should
take to improve his linkset
29
Conclusions
Contribution: Quality measure on linksets
• The only measure explicitly addressing linkset
completeness for dataset complementation
• Formalization for indicators, score functions and
aggregation metrics;
• A first proof of concept prototype (JAVA-JENA)
On-going and Future work
• Validation on the LOD,
• How many “incomplete” Linksets can we detect in the LOD?
• Extension for considering others than owl:sameAs
Linkset (e-g., skos:exactMatch)
• Other dimensions than completeness (e.g.,
Timeliness, Availability, Consistency)
Riccardo Albertoni
30
THANKS for your ATTENTION!
riccardo.albertoni@ge.imati.cnr.it
Riccardo Albertoni

Linkset quality (LWDM 2013)

  • 1.
    Assessing Linkset QualityFor Complementing Third Party Datasets Riccardo Albertoni1,2, Asunción Gómez Pérez1 1Ontology Engineering Group Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid 2CNR-IMATI, Via De Marini, 6, Torre di Francia, 16149 Genova, Italy 3RD INTERNATIONAL WORKSHOP ON LINKED WEB DATA MANAGEMENT (LWDM 2013) in conjunction with the 16th International Conference on Extending Database Technology (EDBT 2013) March 22, 2013 - Genoa, Italy
  • 2.
    2 Motivations Riccardo Albertoni LINKED DATA’sPROMISE: Evolving the Web into a Global Data Space It should help to overcome data silos effect…. So many bubbles there, THAT’S SO COOL!! BUT …. Can I exploit that third party data for my OWN ANALYSES?
  • 3.
    3 Motivation Riccardo Albertoni What doesthis arrow mean ?? NO GROUND CONCEPT about what makes a linkset suitable for a target application Well founded works on quality for datasets, but Linksets are not yet directly addressed!SWDF DBLP
  • 4.
    4 What is LinksetQuality for? Linked Data Publishers can check if a linkset they have provided • is good enough or need to be improved; • is still good enough after one of the two target datasets is updated. Linked Data Consumers can • figure out if they can or can’t rely on a linkset; • have a first guess of what is the next move they can take to improve the linkset; • rank possible linkset alternatives. Riccardo Albertoni
  • 5.
    5 foaf:made a Pub1 Pub2 b foaf:made Pub3 Pub4 Yolanda Gil DBLP Y Linkset L aowl:sameAs a’ b owl:sameAs b’ XL foaf:member a’ Afflii5 Affili4 b’ foaf:member Affili3 X Journal 1 c’ Complementing a Dataset X via a Linkset L ≠ Complementation might introduce some “data missing” The less “data missing” (like researcher c) are introduced the more the Linkset is complete
  • 6.
    6 What is aLinkset ? (http://vocab.deri.ie/void) Riccardo Albertoni Every linkset is a special kind of dataset !! Every linkset has two target datasets: Subject and Object datasets Every linkset should have only one linking property owl:sameAs linksets
  • 7.
    7 Defining quality measures RiccardoAlbertoni Considering the terminology adopted by C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA policy framework. J. Web Sem., 7(1):1-10, 2009 What to define providing a quality measure Provided in this Linkset quality Quality Indicator is an aspect of a data item or data set that may give an indication to the user of the suitability of the data for some intended use. Entities Types Number of Entities for Types … … Scoring Function namely, functions evaluating quality indicators to measure the suitability of the data for some intended use. Linkset Type Coverage Linkset Type Completeness Linkset Entity Coverage for Type Aggregate Metric user-specified assessment metric built upon scoring functions. These aggregations produce new assessment values through the average, sum, max, min or threshold functions applied to the set of scoring functions. Interpretation tables: interpretation on the scoring functions that helps in figuring out which is the next action to do
  • 8.
    8 Defining quality measures RiccardoAlbertoni Considering the terminology adopted by C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA policy framework. J. Web Sem., 7(1):1-10, 2009 What to define providing a quality measure Provided in this Linkset quality Quality Indicator is an aspect of a data item or data set that may give an indication to the user of the suitability of the data for some intended use. Entities Types Number of Entities for Types … … Scoring Function namely, functions evaluating quality indicators to measure the suitability of the data for some intended use. Linkset Type Coverage Linkset Type Completeness Linkset Entity Coverage for Type Aggregate Metric user-specified assessment metric built upon scoring functions. These aggregations produce new assessment values through the average, sum, max, min or threshold functions applied to the set of scoring functions. Interpretation tables: interpretation on the scoring functions that helps in figuring out which is the next action to do
  • 9.
    9 INDICATORS: Examples onDBLP & SWDF Riccardo Albertoni foaf:Organization foaf:Person ro:FullPaperfoaf:Document foaf:Agent swr:Proceedingsswrc:Proceedings DBLP SWDF ro:ShortPaper ro:PosterPaper Type(DBLP) Type(SWDF) #E4Type(foaf:Agent,DBLP)=1000000 #E4Type(foaf:Document,DBLP)=1984087 #E4Type(swrc:Proceedings,DBLP)=1108400
  • 10.
    11 INDICATORS: Examples onDBLP & SWDF Riccardo Albertoni foaf:Organization foaf:Person ro:FullPaperfoaf:Document foaf:Agent swr:Proceedingsswrc:Proceedings DBLP SWDF L2 ro:PosterPaper Type(DBLP) Type(SWDF) #E4Type(foaf:Agent,L2)=100 #E4Type(foaf:Person,L2)=100 Type(L2)
  • 11.
    12 Quality indicators: Types RiccardoAlbertoni Dataset/ Linkset Power set on the possible User defined Types e.g. owl:Class, owl:Restricti on, skos:Concept, sko s:ConceptScheme Returns the types of entities exposed in a dataset or a linkset
  • 12.
    13 Quality indicators: #of Entity for a Type Riccardo Albertoni Dataset/ Linkset One of the possible User defined Types Set of (positive) integer Returns the number of entities exposed in a dataset/ linkset for a given type Blank nodes are left out
  • 13.
    15 Defining quality measures RiccardoAlbertoni Considering the terminology adopted by C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA policy framework. J. Web Sem., 7(1):1-10, 2009 What to define providing a quality measure Provided in this Linkset quality Quality Indicator is an aspect of a data item or data set that may give an indication to the user of the suitability of the data for some intended use. Entities Types Number of Entities for Types … … Scoring Function namely, functions evaluating quality indicators to measure the suitability of the data for some intended use. Linkset Type Coverage Linkset Type Completeness Linkset Entity Coverage for Type Aggregate Metric user-specified assessment metric built upon scoring functions. These aggregations produce new assessment values through the average, sum, max, min or threshold functions applied to the set of scoring functions. Interpretation tables: interpretation on the scoring functions that helps in figuring out which is the next action to do
  • 14.
    16 SCORING FUNCTIONS: LinksetType Coverage (1) Riccardo Albertoni foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF L1 Type(DBLP) Type(SWDF) Complementing DBLP with L1, are we adding some new entities to DBLP? DBLPL1 “imports” organizations for the researchers (foaf:Agent) involved in the linkset
  • 15.
    17 SCORING FUNCTIONS: LinksetType Coverage (2) Riccardo Albertoni foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF Type(DBLP) Type(SWDF) Complementing SWDF with L2, we don’t add any new type of entities SWDFL2 has exactly the same kind of Entities of SWDF swr:Proceedings L2
  • 16.
    18 Definition of LinksetType Coverage Riccardo Albertoni Linkset Target dataset Considering a dataset X, What percentage of types of X that are also covered by the linkset?
  • 17.
    19 SCORING FUNCTION: Ideasbehind Type Completeness (1) Riccardo Albertoni foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF L1 Type(DBLP) Type(SWDF) L1 is type complete It does not make sense to run a procedure ( e.g., SILK) trying to discover interlinks between the instances of swrc:Proceedings and foaf:Organization!!!
  • 18.
    20 SCORING FUNCTION: Ideasbehind Type Completeness(2) Riccardo Albertoni foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF L1 Type(DBLP) Type(SWDF) swr:Proceedings We should try to run a procedure ( e.g., SILK) trying to discover interlinks between the instances of swrc:Proceedings and swr:Proceedings!!! Alignment among classes L1 is type incomplete
  • 19.
    21 Formalization of LinksetType Completeness Riccardo Albertoni Linkset Terget dataset 2 Target dataset 1 Types In the subject that are not considered in the linkset returns the set of types that X have an equivalent in Y according to a relation of equivalence among classes A linkset is complete with respect to types  LTCom= 1 LTCom<1 otherwise
  • 20.
    22 Example on TypeCompleteness Riccardo Albertoni foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF L1 Type(DBLP) Type(SWDF) swr:Proceedings L2 LTCom(L1,DBLP, SWDF) = 1- (|{swrc:Proceedings}| / |{swrc:Proceedings,foaf:Person}|)=1/2 LTCom(L2,DBLP, SWDF) = 1- (|{}| / |{swr:Proceedings,foaf:Person}|)=1
  • 21.
    23 foaf:Organization foaf:Personfoaf:Agent swrc:Proceedings DBLP SWDF L1 L1 andL2 are indistinguishable from the point of view of types Which is the most interesting? L1 or L2? Or L1 U L2 ? swr:Proceedings L2 Linkset Entity Coverage for Type Riccardo Albertoni Number of Entity of type T in the linkset L Number of Entity of type T in the Dataset X How good is a linkset providing 100 owl:sameAs?
  • 22.
    25 Defining quality measures RiccardoAlbertoni Considering the terminology adopted by C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA policy framework. J. Web Sem., 7(1):1-10, 2009 What to define providing a quality measure Provided in this Linkset quality Quality Indicator is an aspect of a data item or data set that may give an indication to the user of the suitability of the data for some intended use. Entities Types Number of Entities for Types … … Scoring Function namely, functions evaluating quality indicators to measure the suitability of the data for some intended use. Linkset Type Coverage Linkset Type Completeness Linkset Entity Coverage for Type Aggregate Metric user-specified assessment metric built upon scoring functions. These aggregations produce new assessment values through the average, sum, max, min or threshold functions applied to the set of scoring functions. Interpretation tables: interpretation on the scoring functions that helps in figuring out which is the next action to do
  • 23.
    26Riccardo Albertoni Aggregate Metrics:Interpretation upon the presented score functions Interpretation is summed up as “decision tree”
  • 24.
    27 Related work: (extendeddiscussion in the paper) • WIQA is a Information Quality Assessment Framework • C. Bizer and R. Cyganiak. Quality-driven information filtering using the WIQA policy framework. J. WebSem., 7(1):110, 2009 • LOD2 • P. N. Mendes, C. Bizer, J. H. Young, Z. Miklos, J.-P. Calbimonte, and A. Moraru. Conceptual model and best practices for high-quality metadata publishing.Technical report, PlanetData, Deliverable 2.1, 2012,http://planet- data-wiki.sti2.at/web/File:D2.1.pdf. • PlanetData • P. N. Mendes and C. Bizer. Survey report state of the art in mapping, quality assessment and data fusion. Technical report, LOD2- Creating Knowledge out of Interlinked data, Deliverable 4.3.1, 2011,http://static.lod2.eu/Deliverables • SIEVE • P. N. Mendes, H. Muhleisen, and C. Bizer. Sieve: linked data quality assessment and fusion. In D. Srivastava and I. Ari, editors, LWDM EDBT/ICDT Workshops, pp. 116-123. ACM, 2012. Riccardo Alberton Contributes with a policy language, engine for interpreting such policies, Explanation if a piece of information satisfies that policy Quality criteria are parameters of the system It does not aim at proposing new quality measures Reviews quality dimensions No indicators or criteria for completeness Intensionally compl. : the schema contains all the necessary attributes,; Extensionally compl. : all instances re quired al present), LDS Completeness: relevant properties have a values SIEVE deploys some of the idea developed in WIQA and LDS completeness They don’t explicitly address quality for Linksets
  • 25.
    28 Related work: (extendeddiscussion in the paper) • Link-QA • C. Gueret, P. T. Groth, C. Stadler, and J. Lehmann. Assessing linked data mappings using network measures. In E. Simperl, P. Cimiano, A. Polleres, O. Corcho, and V. Presutti, editors, ESWC, volume 7295 of Lecture Notes in Computer Science, pp. 87-102. Springer, 2012 Riccardo Alberton Different approach: They apply classic network measure such as degree, centrality, clustering coefficient + open-sameAs chain, description richness for determining whether a bunch of links improves the overall dataset quality Quality of interlinking not for linkset LINK-QA works on links independently of they are part or not of the same linksets; LINK-QA addresses correctness and it does not deal with Completeness LINK-QA is for ranking sets of links, it can be used to say a linkset is better than another, but it does not suggest what is the next move a consumer should take to improve his linkset
  • 26.
    29 Conclusions Contribution: Quality measureon linksets • The only measure explicitly addressing linkset completeness for dataset complementation • Formalization for indicators, score functions and aggregation metrics; • A first proof of concept prototype (JAVA-JENA) On-going and Future work • Validation on the LOD, • How many “incomplete” Linksets can we detect in the LOD? • Extension for considering others than owl:sameAs Linkset (e-g., skos:exactMatch) • Other dimensions than completeness (e.g., Timeliness, Availability, Consistency) Riccardo Albertoni
  • 27.
    30 THANKS for yourATTENTION! riccardo.albertoni@ge.imati.cnr.it Riccardo Albertoni