Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

Statistics about Data Shape Use in RDF Data

Download to read offline

The presentation of the poster paper "Statistics about Data Shape Use in RDF Data" presented during the demo/poster session at the International Semantic Web Conference (ISWC) 2020.

Joint work with Ben De Meester, Anastasia Dimou and Ruben Verborgh.

The related video is online available at YouTube: https://www.youtube.com/watch?v=6-OdjYdEpeU

  • Be the first to like this

Statistics about Data Shape Use in RDF Data

  1. 1. Statistics about Data Shape Use in RDF Data Sven Lieber, Ben De Meester, Ruben Verborgh and Anastasia Dimou
  2. 2. Could you ride such a bike? Would you ride such a bike?
  3. 3. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 3
  4. 4. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 4
  5. 5. Represent restrictions in a knowledge model “A bicycle has two wheels” Possible with the Open World Assumption: the second wheel is somewhere in the world
  6. 6. Use a knowledge representation “A bicycle needs to have two wheels” Now we can validate constraints for all bikes in the picture (closed world)
  7. 7. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 7
  8. 8. Why investigate the use of constraints? Different constraint types exist: min/max cardinality, datatype, min/max literal length, etc Avoid a self-fulfilling prophecy: Use what is supported <--> Only support what is used
  9. 9. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 9
  10. 10. What did we analyze? Currently no large data shape repository Searching for “SHACL” using the GitHub search Selecting 19 repositories containing SHACL shapes
  11. 11. How did we analyze? Thus we get W3C Data Cube and PROV compliant statistics [] a qb:Observation ; rls:detectorDimension rls:logicalDisjunctionLODStatsDetectorSHACLOr ; rls:detectorVersionDimension rls:logicalDisjunctionLODStatsDetectorSHACLOr-v1 ; rls:executionTimeDimension "2020-08-14T11:11:39.253575"^^xsd:dateTime ; rls:ontologyRepositoryDimension rls:noRepo ; rls:ontologyVersionDimension <https://github.com/SEMICeu/dcat-ap_shacl.git#> ; rls:restrictionTypeDimension rls:logicalDisjunction ; rls:restrictionTypeOccurrence 3 . Statistics at w3id.org/montolo/github-stats Montolo Knowledge Graph Statistics described using generates LODStats Data shapes extracts
  12. 12. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 12
  13. 13. More than 60% of repositories define constraints for the basic structure of a Knowledge Graph Statistics at w3id.org/montolo/github-stats
  14. 14. More than 30% of repositories define constraints on specific values; even less for other literal-based constraints Statistics at w3id.org/montolo/github-stats
  15. 15. What does it mean? Class, cardinality, datatype, nodekind and disjunction constraints seem to be obvious choices HOWEVER, more detailed constraints on values of such properties are possible but currently not used a lot Tools should make it easy to define all possible constraints to avoid a self fulfilling prophecy
  16. 16. What are constraints? Why investigate the use of constraints? What did we analyze and how? What did we find and what does it mean? 16 Closed world restrictions for validation Understand the use of different constraint types SHACL shapes from GitHub using Montolo Constraints for basic structure used a lot Unused potential for more detailed constraints
  17. 17. SvenLieber sven-lieber.orgknows.idlab.ugent.be Users: Use constraints to improve quality Developers: Let users access all constraint types Statistics at w3id.org/montolo/github-stats

The presentation of the poster paper "Statistics about Data Shape Use in RDF Data" presented during the demo/poster session at the International Semantic Web Conference (ISWC) 2020. Joint work with Ben De Meester, Anastasia Dimou and Ruben Verborgh. The related video is online available at YouTube: https://www.youtube.com/watch?v=6-OdjYdEpeU

Views

Total views

32

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

0

×