The RDF Report Card: Beyond the Triple Count

The RDF Report
Card

Beyond the Triple Count

26th September 2011

SemTechBiz 2011
Leigh Dodds
@ldodds

http://kasabi.com
http://slideshare.net/ldodds

Triple counts are not a quality
indicator

http://dbpedia.org/resource/London

6 triples for Population Density

Property Count Value

http://dbpedia.org/ontology/PopulatedPlace/populationDensity 2 4807.0
4806.971873853451
http://dbpedia.org/ontology/populationDensity 2 4806.971874
4807.000000
http://dbpedia.org/property/populationDensityKm 1 4807

http://dbpedia.org/property/populationDensitySqMi 1 12450

12 triples for Location (1)


georss:point 1 51.507222222222225
-0.1275
geo:geometry 1 POINT(-0.1275 51.5072)

geo:lat 1 51.507221

geo:long 1 -0.127500

12 triples for Location (2)

dbpprop:latd 1 51

dbpprop:latm 1 30

dbpprop:lats 1 26

dbpprop:latns 1 N

dbpprop:longd 1 0

dbpprop:longm 1 7

dbpprop:longs 1 39

dbpprop:longew 1 W

Triple counts don't indicate utility

http://bbc.co.uk/programmes

2.5 million unique users per week, 60 req/s *

*
http://www.guardian.co.uk/media/pda/2011/apr/06/bbc-yves-raimond

http://bbc.co.uk/programmes

Dataset is less than 50 million triples

Dataset Information Spectrum

Low Detail High Detail

Summary and overview Detailed data model
of dataset content documentation & guides



Summary and overview Detailed data model
of dataset content documentation & guides

More Information


Metadata ● Title, Description
● Provenance

● Publication dates

● Licensing

● Usage cues

● Related datasets


Scope ● What types of entity?
● How many of each type?

● Coverage

● Geographic

● Events (time)


Structure ● URI Scheme
● Vocabulary meshing

● How is a person described?


Internals ● List of Schemas & RDF terms
● Class/property usage counts

● Triple counts

● Named graph structure

● Source files

Summarising Content of a Dataset
● Find all classes in all datasets in Kasabi

● Tag each class against a pre-defined set of
categories
● Customized version of top-level schema.org
classes

● Generate a report card for each dataset listing
types of entity

Ordnance Survey

http://beta.kasabi.com/dataset/ordnance-survey-linked-data

BBC Music

http://beta.kasabi.com/dataset/bbc-music

British National Bibliography

http://beta.kasabi.com/dataset/british-national-bibliography-bnb

NHS Performance Data

http://beta.kasabi.com/dataset/nhs-performance-data

Summary
● Triple counts tell us nothing
● Vital to present the quality & utility of our data
● Data publishing platforms should support this
● "Progressive disclosure"
● Right detail at the right time
● Dataset analysis can generate useful
summaries
● e.g. an RDF report card

The RDF Report Card: Beyond the Triple Count

The RDF Report Card: Beyond the Triple Count

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to The RDF Report Card: Beyond the Triple Count

Similar to The RDF Report Card: Beyond the Triple Count (20)

More from Leigh Dodds

More from Leigh Dodds (20)

Recently uploaded

Recently uploaded (20)

The RDF Report Card: Beyond the Triple Count