Your SlideShare is downloading. ×
0
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The RDF Report Card: Beyond the Triple Count

11,239

Published on

My talk from the Semtech Biz conference in London. …

My talk from the Semtech Biz conference in London.

I argued that it is time to move beyond discussing size of datasets and encourage a more nuanced view to understand quality and utility.

The RDF Report Card is offered as one simple, high-level visualization.

Published in: Technology, Education
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,239
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
19
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. The RDF Report Card Beyond the Triple Count 26th September 2011 SemTechBiz 2011Leigh Dodds@ldoddshttp://kasabi.comhttp://slideshare.net/ldodds
  • 2. Triple counts tell us nothing
  • 3. Triple counts are not a quality indicator
  • 4. http://dbpedia.org/resource/London
  • 5. 6 triples for Population DensityProperty Count Valuehttp://dbpedia.org/ontology/PopulatedPlace/populationDensity 2 4807.0 4806.971873853451http://dbpedia.org/ontology/populationDensity 2 4806.971874 4807.000000http://dbpedia.org/property/populationDensityKm 1 4807http://dbpedia.org/property/populationDensitySqMi 1 12450
  • 6. 12 triples for Location (1)Property Count Valuegeorss:point 1 51.507222222222225 -0.1275geo:geometry 1 POINT(-0.1275 51.5072)geo:lat 1 51.507221geo:long 1 -0.127500
  • 7. 12 triples for Location (2)Property Count Valuedbpprop:latd 1 51dbpprop:latm 1 30dbpprop:lats 1 26dbpprop:latns 1 Ndbpprop:longd 1 0dbpprop:longm 1 7dbpprop:longs 1 39dbpprop:longew 1 W
  • 8. ~4.6m redundant triples
  • 9. Triple counts dont indicate utility
  • 10. http://bbc.co.uk/programmes2.5 million unique users per week, 60 req/s * * http://www.guardian.co.uk/media/pda/2011/apr/06/bbc-yves-raimond
  • 11. http://bbc.co.uk/programmes Dataset is less than 50 million triples
  • 12. Beyond the Triple Count
  • 13. Dataset Information SpectrumLow Detail High DetailSummary and overview Detailed data modelof dataset content documentation & guides
  • 14. Dataset Information SpectrumLow Detail High DetailSummary and overview Detailed data modelof dataset content documentation & guides More Information
  • 15. Dataset Information SpectrumLow Detail High DetailMetadata ● Title, Description ● Provenance ● Publication dates ● Licensing ● Usage cues ● Related datasets
  • 16. Dataset Information SpectrumLow Detail High DetailScope ● What types of entity? ● How many of each type? ● Coverage ● Geographic ● Events (time)
  • 17. Dataset Information SpectrumLow Detail High DetailStructure ● URI Scheme ● Vocabulary meshing ● How is a person described?
  • 18. Dataset Information SpectrumLow Detail High DetailInternals ● List of Schemas & RDF terms ● Class/property usage counts ● Triple counts ● Named graph structure ● Source files
  • 19. RDF Report Card Example
  • 20. Summarising Content of a Dataset● Find all classes in all datasets in Kasabi● Tag each class against a pre-defined set of categories ● Customized version of top-level schema.org classes● Generate a report card for each dataset listing types of entity
  • 21. Report Card Categories
  • 22. Ordnance Surveyhttp://beta.kasabi.com/dataset/ordnance-survey-linked-data
  • 23. BBC Musichttp://beta.kasabi.com/dataset/bbc-music
  • 24. British National Bibliographyhttp://beta.kasabi.com/dataset/british-national-bibliography-bnb
  • 25. NHS Performance Datahttp://beta.kasabi.com/dataset/nhs-performance-data
  • 26. Summary● Triple counts tell us nothing● Vital to present the quality & utility of our data ● Data publishing platforms should support this● "Progressive disclosure" ● Right detail at the right time● Dataset analysis can generate useful summaries ● e.g. an RDF report card

×