Linked Open Data Statistics:
Collection and Exploitation
Ivan Ermilov, Michael Martin,
Jens Lehmann, Sören Auer
Agenda
● Why LODStats?
● Architecture
● Core Module
● Web Interface
● RDF DataCube
Why LODStats?
● Evaluate RDF datasets
● Gathers 32 statistical criteria such as:
– Number of triples, entities, literals...
– Average string length
– Vocabularies, classes used
● Helps you understand your data
● Generates VoID descriptions
Why LODStats?
LODStats can answer the following questions:
● Which visualization is suitable for my dataset?
– Does data contain RDF DataCubes?
– Does data contain geospatial information?
– What is the class hierarchy depth?
● Is it suitable for my application?
– Is it linked to DBpedia?
– Is it the largest existing dataset in this domain?
LODStats Architecture
https://github.com/AKSW/LODStats
https://github.com/AKSW/LODStats_WWW
LODStats: Core Module
● Python module (tested on Ubuntu 12.04)
● Command line interface
● Do the actual work
LODStats: Web Interface
http://stats.lod2.eu/
LODStats Features (1)
● General LOD cloud statistics
● Report of warnings and errors for each
dataset
● Report on statistical criteria for each individual
dataset
● Export as VoID and DataCube statistical
metadata
● Dataset linkage explorer
● Search function for datasets, vocabularies,
classes, properties, languages, datatypes
LODStats Features (2)
● REST interface for the above search functions
● Linked Data publication of statistics
● SPARQL endpoint to query all extracted
statistics
● CubeViz installation for facet-based browsing
and visualisation of the statistical metadata
LODStats: Class Stats
http://stats.lod2.eu/
LODStats RDF DataCube
https://github.com/AKSW/LODStats_WWW/blob/master/LODStats-Sparqlify/lodstats.sml
LODStats RDF DataCube
Thank you for your attention!
Questions?
Ivan Ermilov
iermilov@informatik.uni-leipzig.de

LODStats (Presentation for KESW2013 System Demo)