A Usage-‐dependent Life Cycle Request to put • toy train away the “train” • toy train • made of plas)c • SELECT * WHERE ?t • made of wood a:madeOf a:Plas)c • SELECT * WHERE ?t b:madeOf b:Wood Nego)ate Enter the room understanding USAGE
Yet another… OTK … The NeOn Maintenance METHONTOLOGY Black Box DILIGENT Make it less a methodology but support the people to get their “Things” done!
Who is hurt by that? • rather small/simple ontologies – min. eﬀort for OE – “under-‐engineered” • unknown user requirements
Hey “LOD people”, do you think that ontology engineering maaers? Usage-‐based ontology engineering
Survey covering approx. 25% of all cloud datasets • size • complexity • engineering methodology • … Publishers of 75% of the dataset do not feel 99% responsible for their data? Survey ran in October 2010
Concrete Example of Usage-‐based Approach digging in log ﬁles
Usage? Request to put away the “train” • SELECT * WHERE ?t a:madeOf a:Plas)c Yes*! But beyond? • SELECT * WHERE ?t b:madeOf b:Wood USAGE • What about the future of SPARQL endpoints on the WoD? * W.r.t. an architecture proposed by a famous “Web-‐Extremist”
You should have a query endpoint! Effort Distribution between Publisher and Consumer • You get Pays-As-You-Go something valuable Consumer generates/ data mines linksen the data publisher,t data integration effort is out of i the publisher which helps mer and third parties. !"# Effort $%&()) *(+( Distribution ,-+&.(+"/-lisher s data as RDFyou to play 011/+ your role erms from common vocabularies s and publishes mappings Publisher provides links Links as hintsties on the p WoD! pointing at y g your datamappings to the Web 567)"83&98 011/+ 23"4 5(+: 011/+ Christian Bizer: Pay-as-you-go Data Integration (21/9/2010)sumer ;/-86<&98o the rest 011/+ ta mining techniques for
Usage Analysis • queries • paaerns • triples • primi)ves visualize heat maps zoom in and see details
Some Results (DBpedia Analysis) • ns:Band ns:instrument ?x inconsistent • ns:Band ns:genre ?y data • ns:Band ns:associatedBand ?z • ns:Band ns:knownFor ?x • ns:Band ns:na)onality ?y missing facts Complete analysis can be found at hap://page.mi.fu-‐berlin.de/mluczak/pub/visual-‐analysis-‐of-‐web-‐of-‐data-‐usage-‐dbpedia33/
Some Thoughts about Beneﬁt • usage analysis helps to acquire new knowledge – links between data helps to increase • lightweight approach the quality of data on helps to bootstrap the Web linked data – external schema It is not necessary to automate everything if the result has enough (business) value in a problem domain anyway.
make it less a methodology • LOD vocabularies are speciﬁc ontologies • and need speciﬁc life cycle our data provide (query) access to y support T • endpoint and pcan your to mon the WoD usage analysis lay help role aintain them a • (and the mplicitly necessary to automate it is not i data) k • things ahat enable automa)on publisher this is t beneﬁt for the dataset e and the Web of data as a whole A Hey “LOD people”, do you think that w dataset maintenance maaers? a y Markus Luczak-‐Rösch (firstname.lastname@example.org-‐berlin.de) Freie Universität Berlin, Networked Informa)on Systems (www.ag-‐nbi.de)
Actual Addi)on • “15.500.000 people in Germany are not willing to use the internet” – emphasis on the ESWC discussion: bridging the gap (directly or indirectly) between these people and the internet/Web has a high poten)al to inﬂuence societal transforma)on (they are not going to use a browser or an iPhone and they do not care for seman)cs) Source: ARD-‐Morgenmagazin, 08-‐07-‐2011