Open Data & coding data.gov.uk David Read Open Knowledge Foundation [email_address]
Contents The context: Linked Open Data
Our data catalogue: CKAN
data.gov.uk using CKAN
Discussion
Open Data ”Data is expensive to create” ”But think of the mutual benefits of it being open” Accessible Allowed to use and republish Without restiction
Science UEA criticised for a "culture of withholding information." CC-BY-SA http://commons.wikimedia.org/wiki/User:ChrisO
Geographic data
Public data
Linking data Dr. Hans Rosling, Professor of Global Health, Karolinska Institute, Sweden (TED talk)
Linking data 2
Linked data
Opening government data Transparency --> effectiveness
Labour and Conservatives agree (!)  with Cambridge economists: Making government datasets public will bring a £6bn boost to UK economy (We have paid for it...)
Open Data and Open Software Zero cost
Good performance
Principles: Many hands make light work / natural selection / wisdom of crowd / on shoulders of giants
Not a proprietry format
No supplier lock-in
Infrastructure Software Data Licence GPL PDDL, ODbL, ODC-By (OKF 2007-) isitopendata.org (OKF 2009-) Modules/Linking Lib, egg Spreadsheet, database, RDF/OWL Human Discovery CKAN (OKF 2008-) Automatic Distribution Apt-get, CPAN, easy_install CKAN datapkg (OKF 2008-) Hosting Sourceforge, PyPI, bitbucket archive.org / knowledgeforge.net Community freshmeat data.gov.uk email list closest?
Open Knowledge Foundation Aim: promote Open Knowledge
Founded 2004 as a 'not for profit' organisation
Strong connections with Cambridge University
A key director: Rufus Pollock
Volunteer driven
Create software tools (CKAN, KnowledgeForge), organise conferences, licenses, create visuals & mash-ups (Where Does My Money Go, Open Shakespeare), campaigns (Panton Principals)
Introducing... CKAN ”Comprehensive Knowledge Archive Network” ...well... a fancy Data Catalog
”CKAN is a registry or catalogue system for datasets or other "knowledge" resources. CKAN aims to make it easy to find, share and reuse open content and data, especially in ways that are machine automatable.”
 
 
 
Dataset name title version url author licence notes extras Tag name Resource url format description hash CKAN data model * * * Group name title description * *
Wiki
 
 

Open Data and CKAN Data Catalogues

Editor's Notes

  • #6 Also Haiti, iPhone cycle map
  • #7 Allowing with JobCentresPlus, was highlighted as an innovative use of government data
  • #8 But linking data is even more powerful Health and economic data Dr. Hans Rosling, Professor of Global Health, Karolinska Institute, Sweden (TED talk)
  • #9 Geonames – lat/long of place names Dbpedia – munge of Wikipedia content e.g. Where do footballers in the premiership come from?
  • #10 Note: google maps here – Google have built their business on being very good at not only search, but linking data too. Map has restaurants, travel directions, traffic, related ads. This profits them, but what about the rest of society?
  • #12 One way achieve what hundreds of organised and motivated Google programmers do?
  • #13 Installing linux packages – really sophisticated system of downloading lots of modules and they work together Someone might combine a couple of datasets, may well do some cleaning, produce a graph, but doesn't give back the data. Also: Scraperwiki
  • #19 Core metadata based on debian package. No dependencies shown here, but we do have that too.
  • #23 Can also update via API. Also have python, php, Drupal, Wordpress and other clients to help access API.
  • #25 Lobbying governements, or just tocollect known datasets. Groups like ownership and personalisation of the site.
  • #26 clone/push/pull/merge/reject changes