Your SlideShare is downloading. ×
Web2.0.2012 - lesson 10 - open data
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Web2.0.2012 - lesson 10 - open data


Published on

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Web 2.0 blog, wiki, tag, social network: what are they, how to use them and why they are important Lesson 10: Open Government and Open DataCarlo Vaccarivaccaricarlo@gmail.com Camerino University – 2011/2012
  • 2. This material is distributed under the Creative Commons"Attribution - NonCommercial - Share Alike - 3.0", available at . 2
  • 3. OriginsLetter from Thomas Jefferson to Isaac McPherson (1813)If nature has made any one thing less susceptible than all others ofexclusive property, it is the action of the thinking power called anidea, which an individual may exclusively possess as long as hekeeps it to himself; but the moment it is divulged, it forces itselfinto the possession of every one, and the receiver cannotdispossess himself of it. Its peculiar character, too, is that no onepossesses the less, because every other possesses the whole of it.He who receives an idea from me, receives instructionhimself without lessening mine; as he who lights his taperat mine, receives light without darkening me. That ideasshould freely spread from one to another over the globe, for themoral and mutual instruction of man, and improvement of hiscondition, seems to have been peculiarly and benevolentlydesigned by nature, when she made them, like fire, expansibleover all space, without lessening their density in any point, and likethe air in which we breathe, move, and have our physical being,incapable of confinement or exclusive appropriation. 3
  • 4. Origins1966 : Freedom of Information Act (FOIA): consistent with thebelief that people have the “right to know” about governmentrecordsThe Act gives the government a series of rules to allow anyoneto know how to work the federal government, including the fullor partial access to classified documentsThe measure guarantees the transparency of publicadministration towards the citizen and the freedom of thepress and press freedom of the press 4
  • 5. OriginsSince the 80s campaign in favor of free access to information,decisive for the development and dissemination of new digitalmedia.Key driver of these initiatives has been the movement for Freeor Open Source Software (defined ecumenically FLOSS),through the work of Richard Stallman and Linus TorvaldsStallman, founder of the movement, coined the definition offree software, by which expression includes the freedom torun, copy, distribute, study and modify a program. Stallmanalso introduced the copyleft concept, a term that literallymeans "permission to copy”, a license by which the authortransfers to the public some rights giving its users theconditions under which can be usedTorvalds, Linux creator, and the large community ofprogrammers who have collaborated in its development, haveshown the feasibility of the model conceived by Stallman 5
  • 6. OriginsOutside of software, the concept of copyleft has invaded thefield of content (text, music, video) through Lawrence Lessigwork, founder of Creative Commons, to invest the field ofscientific research.The Open Access movement, born in 2004, has focused on thescientific literatureIn 2008, the European Commission stated that 20% researchfunded by the Commission inside FP7 must be published openaccess after an embargo of 6-12 months; action was followed byEuropean Research Council (ERC - open access publishing after6 months), and then by the European Science Foundation (ESF)and the Head of European Research Council (EuroHORCS) 6
  • 7. OriginsThe advent of the so-called "social software", ie applications inwhich users become content producers, applications groupedunder the banner of Web 2.0, has enabled the Internet tobecome a platforms that allows interaction between differentusers, producing and sharing contents freelyThese behaviors also engage the public sphere. In past twoyears, the instances of the movement for open access toknowledge are also addressed to the public sectorinformation (PSI)Encouraged by the changes underway and the resultsobtained, a new grassroots movement, known as the OpenGovernment Data, is spreading in industrialized countries withthe aim of achieving open access to data in a proactive andspecific area: that of political institutions and publicadministration 7
  • 8. Gov 2.0 pillarsGovernment 2.0 three pillars:1. Leadership, policy and governance to achieve necessary shiftsin public sector culture and practice. Cultural change is at theheart of Government 2.0 and more important than thedevelopment of policy or the technical challenges of adoptingnew technologies.2. Application of Web 2.0 collaborative tools and practices tothe business of government As they are outside of government,these tools and practices can increase productivity andefficiency. Opportunity to make representative democracy moreresponsive, participatory and informed3. Open access to Public Sector Information (PSI) 8
  • 9. Gov 2.0 linksTop 10 gov 2.o web sites: Government govspace platform: state (au) Gov 2.0 action plan: gov 2.0 sites for Gov 2.0 from Web 2.0 9
  • 10. from Gov 2.0 to Open GovGov 2.0 is a fundamental step along the way to OpenGovernment bringing together the utilisation of emerging Webtools and mechanisms which enable multi-channelcommunications and information sharing.Much of the drive for “open” government comes from the“open organisation” and “open data” movements becauseessentially, as the Economist stated in February 2010 “thenation has always been a product of informationmanagement” 10
  • 11. Open GovernmentIn 2008 Barack Obama’s use of social media during his electioncampaign: Obama won having 2x Web Traffic, 4x Youtube views,5x Facebook friends and 10x Online staff than McCainFirst act of Obama as President: Memorandum onTransparency and Open Government that starts:“My Administration is committed to creating an unprecedentedlevel of openness in Government. We will work together toensure the public trust and establish a system of transparency,public participation, and collaboration. Openness willstrengthen our democracy and promote efficiency andeffectiveness in Government”Then Open Government Directive “to direct executivedepartments and agencies to take specific actions to implementthe principles of transparency, participation, and collaborationset forth in the President’s Memorandum”In the website Open Government Initiative all the actions 11
  • 12. Open Government in EuropeAn Open Declaration on European Public Services propose threecore principles for European public services:1. Transparency: - public sector organisations “transparent by default” - clear, regularly-updated information on all processes - citizens able to highlight areas where increase transparency - open, standard and reusable formats2. Participation: - citizens input in all its activities - collaboration with citizens core competence of government3. Empowerment: - public institutions as platforms for public value creation - data and services available in ways that others can build on - providing resources to enable citizens to solve problems - citizens as owners of their own personal data and enablethem to monitor and control how these data are sharedDeclaration accepted inside 2009 Malmoe Ministerial Declaration 12
  • 13. Open GovernmentIn Europe: Council of Europe Convention on Access to OfficialDocuments, Tromsø, 18.VI.2009 yet signed )Gov 2.0 examples: 10, 9, 2, 1) 13
  • 14. Open Government examples- 14
  • 15. Practical Steps for Government AgenciesRecommendations by Tim OReilly: Government as a Platform (see)Issue your own open government directiveCreate “a simple, reliable and publicly accessible infrastructurethat ‘exposes’ the underlying data” from your city, county, state,or agencyBuild your own websites and applications using the same opensystems for accessing the underlying data as they make availableto the public at largeShare those open APIs with the public, using for federalAPIs and creating state and local equivalentsShare your work with other cities, counties, states, or agencies.Provide your work as open source software, work with other bodiesto standardize web services, building a common cloud computingplatform, or simply sharing best practices (see Code for America) 15
  • 16. Practical Steps for Government AgenciesDon’t reinvent the wheel: support existing open standards anduse open source software whenever possible. (eg Open311 is agreat example of an open standard being adopted by manycities)Create a list of software applications that can be reused by yourgovernment employees without procurementCreate an “app store” that features applications created by theprivate sector as well as those created by your own governmentunit (see permissive social media guidelines that allow governmentemployees to engage the public without having to get pre-approval from superiorsSponsor meetups, code camps, and other activity sessions toactually put citizens to work on civic issues 16
  • 17. OECD PSI PrinciplesOECD recommendations about PSI principles:1. Openness. Maximize the availability of public sectorinformation for use and re-use - openness as the default rule.2. Access and transparent conditions for re-use. In principle allaccessible information would be open to re-use by all.3. Asset lists. Strengthening awareness of what public sectorinformation is available for access and re-use.4. Quality. Ensuring methodical practices to enhance dataquality through cooperation of various government bodies5. Integrity. Protect information from unauthorizedmodification or from denial of authorized access6. New technologies. Storing technologies, open formats,multiple languages, technological obsolescence and long termpreservation 17
  • 18. OECD PSI Principles7. Copyright. Intellectual property rights should be respected,exercising copyright in ways that facilitate re-use. Publicsector information must be copyright-free.8. Pricing. PSI provided free of charge, or information pricingtransparent as far as possible9. Competition. PSI open to all possible users and re-users onnon-exclusive terms10. Transparent Redress mechanisms11. Facilitate public private partnerships12. International access and use. Support international co-operation for commercial re-use and non-commercial use13. Best practices. Encouraging the wide sharing of bestpractices and exchange of information on implementation,training, copyright and monitoring 18
  • 19. From PSI to Open DataPSI in Europe: EPSIplus platformPublic Sector Information (PSI) not necessarily openMany different rules (national/regional laws!) about PSI re-useFor PSI to be open for re-use it needs to be- discoverable- legally open- technically open- free of chargeRules for Open Government Data: if it can’t be spidered or indexed, it doesn’t exist if it isn’t available in open and machine readable format, itcan’t engage if a legal framework doesn’t allow it to be re-used, it doesn’tempower 19
  • 20. Value of the DataOpening data can have a big economic value and their value liesin the possibility of their use and reuse.What has real value is what you develop from them, and the factthat they are available.Data are so ubiquitous that are becoming a commodity, such aselectricity and water.In the political field, the data have value only if it forms a criticalmass of people who know them and use them to form opinionsand participate in public activities.More data are used, the greater value because it increases theamount of decisions, goods, products and valuable servicesbased on them. 20
  • 21. Value of the Data2006 European Commission MEPSIR Study (Measuring EuropeanPublic Sector Information Resources) : estimates for the overallmarket size for public sector information in the European Unionrange from €10 to €48 billion, with a mean value around €27billion2011 Vickerys Study : first estimate provides economic gainsfrom opening up Public Information and providing access for freeup to € 40M for the EU27But PSI can be used in direct and indirect applications across theeconomy and direct and indirect economic impacts from PSIapplications and use across the whole EU economy are of theorder of €140 M annuallyTim Berners-Lee: Raw Data Now! TED 2009 21
  • 22. Recent EU movesDecember 2011:Digital Agenda – turning government data into goldOpen Data Strategy for Europe, expected to deliver a €40 billionboost to the EUs economy each year.Open Data Package consisting of:A proposal for a revision of the Directive 2003A Communication on Open DataNew Commission rules on re-use of the documents it holdsthree actions to overcome barriers and fragmentation :Adapt the legal framework for data re-useFinancial resources in favor of open data and European dataportalsFacilitate coordination between European countries, in particularthrough: PSI group and PSI platform (exchange of good practices) LAPSI network (legal issues related to PSI) ISA action Interoperability Solutions for EU PA (€164 m) 22
  • 23. Open DefinitionFrom : what is “open”?1. AccessAvailable as a whole and at a reasonable reproduction cost,preferably downloading via the Internet without charge. Thework must be available in a convenient and modifiable form.2. RedistributionThe license shall not restrict any party from selling or givingaway the work either on its own or as part of a package madefrom derived work. License without royalty or other fee.3. ReuseThe license must allow for modifications and must allow them tobe distributed under the terms of the original work.4. Absence of Technological RestrictionThe work must be provided in such a form that there are notechnological obstacles to the performance of the aboveactivities (eg. open data format) 23
  • 24. Open Definition5. AttributionThe license may require the attribution of the contributorsand creators to the work.6. IntegrityThe license may require as a condition for the work beingdistributed in modified form that the resulting work carry adifferent name or version number from the original work.7. No Discrimination Against Persons or GroupsThe license must not discriminate against any person orgroup of persons.8. No Discrimination Against Fields of EndeavorThe license must not restrict anyone from use the work in aspecific field of endeavor. For example, it may not restrict thework from being used in a business, or for genetic research 24
  • 25. Open Definition9. Distribution of LicenseThe rights attached to the work must apply to all to whom theprogram is redistributed without the need for execution of anadditional license by those parties.10. License Must Not Be Specific to a PackageThe rights attached to the work must not depend on the workbeing part of a particular package. If the work is extracted fromthat package, all parties redistributed should have the samerights as the original package.11. License Must Not Restrict the Distribution of otherWorksThe license must not place restrictions on other works that aredistributed along with the licensed work. For example, thelicense must not insist that all other works distributed on thesame medium are open.Many items similar to Open Software Definition (see) 25
  • 26. Open/Public (Government) DataSource 26
  • 27. How to Open Up data?Key rules in opening up data:Keep it simple (KISS) Start out fast, small and simple.Not every dataset must become open right now. Moving asrapidly as possible is good because it means you can learnfrom experienceEngage early and engage often with actual and potentialusers and reusers of the data: citizens, businesses,developers. Much of the data will reach ultimate users viainfomediaries who take the data and transform and remixthem – users don’t need a large vectors database but areinterested in the map. Thus, the primary users to engage arethe infomediary reusers.Address common fears and misunderstandings especially ifyou are working with large institutions such as government.In opening up data one encounters plenty of questions andconcerns and it is important to (a) identify the main ones (b)address them at as an early stage as possible. 27
  • 28. Steps to Open DataThere are 4 main steps in making data open (unsorted,sometime recursive)Choose the dataset(s) you plan to make open, though noteyou may need to return to this step if you encounter problemsespecially at step 2. Apply an open license, suitable for all rights existing on data(Legal Openness) Make the data available - in bulk and in a useful format.(sometimes via API) (Technical Openness) Make them discoverable: post on the web and perhapsorganize a central catalogue to list your open datasets (or putthem in existing catalogues) 28
  • 29. How to choose the datasets Ask the ‘community’ (i.e. actual or potential users of thedata) what they want Put up a web-page with details of this request for datasuggestions and a simple way to submit data requests (e.g.via email or a simple webform). Some tips:Avoid registrationPrepare a short (5-20 items) list of datasets as a promptThis list should be a quick process that identifies whatdatasets could be made open Circulate the request to relevant mailing lists, forums andindividuals pointing back to the main webpage Run a consultation event — but make sure you run it at aconvenient time where the average business person and ‘datahacker’ can attend 29
  • 30. Apply an Open LicenseIf you are planning to make your data available you shouldput a license on it — and if you want your data to be open thisis even more importantFor Licensing purposes must distinguish:Data (the collection) Contents (individual items, part of the collection,rows/columns)Structure (schema, metadata, Data Definition) 30
  • 31. OpenData LicensesMany licenses proposed:OpenData Commons proposes three licenses: Public Domain Dedication and License (PDDL) Attribution License (ODC-By) Open Database License (ODC-ODbL) - Like the GPL (or CCAttribution Share-Alike) requires public reusers of your data toshare back changes (and attribute)Opendefinition gives a list of licences conformant or not to“open” definitionMany national licenses:CanadaUKNorwayItaly (now IODL 2.0) 31
  • 32. Open Data organisationsOpen Knowledge Foundation : “From sonnets to statistics,genes to geodata”Founded in 2004, a not-for-profit organization promoting openknowledge: any kind of data and content – sonnets to statistics,genes to geodata – that can be freely used, reused, andredistributedOKF created standards like the Open Definition, organizesevents like OKCon and Open Government Data Camp, projectslike “Where Does My Money Go” and Open Shakespeare anddevelops tools like CKAN to help people share open materialItalian organisations: SpaghettiOpendata,, OPenPolis, 32
  • 33. Software for Open DataCKAN (Comprehensive Knowledge Archive Network) is open-source “data hub” software designed to make it easier to find,share, reuse and collaboratively develop data and content,especially open data and contentIn and Italy: code released as OSS (a modified Drupal version)used also for India → Open Government Platform 33
  • 34. CKAN features Free/Open-Source software, written in Python Core catalog based around Resources (Files and APIs) andgroupings of those (Packages)TaggingPackage GroupsRatingsArbitrary metadataPackage relationships Web user interface (WUI)Package adding, editing, listing etcWiki features such as “Recent Changes”, edit histories,purging of changes etcUser management and user home pages Full JSON-based REST API with clients in Python, PHP, Perl …RDF version also availableCKAN is easy to use as your “catalogue” backend An Extension and Plugin system 34
  • 35. Make the data availableTim Berners-Lee: Linked Data as part of a continuum of webpublishing activities associated with gold stars, like the ones yougot in school.Here they are:★ make your stuff available on the web (whatever formatsee here)★★ make it available as structured data (e.g. excel insteadof image scan of a table)★★★ non-proprietary format (e.g. csv instead of excel)★★★★ use URLs to identify things, so that people can point atyour stuff★★★★★ link your data to other people’s data to provide context 35
  • 36. Make the data availableOpen data needs to be ‘technically’ open as well as legallyopen. Specifically the data needs be: Available - at no more than a reasonable cost of reproduction,preferably for free download on the Internet. Publish yourinformation on the Internet wherever possible In bulk - the data should be available as a whole (a webservice may also be useful but is not a substitute for bulkaccess) In an open, machine-readable format - machine-readability isimportant because it facilitates reuse (eg figures in a PDF areread easily by humans but are very hard for a computer)The key point to keep in mind here is: keep it simple, movefast and be pragmatic. In particular it is better to give out rawdata now than perfect data in six months time. 36
  • 37. Make the data discoverable Tell the world!Contact prominent organisations or individuals interested inthis areaContact relevant mailing lists or social networking groupsContact prospective users you know may be interested in thisdata Getting folks in a room: Unconferences, Meetups andBarcamps: face-to-face events can be a very effective way toencourage others to use your data Making things! Hackdays, prizes and prototypes,conferences, barcamps, ... 37
  • 38. Open Linked DataLinked Data: a method of publishing structured data, so that itcan be interlinked and become more usefulBuilt upon standard Web technologies (HTTP and URIs) - but itextends them to share information in a way that can be readautomatically by computers (this enables data from differentsources to be connected and queried)Tim Berners-Lee, W3C director, coined the term in adesign note discussing the Semantic Web project4 rules: Use URIs as names for things Use HTTP URIs so that people can look up those names When someone looks up a URI, provide useful information,using the standards (RDF, SPARQL) Include links to other URIs, so that they can discover morethings 38
  • 39. Open Linked DataA site that exists to provide a home for, or pointers to,resources from across the Linked Data community: DBPedia - dataset from Wikipedia, see description full Ontology and exampleDBPedia SPARQL example: query 39
  • 40. OpenData linksExamples: (since October 2011),%20UK#crimetypes (best practices!) generation data.gov 40