The Road to Open Data Enlightenment Is Paved With Nice Excuses

750 views

Published on

The road to open data enlightenment is paved with nice excuses! These slides include 11 open data revenue models for government agencies who 'pragmatically' need to keep generating revenues being 'authentic sources'. This presentation was delivered by Toon Vanagt from https://data.be as the opening keynote of the 'opening-up' conference in Brussels on 3/12/2014.

Published in: Government & Nonprofit
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
750
On SlideShare
0
From Embeds
0
Number of Embeds
48
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • “Data is the new oil" Turns out it meant: Cost for storage & compute cycles will go down faster than you can imagine!
    This feels like preaching to the Open Data Choir / So I’m keeping this short
  • Austerity. More efficient gov. Do more with less money
    Good intentions on gov site…
    Basic infrastructure & service: like roads & parks were in the past
    Bandwidth & storage are cheap nowadays
  • 1. = Ideal
    2. & 3. are pragmatic
  • Hurdle and limit
  • Xbrl + pdf
  • Free monthly dataset
  • Loopholes and unofficial or undocumented access & backdoors…
  • The P word is out…Privacy
  • Q&A
  • The Road to Open Data Enlightenment Is Paved With Nice Excuses

    1. 1. THE ROAD TO OPEN DATA ENLIGHTENMENT IS PAVED WITH NICE EXCUSES 3rd Dec 2014 Toon Vanagt CEO data.be @Toon
    2. 2. Official Belgian company info sources
    3. 3. Some data.be features Autocomplete Mashing up gov sources Data enrichment Financial ratios OCR in PDFs Entity recognition Alerts
    4. 4. Police Force are top users of data.be On the internet you must always remember: If something of value is free, you’re the product!
    5. 5. Definitions ‘Open knowledge’ is any content, information or data that people are free to use, re-use and redistribute — without any legal, technological or social restriction. (okfn.org)  ‘Open data’ and ‘open content’ mean anyone can freely access, use, modify, and share for any purpose — subject, at most, to requirements that preserve provenance and openness. (opendefinition.org)
    6. 6. Open Data Enlightenment vs Buzz The Age of Enlightenment is the era from the1650s to the 1780s in which cultural and intellectual forces emphasized reason, analysis and individualism rather than traditional lines of authority…. The current open data philosophy redefines ‘authority’ too and appeals to analytical power of citizens, hackers, journalists and entrepreneurs to put data to good use. Open data: fosters “bottom up”-approach stimulates to get more out of the data sets delivers unexpected results & insights Beware of fancy alchemy headlines: Open Data Is The New Oil Unlocking The Gold Mine Turning Government Data Into Gold €40 Billion boost to the EU's economy each year…
    7. 7. Excuse 1: But how will we make money? Does your government (department) really have to make money with open data? Open data quickly evolved into primary state infrastructure & service. Open data benefits society as a whole, so why tax usage separately? If you still want or have to charge users, limit the cost in PSI-spirit to your marginal data delivery expense (extra bandwidth).
    8. 8. Who pays for open data gov cost? 1. Government subsidizes underlying open data department costs as a primary service. Government covers the open data related cost as part of tis general expenses. 2. Government agencies charge each other for cost of data usage between federal, regional and city level departments 3. 11 open data revenue models for government agencies as authentic sources 3 options at input side 8 options at output side
    9. 9. Charging the INPUT side Government makes the user pay for (legally required!) data mutations: 1. Creation of data sets (company creation, alarm system registration, publication of annual accounts,…) 2. Change of data: (address move, new stakeholder in company, name changes, corrections…) 3. Deletion of a dataset (inactive company, bankruptcy,…)
    10. 10. Downsides of INPUT based revenue model Introduces financial hurdles Removes incentives to keep data up to date Results in lower data quality Requires higher ‘enforcement’ cost Requires cost to clean up outdated data sets
    11. 11. Charging the OUTPUT side 1. User pays for individual consultation 2. Basic data are free, but user has to pay to consult extended data or meta data 3. User pays for use of structured data sets (csv, xml, batch, API,..) 4. User pays for real-time data sets, which reflect current state in authentic data source (daily update versus monthly update) 5. User pays for removed data (from archive) or for change log (historic overview) 6. Users pays to Service Level Agreement (eg guaranteed bandwidth or outside business hours) 7. User pays for monitoring keywords (or events) in (or about) certain data sets to receive alerts (push notifications, e-mails, SMS,…) 8. User pays for custom bench marking, segmentations, ratios or advanced filtering options
    12. 12. Downsides of OUTPUT based revenue model  Financial hurdle for ‘newcomers’  Reduces innovation and consolidates ‘status-quo’  Inequality (more for those who can pay, higher service through faster access, better informed)  Results in limited usage and applications  Requires costs for billing & payment system with back office operations
    13. 13. Gazette / Belgisch Staatsblad / Moniteur Input based: 1. Creation of data sets (company creation, publication of annual accounts,…) 2. Change of data: (address move, name changes, capital changes, new stakeholders…)
    14. 14. Belgian example 2: National Bank Balance sheets Input  Pay for publication of annual accounts (274 EUR for BVBA/SPRL = limited liability company) Output  User pays for use of structured data sets via a webservice (roughly between 1.850 EUR and 15.000 EUR per year).  User pays for old archived data sets which are no longer shown on the National Bank’s website  User pays for custom industry bench marking and ratios of competitors, customers or prospects (but one self-owned company benchmarking remains free)
    15. 15. Belgian example 3: Crossroads bank for enterprises Input  Creation of data sets  Change of data, such as address move or registering extra business entity,… Output  User pays for use of structured data sets (copy of public part of database with names of company stakeholders and self employed persons at 75.000 EUR/year  User pays for real-time data sets, which reflect current state in authentic data source (daily update versus monthly update) via API (2.000 API request for 50 EUR in prepaid balance)  User pays for removed data for change log (historic overview)  Users pays to Service Level Agreement (eg guaranteed bandwidth or outside business hours)
    16. 16. Avoid conflict of interest for gov agencies  Battle for budget: creates competition between government agencies  Inequality in support services and quality between paying and non-paying customers or agencies  Battle to secure authentic source as single gatekeeper and extend reach  Creates competition with private sector. Due to government agencies acting as commercial data brokers selling whole sale personal contact details to intermediates
    17. 17. Excuse 2: Our data quality is too low to release  Open Data is not your real challenge, you have much bigger data quality issues…  Accuracy: is the data correctly representing the real-world entity or event?  Completeness: Does the data include all data items representing the entity or event?  Conformance: Is the data following accepted standards?  Consistency: Is the data not containing contradictions?  Credibility: Is the data based on trustworthy sources?  Processability: Is the data machine-readable?  Relevance : Does the data include an appropriate amount of data?  Timeliness: Is the data representing the actual situation and is it published soon enough?
    18. 18. the process and partner chain is not…  Document data process partners  Describe steps in information chain upward of your authentic source (data.be had to reverse engineer processes)
    19. 19. some privacy sensitive data elements…  Keep the lawyers out of your open data project if you want to make a fast start   It’s complicated  It’s Personal  Privacy concept evolves over time and is culturally defined  Many grey zones  Don’t forget to try to anonymise your unstructured data too… accidents will happen  We can technologically do much more than we are permitted to culturally, morally or legally…  Beware that very few data points are needed to identify a person in this big data era. Eloquently phrased by Jonathan Mayer: “The idea of personally identifiable information not being identifiable is completely laughable in computer-science circles”.
    20. 20. Excuse 5: On second thought, we’re not that open… Availability: Can the data be accessed now and over time? Be consistent and offer long term commitments and stable data set formats (integration mapping) Data.be received a ‘Cease & Desist’ after a government hackathon: “Our government website is the only authentic source for air quality measurement. Stop using our data immediately or …”
    21. 21. Excuse 6: We opened the data in a layer on our WMS…  Web Map Service (WMS) is a standard protocol for serving geo-referenced map images over the Internet that are generated by a map server using data from a GIS database.  It is very hard to share the layer data…in other applications
    22. 22. Next frontiers for Open Data Linked & graph data Metadata Unstructured data Structured feedback loops
    23. 23. Barriers to open data reuse © 2013 European Commission training manual
    24. 24. Gatekeepers to the rescue  Don’t just ‘input’ the data which are presented  Inform general public on long term use of their ‘public’ data.  Once online, always online…  Evangelise the use of open data inside and outside your organisation
    25. 25. Open up your organisation Invite a data scientist to work. Share insights internally, learn, optimize quality of data sets Be open about quality and refresh rates Specify the license under which the data may be re-used. Provide a feedback loop (now data.be often is feedback for outdated company data…) Maintenance of metadata and data is critical!
    26. 26. Toon Vanagt CEO toon@data.be @Toon THANK YOU 3rd Dec 2014 #OUP14 Opening up conference in Brussels
    27. 27. Picture copyright & attribution The brick laying machine pictures can be found at Tiger Stone:http://www.tiger-stone. nl/index.php?option=com_co ntent&view=article&id=47&Itemid=5 5 Keep calm cup: http://www.keepcalm-o-matic. co.uk/product/mug/keep-calm- and-open-up-67/ Storify with pictures of opening-up. eu event: https://storify.com/openingup_eu/op ening-up-final-conference-1

    ×