I’m going to talk about a project Teknologirådet did last year on the use of public data. The title of the presentation refers to the fact that the public sector collects and stores enormous amounts of information. And why should they keep it all to themselves?
Over the last years the amount of information available has exploded. Those of you who were here yesterday afternoon heard Google give us some pretty impressive figures for the amount of new information generated each year. This does not only go for information that people and businesses put on the internet – it is also the case for the public sector. Increased demands to make the public sector more lean, efficient and customer orientated means that more information is collected and stored and analysed in public computer systems. And because we – the citizens - are used to doing practically everything on the net: Banking, booking plane tickets, buying Christmas presents… we also expect to be able to fix things to do with the public sector on the internet: Fill in our tax returns, apply for kindergarden, check if the trains run on time… So on the one hand there is a drive for digitization. On the other hand there is of course a directive (this is Europe, after all). The PSI directive states that the public sector should make data available to the public. The idea is that by making public data more accessible, it can be used in new and innovative ways. This could mean better services for citizens, increased control with how the authorities use their funds and power, and overall a more open and transparent government. And better access to public data may also lead to business opportunities – which again will feed back into the public sector in the form of taxes. So if the directive is’t exactly new – why are we doing something on it now? It’s because all of a sudden a lot of governments seem to have realised what the directive means, and are trying to figure out what data they should share and how they should do it. They may have been inspired by this: data.gov, that was launched by the Obama administration soon after they took over.
Denmark are working on their own portal for making data available.
And in the UK Gordon Brown and Tim Berners Lee were the poster boys for data.gov.uk
In Norway, the portal has just come up in a beta version. It was another application that pioneered sharing of public data – namely the weather service. Some of you might have seen this in the elevator of the Thon hotel here in Brussels: It’s detailed weather information from the Norwegian Institute for meteorology. This kind of detailed weather information used to be for sale – but it didn’t sell very well. After they started giving it away for free, the interest has exploded. Suddenly people are checking the weather several times a day – and of course – there’s an app for that. Another popular service is the possibility to check public transport. If you have a pone with a gps, you can get info in your nearest stop and when a bus or train is expected. In real time. The last two examples I gave you are examples of very popular applications based on public data – so that must mean that sharing public data is a good thing, right?
Unfortunately, it isn’t all that simple. Which of course is why Teknologirådet had a project on the how, when and whats of public data. Some of the questions we asked were: “ Is it self evident that data financed by the public should be publicly available?” “ Is it OK for someone to make money from something they were given for free?” “ What role should the government play and how should they relate to commercial businesses?” And of course: “How can we balance transparency against privacy?” I’m not going to look at all of these today – because that is a much longer presentation, but I’m going to share some of the thoughts that we have on privacy. I’m not sure how this has been in other countries, but in Norway, the people that are engaged in free public data are mostly the social media people – the ones who just wants to put EVERYTHING on the internet. And much as I love the internet myself, I’m not convinced that this is ALWAYS the solution.
So when we made our recommendations we first looked at what data should be made available. And we DO agree that as much data as possible should be accessible, but with two important exceptions: Personal data, and data to do with national security. We also gave some recommendations on how to make the data available: The public sector should first and foremost deliver «raw data» in the form of machine readable data, preferably on an open format. For real time data, there should be a programming interface (API). Data quality is often an excuse not to publish data. We think it is better to publish the data, but to clearly label them, so that the public can know that the data has quality issues. That way the quality may even be improved because the public can provide feedback when they identify errors in the data set.
So – not personal data. That sounds easy, right? I’ve tried to make an example - and as you can see my crash course in web design is from the early 90s : ) All the information here is information that is not confidential according to law, and that can be found in public databases, combined with different web-based tools such to calculate the current market value of your property and information that can be found openly on the internet. I’ve also imagined some sort of integration with a social network. When you get full data sets in machine readable form – which is sort of the point of this – you can get a lot more information than you could when you had to call or write to an office to ask for data for a specific place or person. It’s easy to think that its obvious what is personal information and should be kept safely in storage, but with ever more information available on the internet, and ever more tools to search for and combine information, this is actually quite difficult. During the work with this project I was consulted by the Norwegian road authority. They have had several requests for their accident database by a major Norwegian newspaper. The database does not contain any personal information, but it contains detailed information about accidents on Norwegian roads. The information provided includes a status on whether this was thought to be an actual accident, or if the accident was «willed» (i.e. a suicide attempt). If you couple this information with for instance articles from local newspapers from where an accident happened, it may not be difficult to uncover the identities of the victims. And then of course this would be VERY sensitive data.
Another brilliant design from the mid-nineties: This is what we would like to see more of. Here I have imagined a local bank that have put up a web site to encourage people to establish themselves in the community. In addition to advertising their own services they provide information on local farms that are currently without an owner, what the charges for public services (like renovation) are in the different local municipalities and a list of local kindergardens with data on available places and a user rating of the quality. We think that data on public efficiency and quality are important data to put out there – but strangely government agencies don’t seem to be that keen… Other data sets that we approve of are of course the types of data that we mentioned earlier: weather data, real time data on public transport, where can I find public toilets, parks, schools etc. The public sector has a lot of data that probably never will infringe someone’s privacy. Our message with the previous example is just that it’s not always self evident when that is. And while we think it is important to encourage the sharing of data, we think the people working in the public sector will need guidelines in how to evaluate the impact of releasing a specific data set. Because just checking that there is no personally identifiable data in it is not enough. So – in case it wasn’t already clear: It should be like this “web site”
Not like this.
Thank you. All our reports can be found on our website, but unfortunately most of them are in Norwegian. Our English material can be found at: http://teknologiradet.no/default1.aspx?m=3
Public data for democracy and innovation - but what about privacy?
Collecting to share Public data for democracy and innovation - but what about privacy? Christine Hafskjold CPDP, January 27 th 2011
From «Freedom of information» to «active sharing»? Technology has made storage and sharing cheap … and raised public expectation to information and services Government transparency is important for democracy and control … both for citizens and the press Re-use of public data can foster innovation Easy exchange of data between public bodies can lead to a more effective and efficient public sector
Is it self evident that data financed by the public should be publicly available? Is it OK for someone to make money from something they were given for free? What role should the government play and how should they relate to commercial businesses? How can we balance transparency against privacy?
What data? As much as possible – but with some exceptions! Personal data Data to do with national security Machine readable data, API for real-time data Don’t use data quality as an excuse … just make sure the data-sets are labled properly
av 13 All about ME Christine Hafskjold (dd.mm.åååå) Married to Petter Hafskjold (dd.mm.åååå) Children: Tiney Hafskjold (dd.mm.åååå) Junior Hafskjold (dd.mm.åååå) Kilde: Folkereg. PROPERTY House: Boligveien X Price: X mill (19XX) Current value (estimated): XX mill Other property: Fjellet Price: X mill (19XX) Current value (estimated): XX mill Fortune: XXX kr Shares: Company A , nnn shares á XXX Company B , nnn shares á XXX Vehicle(s): AB XXXXX (20XX) Price: XXXXXX Current value (estimated): XXXXX EDUCATION Graduated NTH (NTNU) 1994 Grade: n,n Get full certificate WORK Current workplace: Teknolgirådet (from 2004) Position: Project manager Get full history Total income: XXXXXX Source: Arbeidstakerreg . FRIENDS Grade Income Kari X xxxxxx Tore Y xxxxxx Hild Z xxxxxx Source: Facebook, Samordna opptak, Skatt MORE PHOTOS…
av 13 Smallville Savings and Loans Interest rate if you move to Smallville: 2,0% SAVINGS | LOAN | PENSION | CUSTOMER RELATIONS | Farms need owners: Torpet Sjarmerende småbruk med plass til to kyr og en gris Plassen Du kan bli kornbonde selv om du er allergisk mot dyr Steinrøysa Neri bakken Public charges Byen Grenda Bygda Renovasjon 3000 3500 4000 Byggesaksbeh. 9000 7500 550 SFO 2000 1800 1500 Day care Available. Quality (1-5) Tertitten 24 3,7 Knerten 32 4,1 Havnehagen 80 4,3 Apply Apply Apply Start saving today!
av 13 All about ME Christine Hafskjold (dd.mm.åååå) Married to Petter Hafskjold (dd.mm.åååå) Children: Tiney Hafskjold (dd.mm.åååå) Junior Hafskjold (dd.mm.åååå) Kilde: Folkereg. PROPERTY House: Boligveien X Price: X mill (19XX) Current value (estimated): XX mill Other property: Fjellet Price: X mill (19XX) Current value (estimated): XX mill Fortune: XXX kr Shares: Company A , nnn shares á XXX Company B , nnn shares á XXX Vehicle(s): AB XXXXX (20XX) Price: XXXXXX Current value (estimated): XXXXX EDUCATION Graduated NTH (NTNU) 1994 Grade: Y Get full certificate WORK Current workplace: Teknolgirådet (from 2004) Position: Project manager Get full history Income: XXXXXX Source: Arbeidstakerreg . FRIENDS Grade Income Kari X xxxxxx Tore Y xxxxxx Hild Z xxxxxx Source: Facebook, Samordna opptak, Skatt MORE PHOTOS…