Started in August Looking for name for project blog where we would post news about open data issues and our own progress with the project. I quickly came to think about the famous quote "Comment is free, but facts are sacred ". The first part of that quote was used as the title of The Guardian's commentary website "Comment is free" - that's where I had heard about it originally.
Of course, the quote is from an article published in 1921 by CP Scott, the legendary editor and owner of the Guardian . "Facts are sacred" would be perfect as a title for our project. A direct translation into Norwegian does not work very well. "Facts first " resonates well with the traditional press ideology of keeping facts and opinion separated . "Facts are sacred" motto of The Guardian's "Datablog" This emphasis on journalism - our department's primary interest behind taking on the topic of open public sector data. Contributing to the development of journalism and in understanding the conditions for innovation in journalism . In this perspective we look at how facts can be used in journalism: As structured facts, data can be presented in ways that provide more insight into most areas of society , and they can empower citizens to act on issues that are important to them. Politics and the economy are obvious examples where data as structured facts can have an impact, but there are many more areas.
It was quickly clear that we needed to start with the most basic questions . Because very little information is available. - What and where are the government - public sector - data sources? - Are data available for re-use? - If not, what are the obstacles to making data available? Will briefly run through the main results and our recommendations . We have published our project report today on the web magazine Vox Publica.no, so you can check that out if you want to see all the findings in detail. ONLY in Norwegian, but an English summary.
FOUR MAIN ASPECTS OR FINDINGS we want to emphasize. First: There is generally a lack of information about data sources. We trawled 125 state agency websites. Only one third of the agencies have links to data source information on the homepage . We have been "nice" to them: Any information about statistics, data, numbers landed them a "yes" here. Looked for any information at all anywhere on the website, it turned out that two thirds had such information . But then it was in many cases really hard to find. Ideally, information about data sources should be visible on the homepage, and it should be possible to act on the information - it should be downloadable in formats developers want, with an API where that is relevant. The lesson here must be that a clear government policy on how to inform about data sources is sorely needed. Now users, if they find something, have to relate to all kinds of different ways of informing and presenting data. If, indeed there are data available at all.
THE SECOND MAIN ASPECT 2. No datastore: Some state agencies stand out by informing very well about their data sources, but these are exceptions . There is one necessary solution: A Norwegian data.gov . A central website where re-users can find data sources from all government agencies, and where the agencies themselves can register their data . This should be supplemented with regional and local datastores which cities etc can establish themselves. As you can see from that slide, there are enough examples to learn from. There are more than these. The most recent here is London which launched their datastore last week.
What we did was to simply start our own datastore using an open Google Spreadsheet. In a way to demonstrate that it doesn't have to be so difficult! With the help of a community of users we have registered some 130 sources there. That's information that the government can re-use itself when it gets its own data.gov up and running. We really think this is one of the most pressing issues now. Actually, only in the last few months there have been more initiatives to collect Norwegian data sources, and all are run by users on a voluntary basis.
THIRD MAIN ASPECT 3. Great potential: We did a survey among a selection of state agencies from across different sectors, from research institutes to environmental agencies to the parliament . We asked a set of questions about their data policies. The answers made us quite optimistic, actually. I want to highlight two tendencies : THIS SLIDE: Two out of three agencies say that they have data sets with a potential for re-use that they have not made available yet.
And six out of ten say that they plan to make more data available within the next year. Judging from the comments that we received in this survey and in interviews , it's reasonable to say that in many areas of the public sector there are well-informed people working actively to open up their data.
OK, THERE ARE DATA SOURCES, BUT WHAT ABOUT THE OBSTACLES? We asked about that in the survey. Costs are chosen by most. Clearly direct and indirect costs play a role , and this should be addressed when and if there will be political initiatives. One other, very important obstacle is also mentioned here: Personal data and privacy . In interviews that we had with local government agencies here in Bergen, this was THE major concern . We believe that this topic as well needs to be addressed. Ideally, agencies should have clear, practical guidelines so they can open up data at the necessary aggregated levels . Now there is maybe a tendency to keep data closed because they want to be on the safe side . That's understandable, but also a problem.
OUR FOURTH MAIN POINT. 4. Knowledge gaps: Interest in open data and the competence - how to do it right - varies strongly across the agencies we have surveyed and talked to. A lot needs to be done to improve conditions here. One of our suggestions is that the responsible government agencies - the ministry - formulate clear rules and guidelines about how to open up data in the correct way. Here we can learn from a Dutch project where they wrote a handbook in the form of a wiki and made this great poster that the promising data-bureaucrat can frame and hang above her desk! Don't worry , it does of course have a Creative Commons license! We have included this in our report .
I have mentioned some of our recommendations already, we have in all 10 in the report. But here are the most important ones: SAY MORE ABOUT EACH - Datastores: Data.gov, state/regional/local - Principles, licenses, guidelines, handbook - Personal data: special attention - Define and fund pilot projects – many examples how this is done, for example with the London datastore. We will also hear more about this later from Denmark. A general point about knowledge transfer: When I follow this field internationally now, I see almost daily new initiatives and ideas that can inspire and be copied here. From what I can understand, it has never been easier to transfer knowledge and experience between countries. I'm pretty certain we will leave today with more good ideas.
RETURN TO WIDER CONTEXT OF OPEN DATA I have taken for granted the underlying premise : That opening up government data is A Good Thing . When we talk about this in the most general terms , I think that is acceptable . The principle of re-use of government data has actually been implemented in the law across Europe as a result of an EU directive. In the Nordic countries, opening up data fits seamlessly into the tradition of transparency in government with quite far-reaching freedom of information legislation. The right of access to database information in public sector agencies has indeed been added to the freedom of information act. It was a logical step. But advancing from these very basic principles, there are plenty of other issues to address . When we get so far as to having more data at our disposal, what will we do with them?
This mashup was created by an IT developer in Britain immediately after membership lists of BNP were leaked and posted online a little more than a year ago. By using a standard Google Maps template, it makes it look like the BNP has taken over the UK completely. Criticized for various reasons, and the creator decided to take it down again . The data were stolen, confidential and not government data, and posting them online was definitely breaching these people's right to privacy. Some people, when they got hold of these data, were so eager to use them in their fight against the BNP that they forgot all about the usual privacy concerns . BUT: Public sphere worked. Extreme example, but interesting in our context. In less controversial areas, we risk ending up with misleading information instead of gaining insight from the mashup of different data sources. The concern that "giving us our data" will result in confusion was also frequently mentioned by the civil servants we talked to. But is it desirable or even possible to lock up data sources? It is hard to find convincing arguments for that. Just as the means of producing and distributing news and information now are available to all, data will also be at our fingertips. Our job must be to work for good tools, good practices and good results.
And we must keep the debates going. This is one way to frame them. GOOD PRACTICES: Create good examples, display them, criticize the bad. EMERGING COMMUNITY: You notice that there is already a very active community, how to knit them closer together? I hope we will cover some of these questions today, and I'm sure we will hear about many good examples of what can be done with public sector data. This is a question of opening up not only the public sector and politics, but also journalism itself . Opening up by letting other groups perform journalism-related work, and opening up by displaying our own data sources and inspire readers, users and citizens to work together with us.