I think its worth noting that in BC we are very lucky.Our access to data resources is excellent in comparison to many other jurisdictions. So, first up, I want to congratulate the leaders in our community for making this happen.Now, being a Scotsman, I’m never actually happy or satisfied with anything, so I will tell you a story…
Its starts with hackathon up in Prince George. We were looking specifically at open data from our City and Regional District We had various teams but one team, in particular had a problem. They took the simple idea of open data a little further and they wanted to compare the financials of different municipalities to see which would give the biggest ‘bang for the buck” in terms of tax dollars. The idea being they would be able to give consumers, citizens an idea of the “best value” municipality in which to live.This turned out to be a very difficult exercise. Mainly because as it turns out no one is talking the same language
And by language I’m not talking about spoken, written,program-metric languages or even data transfer formatsI’m talking about raw data. The data products being published from different municipalities didn’t actually support any kind of comparative analysisThat hackathon team were left comparing apples with oranges.
This got me thinking. The real value to open data is OF COURSE the data, not the database housing it, or the software supporting its distribution, but the *actual* data, How long will it be before you move on to the next piece of software for serving or disseminating or dropboxing or whatever it will be we are doing in a year from now, When will we be adopting the next high speed web data transfer format.When will our love affair with json and the xml-naugts end? With this in mind its worth considering the process of just publishing an “open data website” simply because you have a new tool to do so.The values in those tables, and the value of each data point increases everyday with its temporal depth. This happens entirely independently of technology, the value is in the data itself.Everyday more data is added to that repository, it becomes more valuable. So, we should make sure we are capturing and publishing the right data. If we’re not then we are facing an opportunity cost to our investment in *data*.Ok, then back to that hackathon lets think about context
Well, without context, we can get a skewed impression of what our world looks like.Unless we have a good idea of what is happening elsewhere, we might miss the bigger pictureBecause, another way to add enormous value to our data is to publish it in commonly understood waysTake cats, for instance
There are estimated to be 14 billion images of cats on the internet (and 2.7% of which are pictures of cats with bread around their face.)University of absterIndeed there are estimated to be only 220 million domestic cats in the worldBut what’s the point here? Well this massively popular phenomena is derived through a combination of cute-ness, convenience and compatibility.Think about it this way:Each ”cat picture data point” is commonly understood by both the computer and the person. There are only a few popular image formats and in the most part they are well documented. The ability to take a cat picture is somewhat ubiquitous now also.So these are easy to share, they are easy to manipulate and they are easy to reuse. Oh wait, that’s what we want from open data too, yeah?
So, what my point here.Well, the key thing is that for instance:Every individual municipality’s data becomes more valuable the more it can be commonly understood within the context of other municipalities.Every Province, Territory, State or Entity’s data becomes more useful the more it can be placed within a bigger context.In short, I propose that we congratulate ourselves of making a huge leap forward in publishing data. But now we start to think about what to publishAnd ideally we try and publish the same thing.
Disparate Data, technology fiefdoms (and 65 pictures of your cat).
Disparate data, technology fiefdoms
(and 65 pictures of your cat)
Open data is awesome
(give yourselves a clap, come on!)
There are estimated to be
14,308,667,560 pictures of
cats on the internet
There are only thought to
be 220,000,000 domestic
cats in the world
So, every cat has had 65
pictures of it posted on the