We spend a lot of time working with open data – this time about public toilets in London for a project run by @gaillyk. This particular project had us and the redoubtable @symroe taking various files, in various formats, and turning them into an API which we then wrote a web app. to display. http://greatbritishpublictoiletmap.rca.ac.uk/
And we made this working with @timdavies and R4D, It's a neat little thing that asks you for a DfID project number, and spits it out into a widget that references all the publications related to that project. http://r4d.herokuapp.com/
We've been to a fair few Rewired State hack days, and we've mentored for Young Rewired State since its inception. http://rewiredstate.org/
And I'm here today to talk about how to prepare your data for a hackathon – in a pretty wide sense. Not much about formatting your data, a bit more about how to make it a bit more useful and a bit more interesting to developers. I'm not a developer myself, but I work with them and I've organised back events, and made a hack at a hack day myself...
This presso makes extensive use of metaphor. Please treat with caution!
Let's imagine this is the developer community. Buzz buzz buzz, work work work.
And you have data. (Lists of locations of things. Service outcomes. Information about artefacts. Event listings. Nothing necessarily personally identifiable...)
And here's where we'd like to end up at the end of a hack day. Tools, proofs-of-concept, mashups, a visualisation or two...
I told you there'd be metaphor... The first time I gave this talk, there was an entomologist in the room...
It's all about effective collaboration. You and the bee. The pollen is your data. You are the flower: scent, colour, patterns, nectar and all...
Geeks are scared too, on hack days. They have to perform in a day. They could do with help with some preparation... Writing about your data will make them aware of it before the hackday, and that might help them choose your data and make something better with it when they do. Don't have set ideas, but try and give a scent of inspiration... Use the event hashtag to help get the word out – the event organiser will help boost your signal. This is a long—range kind of communication, before the actual event.
A bit closer in... Colour and pattern! Enthuse! Look what's over here! “Here are our problems...” “This is what inspires us!” Your mission statement and objectives? Landing strips.... pointers to the data, structures within the data. Some idea of what's available, where to start. “This is how to understand what's here.” You know what the jargon means, you know your TLAs. Help get that information out. If you've not got an open data page, get one, if you go to the hack day, enthuse... Docs and code examples if you can - helps devs no end. That's a Yellow Bee Orchid - Ophrys lutea – by the way.
This is a bee-friendly flower! It's very obvious where the data is, how to use it. Developer friendly data is easy to work with. Data that's easy to hack with - dates and places. Mapping is easy, but a bit old these days. Data that changes over time is the new(er) thing. (Take a look at Hans Rosling's TED talks – this for example - http://www.ted.com/talks/hans_rosling_at_state.html ) Locational data that changes over time... Yum. Nectar. (Yes, I'm pretty sure this flower has no scent. I did say the metaphor should be approached with caution...)
Cool hacks use your data along with someone else's. So make your data play nice with other organisations' - use standards, uniform identifiers: Postcodes. Charity numbers. ISO 8601 for dates. (Excel speaks ISO 8601...) It's great to formally link your data to other peoples, but it is some extra work, and it can be technical. It's good to have an API, particularly one that's standards-based, but hackathon people are used to working without them. “Save as... CSV” from Excel will do for a start. Upload to Google Spredasheet and you've got a simple API. It's interesting data they can work with they'll be looking for. Data that make good stories. Look at Jeni Tennyson's work with crime data – patterns in bigamy rates during war time - or Anna Powell-Smith's recent work with baby names at http://darkgreener.com/baby-name-data. Interesting stuff.
I tried to make a slide about the waggle-dance, simply because I love it as a piece of animal behaviour, but my metaphor was going too far as it was. Maybe something about devs. telling each other about good data sets. They do. There's something about giving clear signals on the day, maybe wearing a name badge with your organisation and data information on it, being there to answer questions about it, something like that. Maybe something about the fact that you can't make devs. work on your data at a hackathon, you have to inspire them. (Yes, it's a bit of a competition between datasets...)
Importantly: make it as open as you can. See, this data is closed up. Won't attract any but the most determined bee.
That's pretty open...
But look, this is it. Flowers should be as open as possible... Open as in licensed.
This is a Creative Commons license. Might be appropriate for some of your data. There are better folk than I who can advise you on this kind of thing. Loosely, if isn't a key revenue stream, get it open.
See, an excited bee. (NB: I have no basis for knowing this is an excited bee beyond it was the first image on a Flickr search for “excited bee” that was available under the right license, and in the correct aspect ratio.)
This would be too excited.
It's all about effective collaboration. You and the bee. You are the flower, and the pollen is your data.
Thanks. I'm Harry Harrold, firstname.lastname@example.org, @harryharrold