Talk: "Using Open Data and Crowdsourcing to develop CycleStreets"
Using Open Data andCrowdsourcing to developCycleStreets Martin Lucas-Smith CycleStreets.net @CycleStreets
What does CycleStreets do?Cycle journey planner Photomap / CyclescapeOnline service, 2m journeys so far Campaigning tool
CycleStreets: who?Simon Nuttall Martin Lucas-SmithRoutemaster Webmaster … and various people helping out in various ways!
CycleStreets: history Cambridge-only cycle journey planner Originally written for Cambridge Cycling Campaign Launched June 2006 Google Map –based 5,000 lines drawn over satellite imagery Google doesn’t give you data: just cartography 47,000 journeys planned 15,000 photos added
CycleStreets: history Lots of requests for same thing in other places around the UK Result is CycleStreets We are using OpenStreetMap for our data We don’t have money for an OS license Community aspect really important anyway OpenCycleMap cartography Went to public beta in March 2009 2m journeys so far Mainly word-of-mouth so far
“OpenStreetMap (OSM) is a collaborative project tocreate a free editable map of the world.” - WikipediaCollaborative: Jul 2007: 9,000 people; April 2012: Almost 600,000Project: Not just a map - mass of ideas, processes, data, outputsFree: Free financially and Free as in openEditable: Constantly changingOf the world: Global, not just UK where it started
OpenStreetMap“OpenStreetMap creates and provides freegeographic data such as street maps to anyonewho wants them.“The project was started because most maps youthink of as free actually have legal or technicalrestrictions on their use, holding back peoplefrom using them in creative, productive, orunexpected ways.”
OpenStreetMapUK – Ordnance Survey:Very high quality, but ...Cost can be prohibitive (particularly voluntary sector)Derivative data restrictions Ordnance Survey has claimed derived data rights when you place something over one of their maps Incompatible with direction of the Internet, where data is being ‘mashed’ together to make useful information and visualisationsCentral control – change slower
Crowdsourcing principle“Crowdsourcing is the act of taking a jobtraditionally performed by a designated agent(usually an employee) and outsourcing it to anundefined, generally large group of people in theform of an open call.” http://crowdsourcing.typepad.com/Everyone knows a little bit about something intheir area. Put that together and you get:
OpenStreetMap Marikina Mapping Party cake (4th Mapping Party in the Philippines)
Data collection Structured ground surveys – main source Ground surveys, performed by a mapper On foot, bicycle or in a car or boat. Usually collected using a GPS unit Government data sources Landsat 7, US TIGER data, OS OpenData Commercial data sources AND from Netherlands Traced from satellite imagery e.g. Yahoo!, Microsoft Bing have donated
Objective data OSM is a store of objective data Everything must be verifiable Subjective data is not welcome Subjective assessment is the realm of the consumer of the data E.g. Cycle journey planner decides on the likely niceness of a street based on objective attributes like speed limit, width, surface quality My cycle to work would be different to my mum’s: we have different preferences for a ‘good’ route
OpenStreetMap ITO World animation OSM 2008 - A Year of Edits
Data collection Mapping takes place individually or in groups
Ground surveys Individuals or groups survey using GPS and taking notes Made easier by GPS technology 2000: Bill Clinton switches on wider GPS availability Mid-2001: GPS units available for $100 2004: GPX standard (GPS data transfer) widespread Photo attribution unknown – please do contact us if you know
Mapping parties A group of openstreetmappers and novices Go to area & map it exhaustively, usually over a weekend Dividing up an area between participants and mapping it Mapping by car, cycle or walking Social aspect important: people can meet up and talk (usually at a pub) between mapping sessions Photo: David Earl Photo attribution unknown – please do contact us if you know
Mapping parties Photo attribution unknown – please do contact us if you know e.g. Walking Papers: Print current state, annotate, load back in http://walking-papers.org/
Social context Social context important Community decides on data collection and structure norms appropriate to their situationThe mapkibera project istraining locals people ofKibera, Nairobi to create amap with OpenStreetMapTechnologies used dependon circumstances Map Kibera
Social context Importing other people’s data? Massive debate within the OpenStreetMap community (Assumes donated data is compatibly licensed) One view: importing data gives the impression that an area doesn’t need to be mapped in person and reduces volunteer input TIGER data import in US very problematical http://www.slideshare.net/harrywood/wherecampeu-session-state-of-the-states-in-openstreetmap Another view: importing data gives a massive head-start and means we can get into much more detailed mapping Data creators vs Data consumers have different perspectives CycleStreets needs a reasonably complete map!
Social context Is objectivity always possible? WikiProject Gaza Practical issues How do you represent a location where only some people can enter/exit? Photo attribution unknown – please do contact us if you know
Social context How do you represent a location where only some people can enter/exit? Photo attribution unknown – please do contact us if you know
Social context Crisis Mapping: WikiProject Haiti Before January 12, 2010 Then NOAA, GeoEye, DigitalGlobe flew planes over the area, and donated their imagery for tracing purposes People around the world at their computers contributed to effort Roads, buildings and refugee camps of Port-au-Prince mapped in just two days “The most complete digital map of Haitis roads”
Haiti The resulting data & maps have been used by several organisations providing relief aid, such as the World Bank, the European Commission Joint Research Centre, the Office for the Coordination of Humanitarian Affairs, UNOSAT, others
Informal data structure No formal specification of how to represent things No database schema – just key-value pairs Reflects the social context of the users Users make it up as they go along Communities of interest norms Conventions established, then stability User/collector cycle embeds the convention
Informal data structure Nodes & Ways, Tags http://wiki.openstreetmap.org/wiki/Map_Features describes the (many) conventions formed so far Examples Motorway represented as: “highway=motorway” Local street: “highway=residential” Guided bus! “highway=bus_guideway” Fence: “barrier=fence” Cycleway: “highway=cycleway”. But what type? “cycleway=lane” “cycleway=track” “cycleway=opposite_lane” POIs: “amenity=postbox”, “shop=charity” Not to forget... “amenity=pub”
Adding data Potlatch 2 – www.openstreetmap.org (www.geowiki.com)
Adding data Potlatch 2 – www.openstreetmap.org (www.geowiki.com)
Adding dataThe ArcGIS Editor provides:• Simple tools to upload and download OSM data• An OSM-compatible geodatabase schema to locally store OSM data• An OSM symbology template for faster editing• Conflict-resolution tools for reconciling data back to the OSM database ArcGIS plugin for OpenStreetMap (free)
OSM / Google MapsGoogle doesn’t provide any data – just a pictureAlso doesn’t always have information needed by cyclists/walkers –park paths, cut-throughs, pubs! (Though is improving) OSM Google maps
OSM vs Ordnance Survey Depends what scale Question is intended use “Good enough” notion OSM will never be good enough for utility companies needing exact location of pipes But for many other uses, OSM appropriate and good enough Sutton Coldfield B72:
OSM vs Ordnance Survey Costs money – not free Big difference is the license – not free (libre) Plot points on a map and the OS claim some rights to that Derivative data issues Major problem in the age of the internet, where data is being shared, mixed, repurposed By contrast, OSM uses a Creative Commons license
Challenge to traditional mapping agencies Ordnance Survey seeing more competition OSM and internet sharing more generally forcing a change in business models? Lowering data use costs Lowering data collection costs Forcing derivative data restrictions to be removed Challenge in the small-scale map data area
Opens new opportunities Businesses like Microsoft, Google and others presumably spend a small fortune on mapping data Bing Maps (Microsoft) and MapQuest (AOL) now actively putting money and resources into OSM project Perhaps speculatively OSM will provide them with a cheaper way of providing data with far fewer restrictions in future?
Quality assurance issues Can we trust open data? Depends whether it’s ‘good enough’ for your use Can we trust formalised data? Tales of lorry satnavs for instance Balance between accuracy and speed/volume Arbury Park in OSM as it was built – others slower Quality around the country variable How can we ascertain this? Vandalism But there’s the ability to watch an area for changes More people = more vigilance or more vandalism?
Difficulties we face with OSM Coverage not uniform Lack of quality control: makes harder to engage Local Authorities Vandalism a concern for some though not in practice Subjective data? Maybe in future: lack of static IDs – unique numbers for features change Ability to engage local mappers when an area is deficient Many of these problems will go away as OSM matures
Challenge to traditional cartography Cartography is a major area of interest within the OpenStreetMap community Cartography is becoming more automated as Web 2.0 steams ahead http://maps.cloudmade.com/ 3D vector rendering – send data to device, not bitmaps
Cloudmade map renderer demo [Quick demo] http://maps.cloudmade.com/ Click ‘Edit map style’ Click on a design to start from Click ‘Clone Style’ in the bottom-right Use the ‘Object Visibility’ box on the right to remove/add features
OpenStreetMap ecosystem At the heart of the OpenStreetMap project is a database holding all the map data that people work with. Left: editors people use to enter data into the database Right: all sorts of interesting uses for the data, e.g. ...
OpenStreetMap uses Non-commercial Commercial / profit-making use absolutely fine As long as people adhere to the license, i.e. give attribution and allow downstream users to share/re-use the data Maps of very many kinds Web routing SatNav devices Data analysis (e.g. accessibility analysis) Placefinding GPS background Humanitarian ...
OpenStreetMap: Summary Applies the Wikipedia approach of crowd-sourcing Extremely flexible Free (cost) and Free (libre) Challenging traditional map agencies / business models and government funding models Communities of interest and norms Much scope for research Varied uses: maps, electronic devices, humanitarian, .. CycleStreets using it As more data goes in, more uses, so more people add data, so more people use it, so ...
Journey planner: features Plan route from A-B, anywhere in UK Simple user interface (we hope!) Click-click-plan, and simple Namefinder Gives set of route choices (fastest, quietest, balanced) Takes accounts of hills Turn-by-turn directions Photos-en-route
Journey planner: features Distance, time, CO2 avoided, Calories Google Street View at any point Localised versions for easy linking E.g. cambridge.cyclestreets.net Link methods E.g. www.cyclestreets.net/journey/to/cb1+2py/ ‘Fly in Google Earth’ Export to GPS Feedback system
Photomap: features Icons on map (per type of feature) Click to view image and info Add photo Crowdsourcing: lots of people, but each donating a small effort Categorisation E.g. “Show me all the cycle parking problems in Cambridge”
Mobile Key features on small screen iPhone app Android apps Mobile HTML5 app All open source – help welcomed! Jakob Nielsen: “Best Application Designs” - April 2012 (Lightweight Applications category
Mobile Other apps now incorporating our routing API - data interface Bike Hub – great world-first iPhone bike real-SatNav In the leading Boris Bike app, ‘London Cycle’
Why? Fundamentally, we want to see “More people cycling, more safely, more often” New cycle users face many challenges in UK: Poor infrastructure, traffic hostility Confidence cycling (address with training) Cultural/identity issues: not yet mainstream Lack of utility bikes in shops Routes – different to car routes! We try to tackle the last problem ... and the first (through the Photomap)
How a routing engine works • Find route with lowest score, i.e. least ‘friction’ • ‘Shortest path algorithm’ - Standard problem in computer science, we use A* method
How it works (briefly) 1. Data comes from people collecting data on-street for OpenStreetMap Remember: Is factual data only – e.g. presence of road, surface, type NOT “I think this is a nice cycle route” 2. We take OSM data ‘off the shelf’ Though we’re part of the community in practice Import several times a week: fresh data Conversion process is complex – interpreting the data
How it works (briefly)3. Score each type of path:4. Take account of hills (add/remove penalty)5. Account for turn delays (work ongoing)6. Take account of detailed cyclist behaviour (ditto)
How it works (briefly) 7. Compress the network, to make the system much faster (system called ‘Cello’): A A 9 8 4 9: AC 10 7: AD,BD D 3 B B 6 6: BC C C 9 Park: 4 nodes & 7 ways After: 3 nodes & 3 ways
How it works (briefly) So each path / road / shortcut / etc. now has a score Higher score = worse for cycling (more ‘friction’)User comes to the site 8. Find the lowest total score from A to B 9. Route is found 10. Repeat for quietest, fastest modes – each have different scores 11. Routes shown to user
Draw over the cartography We are using OpenCycleMap by Andy Allan ‘Tiles’ which form a static background once a route has been planned – i.e. we just put this behind a line we have calculated
Getting involved: open sourcing All 3 mobile apps now open-sourced Main journey planner being open-sourced Latest update at http://cycle.st/b2221 Codebase currently harder to install than it should be Currently modularising more heavily Converting to Git Cyclescape open-source Very keen for greater involvement www.github.com/cyclestreets
Back in January 2011...Transport Direct CJP CycleStreetswww.transportdirect.info/Web2/JourneyPlanning/FindCycleInput.aspx www.cyclestreets.net£2.4 million (from tax) £28k92,000 journeys planned 458,000 journeys planned (dated Jan 2011, total now = ??) (dated Jan 2011, reached 2m as of now)£26.09 per journey 6p per journey£1m – budget for 2011 £130k needed32 areas (professionally surveyed) UK-wide (but depends on OSM completeness)
UKGov... BUT things have now moved on a bitWe’re working with the DfT through their data contractor toget data into OSM – funded projectDfT have been very receptive to the open data potentialWe think cycle journey planning is most effective when doneby local people using Open DataMerging tool createdWe continue to work toensure that CycleStreetsis solution of choice
Big Society –compliantWe tick all the boxes:Collaborative: involves local peopleLow cost: datasets have no license fee,agile deliveryTrusted: for the people, by the peopleOpen Data http://www.green-alliance.org.ukCitizen involvement: combines skillsand input of large numbers of people(collecting data)Quality delivery: problems can befixed easilyTransparency: more people overseethe data and spot problems or potentialimprovements Cabinet Office
Local Authorities www.cyclestreets.net/localauthorities http://cyclejourneyplanner.westsussex.gov.uk/ www.cyclingscotland.org
UK Collision Map www.cyclestreets.net/collisions
Cyclescape Campaigning toolkit For campaign groups around UK Match, using geography, who is interested in what blog.cyclescape.org Ruby on Rails (new for us!) Really would welcome coders github.com/cyclestreets/toolkit/
David Earl Martin Lucas-Smith, www.CycleStreets.net Twitter: @cyclestreets email@example.com