Talk I gave to the British Computer Society about OpenStreetMap.

OpenStreetMap : Open Licensed GeoData

  1. 1. Harry Wood OpenStreetMap : Open Licensed Geo Data British Computer Society : Monday 27th April 2009
  2. 2. Topics ●OpenStreetMap purpose and premise ●Data structures: Nodes, Ways ,Tags etc ●Editor demo ●OpenStreetMap servers and architecture ●Rendering and map displays ●The license ●CloudMade products and services ●Imports and other mapping techniques ●Getting Involved NOTE: Ran out of time for these topics on the day. The slides for these have also been removed from this deck. Could present them on another occasion!
  3. 3. Free as in Freedom ●Open license: –Creative Commons Attribution Share-alike ●“Open Content” like “Open Source” ●Contributors retain ownership of copyright ●People and commercial companies can use the maps for free under this license. –Details of license requirements?... coming up
  4. 4. Getting an Open Licensed Map ●Can't copy copyrighted maps ● Not allowed to import copyrighted data ● Not allowed to copy from copyrighted maps ● Not allowed to trace over copyrighted maps ● Not allowed to “derive” ●Can copy some maps, but only... ● Public domain. Unrestricted (incl. relicensing) ● Get permission to release with an open license (big ask) ●Can create maps completely from scratch ● crazy idea?
  5. 5. GPS traces ● How it started. Gadgets! ● Cheap consumer GPS units or location-aware mobiles ● Record a line of dots
  6. 6. Record many lines of dots.... Looks like a street map. Kind of
  7. 7. Recording data Names of streets Types of streets (trunk, residential, motorway) One-way restrictions Footpaths, tracks, pedestrian, rivers, railways Parks, woodland, industrial areas, cemeteries POI (pubs, cash machines, post offices, post boxes, bus stops, toilets, supermarkets, restaurants, kebab shops, monuments, hotels, picnic sites, barriers, light houses, piers, sports centres, petrol stations, playgrounds, cinemas, car parks, universities, bicycle parking, tourist information, etc etc etc
  8. 8. Mapping Techniques ●Photo Mapping (geo-located photos) ●Audio mapping ●Taking notes ● ●Ditch the GPS ● Taking notes ● Local knowledge! ● Yahoo Aerial Imagery
  9. 9. Mapping: A lot of effort ●Gather data ● GPS traces and other information ●Input data ● using OSM “editor” software ●Requires a lot of effort ● Requires a lot of people!
  10. 10. Community Contribution ●Built by a large online community ●Many hands make light work ●Openly editable (and easy) ●Poor quality contributions? ● Gradual refinement ● Assume good faith ● Monitoring and correction ....Remarkably it works! Sounds familiar?
  11. 11. Wikipedia ●Large community coming together to build something great! ●Wikipedia Principles ● Openly editable ● Open content license ● Gradual refinement ● Assume good faith ● “Soft Security” Monitoring and correction
  12. 12. looks like wikipedia OpenStreetMap = The wikipedia of maps
  13. 13. Community It's big. 100,000 registered users
  14. 14. Community Increasing editing activity
  15. 15. Community Very Active 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 User rank Less Active Lopsided Long tail of less active users
  16. 16. Cambridge
  17. 17. Global Project!
  18. 18. Open Licensed Data A copyrighted map a justification for OpenStreetMap (It can't be used freely therefore OSM is better) ...cannot be a source for openstreetmap Existing maps are very rarely free
  19. 19. Ordnance Survey ●Wonderful data in the UK ●OS license use of maps (and charge ££££££) ●Never allow re-distributing with a different license ●Very strict about copying and their definition of “derived” work
  20. 20. Guardian 'Free Our Data' Campaign ● Lobbying government ● Tax paid for data collection ● Tax still pays indirectly ● Economic benefits of free ● Slow progress ● OS might release ● low quality data first ● less-than-free license ● Or might be privatised! ● Whine about it or take action?
  21. 21. OSM and Ordnance Survey £many £0 Low quality High quality OS OSM
  22. 22. ● We can't can't use google maps ● License their data from teleatlas ● ...who license data from Navteq / Teleatlas ● ...Ordnance Survey! ● No access to underlying data ● Google terms & conditions ● Don't allow deriving data from their maps ● Don't allow copying & re-distributing with a different license ● Wonderful hi-res aerial imagery ● T&Cs do not allow deriving maps (tracing) ● Bought in (licensed) from multiple suppliers Google Maps
  23. 23. Why not use Google Maps? Wonderful “free” (beer) mash-up API but... ●Errors and omissions ●Car centric. Footpaths and other details ●Cycle routes and Pistes ●Colours / branding - Google maps fatigue ●SVG export. Custom cartography ●Underlying data access! ● Details of OSM map access coming up ●Help OpenStreetMap!
  24. 24. Oxford University Website OSM has better maps of Oxford Encourages the OSM community Other uses coming up...
  25. 25. Nodes, Ways, Relations Node Has latitude and longitude Can stand alone, or form part of a way Way Joins together several nodes Direction sometimes matters Can form a 'closed way' (area) Relation For complex things such as routes
  26. 26. Tags Applied to the Nodes, Ways, & Relations Key value pairs amenity=pub name=Hare & Hounds highway=residential name=Court Street
  27. 27. Data Browser demo
  28. 28. Data Browser
  29. 29. Permalink
  30. 30. JOSM demo
  31. 31. JOSM demo
  32. 32. OpenStreetMap Servers Hosted in UCL Loads of bandwidth ~10 servers: Where does the data go?
  33. 33. OpenStreetMap Foundation Custodian of servers and sysadmin access Oversees funding and vehicle for fund raising Protection from copyright and liability suits
  34. 34. Database Server Motherboard Supermicro X7DWN+ motherboard with Intel 5400 (Seaburg) Chipset CPU 2x Intel Xeon Processor E5420 Quad Core 2.5Ghz Memory 32GB DDR2 667 ECC Disk 2x 73GB (3.5) SAS 15K 10x 450GB (3.5) SAS 15K Raised £10,000 in 2 days
  35. 35. API ● REST web service ● HTTP GET & PUT ● Get elements at URLs ● No bloated request payloads ● Ruby on Rails
  36. 36. Ruby on Rails ● It's easy. Web + REST ● Fashionable. Developers like it ● Developers are our most limited resource. ● It's what SteveC used ● Problems? ● Can't stream data from db ● Memory hungry and leaks somewhere ● Maybe use something else for core API
  37. 37. Nodes <node id="297556642" lat="53.548223" lon="-2.0056012" version="2" changeset="648346" user="Guy" uid="10983" visible="true" timestamp="2008-09-16T20:42:44Z"> <tag k="name" v="Hare &amp; Hounds"/> <tag k="created_by" v="Potlatch 0.10b"/> <tag k="amenity" v="pub"/> </node>
  38. 38. Ways <way id="27120827" visible="true" timestamp="2008-09-19T13:19:53Z" version="2" changeset="664390" user="Guy" uid="10983"> <nd ref="298116100"/> <nd ref="297555192"/> <nd ref="297555193"/> <nd ref="297555194"/> <tag k="name" v="Court Street"/> <tag k="created_by" v="Potlatch 0.10b"/> <tag k="highway" v="residential"/> </way>
  39. 39. Other API calls GET a map All elements within a bounding box,48.14,11.543,48.145 PUT elements Now requires “changeset open” request Various other operations History and changeset access Get GPS points/tracks
  40. 40. Some database details Switched from MySQL to Postgres last weekend! Rails migrations in theory In practice. C++ scripts running all weekend Why the switch? Lots of other planned restructuring including new DB hardware Good time to do it
  41. 41. MySQL Generally fast and scalable enough ● Quadtile indexing extension Several annoying flaws: ● schema changes cause table copies ● different features on different db engines ● (transactions on InnoDB, spatial on MyISAM) ● silently accepts invalid utf8 ● constraints can't be deferred ● some non-standard SQL syntax
  42. 42. Postgres ● Addresses a lot of MySQL flaws: ● Faster schema changes ● Better support for transactions, utf8, etc ● Personal preference of our sysops
  43. 43. Full Revision History Store a full history edits to elements Essential wiki-like feature Ideally provide simple roll-back Access old versions of an element Difficult to reconstruct old version of a map
  44. 44. Changesets ● Brand new feature ● Every edit belongs in a change set ● Every numbered version of every object belongs in one particular changeset ● Changesets have comments ● Great for monitoring
  45. 45. Changeset Displays
  46. 46. Changeset revert? ● Reverting is still a difficult problem ● Changesets are not atomic ● Changeset 1 User:Sam Node 12345 v1 ● Changeset 2 User:SallyNode 12345 v2 ● Changeset 3 User:Sid Node 12345 v3 ● Changeset 4 User:SallyNode 12345 v3 ● Changeset 1 User:Sam Node 12345 v4 ● Changeset 1 User:Sam Node 12345 v5 ● Many interlinked elements
  47. 47. Conflicts ● Two users editing the same element – Rarely happens actually ● Version mismatch now reported – “Optimistic locking” ● Editors (should) do CVS style conflict resolution ● Download reveals conflict ● Upload not allowed until resolved
  48. 48. (Watch nice video) OSM 2008: A Year of Edits
  49. 49. planet.osm ● Snapshot of the OpenStreetMap database ● Entire planet. Every node, way, relation, tag ● Only 'current' data. Not history ● XML formatted .osm file ● 5.2 GB with bzip2 compression ● Uncompressed... 150 GB ● Takes several hours to dump. Every Wednesday ● Important part of Openness. Ensures longevity.
  50. 50. Osmosis ● Java toolkit for OpenStreetMap ● Various data transformations ● Minutely, Hourly, Daily diffs .osc.gz files ● Created by Osmosis. Consumed by osmosis ● Streamable changes
  51. 51. Open Tagging ● Mentioned tags briefly – amenity=pub highway=residential ● Free-form open tagging. Any tags you like! ● Agree on standards ● Main map rendering uses one set of tags ● Other map renderers, other tools, can use other tagging schemes
  52. 52. 'Map Features' wiki page ● BIG list of tags Which tags go on this page? ● Wiki proposal process ● Wiki discussion and voting ● Wiki dabates (& blazing rows!) – Different ways of tagging the same thing. – Things which should not be tagged ● Wiki documentation
  53. 53. Smoothness Debate ● Vehement Objections – Too subjective – Verifiability – Poor english ● Disruption – Disregarding vote – Wiki fiddlers vs Mappers – Wiki edit wars – New process? ● Lock down?
  54. 54. The wrong way to think about tags ● Come up with lots of ideas for new tags ● Submit proposals, organise votes, generally fiddle with the OSM wiki a lot ● Pester people to use tags in map renderings ...oh and maybe do a bit of actual mapping
  55. 55. The right way to think about tags ● Do mapping! ● Found something without a documented tag? – Search thoroughly (in mailing list too) – Use a less specific tag and qualify with type= – Use a note= tag – Just invent a tag ● Do more mapping! ● Discuss politely. Improve existing docs. ● maybe... possibly.... do a proposal ● Focus on mapping. Don't worry about rendering
  56. 56. TagWatch ● Tag usage stats ● Split by country ● Tags used in conjunction
  57. 57. Rendering ● Topic follows on although... tagging is not just about rendering ● Go from geodata (nodes, ways, relations & tags) to rasterized map images Rendering
  58. 58. Which tags to render? ● Thousands of different tags in the DB ● Can't show them all ● Choose features to show at different zoom levels – Cartography! ● What do you want to emphasise?
  59. 59. Rendering Toolchain Slippy Map Display
  60. 60. Mapnik ● Open Source rendering software ● Fast! ● C++ ● Requires PostGIS database
  61. 61. Mapnik Stylesheet ● XML format ● 'styles', 'filters' and 'rules' ● >7000 lines long ● Pre-processing steps – Cascadenik – and also...
  62. 62. osm2pgsql ● Step before using Mapnik (& stylesheet) ● load OSM data into a Postgres database ● Lossy conversion. Only take tags of interest ● nodes and ways → linestrings and polygons Slippy Map Display
  63. 63. ● Open Source JavaScript library ● Dynamic slippy map on your website ● WMS layers ● Tile based map layers ● Transparent overlay layers ● Markers, Boxes, Polygons, Click events In the end we want a map display...
  64. 64. Tiles ● Small map images ● Cacheable ● Fast loading ● Sized to optimize speed – Too big. Unneeded map area – Too small. Too many requests – 256x256pixels
  65. 65. Tile Naming ● Slice the world into tiles at each zoom level ● Tiles are always 256x256 pixels ● Represent different sized area of the world at different zoom levels
  66. 66. Tile Naming Zoom level 0 has only one tile (whole world):
  67. 67. Tile Naming Zoom level 1 has 2x2 tiles
  68. 68. Tile Naming ● Zoom level 2 has 4x4 tiles ● Zoom level 3 has 8x8 tiles ● Zoom level 4 has 16x16 tiles ... ● Zoom level n has 2n x2n tiles ... ● Zoom level 18 has 262144 x 262144 tiles
  69. 69. Tile Naming ● Every tile has a URL y Zoom Level (0-18) x ● Tile naming scheme followed by OpenLayers ● Same used by google maps ● Looks like filesystem URL
  70. 70. Tiles =High Performance Computing 262144 x 262144 = 68,719,476,736 tiles inode problem! 5 kB each = 320 terabytes But then there's zoom 17.... another 80 terabytes etc...
  71. 71. Tiles =High Performance Computing OpenStreetMap updates? ● Apply diffs ● Re-render tile images! ● CPU problem!
  72. 72. Caching and mod_tile ● mod_tile – Apache module. Very fast – Render-on-demand if necessary – Clever caching – Serves old cached images and labels as dirty – Dirty tiles get re-rendered by render daemon Slippy Map Display
  73. 73. Bandwidth ● Serving terrabytes of tile data. High bandwidth ● UCL
  74. 74. ● Using OpenStreetMap – Presenting special interest map – Same data. Different cartographic choices ● Toolchain running on another server – Updates fed in – Passionate sub-set of the OSM community
  75. 75. Route relations, Cycle Parking, Bike Shops, Drink
  76. 76. Relief maps!
  77. 77. SRTM ● NASA - Shuttle Radar Tomography Mission ● Public Domain ● Problems – Spot heights – not contours – Course grid – Voids and other anomalies
  78. 78. CycleMap tool chain ● Downloads weekly planet dump ● SRTM. More steps in the chain! ● Bandwidth problems. Now hosted by CloudMade
  79. 79. OpenPisteMap
  80. 80. Hiking Map
  81. 81. Whitewater Map
  82. 82. Bus map
  83. 83. Kosmos ● .NET (windows only) ● Desktop app ● Can generate tiles ● wiki based style config
  84. 84. osmarender ● First good OSM renderer ● Used to be the only way to get SVG ● Complex perl XSLT ● Generates SVG (XML vector graphics format) ● Feed in .osm file and style config ● Can't be used to generate tiles.... or can it?
  85. 85. 'osmarender' layer
  86. 86. tiles@home ● Distributed tile rendering – Instructions dished out from tiles@home server – Many clients download via API and upload images ● 'osmarender' layer – Used to provide the fastest updates ● XSLT transforms & inkscape SVG rendering – Eats massive amounts of CPU – Mapnik more sensible. need to distribute
  87. 87. Other renderers? ● Plenty of scope to develop but.. ● high performance problem ● Complex graphics problem ● e.g. phprender Needs a bit of work!
  88. 88. We want people to be free to use our maps! OSM License Requirements ● Free to bring maps into “collective” works – Must give “attribution” ● Free to create “derivative” works – must share-alike ● Awkward complications: – What exactly counts as “derivative work”? – How do you give credit to the “authors”?
  89. 89. ODbL + ODC-Factual ● Open Data Commons ● Open Database License ● Factual Information License ● Benefits: – copyright, database right, and contract – Expressly written for data – More strict about underlying data (forcing sharing), but less strict about end products
  90. 90. Commercial use is allowed! ● OSM destroys business models ...or does it? – Destroys monopolies on geo data ● Allowed to charge for distribution – Can't disallow further distribution – Monetary value tends towards zero ● Allowed to charge for services – Distribute different formats / renderings ● Solve difficult problems (+time dependant problems) ● Hosting – Consulting services ● Just use maps. Core business not in geo-data
  91. 91. Flickr
  92. 92. Nestoria
  93. 93. 'Trails' iPhone app
  94. 94. Get Involved! irc:// #osm
  95. 95. Harry Wood worked as an enterprise integration consultant for 8 years, but led a secret double-life as addicted contributor to wikipedia and other collaborative open content projects. He got involved in OpenStreetMap three years ago, as a mapper, wiki gardener, and developer. Since January this year (2009) he has worked for CloudMade, as a full time OpenStreetMap developer CloudMade is a company providing products and services around OpenStreetMap. More information at These slides are (of course) freely re-usable under the Creative Commons Attribution-ShareAlike 2.0 License