OS MasterMap it's not a map - but data


Published on

In 2001 Ian Painter led the team responsible for OS MasterMap. This ground-breaking project took Ordnance Survey’s Land-Line product and created the world’s first database of ‘real world objects’. Now some 10 years on, this talk will look back at the original premise of its creation and how its been used over the years.
First and foremost OSMM was designed as seamless data and a big hope for the product was that unlike LandLine it would no longer be used as a backdrop map. Many organisations use OSMM as backdrop map but by doing so they’re missing a huge amount of its value. It’s now time to put the map aside and use the data. Welcome to Big Data.
Focusing on Big Data concepts for analysis and query, the second half of this talk will introduce Big Data concepts and how Big Data platform scales on commodity hardware, makes extensive use of parallel processing and works with GI data in a manner totally different from anything we’ve seen before. Not just that, but how Big Data offers all this at a fraction of the cost of traditional GIS and relational databases. Big Data will finally realise the original vision of OSMM – and when that happens OS will need to change the name!

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

OS MasterMap it's not a map - but data

  1. 1. OS MasterMap it’s not a map – but data Ian Painter Snowflake Software
  2. 2. About me• I don’t represent Ordnance Survey,• I worked there for 10 years• My opinions are my own
  3. 3. First Some History• County Series was Ordnance Survey’s large scale paper product• Landline was Ordnance Survey’s first digital product• Built to print paper maps quicker• Blind digitised … so plenty of
  4. 4. And then came OS MasterMap Miracle Occurs
  5. 5. Fastrack• Take Landline as input• Clean it up … like really clean it up• Stitch it together (edgematching)• Polygonise it• Restructure the road network• Beef up the attribution• Multi-level structures• Classify all the polygons• Associate all the cartographic text• Only 1% manual editing• Complete in a year
  6. 6. Geospatial Object ServerAll very well creating all this stuff but we need somewhereto put it:•Store it all seamlessly•Maintain the topology•Store change•Seamless ordering•Huge data volumes•From 100 features to 450 million•Built on an object oriented database called ObjectStore
  7. 7. Fastrack + GOS =• Unique identifiers• Real world• Seamless Product• Change-only-update• Delivered as GML
  8. 8. Impacts on the Industry• Data management – From files to databases• Large data volumes• Complex data models – From simple features to table joins, multiple geometries• GML – XML rather than proprietary• Change only update – Individual feature update rather than file replacements
  9. 9. Key Market Selling Points• A data product – Clean – Structured – Rich attribution – Unique identifiers• Seamless - designed for query and analysis• Change Intelligence – Change triggers – Historical archives
  10. 10. How did we do 10 years on• Map vs data – Coloured backdrop map – Cloud web mapping is step backwards• Change Only – Slow start but most now applying COU – Little use of COU for change intelligence queries• Identifiers – Core referencing hasn’t really worked• Seamless – Very little spatial analysis
  11. 11. But Why?• Data model capabilities of GIS are very limiting• Too much focus on web mapping – Even more so with online mapping portals• Proprietary nature of GIS – The limitation of it’s formats – Preference for file based data management – Lack of integration with mainstream IT• Functionality is focused on the map, not the data• Huge hardware requirements for spatial analytics
  12. 12. So what’s going to change all this?When are we going to drop the map?Isn’t it just data? Big Data?
  13. 13. Well it’s not Big Data … but I needed a link!• Big Data is a buzz word!• An technology paradigm to massage your ego• Everybody likes to have something … BIG
  14. 14. Seriously … a Big Data 101• Big Data looks at data in two ways 1. Structured Data – think schema, data models 2. Unstructured Data – free text, insurance claim, transaction log• Structured Data tends to be stored in database – But not any old database … a NOSQL database• Unstructured Data tends stored on a file system – But not any old file system … a distributed filesystem – And then processed through a paradigm called MapReduce
  15. 15. So what’s all the tech•NOSQL Databases: – Columnar, Document, Key-Value and Graph – Netezza, Vertica, Terradata, MongoDB,•Unstructured data: – Hadoop: hdfs, MapReduce, Amazon Elastic MapReduce•Can also be hardware – Lot’s of hardware
  16. 16. Who uses this stuff• Yahoo created Hadoop, it runs: – Facebook, Twitter and eBay – Heavy use in Telco, Finance and retail• Facebook, runs the largest Hadoop cluster in existence – 21PB of storage, – 2000 (8 core) machine, – 12TB per machine, – 32Gb of RAM
  17. 17. But what about Geo support?• IBM Netezza has native spatial• MongoDB• ESRI ArcGIS 10.1 has native Netezza support• But that’s only required if we think of geo as map• Geo won’t be just maps for much longer …• Geo is just collection of relationships – This is next to this, this is inside that• NOSQL database are far better suited to graphs
  18. 18. A Big Data problem …• NFC – Near Field Communication will put a spatial component of every cash transaction• Think of running a spatial query against national OSMM to check if the NFC transaction happened in the correct location – So that’s 1 x,y against 440 millions features• NFC has the potential to overtake cash in less than 10 years – We’re talking 100’s millions vs 100’s of millions a minute … now that’s a Big Data problem
  19. 19. GIS ain’t gonna cut it• Step up – Big Data – Geo as graphs and relationships• Take a seat – GIS and rest your tired data model capabilities – Maps and cartography – Relational databases• Use Geo to give me an answer but don’t show me a map
  20. 20. In summary• Despite OS MasterMap being over 10 years old – It’s still ahead of GIS capabilities – A lot of inherent value still isn’t used• Spatial is not special it’s just another data type• Too much focus on a map to solve all• Big Data is coming, Geo will be very different• Some sectors will skip GIS altogether• Think data, not maps• Graphs, not geometry
  21. 21. Ian PainterSnowflake Softwareian.painter@snowflakesoftware.comhttp://www.snowflakesoftware.com@iapainter Ya scallway whut deserves the black spot. Come t’ our stand an’ be seein’ some great demos.