Open Spatial DataProgress towards a reusable gazetteer th Open Data Group – 16 April 2012 @ianibboThis work is licensed under a Creative Commons Attribution 3.0 Unported License.
OverviewOriginal Problem How to transition a central govt funded aggregation of childcare and positive activities with a budget of >2m / year to an open data* model running on £60/ month hardware Retaining security (Of a certain level) Retaining functionality(See http://www.madwdata.org.uk/blog/id/394)
2 Major Costs To MitigateLarge cluster of proprietary OS hosts, ~12 front end web servers, hot backup sql server Migrated to 1*Pound Host server ~£60/month, server has 2 hard drives, hot backup, off site rsyncData costs – BPH Address-Point data – Used for geocoding incoming records and lookups on search terms. OS Boundary Line ???
Open Spatial DataOrdnance Survey Open Data http://www.ordnancesurvey.co.uk/oswebsite/products/os-loCode Point Open Postcodes to Northing/EastingOS Locator Gazetteer of road names (And other features)Obtained by registering on website, requesting, getting email, following link, …..
The reality of CodePoint OpenThe core data is “Open”Missing the one vital link between CodePoint Open and OS Locator – PostCode → Road Names / Identifiers.If youre happy to display Postcodes without road names, its ideal.Last Mile Problem.Finding an automated way to link the 2 is hard!Licensed data is now open, but out of date
Address PointStill LicensedExpensiveProbably not that useful anyway for most projects
Problem with focus on “Open Data”Everyone ends up implementing their own gazetteerLarge scale providers have rate limits and introduce external dependencies / Speed issuesPeople want local geo-coding (for lots of different reasons).Having rolled your own gazetteer, you discover you need to handle updates (Full replacements)Its not an end in itself
VisionA stand-alone gazetteer web app designed for local network use with features for importing updates from OS, reconciling multiple data sources and performing geo-coding lookups.
Available ToolsApache SOLR Long-Standing stalwart of the open data and search community Schemas slightly clunky Several spatial options, all with different strengths / weaknesses. Multiple points a problem in some.ElasticSearch Schema Free, Apparently Solid Spatial, Multi Points Good integration with Mongo via Rivers
Problems / IssuesES Spatial search hard to do directly via a COOL URL Spatial query syntax is expressive, but complex and needs JSON sub-documentsNeed service wrappersBut thats easily doneUpdates!
Missed Level of Abstraction(Common to many open data sets?) Local Copy C o Sourc m Processin e pa g re NOSQL Like ES Ideal for Mongo is ideal for this this
ProgressStarting to extract code from existing services into a generic spatial apphttps://github.com/ianibo/AnOpenGazetteerFramewoWork progressing under aegis of GIST Mobile group / Open Data groupWorkable Gaz now, but command line interface for importing.
Some supporting infoOriginal Project – FOI request to DfE Total costs - First 3 years 7000000 Local Authority Consultation sem- 6000000 Revenue inars Local Authority Capi- Methods Consulting 5000000 tal Central Office of In- Engine Group 4000000 formation Qi Consulting Digital Public 3000000 Redhouse Tribal Education DfE Staff Costs 2000000 1000000 0 2008-09 2009-10 2010-11
First 3 years - Non LA costs2500000 Central Office of In-2000000 formation Qi Consulting Redhouse1500000 DfE Staff Costs Consultation sem- inars1000000 Methods Consulting Engine Group 500000 Digital Public Tribal Education 0 2008-09 2009-10 2010-11