Who am I? Richard Cantwell, Senior GIS Consultant / Technical Manager with GAMMA. I’ve been in GIS since 1991, so just over 20 years of experience working with GIS. I’m here to talk to you today about the forthcoming revolution in GeoCoding in Ireland, and why 2012 is going to be the most important year yet for GIS Managers in Irish organisations. But first of all, what are the challenges that face users of Irish address databases?
What is GeoCoding? At it’s simplest it can be considered the Art (or Science) of turning an address into a point on a map.
The geocoding process in Ireland, as it currently stands, it quite complex. For a number of reasons. And there are a number of “Do’s and Don’ts” we’ve learned over the years, let’s look at these now.
Firstly, a lot of addresses are ‘ad-hoc’ as in this example – people in the same building recording their addresses quite differently.
This is related to the second issue – vanity addresses. People like to live in more ‘upmarket’ areas, and sometimes they accomplish this by changing their address without moving house!
Then there is the issue of townlands. Over 30% of Irish addresses are not unique, typically they are rural addresses where the townland name is used as the address. Most roads in rural areas don’t have official names. Take the townland of Roskeen. What is it’s address? Roskeen Co. Laois? But it’s a very small place, only 11 houses, and there are 50,000 townlands in the country. So people often put the name of the nearest large town, like Tullamore in this case. But if you’re cleaning, or ‘sanitising’ your address database it can throw out Tullamore Co. Laois, because Tullamore is in Offaly.
Here’s an example of a lookup for Roskeen. GeoDirectory stores it as Roskeen, Tullamore, Co. Laois. There are 11 addresses in this townland.
We can look at them on Google’s StreetView.
And /or we can see them on a map. We could present this to a user / the public and ask them to pick the correct building. But this is an extra step – and that leads me on to…
But when dealing with address databases our primary concern as GI professionals is accuracy, so we are willing to concede simplicity or speed – making things complex and/or slow for the user. However the end user usually wants things to be fast and simple and is not usually concerned with accuracy at all. So we need to make a trade-off
Based on usage metrics
What purpose are you using the address for? If it’s for the classic case of Ambulance routing, for example, you need a precise match to a building / address, if you don’t get this level of match then you need to handle that – presenting the user with a map to click on might be an option, but this is an extra step and is another hoop for the user to jump through. Or perhaps you’re looking at something like usage of rural transport initiatives, in which case you don’t necessarily need building level matches – a townland match will suffice.
So the GeoCoding process needs to be informed by the use to which you will be putting the outputs. Do you need building level matches? Or will Area based matching suffice? Perhaps a mix of the two – an example might be flooding, where you need building level matching in some parts of the county, but area based will be sufficient in other parts.
Here’s an example of a return showing differing levels of match. The ‘L’ indicates a locality level match, A’ for address point, ‘ B’ for building and ‘ T’ for Town, S for Street – or thrufare as GeoDir Call it In this case we could assign certain match levels for further processing if required. This tagging with MatchLevel is key.
It will be tempting to simply set up address capture forms that mirror your existing infrastructure. Instead, from the user’s point of view it is much simpler to add a couple of address lines, and maybe a county field – the user then doesn’t have to figure out what to put where, and this can be worked out afterwards and then reformatted to fit the existing database. It is also good practice to retain the address which the user entered.
In summary, some do’s and don’ts:
The end result of the geocoding process is a mappable table. Ready for mapping / analysis.
But there is a bigger opportunity here – each department in the organisation maintains it’s own address database, which is not efficient. Breaking down these data silos is the future, joined up thinking is needed, and a centralised address database is a key step in achieving this.
Having a single view of the citizen, or customer, can lead to big savings – reduced duplication of effort, reduced data management and can enable “joined up government” with the left hand knowing what the right hand is doing.
Why is this important? Because there is a huge opportunity here for GIS managers to put their data and expertise at the heart of the organisation – to break out from the IT department and into other parts of the organisation – wherever address data is used.
The value of GI is plain to us here, but is often underappreciated in your organisations. Properly GeoCoded data is a key tool to changing this.
The quality of GeoCoding in Ireland improved. GeoDirectory underpins GeoCoding in Ireland
It has been under constant refinement and improvement since it’s introduction 12 years ago, and allows GI professionals to conduct a very wide range of analysis – from simple exercises like this – empty Ireland..
..to more detailed analyses, like penetration rates – so you can map, say, the proportion of local authority owned housing not at DED level but by much smaller units – like these 250m grid squares, or whatever geography you need.
It is a complex database
Which is regularly updated. Has just moved to 6 updates per annum, previously had 4 updates.
Underlying technology has changed too
Data volumes are rising exponentially
Data is migrating to the cloud
One of the biggest changes to GeoCoding in Ireland is the proposed introduction of Postcodes. Indications at this stage are that the specification will be agreed by ‘late summer’ and it is probably going to be a building level system, rather than area based – like in NI/UK – because this is the only way of solving the non-unique address problem, without naming every road in the country and numbering every house.
Area based postcodes are well and good, but for some use cases you really need building level matching.
The Census has changed too, since this single pager of 1911
To over 1,100 variables collected in 2011. Key here is that GeoDirectory underpinned Census 2011, with corrections being fed back into GeoDirectory from the CSO.
There is also an impact in census data itself – Small Areas are coming. We expect to see data available at this level by December 2012. Moving from 3,400 DEDs to approx 19,000 SAs – much more detailed and granular data analysis possible.
The applications built upon GeoDirectory data, encompassing rules etc have been under constant refinement too.
The end result is an ever increasing level of match accuracy.
So we are now at the point where geocoding to building level from within MapInfo achieves acceptable rates of accuracy.
Our Partners AutoAddress have built a GeoCoding tool for MapInfo – it’s baked right in to the interface, and can do address lookups or batch geocoding.
Autocomplete on for Lookups
Returns a point
Setup the batch geocoder by picking fields in the base table
Process is very fast – over 30,000 records per hour single threaded. If you have more records than that there are other options.
End result is Addresses converted to Points, with a match level exceeding 90%
AutoAddress is the leading geocoder in Ireland and is gaining customers rapidly.
When working with addresses Data Protection is a primary concern.
Individuals care about the data which they have made available to you – you must consider this at all stages.
Other changes include the ever increasing range of source data – social media, geotagged photos, etc etc.
There are also location based Social Media services like FourSquare et al. Weather these are a flash in the pan remains to be seen, but their POI database is huge and they recently passed 1.5 Billion checkins. This data can then be easily consumed in a range of formats and displayed on an ever widening range of mapping platforms, many of which are based on OpenStreetMap
Open data initiatives like Dublinked
To changes in how you manage and deliver your spatial data to your organisation, via products like MapInfo Spatial Server.
Which is a fully featured spatial data management system that puts your spatial data at the fingertips of everyone in your organisation.
Thanks for your time. If you have any questions we’ll take them at the end of this session, but you can also contact me at this email address, and I’ll be at the GAMMA stand in the exhibition hall for the rest of the day.
Transcript of "The GeoCoding Revolution"
The GeoCoding Revolution Richard Cantwell email@example.com www.gamma.ie
The Art (or Science) of converting an address to a point on a map
It isnt uncommon for neighbours along a street to write theiraddresses differently:56 Woodbrook Sq.,Castleknock,Dublin 1542 Woodbrook Sq.,Diswellstown Rd.,Clonsilla,Dublin 1534 Woodbrook Sq.,Carpenterstown,Dublin 15
Know what you’re measuringSample Return, with multiple Match Levels.
Impose Order on Freeform Data.... After time of capture.
Some “Do’s and Donts” • Do expect a variety of address formats and contents, even for the same address point. • Don’t argue with the users about the one they use. • Do anticipate that non-unique addresses will be common. • Do trade off address process completeness for speed and clarity. • Do know what level of match you need: building level, county level etc. • Do record the match level for each address. • Do capture addresses in free form, sort out structure later. • Don’t let your back-end IT systems determine how you store an address.
End ResultSample output, with Match Level, Grid Refs, Lat-Long and other identifiers
Key points:Ongoing developments in data and systems mean it is now possibleto geocode in desktop GIS with acceptable match rates, withoutinteractive processing, although interactive processing isrecommended.Postcodes are coming, they represent a major opportunity to widenthe remit of GIS Managers.
New sources of GeoDatahttp://www.flickr.com/photos/walkingsf/4672160490
New Kinds of Address Data & PlatformsFourSquare data (via GeoRSS) on a map from http://www.dotspotting.org
Key Points: • Data Protection and Privacy concerns are paramount. • The range and depth of available spatial data is exploding. • New systems and platforms to address these are available.
2012 is going to be a revolutionary year: • Advances in GeoCoding. • Postcodes are coming. • New datasets and sources of data. • New technologies.Mean that this year has huge opportunities for GIS Managers.