SlideShare a Scribd company logo
Cleansing land ownership data, an
FME use case
David Eagle
Principal Consultant
david.eagle@1spatial.com
@david_eagle
Agenda
•
•
•
•

1Spatial
Asset management, the case for good data
The data challenge
Technical solution
– Regex and Lists

• Benefits
• Founded in 1969
– Part of the Cambridge Tech Cluster

• Headquarters in Cambridge, UK
– International offices in Australia, Ireland,
Belgium & France
• A group of innovative, market leading technology
companies:
Our Customers
•A specialist provider to National Mapping and Charting
Agencies, Government, Defence and Utilities
Our Partners
Customer Case Study
• Fisher German
– Multi-discipline firm of Chartered Surveyors, Town
Planners, Property Consultants & Specialist Engineers
– Management of:
• 4000km of high pressure oil pipeline
• 2500km fibre network

– Creators of:
• www.linesearchbeforeudig.co.uk a free to use enquiry tool used
by BT, HA, Utilities, Local Gov’t etc
• >45 members with protected assets such as:
Linear Asset Management
• Key role is management and protection of
buried and overhead assets:
– High pressure oil and gas pipelines
– Fibre optics
– Overhead power lines

• Need to ensure access to assets for inspection,
maintenance, upgrade and safety.
• Document, maintain and manage details of
land ownership in the vicinity of assets.
Why is Linear Asset Management Important?


Hunton Hill – Birmingham



Shop - New gas supply
connection



25mm PE connection to a
150mm cast iron main



1hr job!



Found 300mm steel pipe



Drilled anyway



3hrs later…
A close call…
Cross Cut-out showing carrier pipepipe epoxy shell shell repair
section highlighting carrier and and epoxy repair



5mm wall



0.5mm left



Petrol pressure 100 Bar
(1400psi)



Gas main is 100psi
The importance of accurate data
•
•
•

Ownership rights – Gas pipe and pond in Dorset
Incorrect grantor was on the mailing list
Land Registry data saves the day
The systems
Before
•Asset management system – UDB
•Desktop GIS – Spatial data managed and edited
– No synchronisation and some duplication

After
•Database extended to support ‘spatial’
•Single data source served to UDB and desktop
•Addition of web client for view only
•Data editing via WFS-t
Mitigating the risk
•
•
•
•

New project = New desk exercise
Data is purchased from the Land Registry
Known ownership along alignment is collated
Site visits enhance ownership details
–
–
–
–
–

Access points
Difficult access
Tenants
Where is asset exactly?
Dogs!
Data to feed the systems
• At the start of a project it’s necessary to collate a number
of datasets
• Project inputs:
1.
2.
3.
4.
5.
6.

Existing asset data and records
Route Corridor
Land Registry Shape and CSV
On site inspection data
Constraints mapping – Environmental Stewardship, Commonland Register
Other External Datasets
The process
• Manual QA and formatting steps:
1.
2.
3.

Processing of the CSVs into the required schema
Merge with the cleaned and aggregated geospatial data
Import into online management systems

• Manual Process could take several days to process and
involve 2 or 3 people
– Each project can have over 10,500 title deeds & 7,000 grantors

• 300 grantors = 2 days of manual effort
Land Registry - Attributes
•

Fundamental but presents some challenges

•

The deed address details are supplied in a CSV
–
–
–
–

•
•

Title Number – Title reference number
Tenure – Freehold etc
Proprietor – Full name and address
Address – Description of position of address/land

Extra fee to get a ‘slightly’ better structure
It still requires significant manual effort to format
Land Registry - Geometry
•
•
•
•
•

All geometry (each title polygon) is held in an
ESRI Shape file
Many polygons are split into a number of pieces
The Land Registry holds and exports the data
tiled
Features are not aggregated on export
The geometry needs joining to the attributes
before with the PK
What is FME?
•
•
•

Industry standard translation and transformation software
Supports >300 formats
Allows manipulation of many data types:
The case for FME
•
•

FME is often bought for a specific task.
The value comes when it’s used for tasks not previously
considered
– Fisher German’s initial impetus was loading their database

•
•

They turned to FME to clean and conflate their data later
Building a case for FME wasn’t necessary
– Re-use the flexible technology and get a better ROI
Automate and re-use
•
•
•

Automate out the mundane with FME
Avoid hours of Excel copy/paste
Allow staff to focus on the analysis

•
•

First task, process 6 linear asset project files
24,000 Land Registry records processed in 30 seconds with FME
• Previously this would have taken >6 days.
Subsequent steps clean up the geometry and merge the attributes – but
this is a classic FME task!

•
Automate and re-use

•
•
•
•

Lots of Testers/TestFilters
Popular Transformers: http://goo.gl/4rOGf
• Adopt “If, then else” approach.
FME 2013 SP1 more capable with ‘Conditional Mapping’
• http://evangelism.safe.com/fmeevangelist113/
The success of the process relies on two capabilities.
1. Lists
2. Regex
Lists

• A list is a method by which FME permits a single
attribute to hold multiple values
Polygon
Polygon
contains 12
contains 12
trees
trees

tree.Species{0} oak
tree.Species{0} oak
tree.Species{1} ash
tree.Species{1} ash
tree.Species{2} birch
tree.Species{2} birch
tree.Species{3} oak
tree.Species{3} oak
tree.Species{4} birch
tree.Species{4} birch
tree.Species{5} birch
tree.Species{5} birch
Challenge 1: Split the ‘Proprietor’ into ‘Name’ & ‘Address’
“ SOUTH EASTERN POWER NETWORKS PLC Newington House, 99 Southwark Bridge Street, London SN1 1AB ”

•Tester – Pass: If Proprietor Begins with <space>
•AttributeSetter: It’s a Commercial business
•AttributeSplitter: Split on 2 <spaces> and trim whitespace
•

proprietor.Proprietor{0} SOUTH EASTERN POWER NETWORKS PLC

•

proprietor.Proprietor{1} Newington House, 99 Southwark Bridge Street, London SN1 1AB

•AttributeRenamer:
•

Name = SOUTH EASTERN POWER NETWORKS PLC

•

Address = Newington House, 99 Southwark Bridge Street, London SN1 1AB
Challenge 1: Split the ‘Proprietor’ into ‘Name’ & ‘Address’
“JOHN EDMUND SMITH

Big Farm, Preston, Canterbury, Kent ” *

•Tester - Fail: (Proprietor did NOT begin with <space>)
•AttributeSetter: It’s a Residential property
•AttributeSplitter: Split on 4 <spaces> and trim whitespace
•
•

proprietor.Proprietor{0} JOHN EDMUND SMITH
proprietor.Proprietor{1} Big Farm, Preston, Canterbury, Kent

•AttributeRenamer:
•
•

Name = JOHN EDMUND SMITH
Address = Big Farm, Preston, Canterbury, Kent
Challenge 2: Split the Address into appropriate parts
“Newington House, 99 Southwark Bridge Street, London SN1 1AB”

•AttributeSplitter: Split on , and trim whitespace
•
•
•

•

ListElementCounter = 3

•

AttributeRenamer:

•
•
•

•

proprietor.Address{0} Newington House
proprietor.Address{1} 99 Southwark Bridge Street
proprietor.Address{2} London SN1 1AB

Address1 = Newington House
Address2 = 99 Southwark Bridge Street
Town = London SN1 1AB

Depending on data, 3 elements may or may not include a postcode!?
Regex
•

Regular Expressions are a language used for:
•
•
•
•

Pattern matching
String searching
String parsing
String replacement

/FME/

“ W l o v e FM 2 0 1 3 ! ”
e
E
“ FM i s g r e a t ! ”
E

/^FME/

“ W l o v e FM 2 0 1 3 ! ”
e
E
“ FM i s g r e a t ! ”
E

/colou?r/

“ FM i s col ourf u l ! ”
E
“ FM i s col orf u l ! ”
E

^ at start
$ at end

? optional
char.
Challenge 3: Spot the Postcode

• Regex = pattern matching and string manipulation
• http://rubular.com/ - Helps you test!
String:
Regex:

AGI NORTH
([A-Z]*)[ ]([A-Z]*)

String:
Regular Expression:

London SN1 1AB
^(.*S)s+(S{2,4}sS{3})s*$

• Use StringSearcher = Matched output port provides…
•
•

_matched_parts{0} London
_matched_parts{1} SN1 1AB
There were lots more challenges on a similar theme…
Other tasks: Structure and Schema
•
•

Remove duplicate records
Apply common format to names e.g. A A Smith to A.A. Smith

•

Resolve addresses listed twice in the same string
•
•

Common where 2 partners live at same address
“2, High Street, Leicester 2 High Street Leicester”

•

Apply Title Case to names & tidy up use of hyphens

•

Add extra columns and fixed values for target schema

•

Split first names and last name into 2 columns – more Regex!

•

Validate the County names against a list of allowed Counties &
resolve abbreviations - AttributeValueMapper
Summary

• Saves time
•

Before: >1 day of data prep per project

•

After: Using FME, a few seconds to do 80% of the work

• Save money
•

No extra fee to the Land Registry to restructure the data

•

No unnecessary staff time on mundane formatting tasks

• Increased ROI
•

Fisher German already had FME

•

Just consider what else you could adapt FME to do…
Thank you

David Eagle
Principal Consultant
david.eagle@1spatial.com
@david_eagle

More Related Content

Similar to Cleansing land ownership data, an FME use case - David Eagle

How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
Amit Gupta
 
Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME
Safe Software
 
Trusted BIM: Accurate As-Builts for Project Coordination
Trusted BIM: Accurate As-Builts for Project CoordinationTrusted BIM: Accurate As-Builts for Project Coordination
Trusted BIM: Accurate As-Builts for Project Coordination
ClearEdge3D Inc
 
OZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodOZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodNicholas Henry
 
ESRI ERUC 2014 - Easy Automation for Process Efficiencies
ESRI ERUC 2014 - Easy Automation for Process EfficienciesESRI ERUC 2014 - Easy Automation for Process Efficiencies
ESRI ERUC 2014 - Easy Automation for Process EfficienciesTammy Kobliuk
 
Data Validation Victories: Tips for Better Data Quality
Data Validation Victories: Tips for Better Data QualityData Validation Victories: Tips for Better Data Quality
Data Validation Victories: Tips for Better Data Quality
Safe Software
 
FME User Stories from Around the World
FME User Stories from Around the WorldFME User Stories from Around the World
FME User Stories from Around the World
Safe Software
 
EU 2016 - FME Around the World
EU 2016 - FME Around the WorldEU 2016 - FME Around the World
EU 2016 - FME Around the World
Inovação GIS - Tecnologia da Informação
 
OpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals companyOpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals company
Alkacon Software GmbH & Co. KG
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Lucidworks
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
12d model - Whats new in V11
12d model - Whats new in V1112d model - Whats new in V11
12d model - Whats new in V11
mpoynts
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
lucenerevolution
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
Treasure Data, Inc.
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software Development
Alexis Seigneurin
 
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
BCC - Solutions for IBM Collaboration Software
 
Sims Metal Management Automates Enterprise Planning with EPBCS
Sims Metal Management Automates Enterprise Planning with EPBCSSims Metal Management Automates Enterprise Planning with EPBCS
Sims Metal Management Automates Enterprise Planning with EPBCS
Joseph Alaimo Jr
 
Back to FME School - Day 1: Your Data and FME
Back to FME School - Day 1: Your Data and FMEBack to FME School - Day 1: Your Data and FME
Back to FME School - Day 1: Your Data and FME
Safe Software
 
Relational data modeling trends for transactional applications
Relational data modeling trends for transactional applicationsRelational data modeling trends for transactional applications
Relational data modeling trends for transactional applications
Ike Ellis
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Andrii Vozniuk
 

Similar to Cleansing land ownership data, an FME use case - David Eagle (20)

How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
 
Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME
 
Trusted BIM: Accurate As-Builts for Project Coordination
Trusted BIM: Accurate As-Builts for Project CoordinationTrusted BIM: Accurate As-Builts for Project Coordination
Trusted BIM: Accurate As-Builts for Project Coordination
 
OZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodOZRI_presentation_Nik&Rod
OZRI_presentation_Nik&Rod
 
ESRI ERUC 2014 - Easy Automation for Process Efficiencies
ESRI ERUC 2014 - Easy Automation for Process EfficienciesESRI ERUC 2014 - Easy Automation for Process Efficiencies
ESRI ERUC 2014 - Easy Automation for Process Efficiencies
 
Data Validation Victories: Tips for Better Data Quality
Data Validation Victories: Tips for Better Data QualityData Validation Victories: Tips for Better Data Quality
Data Validation Victories: Tips for Better Data Quality
 
FME User Stories from Around the World
FME User Stories from Around the WorldFME User Stories from Around the World
FME User Stories from Around the World
 
EU 2016 - FME Around the World
EU 2016 - FME Around the WorldEU 2016 - FME Around the World
EU 2016 - FME Around the World
 
OpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals companyOpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals company
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
12d model - Whats new in V11
12d model - Whats new in V1112d model - Whats new in V11
12d model - Whats new in V11
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software Development
 
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
 
Sims Metal Management Automates Enterprise Planning with EPBCS
Sims Metal Management Automates Enterprise Planning with EPBCSSims Metal Management Automates Enterprise Planning with EPBCS
Sims Metal Management Automates Enterprise Planning with EPBCS
 
Back to FME School - Day 1: Your Data and FME
Back to FME School - Day 1: Your Data and FMEBack to FME School - Day 1: Your Data and FME
Back to FME School - Day 1: Your Data and FME
 
Relational data modeling trends for transactional applications
Relational data modeling trends for transactional applicationsRelational data modeling trends for transactional applications
Relational data modeling trends for transactional applications
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 

More from Association for Geographic Information (AGI)

AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map MonAGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
Association for Geographic Information (AGI)
 
AGI Cymru 2013 - Perspectives from Above - Linknode Ltd
AGI Cymru 2013 - Perspectives from Above - Linknode LtdAGI Cymru 2013 - Perspectives from Above - Linknode Ltd
AGI Cymru 2013 - Perspectives from Above - Linknode Ltd
Association for Geographic Information (AGI)
 
AGI Cymru 2013 - Perspectives from Above - Getmapping PLC
AGI Cymru 2013 - Perspectives from Above - Getmapping PLCAGI Cymru 2013 - Perspectives from Above - Getmapping PLC
AGI Cymru 2013 - Perspectives from Above - Getmapping PLC
Association for Geographic Information (AGI)
 
AGI Cymru - Keynote - Dr Emyr Roberts
AGI Cymru - Keynote - Dr Emyr RobertsAGI Cymru - Keynote - Dr Emyr Roberts
AGI Cymru - Keynote - Dr Emyr Roberts
Association for Geographic Information (AGI)
 
Maximising the use of 3D data (Tom Timms, STAR APIC)
Maximising the use of 3D data (Tom Timms, STAR APIC)Maximising the use of 3D data (Tom Timms, STAR APIC)
Maximising the use of 3D data (Tom Timms, STAR APIC)
Association for Geographic Information (AGI)
 
Using the National Population Database to manage risk (Kirsty Forder)
Using the National Population Database to manage risk (Kirsty Forder)Using the National Population Database to manage risk (Kirsty Forder)
Using the National Population Database to manage risk (Kirsty Forder)
Association for Geographic Information (AGI)
 
Geology, Apps, Maps & Augmented Reality (Patrick Bell, BGS)
Geology, Apps, Maps & Augmented Reality  (Patrick Bell, BGS)Geology, Apps, Maps & Augmented Reality  (Patrick Bell, BGS)
Geology, Apps, Maps & Augmented Reality (Patrick Bell, BGS)
Association for Geographic Information (AGI)
 
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
Association for Geographic Information (AGI)
 
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
Association for Geographic Information (AGI)
 
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
Association for Geographic Information (AGI)
 
Automating Flood Damage Assessment (Katie Graves, Arup)
Automating Flood Damage Assessment  (Katie Graves, Arup)Automating Flood Damage Assessment  (Katie Graves, Arup)
Automating Flood Damage Assessment (Katie Graves, Arup)
Association for Geographic Information (AGI)
 

More from Association for Geographic Information (AGI) (20)

Future Cities Catapult - Teresa Gonzalez Rico, Future Cities Catapult
Future Cities Catapult - Teresa Gonzalez Rico, Future Cities CatapultFuture Cities Catapult - Teresa Gonzalez Rico, Future Cities Catapult
Future Cities Catapult - Teresa Gonzalez Rico, Future Cities Catapult
 
Mixing it up on the East Side - Ross McDonald, Angus Council
Mixing it up on the East Side - Ross McDonald, Angus CouncilMixing it up on the East Side - Ross McDonald, Angus Council
Mixing it up on the East Side - Ross McDonald, Angus Council
 
The use of GIS in support of the Glasgow 2014 Commonwealth Games - Iain Paton...
The use of GIS in support of the Glasgow 2014 Commonwealth Games - Iain Paton...The use of GIS in support of the Glasgow 2014 Commonwealth Games - Iain Paton...
The use of GIS in support of the Glasgow 2014 Commonwealth Games - Iain Paton...
 
Energy Planning for Smart Cities - George Kirk, Scottish Power Energy Networks
Energy Planning for Smart Cities - George Kirk, Scottish Power Energy NetworksEnergy Planning for Smart Cities - George Kirk, Scottish Power Energy Networks
Energy Planning for Smart Cities - George Kirk, Scottish Power Energy Networks
 
Web-GIS used to support and strengthen Environmental and Social Management Pl...
Web-GIS used to support and strengthen Environmental and Social Management Pl...Web-GIS used to support and strengthen Environmental and Social Management Pl...
Web-GIS used to support and strengthen Environmental and Social Management Pl...
 
Glasgow setting the standard for Europe: an innovative public-private partner...
Glasgow setting the standard for Europe: an innovative public-private partner...Glasgow setting the standard for Europe: an innovative public-private partner...
Glasgow setting the standard for Europe: an innovative public-private partner...
 
Asset information visualisation for Scottish Water, an open source and agile ...
Asset information visualisation for Scottish Water, an open source and agile ...Asset information visualisation for Scottish Water, an open source and agile ...
Asset information visualisation for Scottish Water, an open source and agile ...
 
A New Data Landscape: Delivering Open Data for the City of Glasgow - Steven R...
A New Data Landscape: Delivering Open Data for the City of Glasgow - Steven R...A New Data Landscape: Delivering Open Data for the City of Glasgow - Steven R...
A New Data Landscape: Delivering Open Data for the City of Glasgow - Steven R...
 
AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map MonAGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
AGI Cymru 2013 - Plenary - NRW / Astun - Fish Map Mon
 
AGI Cymru 2013 - Perspectives from Above - Linknode Ltd
AGI Cymru 2013 - Perspectives from Above - Linknode LtdAGI Cymru 2013 - Perspectives from Above - Linknode Ltd
AGI Cymru 2013 - Perspectives from Above - Linknode Ltd
 
AGI Cymru 2013 - Perspectives from Above - Getmapping PLC
AGI Cymru 2013 - Perspectives from Above - Getmapping PLCAGI Cymru 2013 - Perspectives from Above - Getmapping PLC
AGI Cymru 2013 - Perspectives from Above - Getmapping PLC
 
AGI Cymru - Keynote - Dr Emyr Roberts
AGI Cymru - Keynote - Dr Emyr RobertsAGI Cymru - Keynote - Dr Emyr Roberts
AGI Cymru - Keynote - Dr Emyr Roberts
 
Maximising the use of 3D data (Tom Timms, STAR APIC)
Maximising the use of 3D data (Tom Timms, STAR APIC)Maximising the use of 3D data (Tom Timms, STAR APIC)
Maximising the use of 3D data (Tom Timms, STAR APIC)
 
Using the National Population Database to manage risk (Kirsty Forder)
Using the National Population Database to manage risk (Kirsty Forder)Using the National Population Database to manage risk (Kirsty Forder)
Using the National Population Database to manage risk (Kirsty Forder)
 
Geology, Apps, Maps & Augmented Reality (Patrick Bell, BGS)
Geology, Apps, Maps & Augmented Reality  (Patrick Bell, BGS)Geology, Apps, Maps & Augmented Reality  (Patrick Bell, BGS)
Geology, Apps, Maps & Augmented Reality (Patrick Bell, BGS)
 
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
Making Infrastructure Work: BIM Meets Geospatial (Rollo Home, Ordnance Survey)
 
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
Master Planning Cities (with the help of CityEngine) (Elliot Hartley, Garsdal...
 
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
Implementing BIM to Realise Benefits (Peter Scuderi, Arup)
 
Automating Flood Damage Assessment (Katie Graves, Arup)
Automating Flood Damage Assessment  (Katie Graves, Arup)Automating Flood Damage Assessment  (Katie Graves, Arup)
Automating Flood Damage Assessment (Katie Graves, Arup)
 
Geospatial Data for Augmented Reality - Crispin Hoult
Geospatial Data for Augmented Reality - Crispin HoultGeospatial Data for Augmented Reality - Crispin Hoult
Geospatial Data for Augmented Reality - Crispin Hoult
 

Recently uploaded

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 

Recently uploaded (20)

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 

Cleansing land ownership data, an FME use case - David Eagle

  • 1. Cleansing land ownership data, an FME use case David Eagle Principal Consultant david.eagle@1spatial.com @david_eagle
  • 2. Agenda • • • • 1Spatial Asset management, the case for good data The data challenge Technical solution – Regex and Lists • Benefits
  • 3. • Founded in 1969 – Part of the Cambridge Tech Cluster • Headquarters in Cambridge, UK – International offices in Australia, Ireland, Belgium & France
  • 4. • A group of innovative, market leading technology companies:
  • 5. Our Customers •A specialist provider to National Mapping and Charting Agencies, Government, Defence and Utilities
  • 7. Customer Case Study • Fisher German – Multi-discipline firm of Chartered Surveyors, Town Planners, Property Consultants & Specialist Engineers – Management of: • 4000km of high pressure oil pipeline • 2500km fibre network – Creators of: • www.linesearchbeforeudig.co.uk a free to use enquiry tool used by BT, HA, Utilities, Local Gov’t etc • >45 members with protected assets such as:
  • 8. Linear Asset Management • Key role is management and protection of buried and overhead assets: – High pressure oil and gas pipelines – Fibre optics – Overhead power lines • Need to ensure access to assets for inspection, maintenance, upgrade and safety. • Document, maintain and manage details of land ownership in the vicinity of assets.
  • 9. Why is Linear Asset Management Important?  Hunton Hill – Birmingham  Shop - New gas supply connection  25mm PE connection to a 150mm cast iron main  1hr job!  Found 300mm steel pipe  Drilled anyway  3hrs later…
  • 10.
  • 11. A close call… Cross Cut-out showing carrier pipepipe epoxy shell shell repair section highlighting carrier and and epoxy repair  5mm wall  0.5mm left  Petrol pressure 100 Bar (1400psi)  Gas main is 100psi
  • 12. The importance of accurate data • • • Ownership rights – Gas pipe and pond in Dorset Incorrect grantor was on the mailing list Land Registry data saves the day
  • 13. The systems Before •Asset management system – UDB •Desktop GIS – Spatial data managed and edited – No synchronisation and some duplication After •Database extended to support ‘spatial’ •Single data source served to UDB and desktop •Addition of web client for view only •Data editing via WFS-t
  • 14. Mitigating the risk • • • • New project = New desk exercise Data is purchased from the Land Registry Known ownership along alignment is collated Site visits enhance ownership details – – – – – Access points Difficult access Tenants Where is asset exactly? Dogs!
  • 15. Data to feed the systems • At the start of a project it’s necessary to collate a number of datasets • Project inputs: 1. 2. 3. 4. 5. 6. Existing asset data and records Route Corridor Land Registry Shape and CSV On site inspection data Constraints mapping – Environmental Stewardship, Commonland Register Other External Datasets
  • 16. The process • Manual QA and formatting steps: 1. 2. 3. Processing of the CSVs into the required schema Merge with the cleaned and aggregated geospatial data Import into online management systems • Manual Process could take several days to process and involve 2 or 3 people – Each project can have over 10,500 title deeds & 7,000 grantors • 300 grantors = 2 days of manual effort
  • 17. Land Registry - Attributes • Fundamental but presents some challenges • The deed address details are supplied in a CSV – – – – • • Title Number – Title reference number Tenure – Freehold etc Proprietor – Full name and address Address – Description of position of address/land Extra fee to get a ‘slightly’ better structure It still requires significant manual effort to format
  • 18. Land Registry - Geometry • • • • • All geometry (each title polygon) is held in an ESRI Shape file Many polygons are split into a number of pieces The Land Registry holds and exports the data tiled Features are not aggregated on export The geometry needs joining to the attributes before with the PK
  • 19. What is FME? • • • Industry standard translation and transformation software Supports >300 formats Allows manipulation of many data types:
  • 20. The case for FME • • FME is often bought for a specific task. The value comes when it’s used for tasks not previously considered – Fisher German’s initial impetus was loading their database • • They turned to FME to clean and conflate their data later Building a case for FME wasn’t necessary – Re-use the flexible technology and get a better ROI
  • 21. Automate and re-use • • • Automate out the mundane with FME Avoid hours of Excel copy/paste Allow staff to focus on the analysis • • First task, process 6 linear asset project files 24,000 Land Registry records processed in 30 seconds with FME • Previously this would have taken >6 days. Subsequent steps clean up the geometry and merge the attributes – but this is a classic FME task! •
  • 22. Automate and re-use • • • • Lots of Testers/TestFilters Popular Transformers: http://goo.gl/4rOGf • Adopt “If, then else” approach. FME 2013 SP1 more capable with ‘Conditional Mapping’ • http://evangelism.safe.com/fmeevangelist113/ The success of the process relies on two capabilities. 1. Lists 2. Regex
  • 23. Lists • A list is a method by which FME permits a single attribute to hold multiple values Polygon Polygon contains 12 contains 12 trees trees tree.Species{0} oak tree.Species{0} oak tree.Species{1} ash tree.Species{1} ash tree.Species{2} birch tree.Species{2} birch tree.Species{3} oak tree.Species{3} oak tree.Species{4} birch tree.Species{4} birch tree.Species{5} birch tree.Species{5} birch
  • 24. Challenge 1: Split the ‘Proprietor’ into ‘Name’ & ‘Address’ “ SOUTH EASTERN POWER NETWORKS PLC Newington House, 99 Southwark Bridge Street, London SN1 1AB ” •Tester – Pass: If Proprietor Begins with <space> •AttributeSetter: It’s a Commercial business •AttributeSplitter: Split on 2 <spaces> and trim whitespace • proprietor.Proprietor{0} SOUTH EASTERN POWER NETWORKS PLC • proprietor.Proprietor{1} Newington House, 99 Southwark Bridge Street, London SN1 1AB •AttributeRenamer: • Name = SOUTH EASTERN POWER NETWORKS PLC • Address = Newington House, 99 Southwark Bridge Street, London SN1 1AB
  • 25. Challenge 1: Split the ‘Proprietor’ into ‘Name’ & ‘Address’ “JOHN EDMUND SMITH Big Farm, Preston, Canterbury, Kent ” * •Tester - Fail: (Proprietor did NOT begin with <space>) •AttributeSetter: It’s a Residential property •AttributeSplitter: Split on 4 <spaces> and trim whitespace • • proprietor.Proprietor{0} JOHN EDMUND SMITH proprietor.Proprietor{1} Big Farm, Preston, Canterbury, Kent •AttributeRenamer: • • Name = JOHN EDMUND SMITH Address = Big Farm, Preston, Canterbury, Kent
  • 26. Challenge 2: Split the Address into appropriate parts “Newington House, 99 Southwark Bridge Street, London SN1 1AB” •AttributeSplitter: Split on , and trim whitespace • • • • ListElementCounter = 3 • AttributeRenamer: • • • • proprietor.Address{0} Newington House proprietor.Address{1} 99 Southwark Bridge Street proprietor.Address{2} London SN1 1AB Address1 = Newington House Address2 = 99 Southwark Bridge Street Town = London SN1 1AB Depending on data, 3 elements may or may not include a postcode!?
  • 27. Regex • Regular Expressions are a language used for: • • • • Pattern matching String searching String parsing String replacement /FME/ “ W l o v e FM 2 0 1 3 ! ” e E “ FM i s g r e a t ! ” E /^FME/ “ W l o v e FM 2 0 1 3 ! ” e E “ FM i s g r e a t ! ” E /colou?r/ “ FM i s col ourf u l ! ” E “ FM i s col orf u l ! ” E ^ at start $ at end ? optional char.
  • 28. Challenge 3: Spot the Postcode • Regex = pattern matching and string manipulation • http://rubular.com/ - Helps you test! String: Regex: AGI NORTH ([A-Z]*)[ ]([A-Z]*) String: Regular Expression: London SN1 1AB ^(.*S)s+(S{2,4}sS{3})s*$ • Use StringSearcher = Matched output port provides… • • _matched_parts{0} London _matched_parts{1} SN1 1AB
  • 29. There were lots more challenges on a similar theme…
  • 30. Other tasks: Structure and Schema • • Remove duplicate records Apply common format to names e.g. A A Smith to A.A. Smith • Resolve addresses listed twice in the same string • • Common where 2 partners live at same address “2, High Street, Leicester 2 High Street Leicester” • Apply Title Case to names & tidy up use of hyphens • Add extra columns and fixed values for target schema • Split first names and last name into 2 columns – more Regex! • Validate the County names against a list of allowed Counties & resolve abbreviations - AttributeValueMapper
  • 31. Summary • Saves time • Before: >1 day of data prep per project • After: Using FME, a few seconds to do 80% of the work • Save money • No extra fee to the Land Registry to restructure the data • No unnecessary staff time on mundane formatting tasks • Increased ROI • Fisher German already had FME • Just consider what else you could adapt FME to do…
  • 32. Thank you David Eagle Principal Consultant david.eagle@1spatial.com @david_eagle