Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Ons households july 17 addressing ac mj
1. Addresses Vs Households - RSS July ‘17
Building an address index for
census and beyond
Alistair Calder
Head of Addressing
Data Architecture - ONS
Mike James
Head of Address Research
Data Architecture - ONS
2. Addresses
• ONS Requirements – and why it has now become easy
• Issues – and why it is still really hard
• Addressing in Government - joining up
• Addresses and Admin data – building quality
• Demo
(& an annexe)
5. The requirement (tbc)
• A ‘complete’ household frame
>99% of household spaces ( addresses)
• Minimal over-coverage
duplicates / commercial / demolished etc < 2 or 3% ?
• A brilliant (integrated) communal frame
• Residential, communal & business (& non postal)
• Up to date, correctly located etc etc …. And more
11. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
20. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with the register than post-out –
huge opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
21.
22.
23.
24.
25. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
27. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
28. A probabilistic address frame
Probability of
• Existence of address
• type - HH/B/CE
• HH Size / structure
• Change / churn
• Hard to countness / category
• (multivariate >> categorisation
• Eg possible holiday home, carehome, student
accommodation
Address
Register
HH
Structure
2011
Census
HH structure,
churn, names
Activity data
Energy, utilities,
broadband, health,
house sales
Admin data
HH structure, churn,
names, house
prices, phone
numbers
Other
Shape / pattern
recognition
Survey paradata
Geoplace
And other CE sources
CE
New definition / schema
Inform field planning / targetting
Intelligent stratification
Prioritise follow up (address level)
Inform estimation & modelling
B
Business Reg
Business structure,
type, churn
Conceptually – all subject to ethical and privacy discussion !
Potentially
29. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
•
33. ONS or
citizen
servicesingle
address UPRN
10 High St PO15 5RR 1234567891011
batch of
addresses
addresses
UPRNs
batch
match
Addressbase load
UPRNs
addresses
classifications
Feedback
to source
(improving quality)
api
api
ONS Data Library
Address
Index
Business
Index
Address Matching - Beta
34. correct match rate
virtually zero false positives
balance between automatic & clerical
flexibility of match tuned but not limited
fast
scalable
accessible via api
non proprietary code -> open
Searching and matching – what we want
35. Avenue Cars Limted
1st Floor
St. William of York House
22-24 First Road,
Street, Somerset
ZE1ODW
synonyms
thesaurus
aliases
lookups
Parsing
Rules based +
Machine learning /
Natural language
Source
input address
address
components
how we are going to do it
36. Informed
decision –
clerical
intervention
HOPPER SCORE
Confidence rank
of options
ES
Fuzzy matching
Distance measures
synonyms
thesaurus
aliases
lookups
Parsing
Rules based +
Machine learning /
Natural language
Source
input address
address
components
AddressBase
hierarchies
ESindexes
39. Alpha – Address
Index build
2015 2016 2017 2018 2019 2020 2021 2022 2023April July October April July October
On-line Survey
transformation
Admin
Data
Admin Data –
Processing
Platform
Alpha
EDC – eQ Alpha EDC – eQ Beta
EDC – Response and Respondent Management Beta
Admin Data – Processing Platform Beta
EDC – Service enhancement
Admin/Survey
Integration
Discovery
Admin/Survey
Integration –
Alpha
Admin/Survey Integration – Beta
Alpha - Business
Index build
Beta - Business index
build
Beta - Address index
build
Registers
2019
Census
Rehearsal
Admin
Data for
Census
Census
Register / Index Platform for ONS
Live services
Decision to
proceed to
beta Develop data migration and data loader for new
BIS data source
IDBR Service Migration
IDBR Migration
Roadmap
Business Statistics
Decision(s) to go
live
2021
Census
Life Events, Social Survey etc etc
40. The Address Register in an Admin Data Census
• What is the role of the Address Register
• Address Register Quality
• Address Matching Demo (what could possibly go
wrong…?)
A perfect address register won't overcome all the
issues of moving from HH to address definition
But it sure would be helpful…
41. The Address Register in an Admin Data Census
People on Admin Data
Address Register
44. Matching Addresses to the Address Register
People on Admin Data
Address Register
Under Coverage
Over CoverageAddress Matching
45. Citizen Address Search
Citizens Identifying Their Address in Admin Data
Address Register
Under Coverage
Over Coverage
I Live There
46. Strategy for Delivering Quality
• Using AddressBase Premium (ABP)
– 2.2M more residential addresses than PAF
• Close partnership with Geoplace
• Lots of LA engagement
• Supporting the use of ABP in Government – embed Unique Property
Reference Number (UPRN) throughout government data
• Understanding types of error, their causes and
impacts
– Over coverage (duplication, misclassifications)
• Non-existent annexes (included by some idiot…..)
– Under coverage (missed HMO, missing new builds,
misclassifications)
– Single instance or clustered?
47. Methods & Evidence of Quality
• Over-Coverage
– Social survey outcomes (does the sample include non-
residential addresses?)
• Using ABP to clean PAF removes majority of non-residential addresses
– Analysing Census tests
• Number and cause on non-deliveries (non-residential, not yet built)
– Within 1% error target
– Can improve through GeoPlace/LA collaborative working
• Under-Coverage
– Admin data – are there addresses we can’t find on ABP?
• Sample of 100K – only 2 addresses we can’t find
– Social surveys – are there addresses we might misclassify as
non-residential?
• Sample of 120K – only 135 address we might misclassify (and these
are uncertain)
48. Communal Establishments
• Really important to Census
– Care homes, university halls, sheltered housing, etc
– Enumeration challenges
– Impact on statistics
• Really important to Admin Data Census!
– Working with ADC to understand their requirements
• Our approach:
– Create CE QA Pack for each CE type
• Definitions, data sources, risks, mitigations, LA risk analysis
– Provide a framework for identifying, monitoring and
improving CE data
50. Summary
• AddressBase at the core – need to confirm & ensure quality
• Linked and integrated indexes
• addresses, communals, businesses, attribution
• No separate national address register (except temp / operational)
• it is all about improving the national source
• Increased use of source >>> linking >>> feedback to improve the national hub
• Local Authority liaison critical to the plan
• Share approach and lists much earlier than before
• – but coding of AddressBase & LLPGs the key
• ONS highly supportive of openness / open data
– but not dependant upon it
• Matching Service Talking to GDS, OS, HMRC, BEIS, DWP , Wales … etc.
• Love to share and talk about addresses and matching
addresses@ons.gov.uk
51.
52. Questions?
(& come and talk to us)
alistair.calder@ons.gov.uk; @alistaircalder_
michael.james@ons.gov.uk
addresses@ons.gov.uk