Although definitions have changed, the concept of households have always been fundamental to the census and social statistics in the UK. The increasing move to the use of administrative data and other sources brings obvious opportunities - but also challenges in defining households. This event will provide an opportunity to discuss the issues and help inform ONS's future research in this area. It will be of interest to anyone who uses or cares about household statistics or is interested in the future of the statistical system.
A group of consultants have been the task to analyze a sample bill of Vodafone
The analysis should contain the following details:
Call:
Total no. of calls made; Called numbers & destinations
Duration of calls (Total & Avg.); Time of call (Peak, Off-Peak);Cost of calls; Breakup: Day & Date wise
SMS:
Total number of SMS sent; Breakup: Day & Date wise
Data Usage:
Total amount of usage(in KB), Breakup: Day & Date wise
Slides that were presented at ONS’ household income statistics user event in October. The slides cover
Developments in household income stats (Dominic Webber, ONS)
Administrative data research (Matthew Greenaway, ONS)
Methodological choices in the analysis of the Effects of Taxes and Benefits – (Tom Waters, IFS)
Future research (Dominic Webber, ONS)
This presentation covers the key question: Why dashboards? Local authorities and other public bodies have largely ended publishing reports and now produce dashboards. What are the factors that have contributed to this change?
This is the first presentation from our Workshop on 21 September 2023 on Dashboards, APIs and PowerBI.
ONS Local has been established by the Office for National Statistics (ONS) to support evidence-based decision-making at the local level. We aim to host insightful events that connect our users with exciting developments happening in subnational statistics and analysis at the ONS and across other organisations.
In April 2022, as the impact of increases in the Cost of Living really came to the forefront, Public Health & Communities, Suffolk County Council published a Cost of Living profile as part of the Joint Strategic Needs Assessment.
Alongside a written Cost of Living report ‘Making ends meet: The cost of living in Suffolk’, an interactive dashboard was also created using Power BI. In addition to internal data flows, publicly available data from sources such as the ONS have been used to provide a rich picture of the current situation for the local community.
The dashboard was developed in order to:
• Provide up to date data and information on the Cost of Living for Suffolk County Council, partner organisations, and members of the public.
• Deliver an interactive tool to allow users to focus on areas most relevant to them.
• Demonstrate that, while increases in the cost of living affect everyone, impact will be greatest for those who are already under financial pressure, exacerbating inequalities.
• Provide a source of actionable insight to support the system with the evidence base needed to support project development, drive change and really make a difference in the community.
Features of the dashboard:
• Place-focused - published at smaller geographies where possible
• Collaborative - Includes local data from across the system such as data shared by Citizens Advice and other system partners.
• Automated - Most data sources have automated connections, meaning there is little manual intervention required.
• Self-Service - Making the report publicly available puts data at the fingertips of colleagues, system partners and members of the public.
• Live - The dashboard is a living report which is frequently updated.
This session will:
• Provide a demonstration of Suffolk County Council’s Cost of Living dashboard
• Give an overview of data sources
• Explore opportunities for automation using Power BI
• Discuss how the data dashboard is used locally
This event is open to all; however, we anticipate it will be of most interest to anyone working on cost of living dashboards at the local level.
If you have any questions, please contact ons.local@ons.gov.uk.
ONS Local has been established by the Office for National Statistics (ONS) to promote evidence-based decision-making at the local level. We aim to host insightful workshops which will provide practical, technical support to help users make the most of ONS data. The Cross-Government Data Science Community brings together data scientists and analysts to build data science capability across the UK governments and public sector.
We are delighted to welcome you to our inaugural Workshop in our new series, entitled: 'How to use APIs'. The session will cover what Application Programming Interfaces (APIs) are, the advantages in using them and a practical demonstration of how they can be used. The journey of two Local Authority analysts as they begin using APIs in place of manual processes will be showcased to the audience. The session will conclude by explaining the plan for the forthcoming series of Workshops that will begin in September and introducing the Slack channel that ONS Local and Cross-Government DS community will be using to support users' technical questions going forward.
This event is open to all; however, we anticipate it will be of most interest to anyone working at a local level on creating data dashboards for internal or external use.
If you have any questions, please contact ons.local@ons.gov.uk.
ONS Local has been established by the Office for National Statistics (ONS) to promote evidence-based decision-making at the local level. We aim to host insightful workshops which will provide practical, technical support to help users make the most of ONS data. The Cross-Government Data Science Community brings together data scientists and analysts to build data science capability across the UK governments and public sector.
We are delighted to welcome you to our inaugural Workshop in our new series, entitled: 'How to use APIs'. The session will cover what Application Programming Interfaces (APIs) are, the advantages in using them and a practical demonstration of how they can be used. The journey of two Local Authority analysts as they begin using APIs in place of manual processes will be showcased to the audience. The session will conclude by explaining the plan for the forthcoming series of Workshops that will begin in September and introducing the Slack channel that ONS Local and Cross-Government DS community will be using to support users' technical questions going forward.
This event is open to all; however, we anticipate it will be of most interest to anyone working at a local level on creating data dashboards for internal or external use.
If you have any questions, please contact ons.local@ons.gov.uk.
ONS Local has been established by the Office for National Statistics (ONS) to promote evidence-based decision-making at the local level. We aim to host insightful workshops which will provide practical, technical support to help users make the most of ONS data. The Cross-Government Data Science Community brings together data scientists and analysts to build data science capability across the UK governments and public sector.
We are delighted to welcome you to our inaugural Workshop in our new series, entitled: 'How to use APIs'. The session will cover what Application Programming Interfaces (APIs) are, the advantages in using them and a practical demonstration of how they can be used. The journey of two Local Authority analysts as they begin using APIs in place of manual processes will be showcased to the audience. The session will conclude by explaining the plan for the forthcoming series of Workshops that will begin in September and introducing the Slack channel that ONS Local and Cross-Government DS community will be using to support users' technical questions going forward.
This event is open to all; however, we anticipate it will be of most interest to anyone working at a local level on creating data dashboards for internal or external use.
If you have any questions, please contact ons.local@ons.gov.uk.
ONS Local has been established by the Office for National Statistics (ONS) to support evidence-based decision-making at the local level. We aim to host insightful events that connect our users with exciting developments happening in subnational statistics and analysis at the ONS and across other organisations.
From 1 August 2019, the Secretary of State for Education delegated responsibility for the commissioning, delivery and management of London’s Adult Education Budget (AEB) to the Mayor of London. The AEB helps Londoners to get the skills they need to progress both in life and work. The overarching aim of London’s AEB is to make adult education in London even more accessible, impactful and locally relevant.
In this presentation, the Greater London Authority will be going through the results of the pioneering 2021/22 London Learner Survey (LLS). The survey’s objective is to gain insight into the outcomes of learners to inform and improve policy. The LLS consists of two linked surveys of learners who participated in GLA-funded Adult Education Budget (AEB) learning in the academic year 2021/22.
In the LLS, Learners are surveyed prior to and 5-7 months after completing their course to estimate the economic and social changes that learners experience following an AEB course.
In particular, the presentation will show the economic impact broken down by:
. Progression into employment
. Progression within work
. Progression into further learning.
The social impact will be explored by looking at changes in:
. Health and wellbeing
. Improved self-efficacy
. Improved social integration
. Participation in volunteering
The presentation will also cover how outcomes vary by funding type, breaking down the results by Community Learning and Adult Skills.
This event is open to all; however, we anticipate it will be of most interest to anyone working at a local level on skills, education and employment.
If you have any questions, please contact ons.local@ons.gov.uk.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. What do we mean by an Admin Data
Census?
• Aiming to replicate as many census outputs as possible
using admin data (and surveys) by 2021 to compare with
2021 Census
Recommendation in 2023
• Three key types of Census outputs:
• Size of population
• Number and structure of households
• Characteristics of housing and the population
• Lot of potential with admin data alone but it will not provide
the complete solution.
• Need access to range of admin data and combine with
surveys. Likely to need two new surveys:
• Annual 1% coverage survey to help measure size of
population and households
• Annual characteristics survey – size tbc 2
3. Census, population and migration
statistics system – the future
Current model – Census every ten years
• Lots of detail every ten years, down to small-
areas
• Less detail at regional and local authority levels
in the interim
Future model – Admin Data Census
Opportunities – more frequent statistics,
longitudinal analysis, new outputs
better statistics, better decisions
• For example, use of mobile phone data to produce more
frequent travel-to-work statistics, alternative population
bases (daytime populations)
4. How will we know if we’re ready to
move to an ADC?
• Research outputs every Autumn (first: 22
October 2015)
• expanding the accuracy and/or breadth and/or
detail each year
• Progress made on size of population, number of
households, income
• Assessment every Spring (first: 16 May 2016)
• Using five high level criteria
• where we are now
• where we expect to be by 2023
4
5. Impact of moving from households to
addresses?
• Analysed 2011 Census data
• 1.57% households had more than one household in
on address (UPRN)
5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
One person
household
One family only One family only:
With dependent
children/No
children
One family only: All
children non-
dependent
Other household
types: With
dependent children
Other household
types: all others
%ofhouseholds
More than one household in each UPRN by
composition (%)
6. House rules
We want to have lots of discussion today and
hear your views.
So that we can get the best out of the day:
• Please be constructive
• Before you speak, please tell us who you are
and where you’re from
• Sli.do – how to use it
6
7. Agenda
7
Time Session Lead
11.00-11.10 Welcome and introduction Becky Tinsley (ONS)
11.10-11.50 Addresses – progress and plans Alistair Calder/Mike James
(ONS)
11.50-12.10 Household definitions Dave Martin (Southampton Uni)
12.10-12.25 Questions and discussion Alistair Calder (ONS)
12.25-12.30 Intro to afternoon Becky Tinsley/ Alistair Calder
(ONS)
12.30-13.30 LUNCH
13.30-14.30 User needs – who needs household
statistics?
Rachel Leeser (GLA)
14.30-15.15 ONS Admin Data Census research
progress & future plans
Royal Mail
Claire Pereira, Marcus
Lewin (ONS)
Tony Lamb (Royal Mail)
15.15-15.45 Panel session
(Sarah Henry, Becky Tinsley, Dave
Martin, Rachel Leeser)
15.45-16.00 Wrap up, next steps Becky/Alistair (ONS)
8. Addresses Vs Households - RSS July ‘17
Building an address index for
census and beyond
Alistair Calder
Head of Addressing
Data Architecture - ONS
Mike James
Head of Address Research
Data Architecture - ONS
9. Addresses
• ONS Requirements – and why it has now become easy
• Issues – and why it is still really hard
• Addressing in Government - joining up
• Addresses and Admin data – building quality
• Demo
(& an annexe)
12. The requirement (tbc)
• A ‘complete’ household frame
>99% of household spaces ( addresses)
• Minimal over-coverage
duplicates / commercial / demolished etc < 2 or 3% ?
• A brilliant (integrated) communal frame
• Residential, communal & business (& non postal)
• Up to date, correctly located etc etc …. And more
18. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
27. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with the register than post-out –
huge opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
28.
29.
30.
31.
32. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
34. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
35. A probabilistic address frame
Probability of
• Existence of address
• type - HH/B/CE
• HH Size / structure
• Change / churn
• Hard to countness / category
• (multivariate >> categorisation
• Eg possible holiday home, carehome, student
accommodation
Address
Register
HH
Structure
2011
Census
HH structure,
churn, names
Activity data
Energy, utilities,
broadband, health,
house sales
Admin data
HH structure, churn,
names, house
prices, phone
numbers
Other
Shape / pattern
recognition
Survey paradata
Geoplace
And other CE sources
CE
New definition / schema
Inform field planning / targetting
Intelligent stratification
Prioritise follow up (address level)
Inform estimation & modelling
B
Business Reg
Business structure,
type, churn
Conceptually – all subject to ethical and privacy discussion !
Potentially
36. The challenge ..... Why it’s hard this time
• We have an excellent starting point but addresses are
complicated and change a lot. There will be error & error
clusters itself in the areas we care about the most – Very
difficult to check quality
• Extracting the right ones is difficult. Small errors can be
significant – and cause trauma
• Communals are important and particularly challenging
• We plan to do MUCH more with addresses than post-out – huge
opportunity but attribute thinking is new
• Addresses are complex so matching is really hard
•
40. ONS or
citizen
servicesingle
address UPRN
10 High St PO15 5RR 1234567891011
batch of
addresses
addresses
UPRNs
batch
match
Addressbase load
UPRNs
addresses
classifications
Feedback
to source
(improving quality)
api
api
ONS Data Library
Address
Index
Business
Index
Address Matching - Beta
41. correct match rate
virtually zero false positives
balance between automatic & clerical
flexibility of match tuned but not limited
fast
scalable
accessible via api
non proprietary code -> open
Searching and matching – what we want
42. Avenue Cars Limted
1st Floor
St. William of York House
22-24 First Road,
Street, Somerset
ZE1ODW
synonyms
thesaurus
aliases
lookups
Parsing
Rules based +
Machine learning /
Natural language
Source
input address
address
components
how we are going to do it
43. Informed
decision –
clerical
intervention
HOPPER SCORE
Confidence rank
of options
ES
Fuzzy matching
Distance measures
synonyms
thesaurus
aliases
lookups
Parsing
Rules based +
Machine learning /
Natural language
Source
input address
address
components
AddressBase
hierarchies
ESindexes
46. Alpha – Address
Index build
2015 2016 2017 2018 2019 2020 2021 2022 2023April July October April July October
On-line Survey
transformation
Admin
Data
Admin Data –
Processing
Platform
Alpha
EDC – eQ Alpha EDC – eQ Beta
EDC – Response and Respondent Management Beta
Admin Data – Processing Platform Beta
EDC – Service enhancement
Admin/Survey
Integration
Discovery
Admin/Survey
Integration –
Alpha
Admin/Survey Integration – Beta
Alpha - Business
Index build
Beta - Business index
build
Beta - Address index
build
Registers
2019
Census
Rehearsal
Admin
Data for
Census
Census
Register / Index Platform for ONS
Live services
Decision to
proceed to
beta Develop data migration and data loader for new
BIS data source
IDBR Service Migration
IDBR Migration
Roadmap
Business Statistics
Decision(s) to go
live
2021
Census
Life Events, Social Survey etc etc
47. The Address Register in an Admin Data Census
• What is the role of the Address Register
• Address Register Quality
• Address Matching Demo (what could possibly go
wrong…?)
A perfect address register won't overcome all the
issues of moving from HH to address definition
But it sure would be helpful…
48. The Address Register in an Admin Data Census
People on Admin Data
Address Register
51. Matching Addresses to the Address Register
People on Admin Data
Address Register
Under Coverage
Over CoverageAddress Matching
52. Citizen Address Search
Citizens Identifying Their Address in Admin Data
Address Register
Under Coverage
Over Coverage
I Live There
53. Strategy for Delivering Quality
• Using AddressBase Premium (ABP)
– 2.2M more residential addresses than PAF
• Close partnership with Geoplace
• Lots of LA engagement
• Supporting the use of ABP in Government – embed Unique Property
Reference Number (UPRN) throughout government data
• Understanding types of error, their causes and
impacts
– Over coverage (duplication, misclassifications)
• Non-existent annexes (included by some idiot…..)
– Under coverage (missed HMO, missing new builds,
misclassifications)
– Single instance or clustered?
54. Methods & Evidence of Quality
• Over-Coverage
– Social survey outcomes (does the sample include non-
residential addresses?)
• Using ABP to clean PAF removes majority of non-residential addresses
– Analysing Census tests
• Number and cause on non-deliveries (non-residential, not yet built)
– Within 1% error target
– Can improve through GeoPlace/LA collaborative working
• Under-Coverage
– Admin data – are there addresses we can’t find on ABP?
• Sample of 100K – only 2 addresses we can’t find
– Social surveys – are there addresses we might misclassify as
non-residential?
• Sample of 120K – only 135 address we might misclassify (and these
are uncertain)
55. Communal Establishments
• Really important to Census
– Care homes, university halls, sheltered housing, etc
– Enumeration challenges
– Impact on statistics
• Really important to Admin Data Census!
– Working with ADC to understand their requirements
• Our approach:
– Create CE QA Pack for each CE type
• Definitions, data sources, risks, mitigations, LA risk analysis
– Provide a framework for identifying, monitoring and
improving CE data
57. Summary
• AddressBase at the core – need to confirm & ensure quality
• Linked and integrated indexes
• addresses, communals, businesses, attribution
• No separate national address register (except temp / operational)
• it is all about improving the national source
• Increased use of source >>> linking >>> feedback to improve the national hub
• Local Authority liaison critical to the plan
• Share approach and lists much earlier than before
• – but coding of AddressBase & LLPGs the key
• ONS highly supportive of openness / open data
– but not dependant upon it
• Matching Service Talking to GDS, OS, HMRC, BEIS, DWP , Wales … etc.
• Love to share and talk about addresses and matching
addresses@ons.gov.uk
69. Questions?
(& come and talk to us)
alistair.calder@ons.gov.uk; @alistaircalder_
michael.james@ons.gov.uk
addresses@ons.gov.uk
70. Are you the householder?
David Martin
Deputy Director, UK Data Service
University of Southampton
Addresses vs households: who needs
household statistics?
13 July 2017
71. Are you the householder?
• ONS advice “Before you start”
• What is the census household definition?
• The importance of having something else in common
• Question time: households, spaces, dwellings and
related matters
• Household questions, derived variables and what we use
them for
• A new hierarchy of entities
• Matters for discussion
74. 74
What is the census household
definition? One person living
alone or…
https://census.ukdataservice.ac.uk/use-data/censuses/forms
75. 75
What is the census household
definition? One person living
alone or…
a group of people (not necessarily
related) living at the same address who
share cooking facilities and share a
living room or sitting room or dining area
2011
http://www.stat.fi/til/asuolo/kas_en.html
76. 76
What is the census household
definition? One person living
alone or…
a group of people (not necessarily
related) living at the same address who
share cooking facilities and share a
living room or sitting room or dining area
2011
a group of people (not necessarily related)
living at the same address with common
housekeeping - sharing either a living room
or sitting room, or at least one meal a day
2001
https://census.ukdataservice.ac.uk/use-data/censuses/forms
77. 77
What is the census household
definition? One person living
alone or…
a group of people (not necessarily
related) living at the same address who
share cooking facilities and share a
living room or sitting room or dining area
2011
a group of people (not necessarily related)
living at the same address with common
housekeeping - sharing either a living room
or sitting room, or at least one meal a day
2001
1991
A group of people not necessarily related,
living at the same address with common
housekeeping, that is, sharing at least one
meal a day or sharing a living room or
sitting room
https://census.ukdataservice.ac.uk/use-data/censuses/forms
78. 78
What is the census household
definition? One person living
alone or…
a group of people (not necessarily
related) living at the same address who
share cooking facilities and share a
living room or sitting room or dining area
2011
a group of people (not necessarily related)
living at the same address with common
housekeeping - sharing either a living room
or sitting room, or at least one meal a day
2001
1991
A group of people not necessarily related,
living at the same address with common
housekeeping, that is, sharing at least one
meal a day or sharing a living room or
sitting room
1981,
1971
A group of persons (not necessarily
related) living at the same address with
common housekeeping
https://census.ukdataservice.ac.uk/use-data/censuses/forms
79. 79
What is the census household
definition? One person living
alone or…
A group of people (not necessarily
related) living at the same address
1971-2011
With (variously) something
else in common!
80. • This much we should be able to do pretty well from
admin data, so it all comes down to which
“something else in common” we need
• (Results currently moderated by respondents’
interpretation of the secondary guidance phrase) 80
What is the census household
definition? One person living
alone or…
A group of people (not necessarily
related) living at the same address
81. Question time 1
• Q. What is the term for
accommodation used or
available for use by an
individual household?
• A. A household space
• Vacant household
spaces and household
spaces used as second
addresses are also
classified as household
spaces
Photo:DavidMartin
82. Question time 2
• Q. What is the term for
a unit of
accommodation which
may comprise one or
more household
spaces?
• A. A dwelling
• A dwelling may be
classified as shared or
unshared
Photo:DavidMartin
83. Question time 3
• Q. Are all units in
sheltered
accommodation where
half or more of the units
possess their own
facilities for cooking
classified as
households?
• A. Yes!
• If less than half the
units possess their own
cooking facilities,
classified as Communal
Establishments
Photo:DavidMartin
84. Question time 4
• Q. Are university owned
student houses that
were difficult to identify
and not clearly located
with other student
residences classified as
households?
• A. Yes!
• Accommodation
provided solely for
students (during term-
time) otherwise
classified as Communal
Establishments
Photo:DavidMartin
86. Household questions, derived variables and
what we use them for
• Usual residents and visitors >> vacancy, second homes
• Family (or not) relationships between household
members >> household formation and dissolution,
parenting, kinship, “hidden households”
• Accommodation type, number of rooms* >> housing
stock, overcrowding, deprivation
• Central heating >> household amenities, deprivation
• Tenure >> home ownership, wealth
• Availability of cars or vans
*If derived from admin data, some of these relate to
addresses, not households
87. (Non-exhaustive) hierarchy of census
entities
Persons
Families
Households
Household spaces
Dwellings
Addresses
Communal Establishments
88. Potential hierarchy of administrative entities
Persons
New construct A
Addresses
Communal EstablishmentsNew construct B
New construct C
89. Possible new construct A: “household-
dwelling unit” (Statistics Finland)
• Consists of the permanent occupants of a dwelling
• Related concepts include: building, dwelling,
consumption unit, residential home, structure of
household-dwelling unit
• Concept adopted in 1980 census. In earlier years the
concept of household was used, which consisted of
family members and other persons living together who
made common provision for food
http://www.stat.fi/til/asuolo/kas_en.html
90. Matters for discussion
• We need to admit that although census is great, it offers
neither a stable nor unambiguous household definition
• The most consistent element is people living at the same
address, which we can probably still estimate
• What other “things in common” really matter and could
they be derived from admin data?
• Shared electricity meter?
• Common wheelie bin?
• Which are the key household statistics and how would
we obtain them without census households?
• Consequences for definitions of dwelling, communal
establishment, etc. 90
92. Household spaces
A household space is the accommodation used or available
for use by an individual household. Household spaces are
identified separately in census results as those with at least
one usual resident, and those that do not have any usual
residents.
• A household space with no usual residents may still be used by
short-term residents, visitors who were present on census night, or a
combination of short-term residents and visitors.
• Vacant household spaces and household spaces that are used as
second addresses are also classified in census results as household
spaces with no usual residents.
http://webarchive.nationalarchives.gov.uk/20160105160709/http://www.ons.gov.uk/ons/guide-method/census/2011/census-
data/2011-census-data/2011-first-release/2011-census-definitions/2011-census-glossary.pdf
93. Dwellings
A dwelling is a unit of accommodation which may comprise
one or more household spaces (a household space is the
accommodation used or available for use by an individual
household).
A dwelling may be classified as shared or unshared. A dwelling is
shared if:
• the household spaces it contains have the accommodation type
“part of a converted or shared house”,
• not all of the rooms (including kitchen, bathroom and toilet, if any)
are behind a door that only that household can use, and
• there is at least one other such household space at the same
address with which it can be combined to form the shared dwelling.
Dwellings that do not meet these conditions are unshared dwellings.
http://webarchive.nationalarchives.gov.uk/20160105160709/http://www.ons.gov.uk/ons/guide-method/census/2011/census-
data/2011-census-data/2011-first-release/2011-census-definitions/2011-census-glossary.pdf
94. Communal establishments
http://webarchive.nationalarchives.gov.uk/20160105160709/http://www.ons.gov.uk/ons/guide-method/census/2011/census-
data/2011-census-data/2011-first-release/2011-census-definitions/2011-census-glossary.pdf
A communal establishment is an establishment providing
managed residential accommodation; “managed” in this context
means full-time or part-time supervision of the accommodation.
Types of communal establishment include:
• Sheltered accommodation units where fewer than 50 per cent of the
units… have their own cooking facilities… If half or more possess
their own facilities for cooking (regardless of use) all units in the
whole establishment are treated as separate households
• Small hotels, guest houses, bed & breakfasts and inns and pubs
with…
• All accommodation provided solely for students (during term-time)…
(University owned student houses that were difficult to identify and
not clearly located with other student residences are treated as
households, and houses rented to students by private landlords are
also treated as households)…
• Accommodation available only to nurses…
95. Houses in multiple occupation
Your home is a house in multiple occupation (HMO) if both of the
following apply:
• at least 3 tenants live there, forming more than 1 household
• you share toilet, bathroom or kitchen facilities with other tenants
Your home is a large HMO if all of the following apply:
• it’s at least 3 storeys high
• at least 5 tenants live there, forming more than 1 household
• you share toilet, bathroom or kitchen facilities with other tenants
A household is either a single person or members of the same family
who live together. A family includes people who are:
• married or living together - including people in same-sex
relationships
• relatives or half-relatives, for example grandparents, aunts, uncles,
siblings
• step-parents and step-children
https://www.gov.uk/house-in-multiple-occupation
97. Definitions
A
household
is defined as:
one person living alone,
or
a group of people (not
necessarily related) living at the
same address who share
cooking facilities and
share a living room or
sitting room or dining area.
98. ADCP aims for household outputs
Produce household statistics as part of Research Outputs 2016.
Three types of statistics over the next few years:-
• Number of households
• Household size
• Household composition
Household numbers released in February 2017
Derived from the same SPD as population estimates.
Replicate a similar output package as the population estimates -
time series
Can be produced at various levels of geography
Multiple versions from SPD versions.
SPD: Statistical Population Dataset
100. Challenges
Our three biggest challenges for producing household
numbers
Definition – household/address is not a one to one
relationship
Correct address allocation
• data lags
• high churn
• people not deregistering
• poor AddressBase matching/allocation
101. What data can we use?
Address
Base
Population
Coverage
Survey
Tax and
Benefits
data
102. Definitions
There are some important distinctions between the household
estimates produced in these research outputs and those
published in official statistics:
The definition of ‘households’ used in these research outputs is
based on identifying occupied addresses in administrative data
Occupied addresses on administrative data include those with
at least one ‘usual resident’ included in our Statistical
Population Dataset (SPD V2.0)
Only occupied addresses that have been successfully linked to a
Unique Property Reference Number (UPRN) on AddressBase
have been included in these research outputs
103. Allocating address at SPD record level
Using many data sources to find our
‘best’ address.
Benefits
Enables aggregation at different
levels and cross tabulation with other
variables.
Can weight certain data sources for
different demographic groups . e.g.
students
Notes:
A non valid UPRN may occur when the address given cannot be
matched to one on reference data, or is not in England and Wales
4% of SPD V2.0 records could not be assigned to UPRN (i.e.
‘residual’)
104. Underestimations
When comparing SPD V2.0 household estimates with official estimates, there is a
clear tendency to underestimate the number of households using this
methodology. Reasons for this can be summarised as follows:
UPRN assignment - Not all records on SPD V2.0 can be assigned to a
UPRN, due to missing address information or failures to link addresses
Complex residential addresses – Addresses with ‘parent’ and ‘child’ UPRN
hierarchies are unlikely to have full coverage on the administrative data we are
using for these research outputs
SPD V2.0 inclusion rules – The rules used to determine usual residence in
our SPD V2.0 population estimates may have resulting in the incorrect exclusion
of some households from our population base
105. England and Wales –
Comparing with Census for 2011 :-
Outcomes – Numbers of Households
107. England and Wales –
Comparing with Census for 2011 and DAU figures for 2011 and
2015:-
-14 -12 -10 -8 -6 -4 -2 0
England and Wales
East Midlands
East of England
London
North East
North West
South East
South West
Wales
West Midlands
Yorkshire and The Humber
Regional Percent Differences - 2011 and
2015
2011 2015
Outcomes – Numbers of Households
DAU: Demographics Analysis Unit at ONS
108. LA Name Region % difference
Kensington and Chelsea London -34.6
Westminster,City of London London -32.3
Islington London -22.2
Gwynedd Wales -21.4
Hammersmith and Fulham London -18.6
Camden London -17.4
Tower Hamlets London -16.8
Wandsworth London -16.0
Haringey London -15.6
Brent London -14.5
2011 2015
LA Name Region % difference
Gwynedd Wales -25.4
Westminster,City of London London -23.6
Kensington and Chelsea London -20.2
Cambridge East of England -20.0
Camden London -18.5
Broxbourne East of England 17.5
South Ribble North West 16.1
Watford East of England -15.4
Gravesham South East -14.5
Forest Heath East of England -14.4
Top Tens – largest differences
Outcomes – Numbers of Households
110. Household Sizes
To investigate whether we can counteract the
definitional differences between census
households and addresses/UPRNs, using
SPREE (Structure Preserving Estimator)
Uses Annual Population Survey (APS)
proportions of household sizes to adjust SPD
estimates.
111. Challenges - sizes
Some categories vary more than others across
geographies, so are harder to estimate.
Some geographies are affected by certain
missingness e.g. armed forces data, so may need to
be treated differently
Some geographies are affected by usual residence
variations, so may need to be treated differently.
If an area is extremely different from the national
distribution, it may be harder to estimate using those
distributions.
112. Adjustment using SPREE
Structure Preserving Estimator (SPREE) method uses survey data to support admin data.
Adjusting the proportions of each category, rather than numbers.
Source: Office for National Statistics Notes: 1. Statistical Population Dataset 2. Annual Population Survey 3. SPREE - Structure Preserving Estimator
113. Adjustment using SPREE
Structure Preserving Estimator (SPREE) method uses survey data to support admin data
Source: Office for National Statistics Notes: 1. Statistical Population Dataset 2. Annual Population Survey 3. SPREE - Structure Preserving Estimator
114. Adjustment using SPREE
Structure Preserving Estimator (SPREE) method uses survey data to support admin data
Source: Office for National Statistics Notes: 1. Statistical Population Dataset 2. Annual Population Survey 3. SPREE - Structure Preserving Estimator
115. Effects of estimation
Kensington and Chelsea
-6
-5
-4
-3
-2
-1
0
1
2
3
4
1 2 3 4 5 plus
SPD¹ - Census
SPREE² - Census
SPD1 difference from census percentages versus SPREE2 adjustment, 2011
Source: Office for National Statistics
Notes: 1. SPD - Statistical Population Dataset
2. SPREE - Structure Preserving Estimator
Hastings
-8
-6
-4
-2
0
2
4
1 2 3 4 5 plus
SPD¹ - Census
SPREE² - Census
116. Effects of estimation
SPD1 difference from census percentages versus SPREE2 adjustment, 2011
Source: Office for National Statistics
Notes: 1. SPD - Statistical Population Dataset
2. SPREE - Structure Preserving Estimator
Newham
-2
-1
0
1
2
3
4
1 2 3 4 5 plus
SPD¹ - Census
SPREE² - Census
Richmondshire
-4
-3
-2
-1
0
1
2
3
4
5
6
1 2 3 4 5 plus
SPD¹ - Census
SPREE² - Census
118. Classification
Census KS105EW:
One person household
Aged 65 and over
Other
One family household
All aged 65 and over
Married or same-sex civil partnership couple
No children
Dependent children
All children non-dependent
Cohabiting couple
No children
Dependent children
All children non-dependent
Lone parent
Dependent children
All children non-dependent
Other household types
With dependent children
All full-time students
All aged 65 and over
Other
Annual UK estimates from
Labour Force Survey:
One person household
Under 65
65 or over
Two or more unrelated adults
One family households
Couple
No children
1-2 dependent children
3 or more dependent children
Non-dependent children only
Lone parent
Dependent children
Non-dependent children only
Multi-family households
119. Using admin data
To create household composition we need:
1. Population base of usual residents – SPD V2.0
2. Usual residents assigned to an address to create
households base
Issues with SPD and household base described earlier
impact household composition
Other information used for household composition
1. Age, sex, surnames of occupants
2. Relationships from other admin data - ONS now has
access to some admin data containing relationships
120. Other work and methods
Register based countries: Austria
• Social security, child allowance and tax sources
• Couple, parent-child, sibling, grandparent-grandchild
relationships
• Still have to use imputation method for some relationships
UK: Harper and Mayhew (2015)
• No relationships available
• Count people in broad age groups to assign household type
• Children (0-19), Working age (20-64), Older adults (65+)
ONS method falls between these
• Use the relationships available in admin data where possible
• Use demographic information to infer others
121. Relationships in admin data
Couple relationships:
1. Housing Benefit
• Partner ID available where
applicable
2. National Benefits
Database
• Partner ID available for
State Pension claimants
If not available, need to infer
a couple relationship
0
50,000
100,000
150,000
200,000
250,000
300,000
15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
Age
Age of people with partner ID
122. Relationships in admin data
Child Benefit data
• Contains a National Insurance ID for one of the parents
• High coverage of dependent children
• Eligible up to age 16, then up to 19 if in approved education
or training
• Identify whether 16-18 year olds are dependent children, to
match census definition
Non-dependent children
• No longer on Child Benefit dataset
• Infer a relationship to a parent using additional information
123. Algorithm
1) Single person 2) All students 3) Lone parent
4) Couple
•a) With Partner ID
•b) No Partner ID
5) Other
•a) More than 2 generations
•b) Unrelated adult
• Use all possible relationships at address to assign
the household to a major category:
124. 3
1
2
Age
18
16
Algorithm
1. Single person households – one person in UPRN
2. Student – all people have HESA record
3. Lone parent families:
Smith
Smith
Parent ID
> 18 years
125. Couple families
4. Couple families:
3
4
Partner ID
≤ 12 years
Parent ID
> 18 years
Smith
Age 1
2
18
16
Smith
126. Other households
Age
2
1
3
4
5
> 50 years
Age
1
2
3
< 15 years
Contain more than one family
More than two
generations:
Person 3 too old to
be child of 1 or 2
127. Results
0 10 20 30 40 50 60
Single
Student
Couple
Lone parent
Other
Missing
% of households
Census
SPD
• Percentage distribution to remove household undercount effect
• ‘Missing’ – does not meet any current category criteria
128. Minor categories
Single person
Aged 65+
Other
Lone parent
With
dependent
children
All children
non-dependent
Couple
No children
With
dependent
children
All children
non-
dependent
Other
With
dependent
children
All aged 65+
Other
129. Minor categories results
0 5 10 15 20
Aged 65 and over
Other
All aged 65 and over
No children
Dependent children
All children non-dependent
Dependent children
All children non-dependent
Student
With dependent children
All aged 65 and over
Other
Missing
SCLSOM
% of households
Census
SPD
130. Local authorities
• Very nearly all LAs have undercount for ‘Couple’ and ‘Other’
• Low level of ‘Missing’ in areas with high proportion of couple
households and low ‘Other’
• Older population = high proportion of couples with Partner ID
-15
-10
-5
0
5
10
Single Student Couple Lone
parent
Other
SPD%-Census%
Comparison with Census
0
10
20
30
40
Missing Couple with Partner ID
%ofhouseholds
Missing and Partner ID
Ranges of values for local authorities:
135. Next Steps
• Assign addresses with ‘Missing’ household
composition to a category
• Many couples but age difference outside current range
• Some are ‘Other’ households eg unrelated adults
• Possibly use imputation method similar to Austria
• Use households containing a Partner ID as donors
• All other relationships in these are ‘non-couple’
• Evaluate effectiveness of algorithm
• Compare to record level census data
136. Future Plans
Publish Research outputs: occupied address (household)
estimates by size, 2011 – 24th July
Improve estimates of household numbers – output early next
year
Adjust numbers using a coverage survey
Research removal of communal establishments
Use more data e.g. Council Tax to identify students/one person
households
Household Composition – output early next year
Unoccupied addresses - do we need them?
138. Over 17bn annual mail and other interactions with
UK citizens builds a view of the individual and
household
@
£
139. My mail event activity - individual
Data insights
• Strongest mail
profiles reside at
the address
• Name variants
need to be linked
to strengthen the
signal
• Error needs to be
managed
140. 140
Insight derived from mail interactions
SN5 summary insights
• Represents 8 properties, covering
circa 31 individuals
• 13 individuals received SCV
parcels, at 4 addresses over a 10
week period
3rd party data insights
• Average age of 56
• Even male to female split
• Northern European names
• Average Zoopla property price
estimate of £244k
• Mostly 3 - 4 bedroom properties
• Mainly professional, retired and
married with medium income
Parcel Volumes
0 = red
1-3 = orange
4+ = green
RM demographics opportunities
• Channel preference
• eCommerce activity
• Residency
• Interest type
• Property type………etc.
By combining third party data and building analytic profiles of the mail interactions, a
new postcode view of the household can be built based on actual interactions