SlideShare a Scribd company logo
Opportunities for alternative
data sources
Chair: Helen James, Head of
Department of Law, University of
Winchester
Opportunities for alternative
data sources to support the
Census
Jane Naylor, 29th June
Data sources for official statistics
• Surveys – eg of businesses and households
• Census – every 10 years
• Administrative data – by-product of
Government process
• Big Data?
‘Data that is difficult to collect, store or process within
the conventional systems of statistical organizations.
Either, their volume, velocity, structure or variety
requires the adoption of new statistical software
processing techniques and/or IT infrastructure to
enable cost-effective insights to be made.’
(UNECE, 2013)
Demographics, population flows :
mobile phone data
4
Twitter
Rationale: Using geo-located
Twitter to gain new insights
mobility and migration
• 7 months of geo-located tweets
within Great Britain (about 100
million data points)
• Methodology to infer place of usual
residence:
- Identify user ‘anchor points’ by
clustering tweets using a DBSCAN
algorithm
- Identify residential anchor points using
AddressBase and nearest neighbour
analysis
Geolocated
penetration rates
by local
authority
Use case: Student mobility
Twitter - sentiment analysis
Twitter - sentiment analysis
Electricity: smart meter
9
Half hourly electricity consumption over 7 days at one meter,
through 28 consecutive 7 day periods.
Housing websites: Zoopla
Aerial imagery
11
Opportunities and challenges
Current Census priorities:
•Mobile phone data for population mobility
•Intelligence on addresses
Using data from property websites and
aerial imagery to support the Census
Karen Gask
ONS Big Data team
Plan for today
• How property website data can be used to
improve statistics in areas with caravan parks
• What analysis of aerial imagery can tell us
about caravan parks
• Your feedback
Challenges for Census enumeration -
where Big Data can help
• Address intelligence is required to effectively
plan enumeration resources
• Understanding where knowledge gaps exist
• Help identify where there are access issues
or new builds
Property website data
Property websites
Potential benefits of property data for
Census
Improve understanding of small areas by identifying:
• High proportions of rental properties
• Unusual properties which may not be captured well in the
Address Register (house boats, caravan homes, beach
huts)
• Areas where there may be access issues or new builds
Provide some limited information on tenure of private
sector housing for Administrative Data Census
Work undertaken
Aim: investigate methods of machine learning
which could accurately identify, or distinguish
between, traits of interest within property data
in an automated way
Rationale: automated classification could allow
targeted field work and inform enumeration
resource allocation
Collecting data from these websites
Collecting data from these websites
• Could have ‘web scraped’ html code behind
websites to capture this data
− But many websites prohibit this
• Zoopla provides (limited) data for free via an
API (Application Programming Interface)
− Successfully collected data about 60,000
properties for sale or for rent
Early results
• Identified caravan homes with good accuracy
using price, property description, number of
bedrooms and property type
• Distinguished between holiday and residential
caravan homes with reasonable accuracy
(although sample size is small)
• Currently working on analysing property
description to identify gated communities
Identifying caravan homes (1)
• Developed machine learning methods such
as logistic regression, decision trees and
support vector machines
Is property type
“mobile/park home”?
yesno
Does property
description contain
“holiday park”?
no yes
Caravan home
Caravan homeNot a caravan home
n=180
n=55n=17,501
Identifying caravan homes (2)
• Support vector
machines the
most accurate
method
y
x
Linear hyperplane
and support
vectors
Non-linear
hyperplane
y
x
Predicted
Actual Not a caravan
home
Caravan
home
Not a caravan
home
7,113 18
Caravan home 0 99
Unseen testing set of 7,230:
Distinguishing between holiday and
residential caravan homes
• Classified 500 caravan
home descriptions
• Split descriptions into
words then correlated
each word with holiday
/ residential
classification
• Small sample size so
there is some
overfitting
Gated communities
• Currently exploring
use of Natural
Language Processing
on property
description
• Want “set in a private
gated development”
but not “…gated side
access to the garden”
Limitations
• Only provides data
about properties for
sale or rent
• Zoopla does not cover
all properties for sale or
rent
• Some properties have
no description or a very
small description
• Sample of data we have
collected is small for
unusual properties
Next steps
• Data shows promise but we have collected all
the free data we can (nearly 60,000 records)
• Soon to issue a tender to purchase data for
Census Test areas to test methods in 2017
• Understand how this data could improve
statistics in areas with caravan parks
Aerial imagery
Potential benefits of image data for
Census
• Similar to property data – image data could
help fill knowledge gaps by identifying:
• the number of properties in a given area
• properties which are similar / different
• properties with particular features
• Images can be more timely than field
intelligence
• Images can provide more cost effective
insight than field intelligence
Work undertaken
Aim: To explore the utility of aerial and satellite
imagery for official statistics through a pilot
study of caravan site images
Rationale: This could improve statistics in areas
with caravan parks, which are historically
considered 'hard to count' within the Census
Address Register
Collecting data
• Data are obtained from Google's API
(Application Programming Interface) for free
• There is a limit on image dimension and data
amount one can get for free (e.g. download of
images of New Forest took 2 days)
• All downloaded images have the same
dimensions and the same level of magnitude
• Google takes care of some pre-processing:
blending images together, adjusting colours
Pre-processing
Machine learning requires
‘training data’ where the objects
of interest are correctly labelled
- Circa 60 images were
manually labelled before
analysis
To artificially increase the size
of the dataset images were
augmented by
• rotation,
• flipping and
• translation.
Object recognition
Used the following machine learning techniques
(plus others):
• Logistic regression
• Random Forests
• Support Vector Machines
• But artificial neural networks worked best
Output
• Heat map of
probabilities
that there is a
caravan at a
given
spot/patch of
the image
• Accuracy (for
single patches)
97%
Limitations
Limitations of the free data:
• Quality of the images, consistency of colours (white
balance, season)
• Timeliness of the data (e.g. Google satellite imagery
is up to 3 years old)
Algorithm limitations:
• Humans can't get it 100% right, 97% seems good
• But even small error rates lead to large amount of
false positives when the classification is deployed to
large area
So what can we do?
• Identify deficiencies in the Address Register
used for Census
• Maybe the accuracy is not good enough for
individual caravans, but it can still help with
caravan parks
• Focus on large clusters and compare them
with Address Register
For example these sites
Address Register: 21 caravans
Algorithm: 188 caravans
Address Register: 3 caravans
Algorithm: 121 caravans
Or these sites
Address Register: 0 caravans
Algorithm: 61 caravans
Address Register: 0 caravans
Algorithm: 21 caravans?
Similar housing
• Find similar
buildings (e.g.
terraced
houses)
• Heat map of
similarity of
each spot/patch
to the central
one
Next steps
Short term:
• Use discrepancies between algorithm and
Address Register in Census Tests
• Include more data sources, e.g. LIDAR which
captures the height of mapped objects
Long term / other applications:
• Land use classification (sustainable
development, crop types)
• Population density estimation
Your feedback
Your feedback please
• Questions or comments on this work
• Can you think of other applications for this
type of data or methods?
• Is there similar work happening elsewhere?
• Can you think of other ‘big data’ sources we
haven’t considered yet for Census?
Comments / questions are welcome now or
Jane and I will be around for lunch and the
rest of the day (or
ons.big.data.project@ons.gov.uk)
What do we understand about public acceptability
of using administrative and other data for
research?
Vanessa Cuthill, Deputy Director, ESRC
Outline
• Context
• ESRC Dialogues
• Other reports
• Views from participants
Context
Administrative Data Research
Network
Dialogues – what have we done?
• ESRC
2013 Public dialogues on using admin data (ESRC,
ONS, Ipsos Mori)
2015 Big Data: Public views on the use of private sector
data for social research (ESRC, Hopkins Van Mil)
• Others:
2014 Powers and perils of data (Ipsos Mori)
2014 Public attitudes to the use and sharing of their
data (RSS and Ipsos Mori)
2015 Private Lives? (MRS Report)
Public dialogues on using admin data
Why did ESRC embark on the dialogues on
data in 2013?
• Rapidly evolving data landscape - Administrative
Data Taskforce
• We wanted to:
Better understand people’s views on the linking of admin
data
Begin the process of creating a terminology describing the
re-use of administrative data and data linking that is
understandable to the general public
Help inform the development of the governance and
operational procedures of the ADRCs and provide data
on public attitudes to inform their future strategies and
priorities
Background
• Throughout October - November 2013 public dialogues
held in 7 locations across UK (Ipsos MORl led)
• The aims were to:
To better understand the cultural barriers around linking
administrative data
To begin the process of creating a language that is
meaningful and accessible to the public.
To test the public perceptions of the rules that ESRC
ADRCs will be subject to and to provide the ADRCs
with data on public attitudes and appetite for
engagement
(To provide ONS with more detailed evidence on public
views of their current front-running option for Beyond
2011)
Support for initiative IF:
1. The data is linked for socially beneficial purposes
“As long as it’s used for good, like to develop things,
improve services, improve knowledge.” Belfast
2. It is fully de-identified – partial vs full postcodes
3. It is kept secure at all times – concerns around remote access
4. No commercial gain for business including commercial access
BUT participants needed extensive information and discussions with
experts and researchers in order to be satisfied that these conditions
would be met under the ADRN plans.
So - simply publicising these three conditions may not be enough to
ensure that the general public are reassured about or support the work
of the ADRN.
The ADRN Initiative
Impact of this dialogue
• Report shared with ADRN
• Informed decisions of the ADRN Management Committee for the
ADRN policies and procedures
▶ Lay membership in ADRN Board and Approvals Panel
• Short animated videos on ADRN website to help explain:
Data linkage https://www.youtube.com/watch?v=E3e4D2bHxa8&feature=youtu.be
Protecting Privacy https://www.youtube.com/watch?v=nnxz3_XGMAE&feature=youtu.be
• Clear policies and 5 safe’s Safe People, Safe Projects, Safe
Settings, Safe Outputs, Safe Data protocols
Big Data: Public views on the use of
private sector data for social research
A Findings Report for the Economic and Social
Research Council
Aim and objectives of the dialogue
To explore public views on access to and
the use of data from private sector
organisations for research purposes in the
context of three Data Research Centres
funded by the ESRC.
o To identify areas of public concern about
confidentiality and privacy impact
o To start creating a language around private
sector data and the use for research purposes
o To test public understanding of: data ownership,
data acquisition, data access, using/ re-using
private sector data, data storage and preservation
The journey
Public views on data collection by
the private sector
“It’s just the way we live”
Internet
GPS
trackin
g
devices
Cards
Public
Places
No way you
can opt out
of giving
data unless
you live like
a hermit in
the middle
of an island.
Particular Concerns
Lack of
Transparency, &
Information
Passing/Selling data
to others
Keeping data safe Linking data
Intentions for data
collection and use
DPA: Principles and
Sanctions
Examples of public concerns
“What people are worried about is that it’s not going to be kept just
within. It might get sold to insurance companies, employers and this
is where people want to know that it’s going to be safe.”
“The more data you add the more it’s creating this sense of identity
of each person, so it’s almost like everyone’s got this data avatar
that’s building up as we get older.”
“I try to avoid using the internet for purchases specifically because I
don’t want my data collected and then used for sales purposes
afterwards.”
(on DPA) “Things like ‘used in a way that is adequate, relevant and
not excessive’ what does that mean? Who decides what that is? Do
you get to decide that yourself when you’re doing research?
Acceptability and red lines:
Data acquisition
Only acquire
data from
trustworthy
companies
Ensure
accuracy and
relevance of
data
Work as much
as possible with
anonymised
data
…. Test the
provenance of
data sets
Little support for
payment for
data sets
Improved
information
….
Accurate
and up to
date data is
vital
It is acceptable
to share data for
the good of
society – but we
need to know
how it will be
used
I don’t think it’s
right that they
[the Data
Centres]
should buy
data, it’s public
money so they
should be
spending it on
the public
Data access
Trust in
processes
Approval for
access
procedures
researchers
Avoid clash of
interests
.... Secure setting
favoured over
virtual
environment
Consent for
use of personal
data
Clear
communication
about purpose
of research
....
Big data is a
broad term,
at what
point is it
not personal
anymore,
when does
it lose your
name?
The process
to secure it
and who’s
actually
getting to
use it and
stuff like that
[…] seems to
be set up
that it’s pretty
secure and
that not just
anybody can
walk in and
get this
information.
Data storage
More
information:
what/ / for how
long/ security
Physical storage
favoured over
virtual
environment
Ensure systems
are accessible
in future
Storage
We don’t
know
where the
information’
s stored
and who’s
in control of
it.
Data ownership
Low
familiarity
More
information:
type of data
owned
More
information:
data you
don’t use
Ownership
They are going to
have data coming
in, they’re going to
be processing it,
so although it
originally came
from a source, the
new information
that’s been
collected, is that
owned by them or
does [ownership]
still go back to the
original?
If it’s
about
me,
surely I
own the
data?
Transparency – why it matters?
“Just educate the public about the Data
Centres. If the public are aware of what’s
happening then they may not mind so much”
“A lot of the stigma that comes with data
sharing comes with people not knowing and not
being educated about the facts of how the
data’s being used.”
Value of private sector data for knowledge
and society
“Show the greater good of using my data – the
benefits of my data for the greater good.”
“I’d be really willing to sign up for this only if I
saw the benefits that my data provided. […] If
you see the impact that it has not only on your
life but on the life of the NHS as well and then
they are going to change their services, that’s
the greater good of it”.
Communication
Increased buy-in
Reassurances
and benefits
Demonstrate
impact
Data
management
processes
Turn the small
print into big
print!
Plain English
please!What is ‘good
use’ of data for
the benefit of
society?
Before we
came here, we
had absolutely
no idea what
the Data
Centre was
Conclusions from the dialogue
o Sharing of information about how the Data Centres operate
significantly increased levels of understanding of the benefits of
social research using private sector data.
o Improved communication and education about the processes by
which the Data Centres acquire, store, own and access private sector
data is vitally important to establish greater credibility.
o Data Centres should use case studies to demonstrate how the use of
private sector data in social research can lead to policy or service
improvements.
o The more reassurances are given about the data Centre processes
and the benefits of using private sector data the more buy-in can be
expected from members of the public.
Other reports (1)
Other reports(2)
10
British
government
The
ONS
GP
surgery
The
NHS
The police
Academic
researchers
Charities
Online
retailers
The media / the
press
0%
Insurance
companies
Telecommunication
s companie
s
Local authority
Supermarkets
Internet companies
5
%
10
%
15
%
20
%
25
%
30
%
35
%
40
%
45
%
0% 5%
10%
15% 20% 25% 30% 35%
High trust in organisation generally (score
8-10)
40
%
45
%
50
%
Hightrustindata(score8-10)
“Please tell me on a score of 0-10 how much you personally trust each of the institutions below.”
“Please tell me on a score of 0-10 how much you personally trust each of the institutions below to use
your data appropriately.”
50%
© Ipsos MORI Version 1 |
RSS Ipsos Mori Report (2014) Public attitudes to the use and sharing of their
data
The “data trust
deficit”
Base: 2,019 GB adults, aged 16-
75
Source: Ipsos
MORI
Other reports (3)
Giving the last word to the participants…
Primary contact at ESRC:
Maria.Sigala@esrc.ac.uk
Thank you!
Vanessa Cuthill ESRC Deputy Director
(Evidence, Impact and Strategic Partnerships)

More Related Content

What's hot

Welcome to the Census Transformation Research Conference 2016
Welcome to the Census Transformation Research Conference 2016Welcome to the Census Transformation Research Conference 2016
Welcome to the Census Transformation Research Conference 2016
Office for National Statistics
 
Transforming the census to 2021 and beyond estonia
Transforming the census   to 2021 and beyond estoniaTransforming the census   to 2021 and beyond estonia
Transforming the census to 2021 and beyond estonia
Office for National Statistics
 
The integration of statistical and administrative data sources to increase po...
The integration of statistical and administrative data sources to increase po...The integration of statistical and administrative data sources to increase po...
The integration of statistical and administrative data sources to increase po...
Office for National Statistics
 
Plans for the online 2021 Census with increased use of administrative and sur...
Plans for the online 2021 Census with increased use of administrative and sur...Plans for the online 2021 Census with increased use of administrative and sur...
Plans for the online 2021 Census with increased use of administrative and sur...
UKDSCensus
 
Delivering early benefits and trial outputs using administrative data
Delivering early benefits and trial outputs using administrative dataDelivering early benefits and trial outputs using administrative data
Delivering early benefits and trial outputs using administrative data
UKDSCensus
 
Evaluating the feasibility of using administrative data in the context of cen...
Evaluating the feasibility of using administrative data in the context of cen...Evaluating the feasibility of using administrative data in the context of cen...
Evaluating the feasibility of using administrative data in the context of cen...
UKDSCensus
 
ONS presentation at RSS South Wales poverty & inequality stats event
ONS presentation at RSS South Wales poverty & inequality stats eventONS presentation at RSS South Wales poverty & inequality stats event
ONS presentation at RSS South Wales poverty & inequality stats event
Richard Tonkin
 
Webometrics analysis localmedia(1aug2011).pptx
Webometrics analysis localmedia(1aug2011).pptxWebometrics analysis localmedia(1aug2011).pptx
Webometrics analysis localmedia(1aug2011).pptxHan Woo PARK
 
Webometrics analysis localmedia
Webometrics analysis localmediaWebometrics analysis localmedia
Webometrics analysis localmediaHan Woo PARK
 
Webometrics analysis localmedia
Webometrics analysis localmediaWebometrics analysis localmedia
Webometrics analysis localmediaHan Woo PARK
 
Leicester City Covid-19 Testing Programme webinar
Leicester City Covid-19 Testing Programme webinarLeicester City Covid-19 Testing Programme webinar
Leicester City Covid-19 Testing Programme webinar
Association for Project Management
 
Big data and macroeconomic nowcasting from data access to modelling
Big data and macroeconomic nowcasting from data access to modellingBig data and macroeconomic nowcasting from data access to modelling
Big data and macroeconomic nowcasting from data access to modelling
Dario Buono
 
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
ALAeLearningSolutions
 
IAOS 2018 - The changing role of the Census in Australia's integrated data la...
IAOS 2018 - The changing role of the Census in Australia's integrated data la...IAOS 2018 - The changing role of the Census in Australia's integrated data la...
IAOS 2018 - The changing role of the Census in Australia's integrated data la...
StatsCommunications
 
Linking PEPFAR Data To Gis Inglis
Linking PEPFAR Data To Gis   InglisLinking PEPFAR Data To Gis   Inglis
Linking PEPFAR Data To Gis InglisMEASURE Evaluation
 
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
Khulisa Management Services
 
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
StatsCommunications
 
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
National Housing Research Committee - Comité national du recherche sur le logement
 
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
Statistics South Africa
 

What's hot (19)

Welcome to the Census Transformation Research Conference 2016
Welcome to the Census Transformation Research Conference 2016Welcome to the Census Transformation Research Conference 2016
Welcome to the Census Transformation Research Conference 2016
 
Transforming the census to 2021 and beyond estonia
Transforming the census   to 2021 and beyond estoniaTransforming the census   to 2021 and beyond estonia
Transforming the census to 2021 and beyond estonia
 
The integration of statistical and administrative data sources to increase po...
The integration of statistical and administrative data sources to increase po...The integration of statistical and administrative data sources to increase po...
The integration of statistical and administrative data sources to increase po...
 
Plans for the online 2021 Census with increased use of administrative and sur...
Plans for the online 2021 Census with increased use of administrative and sur...Plans for the online 2021 Census with increased use of administrative and sur...
Plans for the online 2021 Census with increased use of administrative and sur...
 
Delivering early benefits and trial outputs using administrative data
Delivering early benefits and trial outputs using administrative dataDelivering early benefits and trial outputs using administrative data
Delivering early benefits and trial outputs using administrative data
 
Evaluating the feasibility of using administrative data in the context of cen...
Evaluating the feasibility of using administrative data in the context of cen...Evaluating the feasibility of using administrative data in the context of cen...
Evaluating the feasibility of using administrative data in the context of cen...
 
ONS presentation at RSS South Wales poverty & inequality stats event
ONS presentation at RSS South Wales poverty & inequality stats eventONS presentation at RSS South Wales poverty & inequality stats event
ONS presentation at RSS South Wales poverty & inequality stats event
 
Webometrics analysis localmedia(1aug2011).pptx
Webometrics analysis localmedia(1aug2011).pptxWebometrics analysis localmedia(1aug2011).pptx
Webometrics analysis localmedia(1aug2011).pptx
 
Webometrics analysis localmedia
Webometrics analysis localmediaWebometrics analysis localmedia
Webometrics analysis localmedia
 
Webometrics analysis localmedia
Webometrics analysis localmediaWebometrics analysis localmedia
Webometrics analysis localmedia
 
Leicester City Covid-19 Testing Programme webinar
Leicester City Covid-19 Testing Programme webinarLeicester City Covid-19 Testing Programme webinar
Leicester City Covid-19 Testing Programme webinar
 
Big data and macroeconomic nowcasting from data access to modelling
Big data and macroeconomic nowcasting from data access to modellingBig data and macroeconomic nowcasting from data access to modelling
Big data and macroeconomic nowcasting from data access to modelling
 
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
American Libraries Live: Raise Your Data Analytics Aptitude (September 2019)
 
IAOS 2018 - The changing role of the Census in Australia's integrated data la...
IAOS 2018 - The changing role of the Census in Australia's integrated data la...IAOS 2018 - The changing role of the Census in Australia's integrated data la...
IAOS 2018 - The changing role of the Census in Australia's integrated data la...
 
Linking PEPFAR Data To Gis Inglis
Linking PEPFAR Data To Gis   InglisLinking PEPFAR Data To Gis   Inglis
Linking PEPFAR Data To Gis Inglis
 
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
Going Digital: Using Mobile Data Collection to Monitor ECD in South Africa by...
 
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
IAOS 2018 - Keeping official stats fit for building a wider evidence system, ...
 
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
Arrimage de données sociodémographiques et de santé pour un portrait micro‐te...
 
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
Ta4.05 mac gillivray.unwdf_macgillivray_ta4_05
 

Similar to Opportunities for alternative data sources

Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Steven Ramage
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
Padma Metta
 
Big Data World
Big Data WorldBig Data World
Big Data World
Hossein Zahed
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
Sharjeel Imtiaz
 
The data we want
The data we wantThe data we want
The data we want
Elena Simperl
 
Power Decision-making at Scale with Address-based Spatial Data Science
Power Decision-making at Scale with Address-based Spatial Data SciencePower Decision-making at Scale with Address-based Spatial Data Science
Power Decision-making at Scale with Address-based Spatial Data Science
Precisely
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
amiyadash
 
MUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
MUNICIPAL3 Mobile Recycling Inventory, Jerrard WhittenMUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
MUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
MassRecycleR32014
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
SugumarSarDurai
 
Interesting ways Big Data is used today
Interesting ways Big Data is used todayInteresting ways Big Data is used today
Interesting ways Big Data is used today
Daniel Sârbe
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in Chennai
Vijay Susheedran C G
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introductionNeeraj Tewari
 
Data science and visualization lab presentation
Data science and visualization lab presentationData science and visualization lab presentation
Data science and visualization lab presentation
iHub Research
 
open-data-presentation.pptx
open-data-presentation.pptxopen-data-presentation.pptx
open-data-presentation.pptx
DennicaRivera
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
IT Network marcus evans
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
Nicholas McClure
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013
oj08
 
What’s State of the Data?
What’s State of the Data?What’s State of the Data?
What’s State of the Data?
National Geospatial-Intelligence Agency
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
Geoffrey Fox
 

Similar to Opportunities for alternative data sources (20)

Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Big Data World
Big Data WorldBig Data World
Big Data World
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
 
The data we want
The data we wantThe data we want
The data we want
 
Power Decision-making at Scale with Address-based Spatial Data Science
Power Decision-making at Scale with Address-based Spatial Data SciencePower Decision-making at Scale with Address-based Spatial Data Science
Power Decision-making at Scale with Address-based Spatial Data Science
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
MUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
MUNICIPAL3 Mobile Recycling Inventory, Jerrard WhittenMUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
MUNICIPAL3 Mobile Recycling Inventory, Jerrard Whitten
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 
Interesting ways Big Data is used today
Interesting ways Big Data is used todayInteresting ways Big Data is used today
Interesting ways Big Data is used today
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in Chennai
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introduction
 
Data science and visualization lab presentation
Data science and visualization lab presentationData science and visualization lab presentation
Data science and visualization lab presentation
 
open-data-presentation.pptx
open-data-presentation.pptxopen-data-presentation.pptx
open-data-presentation.pptx
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013
 
What’s State of the Data?
What’s State of the Data?What’s State of the Data?
What’s State of the Data?
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 

More from Office for National Statistics

Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Office for National Statistics
 
SlideShare ONS Economic Forum Slidepack - 22 April 2024.
SlideShare ONS Economic Forum Slidepack - 22 April 2024.SlideShare ONS Economic Forum Slidepack - 22 April 2024.
SlideShare ONS Economic Forum Slidepack - 22 April 2024.
Office for National Statistics
 
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptxSlideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
Office for National Statistics
 
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptxSlideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
Office for National Statistics
 
SlideShare ONS Economic Forum Slidepack - 22 January 2024
SlideShare ONS Economic Forum Slidepack - 22 January 2024SlideShare ONS Economic Forum Slidepack - 22 January 2024
SlideShare ONS Economic Forum Slidepack - 22 January 2024
Office for National Statistics
 
Beyond GDP: international developments and emerging frameworks - 26 September...
Beyond GDP: international developments and emerging frameworks - 26 September...Beyond GDP: international developments and emerging frameworks - 26 September...
Beyond GDP: international developments and emerging frameworks - 26 September...
Office for National Statistics
 
SlideShare ONS Economic Forum Slidepack - 11 December 2023
SlideShare ONS Economic Forum Slidepack - 11 December 2023SlideShare ONS Economic Forum Slidepack - 11 December 2023
SlideShare ONS Economic Forum Slidepack - 11 December 2023
Office for National Statistics
 
SlideShare ONS Economic Forum Slidepack - 13 November 2023
SlideShare ONS Economic Forum Slidepack - 13 November 2023SlideShare ONS Economic Forum Slidepack - 13 November 2023
SlideShare ONS Economic Forum Slidepack - 13 November 2023
Office for National Statistics
 
SlideShare ONS Economic Forum Slidepack - 16 October 2023
SlideShare ONS Economic Forum Slidepack - 16 October 2023SlideShare ONS Economic Forum Slidepack - 16 October 2023
SlideShare ONS Economic Forum Slidepack - 16 October 2023
Office for National Statistics
 
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
Office for National Statistics
 
GDP after 2025: updating national accounts and balance of payments – 11 Octob...
GDP after 2025: updating national accounts and balance of payments – 11 Octob...GDP after 2025: updating national accounts and balance of payments – 11 Octob...
GDP after 2025: updating national accounts and balance of payments – 11 Octob...
Office for National Statistics
 
SlideShare Measuring the Economy Slidepack - 29 September 2023
SlideShare Measuring the Economy Slidepack - 29 September 2023SlideShare Measuring the Economy Slidepack - 29 September 2023
SlideShare Measuring the Economy Slidepack - 29 September 2023
Office for National Statistics
 
Why dashboards?
Why dashboards?Why dashboards?
SlideShare ONS Economic Forum Slidepack - 18 September 2023
SlideShare ONS Economic Forum Slidepack - 18 September 2023SlideShare ONS Economic Forum Slidepack - 18 September 2023
SlideShare ONS Economic Forum Slidepack - 18 September 2023
Office for National Statistics
 
Connecting to the StatXplore API in PowerBI
Connecting to the StatXplore API in PowerBIConnecting to the StatXplore API in PowerBI
Connecting to the StatXplore API in PowerBI
Office for National Statistics
 
ONS Local presents Suffolk County Council's Cost of Living Dashboard
ONS Local presents Suffolk County Council's Cost of Living DashboardONS Local presents Suffolk County Council's Cost of Living Dashboard
ONS Local presents Suffolk County Council's Cost of Living Dashboard
Office for National Statistics
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
Office for National Statistics
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
Office for National Statistics
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
Office for National Statistics
 
ONS Local presents: Adult Education Outcomes in London
ONS Local presents: Adult Education Outcomes in LondonONS Local presents: Adult Education Outcomes in London
ONS Local presents: Adult Education Outcomes in London
Office for National Statistics
 

More from Office for National Statistics (20)

Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
 
SlideShare ONS Economic Forum Slidepack - 22 April 2024.
SlideShare ONS Economic Forum Slidepack - 22 April 2024.SlideShare ONS Economic Forum Slidepack - 22 April 2024.
SlideShare ONS Economic Forum Slidepack - 22 April 2024.
 
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptxSlideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 18 March 2024.pptx
 
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptxSlideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 19 February 2024.pptx
 
SlideShare ONS Economic Forum Slidepack - 22 January 2024
SlideShare ONS Economic Forum Slidepack - 22 January 2024SlideShare ONS Economic Forum Slidepack - 22 January 2024
SlideShare ONS Economic Forum Slidepack - 22 January 2024
 
Beyond GDP: international developments and emerging frameworks - 26 September...
Beyond GDP: international developments and emerging frameworks - 26 September...Beyond GDP: international developments and emerging frameworks - 26 September...
Beyond GDP: international developments and emerging frameworks - 26 September...
 
SlideShare ONS Economic Forum Slidepack - 11 December 2023
SlideShare ONS Economic Forum Slidepack - 11 December 2023SlideShare ONS Economic Forum Slidepack - 11 December 2023
SlideShare ONS Economic Forum Slidepack - 11 December 2023
 
SlideShare ONS Economic Forum Slidepack - 13 November 2023
SlideShare ONS Economic Forum Slidepack - 13 November 2023SlideShare ONS Economic Forum Slidepack - 13 November 2023
SlideShare ONS Economic Forum Slidepack - 13 November 2023
 
SlideShare ONS Economic Forum Slidepack - 16 October 2023
SlideShare ONS Economic Forum Slidepack - 16 October 2023SlideShare ONS Economic Forum Slidepack - 16 October 2023
SlideShare ONS Economic Forum Slidepack - 16 October 2023
 
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
So what does ‘Beyond GDP’ mean for the UK – 12 October 2023
 
GDP after 2025: updating national accounts and balance of payments – 11 Octob...
GDP after 2025: updating national accounts and balance of payments – 11 Octob...GDP after 2025: updating national accounts and balance of payments – 11 Octob...
GDP after 2025: updating national accounts and balance of payments – 11 Octob...
 
SlideShare Measuring the Economy Slidepack - 29 September 2023
SlideShare Measuring the Economy Slidepack - 29 September 2023SlideShare Measuring the Economy Slidepack - 29 September 2023
SlideShare Measuring the Economy Slidepack - 29 September 2023
 
Why dashboards?
Why dashboards?Why dashboards?
Why dashboards?
 
SlideShare ONS Economic Forum Slidepack - 18 September 2023
SlideShare ONS Economic Forum Slidepack - 18 September 2023SlideShare ONS Economic Forum Slidepack - 18 September 2023
SlideShare ONS Economic Forum Slidepack - 18 September 2023
 
Connecting to the StatXplore API in PowerBI
Connecting to the StatXplore API in PowerBIConnecting to the StatXplore API in PowerBI
Connecting to the StatXplore API in PowerBI
 
ONS Local presents Suffolk County Council's Cost of Living Dashboard
ONS Local presents Suffolk County Council's Cost of Living DashboardONS Local presents Suffolk County Council's Cost of Living Dashboard
ONS Local presents Suffolk County Council's Cost of Living Dashboard
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
 
ONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIsONS Local and Data Science Community Workshop 1: How to use APIs
ONS Local and Data Science Community Workshop 1: How to use APIs
 
ONS Local presents: Adult Education Outcomes in London
ONS Local presents: Adult Education Outcomes in LondonONS Local presents: Adult Education Outcomes in London
ONS Local presents: Adult Education Outcomes in London
 

Recently uploaded

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 

Recently uploaded (20)

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 

Opportunities for alternative data sources

  • 1. Opportunities for alternative data sources Chair: Helen James, Head of Department of Law, University of Winchester
  • 2. Opportunities for alternative data sources to support the Census Jane Naylor, 29th June
  • 3. Data sources for official statistics • Surveys – eg of businesses and households • Census – every 10 years • Administrative data – by-product of Government process • Big Data? ‘Data that is difficult to collect, store or process within the conventional systems of statistical organizations. Either, their volume, velocity, structure or variety requires the adoption of new statistical software processing techniques and/or IT infrastructure to enable cost-effective insights to be made.’ (UNECE, 2013)
  • 4. Demographics, population flows : mobile phone data 4
  • 5. Twitter Rationale: Using geo-located Twitter to gain new insights mobility and migration • 7 months of geo-located tweets within Great Britain (about 100 million data points) • Methodology to infer place of usual residence: - Identify user ‘anchor points’ by clustering tweets using a DBSCAN algorithm - Identify residential anchor points using AddressBase and nearest neighbour analysis Geolocated penetration rates by local authority
  • 6. Use case: Student mobility
  • 9. Electricity: smart meter 9 Half hourly electricity consumption over 7 days at one meter, through 28 consecutive 7 day periods.
  • 12. Opportunities and challenges Current Census priorities: •Mobile phone data for population mobility •Intelligence on addresses
  • 13. Using data from property websites and aerial imagery to support the Census Karen Gask ONS Big Data team
  • 14. Plan for today • How property website data can be used to improve statistics in areas with caravan parks • What analysis of aerial imagery can tell us about caravan parks • Your feedback
  • 15. Challenges for Census enumeration - where Big Data can help • Address intelligence is required to effectively plan enumeration resources • Understanding where knowledge gaps exist • Help identify where there are access issues or new builds
  • 18. Potential benefits of property data for Census Improve understanding of small areas by identifying: • High proportions of rental properties • Unusual properties which may not be captured well in the Address Register (house boats, caravan homes, beach huts) • Areas where there may be access issues or new builds Provide some limited information on tenure of private sector housing for Administrative Data Census
  • 19. Work undertaken Aim: investigate methods of machine learning which could accurately identify, or distinguish between, traits of interest within property data in an automated way Rationale: automated classification could allow targeted field work and inform enumeration resource allocation
  • 20. Collecting data from these websites
  • 21. Collecting data from these websites • Could have ‘web scraped’ html code behind websites to capture this data − But many websites prohibit this • Zoopla provides (limited) data for free via an API (Application Programming Interface) − Successfully collected data about 60,000 properties for sale or for rent
  • 22. Early results • Identified caravan homes with good accuracy using price, property description, number of bedrooms and property type • Distinguished between holiday and residential caravan homes with reasonable accuracy (although sample size is small) • Currently working on analysing property description to identify gated communities
  • 23. Identifying caravan homes (1) • Developed machine learning methods such as logistic regression, decision trees and support vector machines Is property type “mobile/park home”? yesno Does property description contain “holiday park”? no yes Caravan home Caravan homeNot a caravan home n=180 n=55n=17,501
  • 24. Identifying caravan homes (2) • Support vector machines the most accurate method y x Linear hyperplane and support vectors Non-linear hyperplane y x Predicted Actual Not a caravan home Caravan home Not a caravan home 7,113 18 Caravan home 0 99 Unseen testing set of 7,230:
  • 25. Distinguishing between holiday and residential caravan homes • Classified 500 caravan home descriptions • Split descriptions into words then correlated each word with holiday / residential classification • Small sample size so there is some overfitting
  • 26. Gated communities • Currently exploring use of Natural Language Processing on property description • Want “set in a private gated development” but not “…gated side access to the garden”
  • 27. Limitations • Only provides data about properties for sale or rent • Zoopla does not cover all properties for sale or rent • Some properties have no description or a very small description • Sample of data we have collected is small for unusual properties
  • 28. Next steps • Data shows promise but we have collected all the free data we can (nearly 60,000 records) • Soon to issue a tender to purchase data for Census Test areas to test methods in 2017 • Understand how this data could improve statistics in areas with caravan parks
  • 30. Potential benefits of image data for Census • Similar to property data – image data could help fill knowledge gaps by identifying: • the number of properties in a given area • properties which are similar / different • properties with particular features • Images can be more timely than field intelligence • Images can provide more cost effective insight than field intelligence
  • 31. Work undertaken Aim: To explore the utility of aerial and satellite imagery for official statistics through a pilot study of caravan site images Rationale: This could improve statistics in areas with caravan parks, which are historically considered 'hard to count' within the Census Address Register
  • 32. Collecting data • Data are obtained from Google's API (Application Programming Interface) for free • There is a limit on image dimension and data amount one can get for free (e.g. download of images of New Forest took 2 days) • All downloaded images have the same dimensions and the same level of magnitude • Google takes care of some pre-processing: blending images together, adjusting colours
  • 33. Pre-processing Machine learning requires ‘training data’ where the objects of interest are correctly labelled - Circa 60 images were manually labelled before analysis To artificially increase the size of the dataset images were augmented by • rotation, • flipping and • translation.
  • 34. Object recognition Used the following machine learning techniques (plus others): • Logistic regression • Random Forests • Support Vector Machines • But artificial neural networks worked best
  • 35. Output • Heat map of probabilities that there is a caravan at a given spot/patch of the image • Accuracy (for single patches) 97%
  • 36. Limitations Limitations of the free data: • Quality of the images, consistency of colours (white balance, season) • Timeliness of the data (e.g. Google satellite imagery is up to 3 years old) Algorithm limitations: • Humans can't get it 100% right, 97% seems good • But even small error rates lead to large amount of false positives when the classification is deployed to large area
  • 37. So what can we do? • Identify deficiencies in the Address Register used for Census • Maybe the accuracy is not good enough for individual caravans, but it can still help with caravan parks • Focus on large clusters and compare them with Address Register
  • 38. For example these sites Address Register: 21 caravans Algorithm: 188 caravans Address Register: 3 caravans Algorithm: 121 caravans
  • 39. Or these sites Address Register: 0 caravans Algorithm: 61 caravans Address Register: 0 caravans Algorithm: 21 caravans?
  • 40. Similar housing • Find similar buildings (e.g. terraced houses) • Heat map of similarity of each spot/patch to the central one
  • 41. Next steps Short term: • Use discrepancies between algorithm and Address Register in Census Tests • Include more data sources, e.g. LIDAR which captures the height of mapped objects Long term / other applications: • Land use classification (sustainable development, crop types) • Population density estimation
  • 43. Your feedback please • Questions or comments on this work • Can you think of other applications for this type of data or methods? • Is there similar work happening elsewhere? • Can you think of other ‘big data’ sources we haven’t considered yet for Census? Comments / questions are welcome now or Jane and I will be around for lunch and the rest of the day (or ons.big.data.project@ons.gov.uk)
  • 44. What do we understand about public acceptability of using administrative and other data for research? Vanessa Cuthill, Deputy Director, ESRC
  • 45. Outline • Context • ESRC Dialogues • Other reports • Views from participants
  • 47. Dialogues – what have we done? • ESRC 2013 Public dialogues on using admin data (ESRC, ONS, Ipsos Mori) 2015 Big Data: Public views on the use of private sector data for social research (ESRC, Hopkins Van Mil) • Others: 2014 Powers and perils of data (Ipsos Mori) 2014 Public attitudes to the use and sharing of their data (RSS and Ipsos Mori) 2015 Private Lives? (MRS Report)
  • 48. Public dialogues on using admin data
  • 49. Why did ESRC embark on the dialogues on data in 2013? • Rapidly evolving data landscape - Administrative Data Taskforce • We wanted to: Better understand people’s views on the linking of admin data Begin the process of creating a terminology describing the re-use of administrative data and data linking that is understandable to the general public Help inform the development of the governance and operational procedures of the ADRCs and provide data on public attitudes to inform their future strategies and priorities
  • 50. Background • Throughout October - November 2013 public dialogues held in 7 locations across UK (Ipsos MORl led) • The aims were to: To better understand the cultural barriers around linking administrative data To begin the process of creating a language that is meaningful and accessible to the public. To test the public perceptions of the rules that ESRC ADRCs will be subject to and to provide the ADRCs with data on public attitudes and appetite for engagement (To provide ONS with more detailed evidence on public views of their current front-running option for Beyond 2011)
  • 51. Support for initiative IF: 1. The data is linked for socially beneficial purposes “As long as it’s used for good, like to develop things, improve services, improve knowledge.” Belfast 2. It is fully de-identified – partial vs full postcodes 3. It is kept secure at all times – concerns around remote access 4. No commercial gain for business including commercial access BUT participants needed extensive information and discussions with experts and researchers in order to be satisfied that these conditions would be met under the ADRN plans. So - simply publicising these three conditions may not be enough to ensure that the general public are reassured about or support the work of the ADRN.
  • 53. Impact of this dialogue • Report shared with ADRN • Informed decisions of the ADRN Management Committee for the ADRN policies and procedures ▶ Lay membership in ADRN Board and Approvals Panel • Short animated videos on ADRN website to help explain: Data linkage https://www.youtube.com/watch?v=E3e4D2bHxa8&feature=youtu.be Protecting Privacy https://www.youtube.com/watch?v=nnxz3_XGMAE&feature=youtu.be • Clear policies and 5 safe’s Safe People, Safe Projects, Safe Settings, Safe Outputs, Safe Data protocols
  • 54. Big Data: Public views on the use of private sector data for social research A Findings Report for the Economic and Social Research Council
  • 55. Aim and objectives of the dialogue To explore public views on access to and the use of data from private sector organisations for research purposes in the context of three Data Research Centres funded by the ESRC. o To identify areas of public concern about confidentiality and privacy impact o To start creating a language around private sector data and the use for research purposes o To test public understanding of: data ownership, data acquisition, data access, using/ re-using private sector data, data storage and preservation
  • 57. Public views on data collection by the private sector “It’s just the way we live” Internet GPS trackin g devices Cards Public Places No way you can opt out of giving data unless you live like a hermit in the middle of an island.
  • 58. Particular Concerns Lack of Transparency, & Information Passing/Selling data to others Keeping data safe Linking data Intentions for data collection and use DPA: Principles and Sanctions
  • 59. Examples of public concerns “What people are worried about is that it’s not going to be kept just within. It might get sold to insurance companies, employers and this is where people want to know that it’s going to be safe.” “The more data you add the more it’s creating this sense of identity of each person, so it’s almost like everyone’s got this data avatar that’s building up as we get older.” “I try to avoid using the internet for purchases specifically because I don’t want my data collected and then used for sales purposes afterwards.” (on DPA) “Things like ‘used in a way that is adequate, relevant and not excessive’ what does that mean? Who decides what that is? Do you get to decide that yourself when you’re doing research?
  • 61. Data acquisition Only acquire data from trustworthy companies Ensure accuracy and relevance of data Work as much as possible with anonymised data …. Test the provenance of data sets Little support for payment for data sets Improved information …. Accurate and up to date data is vital It is acceptable to share data for the good of society – but we need to know how it will be used I don’t think it’s right that they [the Data Centres] should buy data, it’s public money so they should be spending it on the public
  • 62. Data access Trust in processes Approval for access procedures researchers Avoid clash of interests .... Secure setting favoured over virtual environment Consent for use of personal data Clear communication about purpose of research .... Big data is a broad term, at what point is it not personal anymore, when does it lose your name? The process to secure it and who’s actually getting to use it and stuff like that […] seems to be set up that it’s pretty secure and that not just anybody can walk in and get this information.
  • 63. Data storage More information: what/ / for how long/ security Physical storage favoured over virtual environment Ensure systems are accessible in future Storage We don’t know where the information’ s stored and who’s in control of it.
  • 64. Data ownership Low familiarity More information: type of data owned More information: data you don’t use Ownership They are going to have data coming in, they’re going to be processing it, so although it originally came from a source, the new information that’s been collected, is that owned by them or does [ownership] still go back to the original? If it’s about me, surely I own the data?
  • 65. Transparency – why it matters? “Just educate the public about the Data Centres. If the public are aware of what’s happening then they may not mind so much” “A lot of the stigma that comes with data sharing comes with people not knowing and not being educated about the facts of how the data’s being used.”
  • 66. Value of private sector data for knowledge and society “Show the greater good of using my data – the benefits of my data for the greater good.” “I’d be really willing to sign up for this only if I saw the benefits that my data provided. […] If you see the impact that it has not only on your life but on the life of the NHS as well and then they are going to change their services, that’s the greater good of it”.
  • 67. Communication Increased buy-in Reassurances and benefits Demonstrate impact Data management processes Turn the small print into big print! Plain English please!What is ‘good use’ of data for the benefit of society? Before we came here, we had absolutely no idea what the Data Centre was
  • 68. Conclusions from the dialogue o Sharing of information about how the Data Centres operate significantly increased levels of understanding of the benefits of social research using private sector data. o Improved communication and education about the processes by which the Data Centres acquire, store, own and access private sector data is vitally important to establish greater credibility. o Data Centres should use case studies to demonstrate how the use of private sector data in social research can lead to policy or service improvements. o The more reassurances are given about the data Centre processes and the benefits of using private sector data the more buy-in can be expected from members of the public.
  • 71. 10 British government The ONS GP surgery The NHS The police Academic researchers Charities Online retailers The media / the press 0% Insurance companies Telecommunication s companie s Local authority Supermarkets Internet companies 5 % 10 % 15 % 20 % 25 % 30 % 35 % 40 % 45 % 0% 5% 10% 15% 20% 25% 30% 35% High trust in organisation generally (score 8-10) 40 % 45 % 50 % Hightrustindata(score8-10) “Please tell me on a score of 0-10 how much you personally trust each of the institutions below.” “Please tell me on a score of 0-10 how much you personally trust each of the institutions below to use your data appropriately.” 50% © Ipsos MORI Version 1 | RSS Ipsos Mori Report (2014) Public attitudes to the use and sharing of their data The “data trust deficit” Base: 2,019 GB adults, aged 16- 75 Source: Ipsos MORI
  • 73. Giving the last word to the participants… Primary contact at ESRC: Maria.Sigala@esrc.ac.uk Thank you! Vanessa Cuthill ESRC Deputy Director (Evidence, Impact and Strategic Partnerships)