SlideShare a Scribd company logo
BigSurv18 – A Conference
and a Monograph
Lars Lyberg, Inizio
Presentation at Frimis, November 28, 2018
1
2
Top stories - Read all about them
1. Anyone can take pictures of you from a
satellite and there is nothing you can
do about it
2. Better data can drive travel and
conference savings
3. The Health and Human Services
Department will mine data from its
internal social network
4. Vietnam taps Big Data to avoid China’s
traffic chaos
5. Tweets can foretell votes
3
A black swan
A black swan is an undirected and unpredicted event.
It is rare, has an extreme impact but in retrospect we saw it coming
• Internet - yes
• 9/11 - yes
• The Lehman Brothers crash – yes
• Decreasing response rates-yes
• The advent of Big Data –not really
• New data sources other than BD-yes
• Nonprobability sampling-yes and no
4
Monograph Contents
● The new survey landscape
● Total error and data quality
● Big data in official statistics
● Combining big data with survey statistics: methods and
applications
● Combining big data with survey statistics: tools
● Regulations, ethics, privacy
5
A Couple of Giants
Sir Ronald Fisher Jerzy Neyman
6
A Stunning Statement Made by
One of the Keynote Speakers
Surveys are the last resort
7
Design, Measurement, Inference
Adaptation to Available Data
Resource
Traditional Survey Design,
Measurement and Inference
Model-Assisted Survey Design,
Inference
Survey-Assisted Modeling
Modeling, Data Mining,
“Analytics”
InformationContentofAvailableData
High
Low
Source: Heeringa 2018
Why Did Data Become “BIG”
● Technological advances associated with data science and
computational tools and methods.
● Information-based Decision Making
– “Evidence-based”, “Data-Driven”, “Analytics”, “Machine Learning”
● Focus on short-run prediction
– Business decision making
– Health risks (e.g. Google Flu)
– Financial markets
– Political processes
● Style points: “Tail Fins”
Source: Heeringa 2018
The “V” Taxonomy for Big Data: 345(?, Variability, Value)
Some Concepts
● Artificial Intelligence-machines being able to carry out
tasks in a smart way
● Machine Learning-application of AI where we give
machines access to data and let them learn for
themselves via neural networks and natural language
processing
● Data Mining-builds intuition about what is really
happening in some data
● Data Science-combines the application of computer
science, statistics, programming and business
management
11
Hype of Big Data
Gartner’s hype curve
12
Source: Wikipedia
13
An example of Big Data analytics
● Evolv has a database containing information on 984 000
hourly workers in 20 companies
● Data sources: online employee background checks, time
and attendance tracking software, and performance
ranking programs
● Selected results:
– Employees with a criminal record performed slightly
better than others
– Experienced employees did no better than the
inexperienced
14
An example of Big Data analytics (cont’d)
● Potential problems:
– Weak conceptualization
– Not clear if data sources are compatible across clients
– No background variables
– No questions asked to real persons
– Inference is based on big rather than theory
– The analysis might be business-driven
Having said that:
How can data on 984 000 workers be used effectively?
15
Happiness and Well-being
The common survey question: How satisfied are
you with your life?
BD alternative
• 10 million tweets that are coded for happiness
(rainbow, love, beauty, hope, wonderful,
wine…) and non-happiness (damn, boo, ugly,
smoke, hate, lied,…)
• Happiest states: Hawaii, Utah, Idaho, Maine,
Washington
• Saddest states: Louisiana, Mississippi,
Maryland, Michigan, Delaware
16
The Potential Use of Big
Data in Statistics
Production
● Produce statistics based on BD that
can replace surveys
● Combine BD with admin data, sample
surveys, and nonprobability sources
in order to improve statistics
● Explore new topics and concepts
● Data mining to identify new patterns
and models
Examples of Sources of Data
● Censuses
● Other survey programs
● Administrative data systems
● Medical records systems
● Commercially compiled data
● Financial data
● Satellite imagery
● GPS and GIS
● Social media
● Mobile devices
● Wearable measurement
devices
● Sensors (Internet of
Things)
● Visual data: pictures and
video
● Genetic profile data
● Transactional data systems
Adapted from Heeringa 2018
AIS data
● AIS - Automatic Identification System
● Data can be used to follow all vessels
● Messages from vessels are transmitted
with high speed
● Monitor marine traffic in real time
Source:
www.marinetraffi
c.com
Improve current statistics
and produce new statistics
(de Wit et al 2017)
● Statistics on marine traffic
to estimate emission and
identify areas with heavy
traffic
● Port statistics to monitor
how vessels move between
different ports
Combining different data
sources- example from Statistics
Sweden
The left map shows green areas in Lidingö, Stockholm
Sweden. Combining the data with data from the real estate
register we get the green areas that are accessible to the
public in general (the right map).
Source: SCB. Bakgrundskarta © Lantmäteriet
Combining Different Data Sources-
Example from Statistics Netherlands
● Solar energy - power use estimates:
– transmission grid load,
– metrological data,
– areal images,
– electricity meter readings, and
– energy efficient home
improvements.
● How to combine these data sources?
That’s the question.
Satellite imagery: LANDSAT Crop
Layer
23
80 acre sun flower field. Fargo, North Dakota.
Source: USDA
New set of legal and ethical
issues in the big data era
● Data are often collected for one purpose but combined with
other data sources and used for another purpose
● Risk of privacy and confidentiality breaches
● The old way – to get access to survey and administrative data by
using statistical disclosure control techniques and provision of
controlled access through research data centers.
● The new way – unclear legal situation – who owns the new type
of data?
● Consent statements that foresee all potential future use of data -
too complex for anyone to grasp
● There are no data stewards controlling access to individual data
● GDPR does not explicitly mention BD
Take-away points
• New survey developments are taking place
• Our industry needs innovations, less fighting
and more collaboration
• We need to merge with other research cultures
• We need to know more about combining data
sources
• We need to account for all major sources of
uncertainty that are associated with data
collection and analysis of data
• We need to develop new theories for handling
error structures and combining data sources
25
Over and Out
26

More Related Content

What's hot

Big data, Machine learning and the Auditor
Big data, Machine learning and the AuditorBig data, Machine learning and the Auditor
Big data, Machine learning and the Auditor
Bharath Rao
 
Data science
Data scienceData science
Data science
SwapnilDahake2
 
Tools and techniques adopted for big data analytics
Tools and techniques adopted for big data analyticsTools and techniques adopted for big data analytics
Tools and techniques adopted for big data analyticsJOSEPH FRANCIS
 
Sample
Sample Sample
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
azuyo.com
 
Case 3.1 - Big data big rewards
Case 3.1 - Big data big rewardsCase 3.1 - Big data big rewards
Case 3.1 - Big data big rewards
niz73
 
Experfy Online Course - An Introduction to Diagnosing Diseases with Patient ...
Experfy  Online Course - An Introduction to Diagnosing Diseases with Patient ...Experfy  Online Course - An Introduction to Diagnosing Diseases with Patient ...
Experfy Online Course - An Introduction to Diagnosing Diseases with Patient ...
Experfy
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallTrillium Software
 
Big Data Analytics government healthcare
Big Data Analytics government healthcareBig Data Analytics government healthcare
Big Data Analytics government healthcare
Data Science Thailand
 
Bigdatappt
BigdatapptBigdatappt
Bigdatappt
MD. IFTEKARUL ALAM
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
Ziyad Saleh
 
Predictive analytics in uae government organizations
Predictive analytics in uae government organizationsPredictive analytics in uae government organizations
Predictive analytics in uae government organizations
Saeed Al Dhaheri
 
Big data seminar at Broadridge
Big data seminar at BroadridgeBig data seminar at Broadridge
Big data seminar at Broadridge
Software Engineer
 
BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!
Extentia Information Technology
 
Supply chain management
Supply chain managementSupply chain management
Supply chain managementmuditawasthi
 
IOT DATA AND BIG DATA
IOT DATA AND BIG DATAIOT DATA AND BIG DATA
Big data analytics
Big data analyticsBig data analytics
Big data analytics
VedanteePathak
 
Capps programoninformationsciencebrownbag
Capps programoninformationsciencebrownbagCapps programoninformationsciencebrownbag
Capps programoninformationsciencebrownbag
Micah Altman
 

What's hot (19)

Big data, Machine learning and the Auditor
Big data, Machine learning and the AuditorBig data, Machine learning and the Auditor
Big data, Machine learning and the Auditor
 
Data science
Data scienceData science
Data science
 
Tools and techniques adopted for big data analytics
Tools and techniques adopted for big data analyticsTools and techniques adopted for big data analytics
Tools and techniques adopted for big data analytics
 
Sample
Sample Sample
Sample
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Big data-analytics-ebook
Big data-analytics-ebookBig data-analytics-ebook
Big data-analytics-ebook
 
Case 3.1 - Big data big rewards
Case 3.1 - Big data big rewardsCase 3.1 - Big data big rewards
Case 3.1 - Big data big rewards
 
Experfy Online Course - An Introduction to Diagnosing Diseases with Patient ...
Experfy  Online Course - An Introduction to Diagnosing Diseases with Patient ...Experfy  Online Course - An Introduction to Diagnosing Diseases with Patient ...
Experfy Online Course - An Introduction to Diagnosing Diseases with Patient ...
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Big Data Analytics government healthcare
Big Data Analytics government healthcareBig Data Analytics government healthcare
Big Data Analytics government healthcare
 
Bigdatappt
BigdatapptBigdatappt
Bigdatappt
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
 
Predictive analytics in uae government organizations
Predictive analytics in uae government organizationsPredictive analytics in uae government organizations
Predictive analytics in uae government organizations
 
Big data seminar at Broadridge
Big data seminar at BroadridgeBig data seminar at Broadridge
Big data seminar at Broadridge
 
BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!
 
Supply chain management
Supply chain managementSupply chain management
Supply chain management
 
IOT DATA AND BIG DATA
IOT DATA AND BIG DATAIOT DATA AND BIG DATA
IOT DATA AND BIG DATA
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Capps programoninformationsciencebrownbag
Capps programoninformationsciencebrownbagCapps programoninformationsciencebrownbag
Capps programoninformationsciencebrownbag
 

Similar to Lars Lyberg, Inizio: Rapport från konferensen BigSurv18

Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
krishnapalrajput132
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
DATAVERSITY
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
Data Blueprint
 
Big Data World
Big Data WorldBig Data World
Big Data World
Hossein Zahed
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
Piet J.H. Daas
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
T.S. Lim
 
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
croysierkathey
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. maigva
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
InnoTech
 
Interesting ways Big Data is used today
Interesting ways Big Data is used todayInteresting ways Big Data is used today
Interesting ways Big Data is used today
Daniel Sârbe
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
dipak sahoo
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedShradha Verma
 
Emerging technologies in computer science
Emerging technologies in computer scienceEmerging technologies in computer science
Emerging technologies in computer science
Srinivas Narasegouda
 
Introduction to Data Stream Processing
Introduction to Data Stream ProcessingIntroduction to Data Stream Processing
Introduction to Data Stream Processing
Safe Software
 
Data institutions for climate-induced migration Scanning the local data ecosy...
Data institutions for climate-induced migration Scanning the local data ecosy...Data institutions for climate-induced migration Scanning the local data ecosy...
Data institutions for climate-induced migration Scanning the local data ecosy...
Thomas Hervé Mboa Nkoudou
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
Joud Khattab
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
aditi276464
 
how you can use data analytics
how you can use data analytics how you can use data analytics
how you can use data analytics Dan Bart
 

Similar to Lars Lyberg, Inizio: Rapport från konferensen BigSurv18 (20)

Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
Big Data World
Big Data WorldBig Data World
Big Data World
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
 
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm.
 
ppt1.pptx
ppt1.pptxppt1.pptx
ppt1.pptx
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
Big data
Big dataBig data
Big data
 
Interesting ways Big Data is used today
Interesting ways Big Data is used todayInteresting ways Big Data is used today
Interesting ways Big Data is used today
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 
Emerging technologies in computer science
Emerging technologies in computer scienceEmerging technologies in computer science
Emerging technologies in computer science
 
Introduction to Data Stream Processing
Introduction to Data Stream ProcessingIntroduction to Data Stream Processing
Introduction to Data Stream Processing
 
Data institutions for climate-induced migration Scanning the local data ecosy...
Data institutions for climate-induced migration Scanning the local data ecosy...Data institutions for climate-induced migration Scanning the local data ecosy...
Data institutions for climate-induced migration Scanning the local data ecosy...
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
 
how you can use data analytics
how you can use data analytics how you can use data analytics
how you can use data analytics
 

More from Alf Fyhrlund

3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet
Alf Fyhrlund
 
8 sophie-hedestad-meltwater
8 sophie-hedestad-meltwater8 sophie-hedestad-meltwater
8 sophie-hedestad-meltwater
Alf Fyhrlund
 
5 gunnar-ehrnborg-ericsson
5 gunnar-ehrnborg-ericsson5 gunnar-ehrnborg-ericsson
5 gunnar-ehrnborg-ericsson
Alf Fyhrlund
 
3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet
Alf Fyhrlund
 
2 dan-hedlin-stockholms-universitet
2 dan-hedlin-stockholms-universitet2 dan-hedlin-stockholms-universitet
2 dan-hedlin-stockholms-universitet
Alf Fyhrlund
 
1 lilli-japec-scb
1 lilli-japec-scb1 lilli-japec-scb
1 lilli-japec-scb
Alf Fyhrlund
 
Daniel Thorburn, SU: Är bayesianska metoder användbara?
 Daniel Thorburn, SU: Är bayesianska metoder användbara? Daniel Thorburn, SU: Är bayesianska metoder användbara?
Daniel Thorburn, SU: Är bayesianska metoder användbara?
Alf Fyhrlund
 
Lars Lyberg, Inizio: Ett föränderligt surveylandskap
Lars Lyberg, Inizio: Ett föränderligt surveylandskapLars Lyberg, Inizio: Ett föränderligt surveylandskap
Lars Lyberg, Inizio: Ett föränderligt surveylandskap
Alf Fyhrlund
 
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
Alf Fyhrlund
 
Web panel surveys patrick sturgis
Web panel surveys patrick sturgisWeb panel surveys patrick sturgis
Web panel surveys patrick sturgis
Alf Fyhrlund
 
Web panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersenWeb panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersen
Alf Fyhrlund
 
Web panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersenWeb panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersen
Alf Fyhrlund
 
Dwg 2012-oct-07 - european commission open data and public sector information
Dwg 2012-oct-07 - european commission open data and public sector informationDwg 2012-oct-07 - european commission open data and public sector information
Dwg 2012-oct-07 - european commission open data and public sector informationAlf Fyhrlund
 
Sveriges Riksbank: Monetary Policy Update December 2011
Sveriges Riksbank: Monetary Policy Update December 2011Sveriges Riksbank: Monetary Policy Update December 2011
Sveriges Riksbank: Monetary Policy Update December 2011
Alf Fyhrlund
 
Sveriges Riksbank - Monetary Policy Report October 2011
Sveriges Riksbank - Monetary Policy Report October 2011Sveriges Riksbank - Monetary Policy Report October 2011
Sveriges Riksbank - Monetary Policy Report October 2011Alf Fyhrlund
 

More from Alf Fyhrlund (16)

3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet
 
8 sophie-hedestad-meltwater
8 sophie-hedestad-meltwater8 sophie-hedestad-meltwater
8 sophie-hedestad-meltwater
 
5 gunnar-ehrnborg-ericsson
5 gunnar-ehrnborg-ericsson5 gunnar-ehrnborg-ericsson
5 gunnar-ehrnborg-ericsson
 
3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet3 thomas-laitila-orebro-universitet
3 thomas-laitila-orebro-universitet
 
2 dan-hedlin-stockholms-universitet
2 dan-hedlin-stockholms-universitet2 dan-hedlin-stockholms-universitet
2 dan-hedlin-stockholms-universitet
 
1 lilli-japec-scb
1 lilli-japec-scb1 lilli-japec-scb
1 lilli-japec-scb
 
Daniel Thorburn, SU: Är bayesianska metoder användbara?
 Daniel Thorburn, SU: Är bayesianska metoder användbara? Daniel Thorburn, SU: Är bayesianska metoder användbara?
Daniel Thorburn, SU: Är bayesianska metoder användbara?
 
Lars Lyberg, Inizio: Ett föränderligt surveylandskap
Lars Lyberg, Inizio: Ett föränderligt surveylandskapLars Lyberg, Inizio: Ett föränderligt surveylandskap
Lars Lyberg, Inizio: Ett föränderligt surveylandskap
 
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
Anders Holmberg, Statistisk Sentralbyrå, Norge: Att finna och utforska transa...
 
Web panel surveys patrick sturgis
Web panel surveys patrick sturgisWeb panel surveys patrick sturgis
Web panel surveys patrick sturgis
 
Web panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersenWeb panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersen
 
Web panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersenWeb panel surveys maja fromseier petersen
Web panel surveys maja fromseier petersen
 
Dwg 2012-oct-07 - european commission open data and public sector information
Dwg 2012-oct-07 - european commission open data and public sector informationDwg 2012-oct-07 - european commission open data and public sector information
Dwg 2012-oct-07 - european commission open data and public sector information
 
Mpu oh 111220
Mpu oh 111220Mpu oh 111220
Mpu oh 111220
 
Sveriges Riksbank: Monetary Policy Update December 2011
Sveriges Riksbank: Monetary Policy Update December 2011Sveriges Riksbank: Monetary Policy Update December 2011
Sveriges Riksbank: Monetary Policy Update December 2011
 
Sveriges Riksbank - Monetary Policy Report October 2011
Sveriges Riksbank - Monetary Policy Report October 2011Sveriges Riksbank - Monetary Policy Report October 2011
Sveriges Riksbank - Monetary Policy Report October 2011
 

Recently uploaded

ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
ToshihiroIto4
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
SkillCertProExams
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AwangAniqkmals
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Rosie Wells
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
gharris9
 

Recently uploaded (19)

ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
 

Lars Lyberg, Inizio: Rapport från konferensen BigSurv18

  • 1. BigSurv18 – A Conference and a Monograph Lars Lyberg, Inizio Presentation at Frimis, November 28, 2018 1
  • 2. 2
  • 3. Top stories - Read all about them 1. Anyone can take pictures of you from a satellite and there is nothing you can do about it 2. Better data can drive travel and conference savings 3. The Health and Human Services Department will mine data from its internal social network 4. Vietnam taps Big Data to avoid China’s traffic chaos 5. Tweets can foretell votes 3
  • 4. A black swan A black swan is an undirected and unpredicted event. It is rare, has an extreme impact but in retrospect we saw it coming • Internet - yes • 9/11 - yes • The Lehman Brothers crash – yes • Decreasing response rates-yes • The advent of Big Data –not really • New data sources other than BD-yes • Nonprobability sampling-yes and no 4
  • 5. Monograph Contents ● The new survey landscape ● Total error and data quality ● Big data in official statistics ● Combining big data with survey statistics: methods and applications ● Combining big data with survey statistics: tools ● Regulations, ethics, privacy 5
  • 6. A Couple of Giants Sir Ronald Fisher Jerzy Neyman 6
  • 7. A Stunning Statement Made by One of the Keynote Speakers Surveys are the last resort 7
  • 8. Design, Measurement, Inference Adaptation to Available Data Resource Traditional Survey Design, Measurement and Inference Model-Assisted Survey Design, Inference Survey-Assisted Modeling Modeling, Data Mining, “Analytics” InformationContentofAvailableData High Low Source: Heeringa 2018
  • 9. Why Did Data Become “BIG” ● Technological advances associated with data science and computational tools and methods. ● Information-based Decision Making – “Evidence-based”, “Data-Driven”, “Analytics”, “Machine Learning” ● Focus on short-run prediction – Business decision making – Health risks (e.g. Google Flu) – Financial markets – Political processes ● Style points: “Tail Fins” Source: Heeringa 2018
  • 10. The “V” Taxonomy for Big Data: 345(?, Variability, Value)
  • 11. Some Concepts ● Artificial Intelligence-machines being able to carry out tasks in a smart way ● Machine Learning-application of AI where we give machines access to data and let them learn for themselves via neural networks and natural language processing ● Data Mining-builds intuition about what is really happening in some data ● Data Science-combines the application of computer science, statistics, programming and business management 11
  • 12. Hype of Big Data Gartner’s hype curve 12 Source: Wikipedia
  • 13. 13
  • 14. An example of Big Data analytics ● Evolv has a database containing information on 984 000 hourly workers in 20 companies ● Data sources: online employee background checks, time and attendance tracking software, and performance ranking programs ● Selected results: – Employees with a criminal record performed slightly better than others – Experienced employees did no better than the inexperienced 14
  • 15. An example of Big Data analytics (cont’d) ● Potential problems: – Weak conceptualization – Not clear if data sources are compatible across clients – No background variables – No questions asked to real persons – Inference is based on big rather than theory – The analysis might be business-driven Having said that: How can data on 984 000 workers be used effectively? 15
  • 16. Happiness and Well-being The common survey question: How satisfied are you with your life? BD alternative • 10 million tweets that are coded for happiness (rainbow, love, beauty, hope, wonderful, wine…) and non-happiness (damn, boo, ugly, smoke, hate, lied,…) • Happiest states: Hawaii, Utah, Idaho, Maine, Washington • Saddest states: Louisiana, Mississippi, Maryland, Michigan, Delaware 16
  • 17. The Potential Use of Big Data in Statistics Production ● Produce statistics based on BD that can replace surveys ● Combine BD with admin data, sample surveys, and nonprobability sources in order to improve statistics ● Explore new topics and concepts ● Data mining to identify new patterns and models
  • 18. Examples of Sources of Data ● Censuses ● Other survey programs ● Administrative data systems ● Medical records systems ● Commercially compiled data ● Financial data ● Satellite imagery ● GPS and GIS ● Social media ● Mobile devices ● Wearable measurement devices ● Sensors (Internet of Things) ● Visual data: pictures and video ● Genetic profile data ● Transactional data systems Adapted from Heeringa 2018
  • 19. AIS data ● AIS - Automatic Identification System ● Data can be used to follow all vessels ● Messages from vessels are transmitted with high speed ● Monitor marine traffic in real time Source: www.marinetraffi c.com
  • 20. Improve current statistics and produce new statistics (de Wit et al 2017) ● Statistics on marine traffic to estimate emission and identify areas with heavy traffic ● Port statistics to monitor how vessels move between different ports
  • 21. Combining different data sources- example from Statistics Sweden The left map shows green areas in Lidingö, Stockholm Sweden. Combining the data with data from the real estate register we get the green areas that are accessible to the public in general (the right map). Source: SCB. Bakgrundskarta © Lantmäteriet
  • 22. Combining Different Data Sources- Example from Statistics Netherlands ● Solar energy - power use estimates: – transmission grid load, – metrological data, – areal images, – electricity meter readings, and – energy efficient home improvements. ● How to combine these data sources? That’s the question.
  • 23. Satellite imagery: LANDSAT Crop Layer 23 80 acre sun flower field. Fargo, North Dakota. Source: USDA
  • 24. New set of legal and ethical issues in the big data era ● Data are often collected for one purpose but combined with other data sources and used for another purpose ● Risk of privacy and confidentiality breaches ● The old way – to get access to survey and administrative data by using statistical disclosure control techniques and provision of controlled access through research data centers. ● The new way – unclear legal situation – who owns the new type of data? ● Consent statements that foresee all potential future use of data - too complex for anyone to grasp ● There are no data stewards controlling access to individual data ● GDPR does not explicitly mention BD
  • 25. Take-away points • New survey developments are taking place • Our industry needs innovations, less fighting and more collaboration • We need to merge with other research cultures • We need to know more about combining data sources • We need to account for all major sources of uncertainty that are associated with data collection and analysis of data • We need to develop new theories for handling error structures and combining data sources 25