SlideShare a Scribd company logo
http://statbel.fgov.be
Marc Debusschere
Coordinator Administrative & Big Data
Statbel and big data
Data Science Meetup Leuven, 27 June 2018
http://statbel.fgov.be
Overview
❑ Context
Statbel, big data and the third data revolution in statistics
❑ Big data and Statbel
Projects, accomplishments and problems
❑ European collaboration
❑ Big data and official statistics
Provisional conclusions and way forward
❑ Big data, Statbel and data science
http://statbel.fgov.be
Statbel =
❑ Statistics Belgium
❑ The institute formerly known as Nationaal Instituut voor de
Statistiek (NIS) / Institut national de Statistique (INS)
❑ Administratively part of the FPS (‘ministry’) Economy
❑ Member of the European Statistical System (ESS) =
Eurostat + 32 EU & EFTA national statistical institutes + associated statistics producers
http://statbel.fgov.be
Big data
❑ = data impossible to process in a ‘normal’ way
‘normal’ is relative …
❑ 3 v’s: volume, velocity, variety
❑ Result of societal and technological changes
Satellites, cameras and sensors, internet and e-mail, social media, mobile
phones and tablets, e-business, e-government, machine-to-machine
(internet of things, IoT)
❑ Result: data explosion, data deluge
http://statbel.fgov.be
Big data and statistics
❑ Big data = ‘digital footprint’
❑ Containing valuable information, statistically exploitable
(but also commercially …)
❑ Resulting in the third data revolution in statistics
After surveys (>1846) and administrative data (>2000), now: big data!
❑ Possible data sources – list far from exhaustive!
❖ Scanner data, electronical payments, credit card data
❖ Webscraping for job vacancies, enterprise characteristics
❖ Traffic cameras and detection loops
❖ Smart meters (electricity, gas, water)
❖ Last but not least: mobile phone data!
http://statbel.fgov.be
The future of statistics …

… big data!
Instant statistics based on big data, complemented and/or
validated by administrative data and small and specific ad hoc
surveys.
Also known as: smart statistics …
http://statbel.fgov.be
Big data and Statbel: Big Data Team
❑ Start at the end of 2015
❑ Restricted group, operating informally and ad hoc
❑ Focus on mobile phone data, webscraping job vacancies
❑ Tasks:
❖Reflection on strategy, priorities
❖External contacts concerning big data, with data owners, potential users,
federal and regional authorities, academia, EU, international
organisations, …
❖Analysing big datasets and connecting them to statistical ones – see
below
http://statbel.fgov.be
Big data and Statbel: in production
❑ For consumer price index (CPI)
❖Scanner data supermarkets and retail chains
❖Webscraping prices (e.g. airplane tickets, webshops)
http://statbel.fgov.be
Big data and Statbel: not planned (at present)
❑ Social media, internet search results, text analytics, …
=‘high-hanging fruits’, access and interpretation very problematic!
❑ Smart meters
Political decision of regions (2012) not to deploy
=> no data (about to change in Flanders)
❑ Traffic cameras, traffic loops
Regional competence and data
http://statbel.fgov.be
Big data and Statbel: projects
❑ Mobile phone data
❖Project Statbel-Proximus-Eurostat
❖Border Region Data Collection (BRDC)
❖City data from LFS and Big Data
❑ Webscraping
❖Job vacancies
❑ Satellite data and aerial photography
❖Deep Solaris
http://statbel.fgov.be
Big data in production
❑ Scanner data for CPI
❖ Based on agreement with data owners, facilitated by political pressure
❖ Legal basis (HICP regulation) but cooperative model
❖ Being expanded gradually with new supermarket and retail chains
❖ Extremely smooth and cost-efficient after initial set-up
❑ Webscraping prices for CPI
❖ Collecting prices on webshops (e.g. airplane tickets)
❖ For efficiency but also out of necessity: e-commerce fast expanding!
❖ Legal issues possible
http://statbel.fgov.be
Big data almost in production
❑ Webscraping job vacancies: about to go in production …
❖ Methodological and practical issues
❖ Stand-alone results not sufficient, need to combine with existing Job
vacancy survey (JVS) and ‘administrative’ data from regional
employment agencies (VDAB, FOREM, Actiris)
❖ Linked to European project (ESSnet Big Data, see below): https://
webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/WP1_Webscraping_job_vacancies
http://statbel.fgov.be
Project Statbel, Proximus, Eurostat
❑ Start December 2015, first results April 2016
❑ Step by step approach:
❖ First: actual present population
❖ Basis for: resident population (via place of residence), workplace,
‘usual environment’, tourism, labour migration, migration, time use, …
❑ Innovative:
❖ First collaboration NSI/operator in EU => ‘real’ data
❖ No ‘call detail records’ but network signals: 10 x more frequent!
❖ Combining mobile phone data with statistical datasets
❑ Via geo-coupling of aggregates: no privacy issues!
http://statbel.fgov.be
Project Statbel, Proximus, Eurostat

Results and next steps
❑ Results
❖ 10 publications - see De Meersman e.a., Debusschere e.a., Demunter e.a., Reis e.a., Seynaeve e.a.,
Wirthmann e.a., all at https://webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/WP5_Documentation
❑ Our objectives:
❖ Further exploration for statistical ánd commercial applications
❖ Concrete use cases with a view to statistical production lines
❑ Unfortunately …
❖ March 2017: data blocked for everyone
❖ In the meantime: new initiatives via MIT/Univ. Newcastle, Eurostat
http://statbel.fgov.be
Some results
Population density per km²:
Mobile phone data (left) versus Census 2011 (right)
http://statbel.fgov.be
Some results, continued
Weekday cells identified as ‘residential’, ‘commuting’ or ‘work’, with geographical
representation
http://statbel.fgov.be
Some results, continued
‘Residential’, ‘work’ and ‘commuting’-cells in the Brussels-Leuven area
http://statbel.fgov.be
Other projects
❑ Border Region Data Collection (BRDC)
❖Grant EC DG Regio, 1 year, July 2017 - July 2018
❖Cross-border living place-workplace mobility through
combining Labour force survey (LFS), administrative data and
mobile phone data
❖With CBS Netherlands (lead), Destatis Germany, Insee
France, GUS Poland, SURS Slovenia
http://statbel.fgov.be
Other projects, continued
❑ Deep Solaris
❖Grant Eurostat, 1 year, kickoff 19 Febr. 2018
❖Detecting solar pannels on the basis of satellite data
and aerial photography
❖Via machine learning
❖With CBS Netherlands (lead), Destatis Germany,
IT.NRW (Düsseldorf, DE) and BISS (Heerlen, NL)
http://statbel.fgov.be
Other projects, continued
❑ City data from LFS and Big Data
❖Grant EC DG Regio, 1 year, Jan. – Dec. 2018
❖Mapping metropolitan areas on the basis of Labour force
survey (LFS) and mobile phone data
❖With CBS Netherlands (lead), Destatis Germany, Insee
France, Statistics Austria
http://statbel.fgov.be
From exploration to exploitation

Developing use cases
❑ Scanner data and webscraping for CPI
❑ Webscraping job vacancies
❑ Validation living place and workplace Census
❑ Matrix living place-workplace
❑ … (population, migration, tourism, mobility, transport, time use,
environment, agriculture, …)
http://statbel.fgov.be
Big data in the ESS (European Statistical System)

Initiatives and projects
http://statbel.fgov.be
Big data and official statistics

Provisional conclusions
❑ Size and complexity of datasets
❖ Less problematic than anticipated (because of pre-processing, at
least some structure)
❖ Focus consequently less on IT infrastructure and software
http://statbel.fgov.be
Big data and official statistics

Provisional conclusions, continued
❑ Biggest obstacle: access!
❖ Data owned by private enterprises: profit-oriented
❖ Fail to see any advantage in collaborating, on the contrary
(mistakenly!)
❖ Imposing legal obligation seems unavoidable …
❑ Link to privacy issues: fear of reputational damage
http://statbel.fgov.be
Big data and official statistics

Provisional conclusions, continued
❑ Additional challenge: methodology
❖ All ancient headaches are still there …
… with a lot of new ones added!
http://statbel.fgov.be
Big data and official statistics

The next stage: smart statistics
❑ Monitoring systems which are:
❖ integrated
❖ flexible
❖ multi-source
❖ real-time and highly detailed
❑ Some examples:
❖ continuous tracking of air quality
❖ highly granular actual present population (time, location, characteristics)
❖ smart farming statistics
http://statbel.fgov.be
For discussion:

big data, Statbel and data science
❑ Statbel owns numerous geocoded datasets (population,
employment, income, lodgings, …)
❑ and might gain access to big data sources …
❑ … but lacks data science, capability to analyse big data
❑ Two possible solutions:
❖ collaboration with academia, researchers
❖ hiring …
http://statbel.fgov.be
Questions?
Comments?

More Related Content

What's hot

Wikidata and performing_arts_20170811
Wikidata and performing_arts_20170811Wikidata and performing_arts_20170811
Wikidata and performing_arts_20170811
Beat Estermann
 
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
giovanni biallo
 
Open Cultural Data in Switzerland
Open Cultural Data in SwitzerlandOpen Cultural Data in Switzerland
Open Cultural Data in Switzerland
Beat Estermann
 
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Beat Estermann
 
Wikidata Introductory Workshop
Wikidata Introductory WorkshopWikidata Introductory Workshop
Wikidata Introductory Workshop
Beat Estermann
 
Open Data in Serbian Government Institutions
Open Data in Serbian Government InstitutionsOpen Data in Serbian Government Institutions
Open Data in Serbian Government Institutions
Observatory of Social Innovations
 
Wikidata Introduction, Linked Digital Future Initiative, August 2019
Wikidata Introduction, Linked Digital Future Initiative, August 2019Wikidata Introduction, Linked Digital Future Initiative, August 2019
Wikidata Introduction, Linked Digital Future Initiative, August 2019
Beat Estermann
 
L. Quattrociocchi - The information system of thematic "immigrants and new ci...
L. Quattrociocchi - The information system of thematic "immigrants and new ci...L. Quattrociocchi - The information system of thematic "immigrants and new ci...
L. Quattrociocchi - The information system of thematic "immigrants and new ci...
Istituto nazionale di statistica
 
Open Government Data in Europe
Open Government Data in EuropeOpen Government Data in Europe
Open Government Data in Europeokfn
 
Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116
Beat Estermann
 
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
European Data Forum
 
Semantically Mapping Science (SMS)
Semantically Mapping Science (SMS)Semantically Mapping Science (SMS)
Semantically Mapping Science (SMS)
Ali Khalili
 
Migration flows: data and measurement - Discussion
Migration flows: data and measurement - DiscussionMigration flows: data and measurement - Discussion
Migration flows: data and measurement - Discussion
Giampaolo Lanzieri
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
Eduworks Network
 
Open geodata in Finland
Open geodata in FinlandOpen geodata in Finland
Open geodata in Finland
Antti Rainio
 
Aurh-geo weastflows-Angl
Aurh-geo weastflows-AnglAurh-geo weastflows-Angl
Open data, what's cooking at the federal level
Open data, what's cooking at the federal levelOpen data, what's cooking at the federal level
Open data, what's cooking at the federal level
Bart Hanssens
 
Curse of Dimensionality and Big Data
Curse of Dimensionality and Big DataCurse of Dimensionality and Big Data
Curse of Dimensionality and Big Data
Stephane Marchand-Maillet
 
Austrian Experience in Building Data Value Chain
Austrian Experience in Building Data Value ChainAustrian Experience in Building Data Value Chain
Austrian Experience in Building Data Value Chain
Anna Fensel
 
Smart accelerator
Smart acceleratorSmart accelerator
Smart accelerator
ClusteriX20
 

What's hot (20)

Wikidata and performing_arts_20170811
Wikidata and performing_arts_20170811Wikidata and performing_arts_20170811
Wikidata and performing_arts_20170811
 
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...
 
Open Cultural Data in Switzerland
Open Cultural Data in SwitzerlandOpen Cultural Data in Switzerland
Open Cultural Data in Switzerland
 
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
Estermann Linked Data Ecosystem for Heritage Data - 29 Feb 2020
 
Wikidata Introductory Workshop
Wikidata Introductory WorkshopWikidata Introductory Workshop
Wikidata Introductory Workshop
 
Open Data in Serbian Government Institutions
Open Data in Serbian Government InstitutionsOpen Data in Serbian Government Institutions
Open Data in Serbian Government Institutions
 
Wikidata Introduction, Linked Digital Future Initiative, August 2019
Wikidata Introduction, Linked Digital Future Initiative, August 2019Wikidata Introduction, Linked Digital Future Initiative, August 2019
Wikidata Introduction, Linked Digital Future Initiative, August 2019
 
L. Quattrociocchi - The information system of thematic "immigrants and new ci...
L. Quattrociocchi - The information system of thematic "immigrants and new ci...L. Quattrociocchi - The information system of thematic "immigrants and new ci...
L. Quattrociocchi - The information system of thematic "immigrants and new ci...
 
Open Government Data in Europe
Open Government Data in EuropeOpen Government Data in Europe
Open Government Data in Europe
 
Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116Wikidata and performing_arts_20180116
Wikidata and performing_arts_20180116
 
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
 
Semantically Mapping Science (SMS)
Semantically Mapping Science (SMS)Semantically Mapping Science (SMS)
Semantically Mapping Science (SMS)
 
Migration flows: data and measurement - Discussion
Migration flows: data and measurement - DiscussionMigration flows: data and measurement - Discussion
Migration flows: data and measurement - Discussion
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
 
Open geodata in Finland
Open geodata in FinlandOpen geodata in Finland
Open geodata in Finland
 
Aurh-geo weastflows-Angl
Aurh-geo weastflows-AnglAurh-geo weastflows-Angl
Aurh-geo weastflows-Angl
 
Open data, what's cooking at the federal level
Open data, what's cooking at the federal levelOpen data, what's cooking at the federal level
Open data, what's cooking at the federal level
 
Curse of Dimensionality and Big Data
Curse of Dimensionality and Big DataCurse of Dimensionality and Big Data
Curse of Dimensionality and Big Data
 
Austrian Experience in Building Data Value Chain
Austrian Experience in Building Data Value ChainAustrian Experience in Building Data Value Chain
Austrian Experience in Building Data Value Chain
 
Smart accelerator
Smart acceleratorSmart accelerator
Smart accelerator
 

Similar to Statbel and big data

ICT observatories for better governance
ICT observatories for better governanceICT observatories for better governance
ICT observatories for better governancezalisova
 
IAOS 2018 - Remote sensing data for better statistics, N. Rosenski
IAOS 2018 - Remote sensing data for better statistics, N. RosenskiIAOS 2018 - Remote sensing data for better statistics, N. Rosenski
IAOS 2018 - Remote sensing data for better statistics, N. Rosenski
StatsCommunications
 
Commission studies on eaccessibility
Commission studies on  eaccessibilityCommission studies on  eaccessibility
Commission studies on eaccessibility
Jose Angel Martinez Usero
 
DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!
FAO
 
P. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsP. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European Statistics
Istituto nazionale di statistica
 
sylviane toporkoff one conference prague 2013
sylviane toporkoff one conference  prague 2013sylviane toporkoff one conference  prague 2013
sylviane toporkoff one conference prague 2013
Iñaki Zaragüeta Arrizabalaga
 
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
LinDa_FP7
 
OECD GOV digitisation of the public sector
OECD GOV digitisation of the public sectorOECD GOV digitisation of the public sector
OECD GOV digitisation of the public sectoradamlerouge
 
Carlo Amati - Ex post evaluation of cohesion policy
Carlo Amati - Ex post evaluation of cohesion policyCarlo Amati - Ex post evaluation of cohesion policy
Carlo Amati - Ex post evaluation of cohesion policy
OpenCoesione
 
DPS and CPT eXplorer: connecting data & policy
DPS and CPT eXplorer: connecting data & policyDPS and CPT eXplorer: connecting data & policy
DPS and CPT eXplorer: connecting data & policy
carloamati
 
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An..."Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
e-SIDES.eu
 
beyond PSI; INSPIRE infrastructure to share public data.
beyond PSI; INSPIRE infrastructure to share public data.beyond PSI; INSPIRE infrastructure to share public data.
beyond PSI; INSPIRE infrastructure to share public data.
Marc Leobet
 
From E-Government to Open Government
From E-Government to Open GovernmentFrom E-Government to Open Government
From E-Government to Open Government
Johann Höchtl
 
Tomasz Nadolny: Open Data in Gdańsk
Tomasz Nadolny: Open Data in GdańskTomasz Nadolny: Open Data in Gdańsk
Tomasz Nadolny: Open Data in Gdańsk
AnalyticsConf
 
Open Gdansk - Analitics Conf - Gdansk
Open Gdansk - Analitics Conf - GdanskOpen Gdansk - Analitics Conf - Gdansk
Open Gdansk - Analitics Conf - Gdansk
Tomasz Nadolny
 
Open Data in Gdansk
Open Data in GdanskOpen Data in Gdansk
Open Data in Gdansk
Krzysztof Garski
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
BYTE Project
 
Using gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planningUsing gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planning
Marius Rohde Johannessen
 
Ordnance Survey and Linked Data
Ordnance Survey and Linked Data Ordnance Survey and Linked Data
Ordnance Survey and Linked Data
Talis Consulting
 

Similar to Statbel and big data (20)

ICT observatories for better governance
ICT observatories for better governanceICT observatories for better governance
ICT observatories for better governance
 
IAOS 2018 - Remote sensing data for better statistics, N. Rosenski
IAOS 2018 - Remote sensing data for better statistics, N. RosenskiIAOS 2018 - Remote sensing data for better statistics, N. Rosenski
IAOS 2018 - Remote sensing data for better statistics, N. Rosenski
 
Commission studies on eaccessibility
Commission studies on  eaccessibilityCommission studies on  eaccessibility
Commission studies on eaccessibility
 
DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!DISCOVERY DAY 2017: MAKE IT HAPPEN!
DISCOVERY DAY 2017: MAKE IT HAPPEN!
 
Keynote: Stefano Bertolo
Keynote: Stefano BertoloKeynote: Stefano Bertolo
Keynote: Stefano Bertolo
 
P. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsP. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European Statistics
 
sylviane toporkoff one conference prague 2013
sylviane toporkoff one conference  prague 2013sylviane toporkoff one conference  prague 2013
sylviane toporkoff one conference prague 2013
 
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
20140902 LinDa Workshop Semantincs2014 - Bringing LOD to SMEs
 
OECD GOV digitisation of the public sector
OECD GOV digitisation of the public sectorOECD GOV digitisation of the public sector
OECD GOV digitisation of the public sector
 
Carlo Amati - Ex post evaluation of cohesion policy
Carlo Amati - Ex post evaluation of cohesion policyCarlo Amati - Ex post evaluation of cohesion policy
Carlo Amati - Ex post evaluation of cohesion policy
 
DPS and CPT eXplorer: connecting data & policy
DPS and CPT eXplorer: connecting data & policyDPS and CPT eXplorer: connecting data & policy
DPS and CPT eXplorer: connecting data & policy
 
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An..."Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
"Towards Value-Centric Big Data" e-SIDES Workshop - "Responsible Research: An...
 
beyond PSI; INSPIRE infrastructure to share public data.
beyond PSI; INSPIRE infrastructure to share public data.beyond PSI; INSPIRE infrastructure to share public data.
beyond PSI; INSPIRE infrastructure to share public data.
 
From E-Government to Open Government
From E-Government to Open GovernmentFrom E-Government to Open Government
From E-Government to Open Government
 
Tomasz Nadolny: Open Data in Gdańsk
Tomasz Nadolny: Open Data in GdańskTomasz Nadolny: Open Data in Gdańsk
Tomasz Nadolny: Open Data in Gdańsk
 
Open Gdansk - Analitics Conf - Gdansk
Open Gdansk - Analitics Conf - GdanskOpen Gdansk - Analitics Conf - Gdansk
Open Gdansk - Analitics Conf - Gdansk
 
Open Data in Gdansk
Open Data in GdanskOpen Data in Gdansk
Open Data in Gdansk
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
 
Using gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planningUsing gamification to generate citizen input for public transport planning
Using gamification to generate citizen input for public transport planning
 
Ordnance Survey and Linked Data
Ordnance Survey and Linked Data Ordnance Survey and Linked Data
Ordnance Survey and Linked Data
 

More from Data Science Leuven

Distributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the CloudDistributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the Cloud
Data Science Leuven
 
Learning from positive and unlabeled data
Learning from positive and unlabeled dataLearning from positive and unlabeled data
Learning from positive and unlabeled data
Data Science Leuven
 
Lighthouse - an open-source library to build data lakes - Kris Peeters
Lighthouse - an open-source library to build data lakes - Kris PeetersLighthouse - an open-source library to build data lakes - Kris Peeters
Lighthouse - an open-source library to build data lakes - Kris Peeters
Data Science Leuven
 
Recommender systems for job search - Michael Reusens
Recommender systems for job search - Michael ReusensRecommender systems for job search - Michael Reusens
Recommender systems for job search - Michael Reusens
Data Science Leuven
 
VITO WatchItGrow - Jeroen Dries
VITO WatchItGrow - Jeroen DriesVITO WatchItGrow - Jeroen Dries
VITO WatchItGrow - Jeroen Dries
Data Science Leuven
 
How to build a search engine in 2 days
How to build a search engine in 2 daysHow to build a search engine in 2 days
How to build a search engine in 2 days
Data Science Leuven
 
Uplift models
Uplift modelsUplift models
Uplift models
Data Science Leuven
 
Value from health data
Value from health dataValue from health data
Value from health data
Data Science Leuven
 
Computing power and algorithms? In people we trust
Computing power and algorithms? In people we trustComputing power and algorithms? In people we trust
Computing power and algorithms? In people we trust
Data Science Leuven
 
Trumania , a realistic scenario-based data-generator
Trumania , a realistic scenario-based data-generatorTrumania , a realistic scenario-based data-generator
Trumania , a realistic scenario-based data-generator
Data Science Leuven
 
Recommender systems, optimizing least squares or user experience
Recommender systems, optimizing least squares or user experienceRecommender systems, optimizing least squares or user experience
Recommender systems, optimizing least squares or user experience
Data Science Leuven
 
Replicability and questionable research practices
Replicability and questionable research practicesReplicability and questionable research practices
Replicability and questionable research practices
Data Science Leuven
 
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Data Science Leuven
 
Storytelling for impactful predictive models - Gert De Geyter
Storytelling for impactful predictive models - Gert De GeyterStorytelling for impactful predictive models - Gert De Geyter
Storytelling for impactful predictive models - Gert De Geyter
Data Science Leuven
 
Lessons from driving analytics projects
Lessons from driving analytics projectsLessons from driving analytics projects
Lessons from driving analytics projects
Data Science Leuven
 
Geospatial visual analytics
Geospatial visual analyticsGeospatial visual analytics
Geospatial visual analytics
Data Science Leuven
 
Open-Source Data Science Crossing The Chasm
Open-Source Data Science Crossing The ChasmOpen-Source Data Science Crossing The Chasm
Open-Source Data Science Crossing The Chasm
Data Science Leuven
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complex
Data Science Leuven
 
Closing
ClosingClosing
Welcome
WelcomeWelcome

More from Data Science Leuven (20)

Distributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the CloudDistributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the Cloud
 
Learning from positive and unlabeled data
Learning from positive and unlabeled dataLearning from positive and unlabeled data
Learning from positive and unlabeled data
 
Lighthouse - an open-source library to build data lakes - Kris Peeters
Lighthouse - an open-source library to build data lakes - Kris PeetersLighthouse - an open-source library to build data lakes - Kris Peeters
Lighthouse - an open-source library to build data lakes - Kris Peeters
 
Recommender systems for job search - Michael Reusens
Recommender systems for job search - Michael ReusensRecommender systems for job search - Michael Reusens
Recommender systems for job search - Michael Reusens
 
VITO WatchItGrow - Jeroen Dries
VITO WatchItGrow - Jeroen DriesVITO WatchItGrow - Jeroen Dries
VITO WatchItGrow - Jeroen Dries
 
How to build a search engine in 2 days
How to build a search engine in 2 daysHow to build a search engine in 2 days
How to build a search engine in 2 days
 
Uplift models
Uplift modelsUplift models
Uplift models
 
Value from health data
Value from health dataValue from health data
Value from health data
 
Computing power and algorithms? In people we trust
Computing power and algorithms? In people we trustComputing power and algorithms? In people we trust
Computing power and algorithms? In people we trust
 
Trumania , a realistic scenario-based data-generator
Trumania , a realistic scenario-based data-generatorTrumania , a realistic scenario-based data-generator
Trumania , a realistic scenario-based data-generator
 
Recommender systems, optimizing least squares or user experience
Recommender systems, optimizing least squares or user experienceRecommender systems, optimizing least squares or user experience
Recommender systems, optimizing least squares or user experience
 
Replicability and questionable research practices
Replicability and questionable research practicesReplicability and questionable research practices
Replicability and questionable research practices
 
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
 
Storytelling for impactful predictive models - Gert De Geyter
Storytelling for impactful predictive models - Gert De GeyterStorytelling for impactful predictive models - Gert De Geyter
Storytelling for impactful predictive models - Gert De Geyter
 
Lessons from driving analytics projects
Lessons from driving analytics projectsLessons from driving analytics projects
Lessons from driving analytics projects
 
Geospatial visual analytics
Geospatial visual analyticsGeospatial visual analytics
Geospatial visual analytics
 
Open-Source Data Science Crossing The Chasm
Open-Source Data Science Crossing The ChasmOpen-Source Data Science Crossing The Chasm
Open-Source Data Science Crossing The Chasm
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complex
 
Closing
ClosingClosing
Closing
 
Welcome
WelcomeWelcome
Welcome
 

Recently uploaded

一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 

Recently uploaded (20)

一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 

Statbel and big data

  • 1. http://statbel.fgov.be Marc Debusschere Coordinator Administrative & Big Data Statbel and big data Data Science Meetup Leuven, 27 June 2018
  • 2. http://statbel.fgov.be Overview ❑ Context Statbel, big data and the third data revolution in statistics ❑ Big data and Statbel Projects, accomplishments and problems ❑ European collaboration ❑ Big data and official statistics Provisional conclusions and way forward ❑ Big data, Statbel and data science
  • 3. http://statbel.fgov.be Statbel = ❑ Statistics Belgium ❑ The institute formerly known as Nationaal Instituut voor de Statistiek (NIS) / Institut national de Statistique (INS) ❑ Administratively part of the FPS (‘ministry’) Economy ❑ Member of the European Statistical System (ESS) = Eurostat + 32 EU & EFTA national statistical institutes + associated statistics producers
  • 4. http://statbel.fgov.be Big data ❑ = data impossible to process in a ‘normal’ way ‘normal’ is relative … ❑ 3 v’s: volume, velocity, variety ❑ Result of societal and technological changes Satellites, cameras and sensors, internet and e-mail, social media, mobile phones and tablets, e-business, e-government, machine-to-machine (internet of things, IoT) ❑ Result: data explosion, data deluge
  • 5. http://statbel.fgov.be Big data and statistics ❑ Big data = ‘digital footprint’ ❑ Containing valuable information, statistically exploitable (but also commercially …) ❑ Resulting in the third data revolution in statistics After surveys (>1846) and administrative data (>2000), now: big data! ❑ Possible data sources – list far from exhaustive! ❖ Scanner data, electronical payments, credit card data ❖ Webscraping for job vacancies, enterprise characteristics ❖ Traffic cameras and detection loops ❖ Smart meters (electricity, gas, water) ❖ Last but not least: mobile phone data!
  • 6. http://statbel.fgov.be The future of statistics …
 … big data! Instant statistics based on big data, complemented and/or validated by administrative data and small and specific ad hoc surveys. Also known as: smart statistics …
  • 7. http://statbel.fgov.be Big data and Statbel: Big Data Team ❑ Start at the end of 2015 ❑ Restricted group, operating informally and ad hoc ❑ Focus on mobile phone data, webscraping job vacancies ❑ Tasks: ❖Reflection on strategy, priorities ❖External contacts concerning big data, with data owners, potential users, federal and regional authorities, academia, EU, international organisations, … ❖Analysing big datasets and connecting them to statistical ones – see below
  • 8. http://statbel.fgov.be Big data and Statbel: in production ❑ For consumer price index (CPI) ❖Scanner data supermarkets and retail chains ❖Webscraping prices (e.g. airplane tickets, webshops)
  • 9. http://statbel.fgov.be Big data and Statbel: not planned (at present) ❑ Social media, internet search results, text analytics, … =‘high-hanging fruits’, access and interpretation very problematic! ❑ Smart meters Political decision of regions (2012) not to deploy => no data (about to change in Flanders) ❑ Traffic cameras, traffic loops Regional competence and data
  • 10. http://statbel.fgov.be Big data and Statbel: projects ❑ Mobile phone data ❖Project Statbel-Proximus-Eurostat ❖Border Region Data Collection (BRDC) ❖City data from LFS and Big Data ❑ Webscraping ❖Job vacancies ❑ Satellite data and aerial photography ❖Deep Solaris
  • 11. http://statbel.fgov.be Big data in production ❑ Scanner data for CPI ❖ Based on agreement with data owners, facilitated by political pressure ❖ Legal basis (HICP regulation) but cooperative model ❖ Being expanded gradually with new supermarket and retail chains ❖ Extremely smooth and cost-efficient after initial set-up ❑ Webscraping prices for CPI ❖ Collecting prices on webshops (e.g. airplane tickets) ❖ For efficiency but also out of necessity: e-commerce fast expanding! ❖ Legal issues possible
  • 12. http://statbel.fgov.be Big data almost in production ❑ Webscraping job vacancies: about to go in production … ❖ Methodological and practical issues ❖ Stand-alone results not sufficient, need to combine with existing Job vacancy survey (JVS) and ‘administrative’ data from regional employment agencies (VDAB, FOREM, Actiris) ❖ Linked to European project (ESSnet Big Data, see below): https:// webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/WP1_Webscraping_job_vacancies
  • 13. http://statbel.fgov.be Project Statbel, Proximus, Eurostat ❑ Start December 2015, first results April 2016 ❑ Step by step approach: ❖ First: actual present population ❖ Basis for: resident population (via place of residence), workplace, ‘usual environment’, tourism, labour migration, migration, time use, … ❑ Innovative: ❖ First collaboration NSI/operator in EU => ‘real’ data ❖ No ‘call detail records’ but network signals: 10 x more frequent! ❖ Combining mobile phone data with statistical datasets ❑ Via geo-coupling of aggregates: no privacy issues!
  • 14. http://statbel.fgov.be Project Statbel, Proximus, Eurostat
 Results and next steps ❑ Results ❖ 10 publications - see De Meersman e.a., Debusschere e.a., Demunter e.a., Reis e.a., Seynaeve e.a., Wirthmann e.a., all at https://webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/WP5_Documentation ❑ Our objectives: ❖ Further exploration for statistical ánd commercial applications ❖ Concrete use cases with a view to statistical production lines ❑ Unfortunately … ❖ March 2017: data blocked for everyone ❖ In the meantime: new initiatives via MIT/Univ. Newcastle, Eurostat
  • 15. http://statbel.fgov.be Some results Population density per km²: Mobile phone data (left) versus Census 2011 (right)
  • 16. http://statbel.fgov.be Some results, continued Weekday cells identified as ‘residential’, ‘commuting’ or ‘work’, with geographical representation
  • 17. http://statbel.fgov.be Some results, continued ‘Residential’, ‘work’ and ‘commuting’-cells in the Brussels-Leuven area
  • 18. http://statbel.fgov.be Other projects ❑ Border Region Data Collection (BRDC) ❖Grant EC DG Regio, 1 year, July 2017 - July 2018 ❖Cross-border living place-workplace mobility through combining Labour force survey (LFS), administrative data and mobile phone data ❖With CBS Netherlands (lead), Destatis Germany, Insee France, GUS Poland, SURS Slovenia
  • 19. http://statbel.fgov.be Other projects, continued ❑ Deep Solaris ❖Grant Eurostat, 1 year, kickoff 19 Febr. 2018 ❖Detecting solar pannels on the basis of satellite data and aerial photography ❖Via machine learning ❖With CBS Netherlands (lead), Destatis Germany, IT.NRW (Düsseldorf, DE) and BISS (Heerlen, NL)
  • 20. http://statbel.fgov.be Other projects, continued ❑ City data from LFS and Big Data ❖Grant EC DG Regio, 1 year, Jan. – Dec. 2018 ❖Mapping metropolitan areas on the basis of Labour force survey (LFS) and mobile phone data ❖With CBS Netherlands (lead), Destatis Germany, Insee France, Statistics Austria
  • 21. http://statbel.fgov.be From exploration to exploitation
 Developing use cases ❑ Scanner data and webscraping for CPI ❑ Webscraping job vacancies ❑ Validation living place and workplace Census ❑ Matrix living place-workplace ❑ … (population, migration, tourism, mobility, transport, time use, environment, agriculture, …)
  • 22. http://statbel.fgov.be Big data in the ESS (European Statistical System)
 Initiatives and projects
  • 23. http://statbel.fgov.be Big data and official statistics
 Provisional conclusions ❑ Size and complexity of datasets ❖ Less problematic than anticipated (because of pre-processing, at least some structure) ❖ Focus consequently less on IT infrastructure and software
  • 24. http://statbel.fgov.be Big data and official statistics
 Provisional conclusions, continued ❑ Biggest obstacle: access! ❖ Data owned by private enterprises: profit-oriented ❖ Fail to see any advantage in collaborating, on the contrary (mistakenly!) ❖ Imposing legal obligation seems unavoidable … ❑ Link to privacy issues: fear of reputational damage
  • 25. http://statbel.fgov.be Big data and official statistics
 Provisional conclusions, continued ❑ Additional challenge: methodology ❖ All ancient headaches are still there … … with a lot of new ones added!
  • 26. http://statbel.fgov.be Big data and official statistics
 The next stage: smart statistics ❑ Monitoring systems which are: ❖ integrated ❖ flexible ❖ multi-source ❖ real-time and highly detailed ❑ Some examples: ❖ continuous tracking of air quality ❖ highly granular actual present population (time, location, characteristics) ❖ smart farming statistics
  • 27. http://statbel.fgov.be For discussion:
 big data, Statbel and data science ❑ Statbel owns numerous geocoded datasets (population, employment, income, lodgings, …) ❑ and might gain access to big data sources … ❑ … but lacks data science, capability to analyse big data ❑ Two possible solutions: ❖ collaboration with academia, researchers ❖ hiring …