Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Where we are and are going
for Big Data in OpenScience
The perspective of European
official statistics
Fernando Reis, Task...
Where we are
• Public use files for Eurostat micro-data
• They are public
• Used for training purposes and data discovery
...
Where we are
• Scientific use files for Eurostat micro-data
Microdata
confidential
data
for statistical
purposes
national
...
Where we are
• Scientific use files for Eurostat micro-data
• Access provided to entities that do research;
Eurostat is ch...
Where we are
• Legal study on access to big data for statistics
• Purpose
- Identify obstacles and enabling factors in cur...
Where we are
• Legal study on access to big data for statistics
• Legal obstacles in Member States?
- Not that many true l...
Where we are
• Legal study on access to big data for statistics
• NSI can often compel big data sources to
communicate dat...
Where we are going
• Legislative initiative for data access?
• Separate law on data access?
- Obligation to private source...
Where we are going
• Open Algorithms (OPAL) Project
• open suite of software and open algorithms
providing access to stati...
Where we are going
• From Internet of Things to …
• A set of sensors, actuators,
smart objects, data
• communications and
...
Where we are going
• … Smart statistics
• Data capturing, processing
and analysis will be
embedded in the system
itself
• ...
Where we are going
• Smart statistics proof-of-concept
Proofs-of- concept
•Give life to an idea
•Provide evidence that IoT...
Thank you for your attention
Fernando Reis
Eurostat Task Force on Big Data
https://github.com/reisfe/
https://twitter.com/...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScience . The perspective of European official st...
Upcoming SlideShare
Loading in …5
×

Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScience . The perspective of European official statistics Fernando Reis, Task-Force Big Data, European Commission (Eurostat)

158 views

Published on

Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScience . The perspective of European official statistics Fernando Reis, Task-Force Big Data, European Commission (Eurostat)

  1. 1. Where we are and are going for Big Data in OpenScience The perspective of European official statistics Fernando Reis, Task-Force Big Data European Commission (Eurostat) Big Data Europe Workshop Amsterdam, 11 September 2017
  2. 2. Where we are • Public use files for Eurostat micro-data • They are public • Used for training purposes and data discovery prior to getting access to scientific use files • May be randomly generated from the real microdata so to preserve statistical properties of real data • EU-SILC and EU-LFS • https://ec.europa.eu/eurostat/cros/content/publ ic-use-files-eurostat-microdata-0_en • Not big data (for now)!
  3. 3. Where we are • Scientific use files for Eurostat micro-data Microdata confidential data for statistical purposes national statistical production secure data exchange (within ESS) for scientific purposes scientific use files (pseudonymised datasets sent to researchers on DVD) secure use files (datasets accessed in Eurostat's Safe Centre, outputs checked for confidentiality) public use files (anonymised datasets; identification of statistical units is not possible)
  4. 4. Where we are • Scientific use files for Eurostat micro-data • Access provided to entities that do research; Eurostat is checking if the entity can be considered research entity (according to predefined criteria) • Entity signs agreement that they will use the data properly - this is a prerequisite for access • Individual researchers submit projects where they explain why they need access to microdata • These projects are verified by Eurostat AND National statistical offices If a country disagrees, its data removed from file • Around 1200 projects running using our microdata • Not big data (for now)!
  5. 5. Where we are • Legal study on access to big data for statistics • Purpose - Identify obstacles and enabling factors in current and upcoming relevant legislation (MS and EU) regarding the access and use of Big Data for official statistics (incl. production and dissemination) for four private data sources: telecom, internet, utilities, payment • Analysis - Statistical legislation at EU level - National legal framework for production of official statistics, including provisions that may prevent or limit use of big data sources - EU data protection legislation (Directive 95/46/EC and GDPR) - National legal framework for personal data protection, including derogations in case of processing for statistical purposes - Other relevant legislation (copyright, database legislation) - National legal framework for traffic and location data - Existing practices at NSIs
  6. 6. Where we are • Legal study on access to big data for statistics • Legal obstacles in Member States? - Not that many true legal obstacles, neither in statistics legislation, nor in sector legislation - But there are concerns both for NSI and data sources (mainly for personal data and confidential business information)… - Issues: retention period in mobile network data, data minimisation (burden), transparency towards data subjects - Statistical confidentiality sufficiently guaranteed? Recital 162 GDPR: The statistical purpose implies that the result of processing for statistical purposes is not personal data - Yet… the potential of big data is currently not being fully exploited
  7. 7. Where we are • Legal study on access to big data for statistics • NSI can often compel big data sources to communicate data to the NSI, but… - For data sources the rules may not be clear enough - For NSI the rules may not be strong enough - Adopting the required legal instrument can require substantial time and effort (e.g. part of annual program) - The national DPA may need to be consulted first and may lay down access modalities and restrictions - Communication of aggregated data by data sources may not be possible if they identify too small subgroups (Belgian DPA: at least 30 users in case of location data from MNO) - Need for continuous, flexible and reliable access not guaranteed by current legal provisions - Voluntary partnerships are concluded, mainly with MNOs and retail trade chains
  8. 8. Where we are going • Legislative initiative for data access? • Separate law on data access? - Obligation to private sources to license the data they have for use by public (statistical) offices - Right balance between public interest and citizens’ needs to privacy protection • Inclusion into specific statistical domain legislation? - Regulation 2016/792 on consumer prices indexes: “upon the request of the national bodies responsible for compiling the harmonised indices, the statistical units shall provide, where available, electronic records of transactions, such as scanner data, and at the level of detail necessary in order to produce harmonised indices and to evaluate compliance with the comparability requirements and the quality of the harmonised indices”
  9. 9. Where we are going • Open Algorithms (OPAL) Project • open suite of software and open algorithms providing access to statistical information extracted from anonymized, secured and formatted data • will start with APIs to access indicators such as population density, mobility, based on mobile network data • library of certified open algorithms to extract these indicators in a governed and trustworthy manner • http://www.opalproject.org
  10. 10. Where we are going • From Internet of Things to … • A set of sensors, actuators, smart objects, data • communications and interface technologies that - allow information to be collected, tracked and processed across local and global network infrastructures, - enabling the future hyper-connected society
  11. 11. Where we are going • … Smart statistics • Data capturing, processing and analysis will be embedded in the system itself • Intelligence along data life-cycle enhanced with cognitive processes
  12. 12. Where we are going • Smart statistics proof-of-concept Proofs-of- concept •Give life to an idea •Provide evidence that IoT data (eco)systems can be used for official statistics •Sandbox infrastructure •… Prototypes •Functional model of producing statistics leveraging BD •Monitored use •Sandbox infrastructure •Methodology under construction •Quality under evaluation • Limited number of NSI Working products •Fully operational •Up-sized prototype •Unmonitored use •UI •IT infrastructure •Methodology •Quality •Integration with other statistics •ESS ? ?
  13. 13. Thank you for your attention Fernando Reis Eurostat Task Force on Big Data https://github.com/reisfe/ https://twitter.com/reisfe/ https://linkedin.com/in/reisfe/ fernando.reis@ec.europa.eu

×