Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Setting the Scene for Big Data in Europe, Looking Ahead to the Case Studies


Published on

Author: Guillermo Vega-Gorgojo, Universitetet i Oslo
Presented at: BYTE WP2 Workshop, Lyon, 11 Sept 2014

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Setting the Scene for Big Data in Europe, Looking Ahead to the Case Studies

  1. 1. BYTE: Big data roadmap and cross-disciplinary community for addressing societal Externalities Se*ng the scene for Big Data in Europe, looking ahead to the case studies Guillermo Vega-­‐Gorgojo – Universitetet i Oslo
  2. 2. So far, what we have learned in BYTE? ◦ Big data, more than “the 3Vs” ◦ Defini7on, dimensions, ac7vi7es, applica7ons, data flows, policies ◦ Big data ini7a7ves ◦ Technologies and infrastructures for big data ◦ Posi7ve and nega7ve societal externali7es ◦ Economic, legal, social, ethical, poli7cal… @BYTE_EU
  3. 3. What we expect to learn through the case studies? 1. Inves7gate which posi%ve and nega%ve societal externali%es @BYTE_EU do organiza7ons create through the use of big data 2. How have they worked to amplify posi%ve externali%es 3. How have they addressed the nega%ve externali%es they have encountered
  4. 4. A template for the case studies in BYTE CASE STUDY OVERVIEW 1. Organiza7on 2. Sector 3. Case study moQo 4. Execu7ve summary 5. Business processes 6. Rela7on to big data ini7a7ves 7. Illustra7ve user stories SOURCES OF INFORMATION ◦ Semi-­‐structured interviews ◦ Organiza7on documents TECHNICAL PERSPECTIVE 8. Data sources 9. Data flows 10. Relevant big data policies 11. Main technical challenges 12. Big data dimensions SOCIETAL EXTERNALITIES 13. Posi7ve societal externali7es 14. Nega7ve societal externali7es 15. Amplifying posi7ve externali7es 16. Addressing nega7ve externali7es @BYTE_EU
  5. 5. A model for the societal externaliKes Ci%zens Public Sector @BYTE_EU Private Sector
  6. 6. Examples of posiKve and negaKve societal externaliKes Ci%zens Public Sector + support communi7es -­‐ con7nuous and invisible surveillance @BYTE_EU + commercializa7on of new goods and services + data-­‐driven employment offerings Private Sector + innova7ve business models -­‐ inequali7es to data access -­‐ need to reconcile different laws and agreements + economic growth through community building -­‐ compe77ve disadvantage of newer businesses and SMEs -­‐ private data misuse -­‐ invasive use of informa7on + accelerate scien7fic progress + transparency and accountability -­‐ distrust of government data-­‐ based ac7vi7es
  7. 7. The case studies Case study Organiza%on Contact partner Environment ESA and others CNR Crime XXX TRI Smart ci7es Siemens Siemens Culture Europeana TRI Energy Statoil UiO Health Ins7tute of Child Health TRI Transport Rolls Royce/Farstad shipping DNV @BYTE_EU
  8. 8. Preliminary case study analysis for Statoil Case study overview 1. Organiza%on Statoil 2. Sector ENERGY 3. Case study moQo Improve decision making in oil & gas explora7on in the presence of par7al informa7on and limited 7me. 5. Business processes Oil & gas explora7on decision-­‐making 6. Rela%on to big data ini%a%ves Research projects: OPTIQUE 4. Execu%ve summary In the early phases of the explora7on process of oil and gas many prospects, i.e. @BYTE_EU poten%al projects, are at any 7me under evalua7on in order to select just a few of them for further inves7ga7on. These decisions are oken of cri7cal importance for Statoil. However, in most cases prospects have to be selected on a short no%ce and on the basis of only par%al informa%on. Typically, explora7on experts in these very early phases of an explora7on project spend just a few days collec7ng relevant informa7on before they embark on further analyses; the data that is not found within this 7me frame is then simply ignored, and will hence not influence the important selec7on of prospects. If the geophysics and geology (G&G) experts u7lize all the data available, this will reduce the risk factor in the selec7on process, and hence also increase the chances that the ‘right’ prospects are selected. In the end this will in all likelihood increase the number of successful explora%on projects for Statoil.
  9. 9. Preliminary case study analysis for Statoil Technical descripKon 8. Data sources Name: Subsurface Short descrip7on: ◦ Seismic survey ◦ Seismic & geophysical data ◦ Well and wellbore data ◦ Acquisi7on reports Domain: geophysics and geology How is collected: ◦ Seismic shots ◦ Well data from drilling opera7ons ◦ Reports from value-­‐adding analysis Size: ~8 PB … 11. Main technical challenges Data storage and access: VERY CHALLENGING ◦ G&G experts in explora7on spend 16% of their 7me on finding the relevant data sets and documents (internal survey of Statoil in 2005) ◦ There is a plethora of tools to access and process the different kinds of data, amplified by the segrega7on into silos Data integra7on: CHALLENGING ◦ There is a clear need to integrate the data scaQered across different repositories and databases from mul7ple vendors. For instance, the provided user story reflects that the Subsurface database was not up to date due to limited integra7on with the OpenWorks project databases … @BYTE_EU 12. Big data dimensions Volume: YES ◦ Some datasets are at a scale of PBs ◦ Extremely complex queries that can involve more than 30 joins Velocity: NO ◦ No streaming data processing Variety: YES ◦ Need of different data models to reflect the views of Drilling Engineers, Petrophysicists, Geophysicists, Geologists and Reservoir Engineers ◦ Very complex data models: ~K of tables and ~10K columns Veracity: YES ◦ Some of the employed data sources are more trustworthy than others
  10. 10. Preliminary case study analysis for Statoil Societal externaliKes Statoil – Ci%zens + Reduced risk for environment + Demand for hiring big data analysts Statoil – Other corpora%ons + New work processes and vendor ecosystems - Data lock-­‐in, contracts prohibit access to data for third par7es - Increased risk of exposing confiden7al data Statoil – Public sector + BeQer informed decisions for drilling opera7ons based on open government data (FactPages) - Compe77ve advantage of the private sector w.r.t open data (Statoil doesn’t have to open their data, while it has access to public data) @BYTE_EU
  11. 11. Societal externaliKes (1-­‐3) Public sector – Ci%zens + Gather public insight by iden7fying social trends and sta7s7cs + Accelerate scien7fic progress + Tracking environmental challenges + Transparency and accountability of the public sector + Increased ci7zen par7cipa7on + Foster innova7on, e.g. new applica7ons, from government data + BeQer services, e.g. health care and educa7on, through data sharing and analysis + More targeted services for ci7zens, through profiling popula7ons + cost-­‐effec7veness of services + crime preven7on and detec7on, including fraud - Distrust of government data-­‐based ac7vi7es - Unnecessary surveillance - Compromise to government security and privacy - Private data misuse, especially sharing with third par7es without consent - Threats to data protec7on and personal privacy - Threats to intellectual property rights (including scholars' rights and contribu7ons) - Public reluctance to provide informa7on (especially personal data) @BYTE_EU
  12. 12. Societal externaliKes (2-­‐3) Private sector – Ci%zens + Rapid commercializa7on of new goods and services + Free use of services, e.g. email, search engines + Enhances in data-­‐driven R&D + Making society energy efficient + Op7miza7on of u7li7es through data analy7cs + Data-­‐driven employment offerings + Marke7ng improvement + Increased insight of goods (more transparency) + Increased transparency in commercial decision making + Fostering innova7on from opening data + Increase awareness about privacy viola7ons and ethical issues of big data + Time-­‐saving in transac7ons if personal data were already held - Employment losses for certain job categories - Invasive use of informa7on - Risk of informa7onal rent-­‐seeking - Discriminatory prac7ces and targeted adver7sing - Distrust of commercial data-­‐based ac7vi7es - Unethical exploita7on of data - Reduced market compe77on - Consumer manipula7on - Crea7on of data-­‐based monopolies (plaxorms and services) - Private data accumula7on and ownership - Private data leakage - Private data misuse, especially sharing with third par7es without consent - Privacy threats even with anonymized data and with data mining - Threats to intellectual property rights - Public reluctance to provide informa7on (especially personal data) - “Sabotaged" data prac7ces @BYTE_EU
  13. 13. Societal externaliKes (3-­‐3) Ci%zens – Ci%zens + Support communi7es - Con7nuous and invisible surveillance Private sector – Private sector + Opportuni7es for economic growth + Innova7ve business models - Barriers to market entry - Inequali7es to data access - Market manipula7on - Challenge of tradi7onal non-­‐digital services - Dependency on external data sources, plaxorms and services - Compe77ve disadvantage of newer businesses and SMEs - Reduced growth and profit among all business - Threats to commercially valuable informa7on Public sector – Private sector + Opportuni7es for economic growth + Innova7ve business models + Support communi7es - Open data puts the private sector at a compe77ve @BYTE_EU advantage - Inequali7es to data access, especially in research - Taxa7on leakages - Lack of norms for data storage and processing Public sector – Public sector - Geopoli7cal tensions due to surveillance out of the boundaries of states - Need to reconcile different laws and agreements, e.g. "right to be forgoQen" Barriers to market entry