Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse


Published on

Although there is no well-established definition of big data, its main characteristic is its sheer volume. Large volumes of data are generated by people (e.g., via social media) and by technology, including sensors (e.g., cameras, microphones), trackers (e.g., RFID tags, web surfing behavior) and other devices (e.g., mobile phones, wearables for self-surveillance/quantified self), whether or not they are connected to the Internet of Things. However, the large volumes of data needed to capitalize on the benefits of big data can to some extent also be established by the reuse of existing data, a source that is sometimes overlooked.

Data can be reused for purposes similar to that for which it was initially collected, but also beyond these purposes. Similarly, data can be reused in its original context, but also beyond this context. However, such repurposing and recontextualizing of data may lead to privacy issues. For instance, data reuse may lead to issues regarding informed consent and informational self-determination. When the data is used for profiling and other types of predictive analytics, also issues regarding stigmatization and discrimination may arise. This presentation by Bart Custers, Head of Research, eLaw – Center for Law and Digital Technologies at Leiden University, The Netherlands, focuses on the privacy issues of big data sharing and reuse and how these issues could be addressed.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

  1. 1. Bart Custers PhD MSc LLM Associate professor/head of research eLaw – Center for Law and DigitalTechnologies Leiden University,The Netherlands Cyber Summit 2016 – Banff, Canada October 27th 2016, 2:15 pm – 2:45 pm
  2. 2. Introduction: big data and data reuse Eudeco-project Generating new data vs data reuse Legal and ethical issues Privacy, security Discrimination, stigmatization, polarization Consent, autonomy, self-determination Transparency, integrity, trust Suggestions for solutions Conclusions 2 more data => more opportunities This calls for data sharing and reuse
  3. 3. The Eudeco project (3 years) Five partners Four countries Modeling the European Data Economy Focus on big data and data reuse Legal, societal, economic and technological perspectives 3 Big Data • Volume (big) • Velocity (fast) • Variety (unstructured)
  4. 4. People Social media User generated content Devices (Internet ofThings) Sensors ▪ Cameras, microphones Trackers ▪ RFID tags, web surfing behavior Other ▪ Mobile phones, wearables ▪ Self-surveillance/quantified self 4
  5. 5. Data sharing Active role of data subjects (hence: consent) Data reuse (with/without consent) Data recycling Data reuse for the same purpose Data repurposing Data reuse for new purposes Data recontextualisation Data reuse in a new context 5 Data reuse may… • be more efficient • be more effective (e.g., larger volumes, more completeness) • include historical data • not always match purposes and context • be difficult • Technological (e.g. interoperability, data portability • Legal (e.g. privacy laws) • Economic (e.g. competition) • Right to data portability • Right to be forgotten
  6. 6. Facebook likes can predict: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. (Kosinski et al. 2013) Legal perspective Violations of privacy depend on your definition of privacy Ethical perspective Violations of privacy depends on your expectations. Subjective: personal expectations Objective: reasonable expectations Unwanted disclosure of information Security (hacking, leaking) Predictions Unwanted use of information Transparency regarding decision-making Function creep 6 informational privacy: Which data are used? For which purposes?
  7. 7. 7
  8. 8. 8 Data may be discriminating: When police surveillance focuses on black neighborhoods, people in database will be black (selective sampling) Patterns may be discriminating: Database may show top managers are male (self fulfilling prophecy) People causing car accidents are >16 years old (non-novel pattern) Discrimination may be concealed/indirect Selection on zip code instead of ethnic background (redlining) Selection on legitimate attributes correlated to discriminating attributes (masking) Discrimination Stigmatisation Polarisation
  9. 9. Privacy policies/Terms & Conditions People do not read policies Reading everything would take 244 hours annually Users are willing to spend 1-5 minutes on this Facebook: 9,500 words (>1 hour), LinkedIn: 7,500 words (~1 hour) People do not understand policies Policies are often highly legalistic, technical, or both Devil is in the details People do not grasp consequences Preferred option is not available Take-it-or-leave it decisions: check the box 9 informational self-determination (Westin, 1967) People control who gets their data and for which purposes
  10. 10. 10 Past Current Future? Big data is used for a lot of decision-making Based on what data? Based on which analyses? Do you know in how many databases you are?
  11. 11. LimitingAccess to Sensitive Data Basic idea is that if sensitive data are absent in the database/cloud, the resulting decisions/selections cannot be discriminating However, restricting access is very difficult: According to information theory, the dissemination of data follows the laws of entropy: ▪ Information can easily be copied and multiplied ▪ Information can easily be distributed ▪ This process is irreversible 11
  12. 12. Analyze the problem: Privacy Impact Assessments Customize the solution: Privacy by Design Privacy enhancing tools Privacy preserving big data analytics Discrimination aware data mining 12 Since there is not one problem, there is no single solution Combinations of smart solutions are required
  13. 13. New perspectives Focus less on: Limiting access to data Restrictions use of data Focus more on: Transparency Responsibility 13 Restricting data access and use limits big data opportunities and is difficult to enforce
  14. 14. We need data sharing and data reuse There are risks, however, regarding Privacy, discrimination, consent, transparency These risks can be addressed via responsible innovation Privacy ImpactAssessments Privacy by Design ▪ Privacy enhancing tools ▪ Privacy preserving big data analytics ▪ Discrimination aware data mining New approaches Focus less on limitations of access to data and use restrictions Focus more on transparency and responsibility 14
  15. 15. 15 ? ? ? ?? ? ? ? ? Thank you for your attention! Or contact me later: