SlideShare a Scribd company logo
1 of 38
Shadid Chowdhury
GDPR and Data Lake
 Understanding GDPR
 GDPR from Data Lake perspective
 Solving Data Controller’s responsibility
 Solving Data Subject’s right
 Process recommendation
 Final thoughts
Disclaimer: This is not legal advice!
Goal: GDPR compliant Data Lake
 GDPR is the most important change
in data privacy regulation in 20 years
 Enforced from 25th May 2018
 4% of annual global turnover or €20 Million
(whichever is greater)
General Data
Protection Regulation
GDPR from Data Lake perspective
Aggregation
Pseudo
anonymization-
Anonymization
Consent
Legitimate
Interest
Vendor’s Solution
 The EU General Data Protection
Regulation (GDPR) is the most important
change in
data privacy regulation in 20 years
 99 Article
 Data controller’s responsibility
 Data subject’s right
GDPR
 Data Controller
 Lawfulness of processing based on consent
 Records of processing activities and personal data
 Data protection by design and default
 Cooperation with supervisory authority
Data Controller’s Responsibility
 Data Subject, consumer
 Right of access
 Data portability
 Right to be forgotten
 Right to object, rectify
Data Subject’s Right
 Data Controller
 Lawfulness of processing based on consent
 Records of processing activities and personal data
 Data protection by design and default
 Cooperation with supervisory authority
 Data Subject, consumer
 Right of access
 Data portability
 Right to be forgotten
 Right to object, rectify
GDPR from Data Lake Perspective
Understanding Data Lake
 Disjoint files
 Easy to replicate
 Different teams
 No built-in Governance
Data Lake
GDPR & Data Lake
Image Source: https://mindfulmvmnt.org/2016/08/09/sciatica-piriformis-syndrome-condition-breakdown-w-corrective-
yoga/
Solution
 There is no silver bullet solution
 Different solution approach based on the use case
Solution approach
 Data Controller
 Lawfulness of processing based on consent
 Records of processing activities and personal data
 Data protection by design and default
 Cooperation with supervisory authority
Recap: Data Controller’s Responsibility
Lawfulness of processing
 Anonymization – Re-identification is NOT possible
 Pseudo anonymization- re-identification possible
 Personal data – Identifies a person directly or indirectly
 Special category of personal data – ethnic origin, political or religious
views, health etc
Rest of the talk assumes
P
e
r
s
o
n
a
l
True Anonymisation?
Anonymization
V
a
l
u
e
Low High
High
 Anonymized
 Pseudo anonymized
 Personal Data
 Special category of personal data
Personal Data Minimisation
L
a
k
e
Anonymize everything
Batch
source
Ingestion
Raw Storage
Batch
source
Analytics
BI
Aggregated Storage
Streaming
Source
Sources Transient Storage Consumer
Channels
Personal data: Pseudo Anonymised
Batch
source
Ingestion
Raw Storage
Batch
source
Analytics
BI
Aggregated Storage
Streaming
Source
Sources Transient Storage Consumer
Channels
Pseudo anonymization techniques
• For each data source
• Direct Identifiers
– Encryption
1. Symmetric/Asymmetric
2.Per person/Per purpose
– Hashing ID + salt
– Save mapping hash/key in a lookup table (consent or legal or legitimate interest)
• Indirect identifiers
– Aggregation/generalization etc
Personal data: on a single place
Batch
source
Ingestion
Raw Storage
Batch
source
Analytics
BI
Aggregated Storage
Streaming
Source
Sources Transient Storage Consumer
Channels
Personal data: Pseudo Anonymized
Batch
source
Ingestion
Batch
source
Analytics
BI
Streaming
Source
Sources Transient Storage Consumer
Channels
Consent
Pseudo Anonymized separated
Batch
source
Ingestion
Batch
source
Analytics
BI
Streaming
Source
Sources Transient Storage Consumer
Channels
Consent
Personal Data: Log Access
Batch
source
Ingestion
Batch
source
Analytics
BI
Streaming
Source
Sources Transient Storage Consumer
Channels
Consent
 If user withdraws a consent later
 How would you restrict processing?
Multiple consent for same data source
User Marketing
Campaign
Customer
Care
+467308080 Yes Yes
+467000601 Yes Yes
User Marketing
Campaign
Customer
Care
+467308080 Yes Yes
+467000601 Yes
 Model around purpose
 Pros
 Simplifies GDPR compliance
 Cons
 Increase of storage
Multiple consent for same data source
p1 p2 … pn
 Minimization of personal data
 Lawfulness of processing
 Traceability of processing
 Data protection by design and by default
Data Controller’s Responsibility: Solution
Principles
 Data Subject, consumer
 Right of access
 Data portability
 Right to be forgotten
 Right to object, rectify
Recap: Data Subject’s Right
Right of Data Subject
 Removing from the mapped key, hashed ID is sufficient on the lake to
implement right to forget
Right to forget
Keep metadata & lineage
Batch
source
Ingestion
Batch
source
Analytics
BI
Streaming
Source
Sources Transient Storage Consumer
Channels
Consent
Self service: Automated Reports
Batch
source
Ingestion
Batch
source
Analytics
BI
Streaming
Source
Sources Transient Storage Consumer
Channels
Consent
 Governance in single place
 Rich Metadata
 Self service
Right of Data Subject: Solution Principles
 Apply PIA for each data sources, DPO
 Develop tests for anonymization with Statistician, Scientist
 Anonymization level test with existing data sources
 Solutions needs to be reapplied to Data Processor’s as well
Process
GDPR is a blessing in disguise!

More Related Content

What's hot

Building Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueBuilding Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueAmazon Web Services
 
Migrating On-Premises Databases to Cloud
Migrating On-Premises Databases to CloudMigrating On-Premises Databases to Cloud
Migrating On-Premises Databases to CloudAmazon Web Services
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSAmazon Web Services
 
Introduction to AWS Storage Services
Introduction to AWS Storage ServicesIntroduction to AWS Storage Services
Introduction to AWS Storage ServicesAmazon Web Services
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaScyllaDB
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
 
Introduction to AWS Lake Formation.pptx
Introduction to AWS Lake Formation.pptxIntroduction to AWS Lake Formation.pptx
Introduction to AWS Lake Formation.pptxSwathiPonugumati
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduceAmazon Web Services
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as ProductDATAVERSITY
 

What's hot (20)

Building Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueBuilding Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS Glue
 
Migrating On-Premises Databases to Cloud
Migrating On-Premises Databases to CloudMigrating On-Premises Databases to Cloud
Migrating On-Premises Databases to Cloud
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
Introduction to AWS Storage Services
Introduction to AWS Storage ServicesIntroduction to AWS Storage Services
Introduction to AWS Storage Services
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
AWS Storage Options
AWS Storage OptionsAWS Storage Options
AWS Storage Options
 
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
 
Introduction to AWS Lake Formation.pptx
Introduction to AWS Lake Formation.pptxIntroduction to AWS Lake Formation.pptx
Introduction to AWS Lake Formation.pptx
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Microsoft Purview
Microsoft PurviewMicrosoft Purview
Microsoft Purview
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 

Similar to GDPR Data Lake Guide: Solving Data Controller Responsibilities

Data Quality-Driven GDPR: Compliance with Confidence (EMEA)
Data Quality-Driven GDPR: Compliance with Confidence (EMEA)Data Quality-Driven GDPR: Compliance with Confidence (EMEA)
Data Quality-Driven GDPR: Compliance with Confidence (EMEA)Precisely
 
Taking the Fear Out of GDPR
Taking the Fear Out of GDPRTaking the Fear Out of GDPR
Taking the Fear Out of GDPRNate Stockard
 
GDPR master class - transparent research projects
GDPR master class - transparent research projectsGDPR master class - transparent research projects
GDPR master class - transparent research projectsMRS
 
Flash Friday: Data Quality & GDPR
Flash Friday: Data Quality & GDPRFlash Friday: Data Quality & GDPR
Flash Friday: Data Quality & GDPRPrecisely
 
Preparing for GDPR: What Every B2B Marketer Must Know
Preparing for GDPR: What Every B2B Marketer Must KnowPreparing for GDPR: What Every B2B Marketer Must Know
Preparing for GDPR: What Every B2B Marketer Must KnowIntegrate
 
Bridging the Gap Between Privacy and Retention
Bridging the Gap Between Privacy and RetentionBridging the Gap Between Privacy and Retention
Bridging the Gap Between Privacy and RetentionInfoGoTo
 
Data Quality-Driven GDPR: Compliance with Confidence
Data Quality-Driven GDPR: Compliance with ConfidenceData Quality-Driven GDPR: Compliance with Confidence
Data Quality-Driven GDPR: Compliance with ConfidencePrecisely
 
Webinar | GDPR: How Can Content Services Help You Comply?
Webinar | GDPR: How Can Content Services Help You Comply?Webinar | GDPR: How Can Content Services Help You Comply?
Webinar | GDPR: How Can Content Services Help You Comply?Nuxeo
 
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...Sabrina Kirrane
 
GDPR in the Healthcare Industry
GDPR in the Healthcare IndustryGDPR in the Healthcare Industry
GDPR in the Healthcare IndustryEMMAIntl
 
GDPR: Training Materials by Qualsys
GDPR: Training Materials  by QualsysGDPR: Training Materials  by Qualsys
GDPR: Training Materials by QualsysQualsys Ltd
 
Web Analytics and Privacy
Web Analytics and Privacy Web Analytics and Privacy
Web Analytics and Privacy Piwik PRO
 
IAB Europe's GDPR Compliance Primer
IAB Europe's GDPR Compliance PrimerIAB Europe's GDPR Compliance Primer
IAB Europe's GDPR Compliance PrimerIAB Europe
 
An Overview of GDPR by Pathway Group
An Overview of GDPR by Pathway GroupAn Overview of GDPR by Pathway Group
An Overview of GDPR by Pathway GroupThe Pathway Group
 
#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers
#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers
#1NWebinar: GDPR and Privacy Best Practices for Digital MarketersOne North
 
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...ARMA International
 
GDPR: Your Journey to Compliance
GDPR: Your Journey to ComplianceGDPR: Your Journey to Compliance
GDPR: Your Journey to ComplianceCobweb
 

Similar to GDPR Data Lake Guide: Solving Data Controller Responsibilities (20)

Data Quality-Driven GDPR: Compliance with Confidence (EMEA)
Data Quality-Driven GDPR: Compliance with Confidence (EMEA)Data Quality-Driven GDPR: Compliance with Confidence (EMEA)
Data Quality-Driven GDPR: Compliance with Confidence (EMEA)
 
Taking the Fear Out of GDPR
Taking the Fear Out of GDPRTaking the Fear Out of GDPR
Taking the Fear Out of GDPR
 
GDPR master class - transparent research projects
GDPR master class - transparent research projectsGDPR master class - transparent research projects
GDPR master class - transparent research projects
 
Flash Friday: Data Quality & GDPR
Flash Friday: Data Quality & GDPRFlash Friday: Data Quality & GDPR
Flash Friday: Data Quality & GDPR
 
Preparing for GDPR: What Every B2B Marketer Must Know
Preparing for GDPR: What Every B2B Marketer Must KnowPreparing for GDPR: What Every B2B Marketer Must Know
Preparing for GDPR: What Every B2B Marketer Must Know
 
Bridging the Gap Between Privacy and Retention
Bridging the Gap Between Privacy and RetentionBridging the Gap Between Privacy and Retention
Bridging the Gap Between Privacy and Retention
 
Data Quality-Driven GDPR: Compliance with Confidence
Data Quality-Driven GDPR: Compliance with ConfidenceData Quality-Driven GDPR: Compliance with Confidence
Data Quality-Driven GDPR: Compliance with Confidence
 
Webinar | GDPR: How Can Content Services Help You Comply?
Webinar | GDPR: How Can Content Services Help You Comply?Webinar | GDPR: How Can Content Services Help You Comply?
Webinar | GDPR: How Can Content Services Help You Comply?
 
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...
Scalable policy-aware Linked Data architecture for prIvacy, transparency and ...
 
GDPR Seminar Slides
GDPR Seminar SlidesGDPR Seminar Slides
GDPR Seminar Slides
 
GDPR in the Healthcare Industry
GDPR in the Healthcare IndustryGDPR in the Healthcare Industry
GDPR in the Healthcare Industry
 
GDPR: Training Materials by Qualsys
GDPR: Training Materials  by QualsysGDPR: Training Materials  by Qualsys
GDPR: Training Materials by Qualsys
 
GDPR for your Payroll Bureau
GDPR for your Payroll BureauGDPR for your Payroll Bureau
GDPR for your Payroll Bureau
 
Web Analytics and Privacy
Web Analytics and Privacy Web Analytics and Privacy
Web Analytics and Privacy
 
IAB Europe's GDPR Compliance Primer
IAB Europe's GDPR Compliance PrimerIAB Europe's GDPR Compliance Primer
IAB Europe's GDPR Compliance Primer
 
An Overview of GDPR
An Overview of GDPR An Overview of GDPR
An Overview of GDPR
 
An Overview of GDPR by Pathway Group
An Overview of GDPR by Pathway GroupAn Overview of GDPR by Pathway Group
An Overview of GDPR by Pathway Group
 
#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers
#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers
#1NWebinar: GDPR and Privacy Best Practices for Digital Marketers
 
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...
Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...
 
GDPR: Your Journey to Compliance
GDPR: Your Journey to ComplianceGDPR: Your Journey to Compliance
GDPR: Your Journey to Compliance
 

Recently uploaded

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

Recently uploaded (20)

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

GDPR Data Lake Guide: Solving Data Controller Responsibilities

  • 2.  Understanding GDPR  GDPR from Data Lake perspective  Solving Data Controller’s responsibility  Solving Data Subject’s right  Process recommendation  Final thoughts Disclaimer: This is not legal advice! Goal: GDPR compliant Data Lake
  • 3.  GDPR is the most important change in data privacy regulation in 20 years  Enforced from 25th May 2018  4% of annual global turnover or €20 Million (whichever is greater) General Data Protection Regulation
  • 4. GDPR from Data Lake perspective
  • 7.  The EU General Data Protection Regulation (GDPR) is the most important change in data privacy regulation in 20 years  99 Article  Data controller’s responsibility  Data subject’s right GDPR
  • 8.  Data Controller  Lawfulness of processing based on consent  Records of processing activities and personal data  Data protection by design and default  Cooperation with supervisory authority Data Controller’s Responsibility
  • 9.  Data Subject, consumer  Right of access  Data portability  Right to be forgotten  Right to object, rectify Data Subject’s Right
  • 10.  Data Controller  Lawfulness of processing based on consent  Records of processing activities and personal data  Data protection by design and default  Cooperation with supervisory authority  Data Subject, consumer  Right of access  Data portability  Right to be forgotten  Right to object, rectify GDPR from Data Lake Perspective
  • 12.  Disjoint files  Easy to replicate  Different teams  No built-in Governance Data Lake
  • 13. GDPR & Data Lake Image Source: https://mindfulmvmnt.org/2016/08/09/sciatica-piriformis-syndrome-condition-breakdown-w-corrective- yoga/
  • 15.  There is no silver bullet solution  Different solution approach based on the use case Solution approach
  • 16.  Data Controller  Lawfulness of processing based on consent  Records of processing activities and personal data  Data protection by design and default  Cooperation with supervisory authority Recap: Data Controller’s Responsibility
  • 18.  Anonymization – Re-identification is NOT possible  Pseudo anonymization- re-identification possible  Personal data – Identifies a person directly or indirectly  Special category of personal data – ethnic origin, political or religious views, health etc Rest of the talk assumes P e r s o n a l
  • 20.  Anonymized  Pseudo anonymized  Personal Data  Special category of personal data Personal Data Minimisation L a k e
  • 21. Anonymize everything Batch source Ingestion Raw Storage Batch source Analytics BI Aggregated Storage Streaming Source Sources Transient Storage Consumer Channels
  • 22. Personal data: Pseudo Anonymised Batch source Ingestion Raw Storage Batch source Analytics BI Aggregated Storage Streaming Source Sources Transient Storage Consumer Channels
  • 23. Pseudo anonymization techniques • For each data source • Direct Identifiers – Encryption 1. Symmetric/Asymmetric 2.Per person/Per purpose – Hashing ID + salt – Save mapping hash/key in a lookup table (consent or legal or legitimate interest) • Indirect identifiers – Aggregation/generalization etc
  • 24. Personal data: on a single place Batch source Ingestion Raw Storage Batch source Analytics BI Aggregated Storage Streaming Source Sources Transient Storage Consumer Channels
  • 25. Personal data: Pseudo Anonymized Batch source Ingestion Batch source Analytics BI Streaming Source Sources Transient Storage Consumer Channels Consent
  • 27. Personal Data: Log Access Batch source Ingestion Batch source Analytics BI Streaming Source Sources Transient Storage Consumer Channels Consent
  • 28.  If user withdraws a consent later  How would you restrict processing? Multiple consent for same data source User Marketing Campaign Customer Care +467308080 Yes Yes +467000601 Yes Yes User Marketing Campaign Customer Care +467308080 Yes Yes +467000601 Yes
  • 29.  Model around purpose  Pros  Simplifies GDPR compliance  Cons  Increase of storage Multiple consent for same data source p1 p2 … pn
  • 30.  Minimization of personal data  Lawfulness of processing  Traceability of processing  Data protection by design and by default Data Controller’s Responsibility: Solution Principles
  • 31.  Data Subject, consumer  Right of access  Data portability  Right to be forgotten  Right to object, rectify Recap: Data Subject’s Right
  • 32. Right of Data Subject
  • 33.  Removing from the mapped key, hashed ID is sufficient on the lake to implement right to forget Right to forget
  • 34. Keep metadata & lineage Batch source Ingestion Batch source Analytics BI Streaming Source Sources Transient Storage Consumer Channels Consent
  • 35. Self service: Automated Reports Batch source Ingestion Batch source Analytics BI Streaming Source Sources Transient Storage Consumer Channels Consent
  • 36.  Governance in single place  Rich Metadata  Self service Right of Data Subject: Solution Principles
  • 37.  Apply PIA for each data sources, DPO  Develop tests for anonymization with Statistician, Scientist  Anonymization level test with existing data sources  Solutions needs to be reapplied to Data Processor’s as well Process
  • 38. GDPR is a blessing in disguise!

Editor's Notes

  1. Broaden the definition of personal data More responsibility on Data Controller Lawfulnees of processing Data Subject’s right for example right to be fogotten or portability right Heavy fine 
  2. 1. Vendors or products won't solve everything 2. There is no one size fit solution
  3. Recommended for GDPR, processing, processors does not need to identify individuals. Remember pseudo anonymization is still considered personal data even if they are written down on paper on locked in volt the GDPR defines pseudonymization in Article 3, as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information.” To pseudonymize a data set, the “additional information” must be “kept separately and subject to technical and organizational measures to ensure non-attribution to an identified or identifiable person.” Pseudonymization does not remove all identifying information from the data but merely reduces the linkability of a dataset with the original identity of an individual (e.g., via an encryption scheme).
  4. Track all metadata and lineage and based on the lineage keep the whole graph Services to track and build report for each users data, processing etc Track metadata, lineage, tags and single source of governance on lake Tag based dynamic security
  5. Track all metadata and lineage and based on the lineage keep the whole graph Services to track and build report for each users data, processing etc Track metadata, lineage, tags and single source of governance on lake Tag based dynamic security