SlideShare a Scribd company logo
1 of 3
Download to read offline
Why is the EU GDPR so Misunderstood and Unfamiliar to so Many? Page 1 of 3
Why Compliance isn’t just reducing intrusions. What every DPO, CEO, CIO should know about GDPR
In a Dell Press Release “Dell announced results of a global survey on the European Union’s new
General Data Protection Regulation (GDPR).
• More than 80 percent of global respondents know few details or nothing about GDPR
• Less than one in three companies feel they are prepared for GDPR today
• 97 percent of companies don’t have a plan to be ready for GDPR
• Only 9 percent of IT & business professionals are confident they will be
fully ready for GDPR”
“To give you some idea of the punitive impact this can have, in the UK in 2016, Tesco Bank
experienced a data security breach that affected 9,000 of its customers. Had GDPR been in force
when this breach took place, Tesco Bank would have been fined
£1.9 billion ($2.3 billion)”cmswire
GDPR Key Points: 10 key facts businesses need to note about the GDPR – ComputerWeekly.com
1. GDPR applies to all – The GDPR applies to all companies worldwide that process personal data of
European Union (EU) citizens. See it in its entirety at GDPR.Institute
2. The GDPR widens the definition of personal data
The GDPR considers any data that can be used to identify an individual as personal data. It
includes, for the first time, things such as genetic, mental, cultural, economic or social
information.
3. The GDPR tightens the rules for obtaining valid consent to using personal information
The GDPR requires all organizations collecting personal data to be able to prove clear and
affirmative consent to process that data
4. The GDPR makes the appointment of a DPO mandatory for certain organizations
The GDPR requires public authorities processing personal information to appoint a data
protection officer (DPO)
5. The GDPR introduces mandatory PIAs
The inclusion of mandatory privacy impact assessments (PIAs) in the GDPR is mainly due to the
influence of the UK’s Information Commissioner’s Office, which has worked a lot with PIAs in the
past. The GDPR requires data controllers to conduct PIAs where privacy breach risks are high to
minimize risks to data subjects.
6. The GDPR introduces a common data breach notification requirement
The GDPR harmonizes the various data breach notification laws in Europe and is aimed at
ensuring organizations constantly monitor for breaches of personal data.
The regulation requires organizations to notify the local data protection authority of a data
breach within 72 hours of discovering it.
7. The GDPR introduces the right to be forgotten
The GDPR introduces very restrictive, enforceable data handling principles.
One of these is the data minimization principle that requires organizations not to hold data for
any longer than absolutely necessary, and not to change the use of the data from the purpose for
which it was originally collected, while – at the same time – they must delete any data at
the request of the data subject.
8. The GDPR expands liability beyond data controllers
In the past, only data controllers were considered responsible for data processing activities, but
the GDPR extends liability to all organizations that touch personal data.
The GDPR also covers any organization that provides data processing services to the data
controller, which means that even organizations that are purely service providers that work with
personal data will need to comply with rules such as data minimization.
9. The GDPR requires privacy by design
The GDPR requires that privacy is included in systems and processes by design.
This means that software, systems and processes must consider compliance with the principles of
data protection. However, the proper erasure of information, for example, is not something often
seen in software. But in the future, all software will be required to be capable of completely
erasing data, which will be a challenge for a lot of software engineers.
10. The GDPR introduces the concept of a one-stop shop
GDPR, which allows any European data protection authority to take action against organizations,
regardless of where in the world the company is based. GDPR enforcement is also backed by
significant fines of up to €20m or 4% of group annual global turnover. Page 1 of 3
Methodology Page 2
About Methodology:
As you can see from the above statements, intrusion protection is not the main focus of GDPR regulations.
They reach deeper into the true meaning of Data Governance and give the EU citizen the right to remain
anonymous, if desired, even from individuals within your own organization.
How do you become compliant with EU GDPR and other Regulatory Agencies without disrupting or
degrading your legacy systems or existing Hadoop Clusters? The days of using old generation Data Profiling
and Discovery tools are not the desired approach for sensitive production systems and in many cases not
even allowed.
BigDataRevealed (BDR) is a software product specifically designed to address the same issues that GDPR is
now mandating. BDR identifies and encrypts Personal Data by using Pattern Recognition algorithms and
features of our Intelligent Catalogue for maximum Personal Data discovery. BDR is written entirely within
the Hadoop framework for enhanced efficiency and allows BDR to process streaming data.
BDR is able to use the massive processing power of Hadoop to discover exposed PII data in legacy systems
by following the easy to use instructions attached to this document.
For a more in-depth description of BigDataRevealed’s (BDR) features please read the following.
What I suggest is to create a BigDataRevealed Hadoop Cluster by exporting data to Hadoop using either
sqoop or our Spark/Kafka connectors. Name the Folders and Files as closely to the names of the source
files brought into Hadoop.
As data gets loaded into Hadoop;
1. Allow BDR to discover and build Catalog/Metadata from the Hadoop Files copied from your legacy
data.
2. BDR will identify the most likely business columnar classifications and even give you the
percentage of data in that column that matches a pattern. As an example, a column may contain
the following:
a. Email 85%
b. Phone 10%
c. Unknown 2%
3. Then Run detailed Discovery on the files that searches for Personally Identifiable Information
using algorithms found in BDR’s Collaborative Library. The following patterns represent a small
sample of the many contained in BDR’s Library:
a. Names
b. Addresses
c. Zip Code
d. Social Security
e. Credit Card Info
f. User-Defined / Collaborative
g. And so on ….
4. Store the Folder/File/Column path and the Discovery Pattern Name where found
5. Use BDR’s Drilling capabilities to view the source data and verify results.
6. BDR will allow history to be kept every time these processes are run along with time stamps. This
will help identify when other jobs where run that may have contained PII and other sensitive data
that subsequently entered your databases and file systems.
Page 2 of 3
For Hadoop, BDR-SecureSequesterEncryption:
1. If you wish to secure Data within Hadoop, follow the process described below to Secure and
Sequester PII and other sensitive data:
a. Create a safe and secure Encrypted Hadoop Cluster off the Grid, off of the Web, so that
NO outside hacker can reach the data during this process. Or if protecting a current
Hadoop Cluster, create a large Encrypted Zone and use this during the
BDR-SecureSequesterEncryption Processes.
b. Copy the original files with Personally Identifiable Information into a Hadoop Encrypted
Zone
c. Instruct BDR to encrypt the PII and sensitive data and write a new safe, secure compliant
file.
d. BDR will Store the “AES 256” Encryption keys off the Grid and outside of the Edge Node
and use User Role Based Authority to access those keys when needed.
e. Delete the old files from HDFS and Hbase that contained PII and other sensitive data.
f. Now you have preserved your Original Files in an Encrypted Zone if desired.
g. And you have created new files with encrypted PII data while leaving other data in the
files for Data Scientist and others to use.
h. And you have deleted all files and iterations of files that contained exposed PII data from
HDFS and Hbase.
- This complete series of 1-6 above and 1 a-h can be continually processed at whatever latency
you desire to maintain ongoing compliance.
2. For Legacy Databases such as SQL, Mainframe, AS400, Documents and more …,
BDR-Secure-Sequester-Encrypt:
a. Using the Intelligent-Catalog/Metadata created in steps 1-6 above and stored by BDR.
a. Review each file processed by BDR in your Hadoop Cluster.
i. BDR identifies each column containing PII or other sensitive data.
ii. Identify what Files and columns PII or other sensitive data was found.
iii. Backup and secure Original Files/Data.
iv. Use SQL tools or queries to encrypt or remove PII or sensitive data from
legacy system.
No matter to what degree you feel ready to pass GDPR audits, you can quickly and easily use BDR to verify
your readiness and determine your real exposure. Just a few resources and several days could give that
peace of mind you are looking for.
Email or call us and feel comfortable you are on your way to GDPR compliance. For more information
about our discovery process and the BDR-SecureSequesterEncryption process that is facilitated using our
intelligent catalog and Metadata, contact us at info@bigdatarevealed.com or 847-791-7838.
https://vimeo.com/216064822 Video of recent Webinar
Page 3 of 3

More Related Content

More from Steven Meister

I have listed 3 informative youtube videos on the eu gdpr
I have listed 3 informative youtube videos on the eu gdprI have listed 3 informative youtube videos on the eu gdpr
I have listed 3 informative youtube videos on the eu gdprSteven Meister
 
Eu gdpr technical workflow and productionalization neccessary w privacy ass...
Eu gdpr technical workflow and productionalization   neccessary w privacy ass...Eu gdpr technical workflow and productionalization   neccessary w privacy ass...
Eu gdpr technical workflow and productionalization neccessary w privacy ass...Steven Meister
 
Gdpr questions for compliance difficulties
Gdpr questions for compliance difficultiesGdpr questions for compliance difficulties
Gdpr questions for compliance difficultiesSteven Meister
 
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...Steven Meister
 
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...Steven Meister
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalogSteven Meister
 

More from Steven Meister (6)

I have listed 3 informative youtube videos on the eu gdpr
I have listed 3 informative youtube videos on the eu gdprI have listed 3 informative youtube videos on the eu gdpr
I have listed 3 informative youtube videos on the eu gdpr
 
Eu gdpr technical workflow and productionalization neccessary w privacy ass...
Eu gdpr technical workflow and productionalization   neccessary w privacy ass...Eu gdpr technical workflow and productionalization   neccessary w privacy ass...
Eu gdpr technical workflow and productionalization neccessary w privacy ass...
 
Gdpr questions for compliance difficulties
Gdpr questions for compliance difficultiesGdpr questions for compliance difficulties
Gdpr questions for compliance difficulties
 
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
 
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalog
 

Recently uploaded

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Recently uploaded (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Why is the eu gdpr so misunderstood and unfamiliar to so many

  • 1. Why is the EU GDPR so Misunderstood and Unfamiliar to so Many? Page 1 of 3 Why Compliance isn’t just reducing intrusions. What every DPO, CEO, CIO should know about GDPR In a Dell Press Release “Dell announced results of a global survey on the European Union’s new General Data Protection Regulation (GDPR). • More than 80 percent of global respondents know few details or nothing about GDPR • Less than one in three companies feel they are prepared for GDPR today • 97 percent of companies don’t have a plan to be ready for GDPR • Only 9 percent of IT & business professionals are confident they will be fully ready for GDPR” “To give you some idea of the punitive impact this can have, in the UK in 2016, Tesco Bank experienced a data security breach that affected 9,000 of its customers. Had GDPR been in force when this breach took place, Tesco Bank would have been fined £1.9 billion ($2.3 billion)”cmswire GDPR Key Points: 10 key facts businesses need to note about the GDPR – ComputerWeekly.com 1. GDPR applies to all – The GDPR applies to all companies worldwide that process personal data of European Union (EU) citizens. See it in its entirety at GDPR.Institute 2. The GDPR widens the definition of personal data The GDPR considers any data that can be used to identify an individual as personal data. It includes, for the first time, things such as genetic, mental, cultural, economic or social information. 3. The GDPR tightens the rules for obtaining valid consent to using personal information The GDPR requires all organizations collecting personal data to be able to prove clear and affirmative consent to process that data 4. The GDPR makes the appointment of a DPO mandatory for certain organizations The GDPR requires public authorities processing personal information to appoint a data protection officer (DPO) 5. The GDPR introduces mandatory PIAs The inclusion of mandatory privacy impact assessments (PIAs) in the GDPR is mainly due to the influence of the UK’s Information Commissioner’s Office, which has worked a lot with PIAs in the past. The GDPR requires data controllers to conduct PIAs where privacy breach risks are high to minimize risks to data subjects. 6. The GDPR introduces a common data breach notification requirement The GDPR harmonizes the various data breach notification laws in Europe and is aimed at ensuring organizations constantly monitor for breaches of personal data. The regulation requires organizations to notify the local data protection authority of a data breach within 72 hours of discovering it. 7. The GDPR introduces the right to be forgotten The GDPR introduces very restrictive, enforceable data handling principles. One of these is the data minimization principle that requires organizations not to hold data for any longer than absolutely necessary, and not to change the use of the data from the purpose for which it was originally collected, while – at the same time – they must delete any data at the request of the data subject. 8. The GDPR expands liability beyond data controllers In the past, only data controllers were considered responsible for data processing activities, but the GDPR extends liability to all organizations that touch personal data. The GDPR also covers any organization that provides data processing services to the data controller, which means that even organizations that are purely service providers that work with personal data will need to comply with rules such as data minimization. 9. The GDPR requires privacy by design The GDPR requires that privacy is included in systems and processes by design. This means that software, systems and processes must consider compliance with the principles of data protection. However, the proper erasure of information, for example, is not something often seen in software. But in the future, all software will be required to be capable of completely erasing data, which will be a challenge for a lot of software engineers. 10. The GDPR introduces the concept of a one-stop shop GDPR, which allows any European data protection authority to take action against organizations, regardless of where in the world the company is based. GDPR enforcement is also backed by significant fines of up to €20m or 4% of group annual global turnover. Page 1 of 3 Methodology Page 2
  • 2. About Methodology: As you can see from the above statements, intrusion protection is not the main focus of GDPR regulations. They reach deeper into the true meaning of Data Governance and give the EU citizen the right to remain anonymous, if desired, even from individuals within your own organization. How do you become compliant with EU GDPR and other Regulatory Agencies without disrupting or degrading your legacy systems or existing Hadoop Clusters? The days of using old generation Data Profiling and Discovery tools are not the desired approach for sensitive production systems and in many cases not even allowed. BigDataRevealed (BDR) is a software product specifically designed to address the same issues that GDPR is now mandating. BDR identifies and encrypts Personal Data by using Pattern Recognition algorithms and features of our Intelligent Catalogue for maximum Personal Data discovery. BDR is written entirely within the Hadoop framework for enhanced efficiency and allows BDR to process streaming data. BDR is able to use the massive processing power of Hadoop to discover exposed PII data in legacy systems by following the easy to use instructions attached to this document. For a more in-depth description of BigDataRevealed’s (BDR) features please read the following. What I suggest is to create a BigDataRevealed Hadoop Cluster by exporting data to Hadoop using either sqoop or our Spark/Kafka connectors. Name the Folders and Files as closely to the names of the source files brought into Hadoop. As data gets loaded into Hadoop; 1. Allow BDR to discover and build Catalog/Metadata from the Hadoop Files copied from your legacy data. 2. BDR will identify the most likely business columnar classifications and even give you the percentage of data in that column that matches a pattern. As an example, a column may contain the following: a. Email 85% b. Phone 10% c. Unknown 2% 3. Then Run detailed Discovery on the files that searches for Personally Identifiable Information using algorithms found in BDR’s Collaborative Library. The following patterns represent a small sample of the many contained in BDR’s Library: a. Names b. Addresses c. Zip Code d. Social Security e. Credit Card Info f. User-Defined / Collaborative g. And so on …. 4. Store the Folder/File/Column path and the Discovery Pattern Name where found 5. Use BDR’s Drilling capabilities to view the source data and verify results. 6. BDR will allow history to be kept every time these processes are run along with time stamps. This will help identify when other jobs where run that may have contained PII and other sensitive data that subsequently entered your databases and file systems. Page 2 of 3
  • 3. For Hadoop, BDR-SecureSequesterEncryption: 1. If you wish to secure Data within Hadoop, follow the process described below to Secure and Sequester PII and other sensitive data: a. Create a safe and secure Encrypted Hadoop Cluster off the Grid, off of the Web, so that NO outside hacker can reach the data during this process. Or if protecting a current Hadoop Cluster, create a large Encrypted Zone and use this during the BDR-SecureSequesterEncryption Processes. b. Copy the original files with Personally Identifiable Information into a Hadoop Encrypted Zone c. Instruct BDR to encrypt the PII and sensitive data and write a new safe, secure compliant file. d. BDR will Store the “AES 256” Encryption keys off the Grid and outside of the Edge Node and use User Role Based Authority to access those keys when needed. e. Delete the old files from HDFS and Hbase that contained PII and other sensitive data. f. Now you have preserved your Original Files in an Encrypted Zone if desired. g. And you have created new files with encrypted PII data while leaving other data in the files for Data Scientist and others to use. h. And you have deleted all files and iterations of files that contained exposed PII data from HDFS and Hbase. - This complete series of 1-6 above and 1 a-h can be continually processed at whatever latency you desire to maintain ongoing compliance. 2. For Legacy Databases such as SQL, Mainframe, AS400, Documents and more …, BDR-Secure-Sequester-Encrypt: a. Using the Intelligent-Catalog/Metadata created in steps 1-6 above and stored by BDR. a. Review each file processed by BDR in your Hadoop Cluster. i. BDR identifies each column containing PII or other sensitive data. ii. Identify what Files and columns PII or other sensitive data was found. iii. Backup and secure Original Files/Data. iv. Use SQL tools or queries to encrypt or remove PII or sensitive data from legacy system. No matter to what degree you feel ready to pass GDPR audits, you can quickly and easily use BDR to verify your readiness and determine your real exposure. Just a few resources and several days could give that peace of mind you are looking for. Email or call us and feel comfortable you are on your way to GDPR compliance. For more information about our discovery process and the BDR-SecureSequesterEncryption process that is facilitated using our intelligent catalog and Metadata, contact us at info@bigdatarevealed.com or 847-791-7838. https://vimeo.com/216064822 Video of recent Webinar Page 3 of 3