GDPR = General Data Protection Regulations or GDPR = Get Demand Payment Ready when your hacked or audited.
A Realistic project plan for GDPR Compliance. Another reality is the 95% not ready and even the 5% that say they are, will not like what they see in this plan in the hopes of becoming GDPR compliant.
There is just not enough time or people to get it done in the next 8 months and even if you had
2 years. This is a harsh reality and without the use of software technology and strict yet flexible, repeatable methodologies, it just won’t happen. Look at this Project plan of what needs to be done, do the math, see the complexity of data movement and code and programs needed then give us a call.
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...Steven Meister
How to become GDPR & CCPA Compliant. See the complete 5 page GDPR, CCPA Compliancy Plan
Here is the CCPA / GDPR 3 Day Training PowerPoint - https://www.slideshare.net/StevenMeister/ccpa-and-gdpr-three-day-training-with-actual-deliverables-and-the-whys-and-hows-to-do-so
847-440-4439 https://www.youtube.com/channel/UC3F-qrvOIOwDj4ZKBMmoTWA?view_as=subscriber
GDPR 16 page PPT Plan - https://www.slideshare.net/StevenMeister/gdpr-ccpa-automated-compliance-spark-java-application-features-and-functions-of-big-datarevealed-april-version-35
https://youtu.be/JGoQwoicUxw
Comprehensive Metadata Catalog Video for GDPR / CCPA - https://youtu.be/xryESgfzRcc
Gdpr ccpa automated compliance - spark java application features and functi...Steven Meister
GDPR – CCPA Automated Technology, 16 Page PowerPoint with Features, Functions, Architecture and our reasons for choosing them. Be on your way to compliance with Technology created with compliance as its goal. Expect to add years of development without technology built specifically for compliances, such as GDPR, CCPA, HIPAA and others.
After scrolling through this PowerPoint you will realize just what is required and be able to better estimate the efforts it will take for your company to meet these regulatory requirements with technology and then without technology.
Spend just 5-10 minutes that might save your company, and your Customers, all the negative ramifications of the inevitable 2 breaches a year a company can expect to suffer.
This PowerPoint covers the critical aspects and needs that are present in any project designed to meet regulatory requirements for GDPR, CCPA and many others.
Complete Channel of Videos on BigDataRevealed
https://www.youtube.com/watch?v=3rLcQF5Wsgc&list=UU3F-qrvOIOwDj4ZKBMmoTWA
847-440-4439
#CCPA #GDPR #Big Data #Data Compliance #PII #Facebook #Hadoop #AWS #Spark #IoT #California
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...BigDataEverywhere
Today, no industry is immune from a potential data breach and the havoc it can create. According to a 2013 Global Data Breach study by the Ponemon Institute, the average cost of data loss exceeds $5.4 million per breach, and the average per person cost of lost data is approaching $200 per record in the US. Protecting sensitive data in Hadoop is now the imperative for IT and business. With the emergence of Hadoop as a business-critical data platform, Hadoop offers organizations opportunities to improve performance, better understand customers and develop a competitive advantage. But reaching these desirable analytic outcomes depends on the ability to use data without exposing the organization to unnecessary risk. This presentation will cover best practices for a data-centric security, compliance and data governance approach, with a particular focus on two customer use cases within the financial services and insurance industries. You'll learn how these companies are reducing their security exposure through automated data-centric protection of sensitive data in Hadoop.
In today’s increasingly competitive world, accelerated speed to identifying relevant and hidden knowledge, internal expertise and experience is critical to meeting client demands, securing new clients and cases, reviewing precedents and outcomes and leveraging collective IP for the strategic advantage. OpenText Decisiv instantly finds, organizes, and helps gain insights from your data for the competitive advantage. To learn more, email salt@opentext.com
It is almost impossible to escape the topic of Data Science. While the core of Data Science has remained the same over the last decade, it’s emergence to the forefront is spurred by both the availability of new data types and a true realization of the value that it delivers. In this session, we will provide an overview of data science, the different classes of machine learning algorithm and deliver an end-to-end demonstration of performing Machine Learning Using Hadoop. Audience: Developers, Data Scientist Architects and System Engineers.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=4175a7421d00257f33df146f50c41af8
RDBMS gave us table schemas. A table schema, which is an essential metadata component, gave us the power to validate data types, and enforce constraints. In the age of varying data and schema-less data stores, how can we enforce these rules and how can we leverage metadata (even in RDBMS) to empower data validity, code checks, and automation.
This is a brief background into Big data (data lake) to put in context the importance of metadata from a governance perspective and more especially in todays heterogeneous big data platforms.
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...Steven Meister
How to become GDPR & CCPA Compliant. See the complete 5 page GDPR, CCPA Compliancy Plan
Here is the CCPA / GDPR 3 Day Training PowerPoint - https://www.slideshare.net/StevenMeister/ccpa-and-gdpr-three-day-training-with-actual-deliverables-and-the-whys-and-hows-to-do-so
847-440-4439 https://www.youtube.com/channel/UC3F-qrvOIOwDj4ZKBMmoTWA?view_as=subscriber
GDPR 16 page PPT Plan - https://www.slideshare.net/StevenMeister/gdpr-ccpa-automated-compliance-spark-java-application-features-and-functions-of-big-datarevealed-april-version-35
https://youtu.be/JGoQwoicUxw
Comprehensive Metadata Catalog Video for GDPR / CCPA - https://youtu.be/xryESgfzRcc
Gdpr ccpa automated compliance - spark java application features and functi...Steven Meister
GDPR – CCPA Automated Technology, 16 Page PowerPoint with Features, Functions, Architecture and our reasons for choosing them. Be on your way to compliance with Technology created with compliance as its goal. Expect to add years of development without technology built specifically for compliances, such as GDPR, CCPA, HIPAA and others.
After scrolling through this PowerPoint you will realize just what is required and be able to better estimate the efforts it will take for your company to meet these regulatory requirements with technology and then without technology.
Spend just 5-10 minutes that might save your company, and your Customers, all the negative ramifications of the inevitable 2 breaches a year a company can expect to suffer.
This PowerPoint covers the critical aspects and needs that are present in any project designed to meet regulatory requirements for GDPR, CCPA and many others.
Complete Channel of Videos on BigDataRevealed
https://www.youtube.com/watch?v=3rLcQF5Wsgc&list=UU3F-qrvOIOwDj4ZKBMmoTWA
847-440-4439
#CCPA #GDPR #Big Data #Data Compliance #PII #Facebook #Hadoop #AWS #Spark #IoT #California
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...BigDataEverywhere
Today, no industry is immune from a potential data breach and the havoc it can create. According to a 2013 Global Data Breach study by the Ponemon Institute, the average cost of data loss exceeds $5.4 million per breach, and the average per person cost of lost data is approaching $200 per record in the US. Protecting sensitive data in Hadoop is now the imperative for IT and business. With the emergence of Hadoop as a business-critical data platform, Hadoop offers organizations opportunities to improve performance, better understand customers and develop a competitive advantage. But reaching these desirable analytic outcomes depends on the ability to use data without exposing the organization to unnecessary risk. This presentation will cover best practices for a data-centric security, compliance and data governance approach, with a particular focus on two customer use cases within the financial services and insurance industries. You'll learn how these companies are reducing their security exposure through automated data-centric protection of sensitive data in Hadoop.
In today’s increasingly competitive world, accelerated speed to identifying relevant and hidden knowledge, internal expertise and experience is critical to meeting client demands, securing new clients and cases, reviewing precedents and outcomes and leveraging collective IP for the strategic advantage. OpenText Decisiv instantly finds, organizes, and helps gain insights from your data for the competitive advantage. To learn more, email salt@opentext.com
It is almost impossible to escape the topic of Data Science. While the core of Data Science has remained the same over the last decade, it’s emergence to the forefront is spurred by both the availability of new data types and a true realization of the value that it delivers. In this session, we will provide an overview of data science, the different classes of machine learning algorithm and deliver an end-to-end demonstration of performing Machine Learning Using Hadoop. Audience: Developers, Data Scientist Architects and System Engineers.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=4175a7421d00257f33df146f50c41af8
RDBMS gave us table schemas. A table schema, which is an essential metadata component, gave us the power to validate data types, and enforce constraints. In the age of varying data and schema-less data stores, how can we enforce these rules and how can we leverage metadata (even in RDBMS) to empower data validity, code checks, and automation.
This is a brief background into Big data (data lake) to put in context the importance of metadata from a governance perspective and more especially in todays heterogeneous big data platforms.
Learn about custom data capture or imaging tool development from DocuFi. Whether you need a mobile, desktop or cloud application, driver or plug in, DocuFi has the building blocks and experience to build your customization quickly and efficiently.
The January call will focus on introducing the concepts of open development, software lifecycle and upcoming open projects. We have a number of projects on the roadmap and would like to give the community an opportunity to help prioritize the list.
We'll discuss the upcoming GT.M Integration project to more tightly couple OpenVista and GT.M. You can read the proposals and discuss this project at Medsphere.org, see the project homepage here: http://medsphere.org/community/roadmap/gtm
Please feel free to invite any colleagues that might find this topic relevant or interesting.
When: January 15, 12:30 - 2pm Pacific
Where: Dial-in: (888) 346-3950 // Participant Code: 1302465
Web conference: http://www.medsphere.com/infinite/
What: Open Development
- Ecosystems at work
- Open Development Introduction
- Community Project Overview
- GT.M Project Introduction
- Project Review
- Medsphere.org: Tip of the Month
===
The community calls are listed on the Medsphere.org event calendar (http://medsphere.org/community-events/) and we will update each month's call as the agenda is solidified.
Details and Recording available here: http://medsphere.org/blogs/events/2009/01/15/community-call-january-2009
Best Practices: Data Virtualization Perspectives and Best PracticesDenodo
These are the slides from a presentation given by Rajeev Rangachari, Senior Technology Architect, Infosys at Fast Data Strategy Roadshow in San Francisco. Infosys were the official co sponsors of this event.
For more information about our partners Infosys, follow this link: https://goo.gl/wVy5j4
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
Data integration is paramount, in this presentation you will find three different paradigms: using client-side tools, creating traditional data warehouses and the data virtualization solution - the logical data warehouse, comparing each other and positioning data virtualization as an integral part of any future-proof IT infrastructure.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/1q94Ka.
Azure data catalog your data your way eugene polonichko dataconf 21 04 18Olga Zinkevych
Topic of presentation: Azure Data Catalog: your data, your way
The main points of the presentation:It’s a fully-managed service that lets you—from analyst to data scientist to data developer—register, enrich, discover, understand, and consume data sources
http://dataconf.com.ua/speaker-page/eugene-polonichko.php
https://www.youtube.com/watch?v=wceGzcQcPOo&list=PL5_LBM8-5sLjbRFUtXaUpg84gtJtyc4Pu&t=0s&index=4
Learn how intelligent data capture has replaced scanning for archival. Understand how recognition technologies and capture software including advanced OCR, barcodes and regex, combine to extract your important data seamlessly from scans and existing files. The time is now to truly turn your content into data.
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
Data Virtualization means Real-time Data Access and Integration. But why do I need it? This presentation tries to answer it in a simple yet clear way.
By Alberto Pan, CTO of Denodo, and Justo Hidalgo, VP Product Management.
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo
Watch the full session: Denodo DataFest 2016 sessions: https://goo.gl/ptGwp7
Curious about product roadmap? In this session, we will review some of the new key features introduced this year in the Denodo Platform in areas such as performance, self-service, security and monitoring. We will also take a sneak peek at the most exciting features in the roadmap for Denodo 7.0.
In this session, you will learn:
• New performance-related features in big data scenarios
• New governance and self-service features
• New connectivity, data transformation, and enterprise-wide deployment features
This session is part of the Denodo DataFest 2016 event. You can also watch more Denodo DataFest sessions on demand here: https://goo.gl/VXb6M6
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
| www.eudat.eu | The EUDAT data domain handles registered data. Each digital object should have a persistent identifier. This persistent identifier is used for: Replica identification; Identification of the repository of record (in the case of replication); Querying of additional information; Checksum (time stamped)...
Learn about ChronoScan for document scanning, data extraction and integration into your ECM, CMIS compliant, or line of business database. ChronoScan's software provides a comprehensive set of features for all your data capture needs. Viewers will be able to answer "What is ChronoScan".
Learn more about Hitachi Content Platform Anywhere by visiting http://www.hds.com/products/file-and-content/hitachi-content-platform-anywhere.html
and more information on the Hitachi Content Platform is at http://www.hds.com/products/file-and-content/content-platform
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)Denodo
This first session in a series of six ‘Packed Lunch’ webinars provides an overview of Data Virtualization technology, its applications and how it is adding business value to organizations around the world.
More information and FREE registrations to this webinar: http://goo.gl/z7mq2S
Landing page for the entire Packed Lunch webinar series: http://goo.gl/NATMHw
Attend & get unique insights into:
What Data Virtualization is and what sets it apart from traditional integration tools
How it both complements and leverages existing enterprise architectures
The Denodo Data Virtualization platform and its capabilities
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
Today, practically every firm uses big data to gain a competitive advantage in the market. With this in mind, freely available big data tools for analysis and processing are a cost-effective and beneficial choice for enterprises. Hadoop is the sector’s leading open-source initiative and big data tidal roller. Moreover, this is not the final chapter! Numerous other businesses pursue Hadoop’s free and open-source path.
Learn about custom data capture or imaging tool development from DocuFi. Whether you need a mobile, desktop or cloud application, driver or plug in, DocuFi has the building blocks and experience to build your customization quickly and efficiently.
The January call will focus on introducing the concepts of open development, software lifecycle and upcoming open projects. We have a number of projects on the roadmap and would like to give the community an opportunity to help prioritize the list.
We'll discuss the upcoming GT.M Integration project to more tightly couple OpenVista and GT.M. You can read the proposals and discuss this project at Medsphere.org, see the project homepage here: http://medsphere.org/community/roadmap/gtm
Please feel free to invite any colleagues that might find this topic relevant or interesting.
When: January 15, 12:30 - 2pm Pacific
Where: Dial-in: (888) 346-3950 // Participant Code: 1302465
Web conference: http://www.medsphere.com/infinite/
What: Open Development
- Ecosystems at work
- Open Development Introduction
- Community Project Overview
- GT.M Project Introduction
- Project Review
- Medsphere.org: Tip of the Month
===
The community calls are listed on the Medsphere.org event calendar (http://medsphere.org/community-events/) and we will update each month's call as the agenda is solidified.
Details and Recording available here: http://medsphere.org/blogs/events/2009/01/15/community-call-january-2009
Best Practices: Data Virtualization Perspectives and Best PracticesDenodo
These are the slides from a presentation given by Rajeev Rangachari, Senior Technology Architect, Infosys at Fast Data Strategy Roadshow in San Francisco. Infosys were the official co sponsors of this event.
For more information about our partners Infosys, follow this link: https://goo.gl/wVy5j4
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
Data integration is paramount, in this presentation you will find three different paradigms: using client-side tools, creating traditional data warehouses and the data virtualization solution - the logical data warehouse, comparing each other and positioning data virtualization as an integral part of any future-proof IT infrastructure.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/1q94Ka.
Azure data catalog your data your way eugene polonichko dataconf 21 04 18Olga Zinkevych
Topic of presentation: Azure Data Catalog: your data, your way
The main points of the presentation:It’s a fully-managed service that lets you—from analyst to data scientist to data developer—register, enrich, discover, understand, and consume data sources
http://dataconf.com.ua/speaker-page/eugene-polonichko.php
https://www.youtube.com/watch?v=wceGzcQcPOo&list=PL5_LBM8-5sLjbRFUtXaUpg84gtJtyc4Pu&t=0s&index=4
Learn how intelligent data capture has replaced scanning for archival. Understand how recognition technologies and capture software including advanced OCR, barcodes and regex, combine to extract your important data seamlessly from scans and existing files. The time is now to truly turn your content into data.
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
Data Virtualization means Real-time Data Access and Integration. But why do I need it? This presentation tries to answer it in a simple yet clear way.
By Alberto Pan, CTO of Denodo, and Justo Hidalgo, VP Product Management.
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo
Watch the full session: Denodo DataFest 2016 sessions: https://goo.gl/ptGwp7
Curious about product roadmap? In this session, we will review some of the new key features introduced this year in the Denodo Platform in areas such as performance, self-service, security and monitoring. We will also take a sneak peek at the most exciting features in the roadmap for Denodo 7.0.
In this session, you will learn:
• New performance-related features in big data scenarios
• New governance and self-service features
• New connectivity, data transformation, and enterprise-wide deployment features
This session is part of the Denodo DataFest 2016 event. You can also watch more Denodo DataFest sessions on demand here: https://goo.gl/VXb6M6
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
| www.eudat.eu | The EUDAT data domain handles registered data. Each digital object should have a persistent identifier. This persistent identifier is used for: Replica identification; Identification of the repository of record (in the case of replication); Querying of additional information; Checksum (time stamped)...
Learn about ChronoScan for document scanning, data extraction and integration into your ECM, CMIS compliant, or line of business database. ChronoScan's software provides a comprehensive set of features for all your data capture needs. Viewers will be able to answer "What is ChronoScan".
Learn more about Hitachi Content Platform Anywhere by visiting http://www.hds.com/products/file-and-content/hitachi-content-platform-anywhere.html
and more information on the Hitachi Content Platform is at http://www.hds.com/products/file-and-content/content-platform
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)Denodo
This first session in a series of six ‘Packed Lunch’ webinars provides an overview of Data Virtualization technology, its applications and how it is adding business value to organizations around the world.
More information and FREE registrations to this webinar: http://goo.gl/z7mq2S
Landing page for the entire Packed Lunch webinar series: http://goo.gl/NATMHw
Attend & get unique insights into:
What Data Virtualization is and what sets it apart from traditional integration tools
How it both complements and leverages existing enterprise architectures
The Denodo Data Virtualization platform and its capabilities
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
Today, practically every firm uses big data to gain a competitive advantage in the market. With this in mind, freely available big data tools for analysis and processing are a cost-effective and beneficial choice for enterprises. Hadoop is the sector’s leading open-source initiative and big data tidal roller. Moreover, this is not the final chapter! Numerous other businesses pursue Hadoop’s free and open-source path.
This new solution from Capgemini, implemented in
partnership with Informatica, Cloudera and Appfluent,
optimizes the ratio between the value of data and storage
costs, making it easy to take advantage of new big data
technologies.
XA Secure | Whitepaper on data security within Hadoopbalajiganesan03
Enterprises adopting Hadoop and other big data tools need to ensure that they the data they are storing and processing is internally protected through strong access control, auditing and governance. This whitepaper talks to current challenges with Hadoop, the initiatives within the open source community and how XA Secure can help with its approach.
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
This presentation was presented at the July 8th 2014 user group meeting for BI Reporting for Bay Area Start Ups
Content - Creation Infocepts/DWApplications
Presented by: Scott Mitchell - DWApplications
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
Data mesh was among the most discussed and controversial enterprise data management topics of 2021. One of the reasons people struggle with data mesh concepts is we still have a lot of open questions that we are not thinking about:
Are you thinking beyond analytics? Are you thinking about all possible stakeholders? Are you thinking about how to be agile? Are you thinking about standardization and policies? Are you thinking about organizational structures and roles?
Join data.world VP of Product Tim Gasper and Principal Scientist Juan Sequeda for an honest, no-bs discussion about data mesh and its role in data governance.
In the healthcare sector, data security, governance, and quality are crucial for maintaining patient privacy and ensuring the highest standards of care. At Florida Blue, the leading health insurer of Florida serving over five million members, there is a multifaceted network of care providers, business users, sales agents, and other divisions relying on the same datasets to derive critical information for multiple applications across the enterprise. However, maintaining consistent data governance and security for protected health information and other extended data attributes has always been a complex challenge that did not easily accommodate the wide range of needs for Florida Blue’s many business units. Using Apache Ranger, we developed a federated Identity & Access Management (IAM) approach that allows each tenant to have their own IAM mechanism. All user groups and roles are propagated across the federation in order to determine users’ data entitlement and access authorization; this applies to all stages of the system, from the broadest tenant levels down to specific data rows and columns. We also enabled audit attributes to ensure data quality by documenting data sources, reasons for data collection, date and time of data collection, and more. In this discussion, we will outline our implementation approach, review the results, and highlight our “lessons learned.”
The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of interactive SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. In this webinar, join Cloudera and MicroStrategy to learn how Impala works, how it is uniquely architected to provide an interactive SQL experience native to Hadoop, and how you can leverage the power of MicroStrategy 9.3.1 to easily tap into more data and make new discoveries.
Personium - Open Source PDS envisioning the Web of MyData暁生 下野
How can we citizens maximize the benefits of the new right to data portability, which is now rapidly being recognized globally?
Personal Data Store is a technology that will receive all “My Data” from hundreds of services. It aggregates and integrates them, and at times discloses a portion of them to others under user’s control for creating new values.
This talk will introduce an open-source Personal Data Store (PDS) server “Personium”, providing details on its technical implementation, the underpinning business models, and the actual implemented and future use cases.
Solving the Really Big Tech Problems with IoTEric Kavanagh
The Briefing Room with Dr. Robin Bloor and HPE Security
The Internet of Things brings new technological problems: sensor communications are bi-directional, the scale of data generation points has no precedent and, in this new world, security, privacy and data protection need to go out to the edge. Likely, most of that data lands in Hadoop and Big Data platforms. With the need for rapid analytics never greater, companies try to seize opportunities in tighter time windows. Yet, cyber-threats are at an all-time high, targeting the most valuable of assets—the data.
Register for this episode of The Briefing Room to hear Analyst Dr. Robin Bloor explain the implications of today's divergent data forces. He’ll be briefed by Reiner Kappenberger of HPE, who will discuss how a recent innovation -- NiFi -- is revolutionizing the big data ecosystem. He’ll explain how this technology dramatically simplifies data flow design, enabling a new era of business-driven analysis, while also protecting sensitive data.
Learn more about Hitachi Content Platform Anywhere by visiting http://www.hds.com/products/file-and-content/hitachi-content-platform-anywhere.html
and more information on the Hitachi Content Platform is at http://www.hds.com/products/file-and-content/content-platform
Gdpr CCPA Why Benchmarks of Billions of rows are as meaningful as compliance ...Steven Meister
GDPR/CCPA …, Fortune C Levels, What has been communicated to you is NO LONGER accurate. Data Compliance with your volumes is now viable! BigDataRevealed’s Architecture and Methodologies combined with the latest Spark & Apache, have broken the Compliance/Scalability Code. Billions of rows can now be processed for Compliance in minutes to hours. Video Benchmarks Spreadsheet & Demo = https://youtu.be/VTZ16LcgLmU
GDPR, CCPA, Analytics & Big Data applications. Beta this Comprehensive Regulatory Compliance & Analytics Accelerator engine delivering results on laptops, servers & AWS / Clouds. Analytics and extensive Metadata Catalogs, assist companies in developing marketing strategies, increase profits, and understand their customers and Data Protection Regulations.
Steven Meister GDPR and Regulatory Compliance and Big Data Excelerator Profes...Steven Meister
Steven Meister Cover Letter and CV
My Expertise is in Data Regulatory Compliance like (EU GDPR), California Cyber Security and most every countries Data Privacy and Security Regulations and accelerating the building of Big Data Frameworks and platforms in Hadoop and AWS S3.
Recent Accomplishments: https://youtu.be/roPC1NSgRGg
https://youtu.be/nwwqZTY_6Gc https://youtu.be/ZcNGXR2eLT0
Privacy Assurance Initiative
Description:
Much has been written about the importance of adopting a consumer data privacy program that can withstand the scrutiny of regulators mindful of enforcing the General Data Protection Regulations as adopted in the European Community in 2018. Many have developed solutions that go to great lengths to protect consumer data that has been identified as falling within the guidance of GDPR. But few have devised the means of identifying the data housed within your four walls, within the cloud solutions you employ and within the platforms you employ to perform some functions of your commercial ventures that involve the use of consumer data.
GDPR BigDataRevealed Readiness Requirements and EvaluationSteven Meister
This GDPR methodology can evaluate your GDPR readiness. For those feeling GDPR ready, you may uncover complex issues often neglected. For those that have waited, you can gain knowledge providing for a more successful GDPR outcome.
https://youtu.be/uE4Q7u0LatU https://youtu.be/R37S9mIiVAk https://youtu.be/AQf3if7DnuM
Are you prepared for eu gdpr indirect identifiers? what are indirect identifi...Steven Meister
What is your solution for GDPR’s Indirect Identifiers? Many aren’t sure what they are and will probably be unsuccessful when attempting to become GDPR compliant. Allow me to explain.
As a software development manager, I must confess that the Discovery & Remediation of Indirect Identifiers was the most complex project I have managed in my 33 years in the industry.
First, let me explain what an Indirect Identifier is. According to the “Privacy Technical Assistance Center of the U.S. Department of Education, it means “Indirect identifiers include information that can be combined with other information to identify specific individuals, including, for example, a combination of gender, birth date, geographic indicator and other descriptors.”
I have listed 3 informative youtube videos on the eu gdprSteven Meister
I have listed 3, of what I consider very informative yet very different viewpoints on the EU GDPR and most definitely expressed differently by each set of presenters
Every Executive that has a Big Data Hadoop Cluster and their Staff, this is a must see! Getting your big data house in order.
The misalignment and clutter issues waste much of the precious time for critical decisions.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Eu gdpr technical workflow and productionalization neccessary w privacy assurance calculator september 2017
1. Repeatable,
auditable solution
Enable the
Right to be
Forgotten
The
Intelligent
Catalog
Productionalized
Discovery
Single Platform
Insulate in flight
transactions from
cyber-attacks
Single PII
discovery zone
Click on each
process, execute
each process and
have accomplished
your role in GDPR
and Protecting
You’re Customer
trusted Private
Information
leaving the hackers
with meaningless
highly Encrypted
Information
BigDataRevealed
Architecture Slide
Next
Page
What you must ensure
ENABLERS
Regulators have begun passing and enforcing legislation which protects their citizenry from cyberthreat
and mandates a right to be forgotten. GDPR is one of the first and most stringent of these regulations.
The EU is set to make examples of companies either unwilling or unable to comply with GDPR, this deck
gives a sense of what you must consider doing to comply.
Functionally rich
rules
maintainable by
non-technicians
Road Map to
Compliance
Privacy
Assurance
Calculator
2. • Identify information that uniquely identifies a consumer, either directly or
indirectly, residing anywhere in your portfolio of files and systems
maintained within your four walls, within cloud environments you use, within
your backups or within partner environments.
• Develop a plan to secure personally identifiable information.
Document that plan sufficiently well so that regulators can be
satisfied. And finally execute the plan while documenting
milestones achieved, again to satisfy regulators
• Take the necessary steps to protect personally
identifiable information from being stolen or altered
inappropriately through a well devised program
• Have a facility that is able to locate
information for a particular consumer
everywhere in your environment so
that a request from any consumer to
be forgotten can be fulfilled within 48
hours.
1
2
3
4
BigDataRevealed
Architecture Slide
What must you do to comply with GDPR
5
Home
3. Centralize Legacy, Structured, Unstructured and
Live Streaming Data into a file system that
supports all data types including binary files
such PDF, Office Documents, XML, Pictures and
more..
Home
Apache™ Hadoop® Staging ODS
Think about the value of mainframe data, AS/400 data, Oracle, SQL
Server, DB2, Teradata, PDF’s, Office Documents, Emails and other data
being processed by one application using a single platform. Imagine
the exponential improvement in BI reporting, demonstration of
regulatory preparedness, collaboration among data scientists and the
ease of adjusting to new data entering your environment every day.
A new way or addressing data management will be required to keep
up with the pace of data growth and complexity, especially in the
coming age of data regulation where Personally Identifiable
Information PII must be protected, even if pieces of PII are scattered
across multiple files of various originations.
You may choose Apache™ Hadoop®
over a vendors Hadoop as it is 100%
open source and Free. BigDataRevealed
offers a completely configured VM with
Apache Hadoop, an installation file to
run on your current or new Hadoop
environment or even on Amazon or
other Clouds.
Next
Page
4. Centralize Legacy, Structured, Unstructured and Live Streaming
Data into a file system that supports all data types including binary
files such PDF, Office Documents, XML, Pictures and more..
Home
Apache™ Hadoop® Staging ODS
BigDataRevealed’s primary advantage over other solutions is the
creation of our Intelligent Catalog which provides knowledge of every
field in every file and gives the ability to logically join files of varied
structures when conducting Indirect Identifier discovery.
We have combined a callable, accurate set of processes that are
repeatable and collaborative with a methodology that stores the
necessary information in the BDR-Intelligent Catalog Metadata
repository. Our Intelligent Catalog and related processes store
necessary information developed in earlier steps for use in subsequent
processes, thereby avoiding the need to repeat heavy algorithm
processing.
You need metadata developed from actual pattern recognition
algorithms designed to identify PII that includes Direct and Indirect
Identifiers, and you need it developed for all your data no matter what
it’s source or structure may be.
You may choose Apache™ Hadoop®
over a vendors Hadoop as it is 100%
open source and Free. BigDataRevealed
offers a completely configured VM with
Apache Hadoop, an installation file to
run on your current or new Hadoop
environment or even on Amazon or
other Clouds.
Intelligent Catalog / Metadata
Next
Page
5. Home
We store the metadata necessary for you to respond to regulators and consumers
•The privacy regulations becoming law around the world (GDPR, PrivacyShield, IDP, Cloud-A) call for an ability
to prove to regulators that you can administer the right to be forgotten. That means you can identify where
a customer’s data exists at a moment’s notice.
•We create a functionally rich metadata layer as part of the intelligent catalog which we use for what would
otherwise be an administrative nightmare.
•We also provide you with the ability to enrich our metadata by augmenting it with metadata from other
sources and tools, such as your technical data integration (ETL) tools, your compliance engines, your
regulatory reporting engines and any other source you feel aids in your ability to respond to consumers and
regulators in meeting the obligation to execute the right to erasure now endowed upon consumers.
Intelligent Catalog / Metadata – Cont.
7. Quick Discovery
Step One Run Discovery
Stats
• Have a means of Identifying Personally Identifiable Information (PII) on
File at a moment’s notice
• Quickly Searches most vulnerable Personally Identifiable Information
• Displays all found results for each column and the percentage that
pattern represents of the total entries.
Share the Intelligent
Catalog and Metadata
• View, share and export the results for use in other Catalog / Metadata
Repositories
• Users can add or modify Metadata based on users authority
User Collaboration and
Sharing and Validating /
Updating Metadata
• Adjust columnar business classifications naming to reflect the actual
contents of the column.
• Data Scientists and Management (Stewards) will benefit from this
accurate columnar naming created by actual data patterns
Home
Next
Page
8. Data Discovery Graphical Viewer with results and drill down into actual domain data values for additional discovery and validation.
Quick Discovery
Home
9. Complete Discovery &
Remediation
Step One Run Discovery
Stats
•Quickly Searches Requested Personally Identifiable Information for Direct Identifiers and Cross File Indirect Identifiers.
•Displays results by each column for every found pattern and what percentage that pattern represents
Share the Intelligent
Catalog and Metadata
•View, share and even export the results for use in other Catalog / Metadata Repositories
•Users can add or modify Metadata based on users authority
User Collaboration and
Sharing and Validating /
Updating Metadata
•Assign the suggested / validated columnar business classification naming to the columnar headings of the files
•Data Scientists and Management (Stewards) will benefit from this accurate columnar naming created by actual data patterns
Prepare files and or file
columns for Encryption
Remediation or full
Sequestering of the File
to an Encrypted Zone
•Be able to meet the demands of a citizen exercising their GDPR right to be forgotten from all points of identification, whether
on your premises, on a backup environment or in a partner’s environment storing PII data on your behalf
•Provide a means to protect PII data (that directly or indirectly identifies an individual) from cyber-threats
•Make the choice to either sequester a complete File or to Encrypt one or more Columns of Data in a file.. In the case of Indirect
Identifiers and the Right to be Forgotten, you may need to encrypt one or more Column / Row combinations that are found in
multiple files.
Home
Next
Page
If you have any Data Subjects
(Consumers, Employees, Suppliers,
Partners, Citizens, Patients etc.)
sharing their identity with you from
Europe or while traveling in Europe,
you have to take the new GDPR
regulations seriously, which requires
you to be able to identify all the
information you have about a Data
Subject, whether identified by key
data items (national insurance
number, tax payer id, name, address,
credit card numbers, etc.) or indirect
information that allows a Data Subject
to be identified when multiple fields
are grouped together (address,
professional affiliations, HIPAA and
other medical records or any other
information that allows the
identification of an individual).
10. Complete Discovery Home
Next
Page
If you have any Data Subjects
(Consumers, Employees, Suppliers,
Partners, Citizens, Patients etc.)
sharing their identity with you from
Europe or while traveling in Europe,
you have to take the new GDPR
regulations seriously, which requires
you to be able to identify all the
information you have about a Data
Subject, whether identified by key
data items (national insurance
number, tax payer id, name, address,
credit card numbers, etc.) or indirect
information that allows a Data Subject
to be identified when multiple fields
are grouped together (address,
professional affiliations, HIPAA and
other medical records or any other
information that allows the
identification of an individual).
11. Complete Discovery Home
Next
Page
If you have any Data Subjects
(Consumers, Employees, Suppliers,
Partners, Citizens, Patients etc.)
sharing their identity with you from
Europe or while traveling in Europe,
you have to take the new GDPR
regulations seriously, which requires
you to be able to identify all the
information you have about a Data
Subject, whether identified by key
data items (national insurance
number, tax payer id, name, address,
credit card numbers, etc.) or indirect
information that allows a Data Subject
to be identified when multiple fields
are grouped together (address,
professional affiliations, HIPAA and
other medical records or any other
information that allows the
identification of an individual).
12. Complete Discovery Home
If you have any Data Subjects
(Consumers, Employees, Suppliers,
Partners, Citizens, Patients etc.)
sharing their identity with you from
Europe or while traveling in Europe,
you have to take the new GDPR
regulations seriously, which requires
you to be able to identify all the
information you have about a Data
Subject, whether identified by key
data items (national insurance
number, tax payer id, name, address,
credit card numbers, etc.) or indirect
information that allows a Data Subject
to be identified when multiple fields
are grouped together (address,
professional affiliations, HIPAA and
other medical records or any other
information that allows the
identification of an individual).
13. How we approached the "Right to erasure ('right to
be forgotten’)” and Indirect Identifiers
•We use Hadoop as a staging area for interrogating data for privacy concerns as Hadoop removes the
challenges that hide privacy concerns due to the complexity of data
•We use our intelligent catalog mechanisms to codify rules that define patterns of data that represent
privacy concerns and schedule the interrogation of files and streams of data
•We use pattern detection to identify potential privacy concerns and use a process we call sequester,
encrypt and secure, which encrypts the exposed data so that privacy information is not in harms way and
sequester it into a highly protected environment
•We allow the creation of false positive lists so you do not have to revisit potential issues as new data
enters your environment
Home
Next
Page
14. First we must establish joinable columns by Direct Identifiers (patterns in the data and business metadata)
Indirect Identifiers Home
Next
Page
Now we see what files are joinable by domain value (key value) email and what percentage of each files column has email
15. Now we selected the Indirect Identifier Basket of Patterns to find across multiple tables to Discover Indirect Identifiers or
use this information to assist in the right to Erasure, “Right to be Forgotten”
Indirect Identifiers Home
Next
Page
Now for each unique domain column value we can see where they exit for across all the files for a specific value, entity or person.
16. Now we selected the Indirect Identifier Basket of Patterns to find across multiple tables to Discover Indirect Identifiers or
use this information to assist in the right to Erasure, “Right to be Forgotten”
Indirect Identifiers Home
Next
Page
Now we select the Key Domain for the file column and row we need to encrypt or sequester to not violate the GDPR Indirect Identifiers
17. Here we can see we encrypted the proper row and column for this file for dmorris6@narod.ru.
Indirect Identifiers Home
Next
Page
Now we select the Key Domain for the file column and row we need to encrypt or sequester to not violate the GDPR Indirect Identifiers
GDPR requires companies to maintain verification that a customer has given consent for you to use their Personal Information and also
requires you to provide a graphical interface for them to later request erasure of that information. Consent and Erasure can be unlimited or
selective in nature; meaning a customer may wish to receive email coupons but not allow use of their home phone or address.
Consent – Request for Erasure
Graphical interface and API’s
"Right to erasure ('right to be forgotten ')"
18. "Right to erasure ('right to be forgotten')"
• (1) The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or
her without undue delay and the controller shall have the obligation to erase personal data without undue delay where
one of the following grounds applies:
• (a) the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed;
• (b) the data subject withdraws consent on which the processing is based according to point (a) of Article 6(1), or point (a) of Article
9(2), and where there is no other legal ground for the processing;
• (c) the data subject objects to the processing pursuant to Article 21(1) and there are no overriding legitimate grounds for the processing,
or the data subject objects to the processing pursuant to Article 21(2);
• (d) the personal data have been unlawfully processed;
• (e) the personal data have to be erased for compliance with a legal obligation in Union or Member State law to which the controller is
subject;
• (f) the personal data have been collected in relation to the offer of information society services referred to in Article 8(1).
Home
Next
Page
19. "Right to erasure ('right to be forgotten’)”
Cont.
(2) Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal
data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps,
including technical measures, to inform controllers which are processing the personal data that the data subject has
requested the erasure by such controllers of any links to, or copy or replication of, those personal data.
(3) Paragraphs 1 and 2 shall not apply to the extent that processing is necessary:
(a) for exercising the right of freedom of expression and information;
(b) for compliance with a legal obligation which requires processing by Union or Member State law to which the controller is subject or for
the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;
(c) for reasons of public interest in the area of public health in accordance with points (h) and (i) of Article 9(2) as well as Article 9(3);
(d) for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article
89(1) in so far as the right referred to in paragraph 1 is likely to render impossible or seriously impair the achievement of the objectives
of that processing; or
(e) for the establishment, exercise or defence of legal claims.
Home
20. Calculating Benchmarks Home
Things to consider when calculating Benchmarks and Latency:
Benchmarks:
1. Number of Files
2. Average number of columns
3. Folder Configurations
4. Type of Data:
1. Structured
2. Unstructured
3. Binary
4. Email
5. Others
5. The various processes to be run:
1. Quick Column Classification with Intelligent Catalog and Metadata
2. Adding Column Headers from the Intelligent Catalog
3. Complete Discovery run of All Data
4. Review of results
5. Deciphering through the graphical interface what are false positives versus real risks and violations
6. Deciding if a complete file should be sequestered or if specific columns / rows should be encrypted
7. Search for Indirect Identifiers contained in numerous files of various formats and structures by logically joining
them by using other Direct Identifiers as Keys.
1. Review results that identify which files, when joined together, contain enough indirect identifiers to
Next
Page
21. Calculating Benchmarks Home
Things to consider when calculating Benchmarks and Latency,
Continued:
2. Determine which columns or rows needs to be encrypted, or which files need to be sequestered, even if the
fields reside on files of unlike structure or type.
3. Use the list of individuals exercising their ‘Right to be Forgotten’ to further select fields to encrypt or files to
sequester.
8. Volume test using a variety of files, file types, file sizes / rows, data types, with server clusters containing a various
number of nodes. Document the time it took these processes to run in a repeatable manner closest to zero latency as
possible.
Our benchmarking for jobs that performed discovery and remediation by encryption on one Node, with 50 million rows has
executed in under two minute for most processes. On a 4 node box 1 billion rows was processed in just a few hours. Indirect
identifiers will depend on the number of files and columns and total permutations to get 100% of the results. We found 10
million rows with 3-5 files on one node still runs in under an hour.
Our product is 100% Spark / Java 8 jobs running as part of the Hadoop framework and eco-system so that data Never leaves
the Hadoop ecosystem. We scale 100% in parallel with the nodes and power of your Hadoop platform.
Also consider running live streaming data and performing discovery and remediation on the fly so that you are comfortable
data will be secured before it is even loaded into HDFS or Hbase.
22. HBASE
MAPREDUCE
MySQL
Databases
(Oracle, DB2, SQL)
Teradata
Mainframe / AS400
A
P
A
C
H
E
S
Q
O
O
P
B
D
R
–
S
p
a
r
k
S
t
r
e
a
m
i
n
g
OR
Apache™ Hadoop®
Staging ODS
Keep your existing legacy
systems functioning
without disruption or
degradation and begin
meeting the demands of
GDPR and other
Regulatory Compliances,
using Apache Hadoop as
your central Data /
Operational File Store.
With BigDataRevealed : For Existing Hadoop Data
Lakes or Staged Legacy Data to create Catalog /
Metadata for legacy Compliance usage
- Discover Personally Identifiable Information
(PII) by searching every column in every
row. We don’t use a randomizing algorithm
that searches only a fraction of your
columns looking for undiscovered PII data.
- Encrypt PII wherever it is found
- Process Streaming data
- Allow Data Scientists to drill down and view
suspected PII data in any column
- Sequester the Original file in a Hadoop
managed Encrypted Zone for reference
- Remove files and historical versions where
PII was discovered
- Provide Workflow Management screens for
task assignment and completion control
- Provide an Intelligent Catalog/Metadata for:
o collaborative efforts and
file/columnar naming
- Discovery for Peoples rights to be forgotten
- Provide an Intelligent Catalog/Metadata for:
- Indirect file matching to determine
Indirect Identifiers
- The use of Hadoop Encrypted Zones for
additional sequestering of sensitive Data
Use the BDR-Intelligent
Catalog/Metadata to
revert back to your
Legacy Data for
remediation for GDPR
and other Regulatory
Compliances
CLOUD
B
D
R
–
A
p
a
c
h
e
H
a
d
o
o
p
–
E
n
g
i
n
e
s
BigDataRevealed Architecture For GDPR / Legacy - for All Regulatory Compliances, Powered by Apache™ Hadoop®
Slay the GDPR Dragon for Both Hadoop & Legacy Systems with BigDataRevealed The Intelligent & Quickest Path to jump start your GDPR & Regulatory Compliance
BDR-Intelligent-Catalog/Metadata
P
r
o
c
e
s
s
H
a
d
o
o
p
D
a
t
a
L
a
k
e
OR
P
r
o
c
e
s
s
S
t
a
g
e
d
L
e
g
a
c
y
D
a
t
Home
Next
Page
23. Productionalizing Methodology
CLOUD
Office Documents
Databases:
(Oracle, DB2,
SQL Server,
Teradata, Flat File) …
MySQL
Mainframe / AS400
Audit / Assessment Centralized Repository
Apache Hadoop Recommended
Live Streams (Spark)
Push new data or complete data
over at the shortest latency possible
BigDataRevealed
Processes
&
Productionizing
Quick Pattern Business
Classifications with Intelligent
Catalog / Metadata
Setup live
streaming
processes
•Complete the above processing for live streams
Setup and
Run
Complete
Pattern PII
processes
•Based on analysis select compliance algorithms and files
•Run Discovery, remediation and schedule best latency
Remediation
Encryption
Sequester
•Encryption of columns/rows, complete columns or files
•Sequester binary and other files as needed in Encrypted Zones
Right of
Erasure / to be
Forgotten
•Select keys to join files to complete Indirect Identifier detection.
•Join all file types, including binary files, by selected keys and process Cross File Indirect
Identifier discovery.
•Encrypt file/column/row or sequester files to be compliant
Quick
Pattern &
Intelligent
Catalog /
Metadata
• Simply Run Quick Classification on Folders
• View results, Export Results, Share Results
Review of
Quick Pattern
Analysis of
Metadata
• Based on Results Prepare Plan for remediation
• Know how many files / columns need processing
Calculate
Benchmarks
and best
latency
• Calculate total files / columns and rows to be encrypted
• Calculate baseline latency and timing for production
Home
24. The roadmap to compliance
BDR is the engine that houses the intelligent catalog and the GDPR enablement kit for discovery and
protection of PII data as mandated by GDPR. It also publishes a searchable secured catalog, which allows
for compliance with tight deadlines provisioned to respond to consumers wishing to be forgotten.
We recommend starting with a Privacy Self Assessment which we will help analyze and benchmark
against organizations similar to yours during an intensive 3 day session. This $5K cost of this session will
be waived when engaging to install BigDataRevealed.
Our current list price for the Privacy Assurance solution initially geared for GDPR is an annual recurring
cost of $20K for the master node and $1K for any additional nodes. We are discounting the investment for
early adopters who install and provide us with testimonials demonstrating how BigDataRevealed helped in
their journey to privacy assurance.
To encourage compliance and make a examples of non-compliant companies the EU has promised fines
as stiff as 4% of your prior year’s revenues. We believe a sound GDPR solution armed with a holistic
approach to privacy assurance is the only prudent means to demonstrate to the EU that you have your
compliance house in order.
We encourage you to start the journey with us by clicking on the privacy assurance self assessment
CLICK HERE HomePrivacy Assurance Calculator Excel