Privacera’s Vice President of Product Management Srikanth Venkat and Nauman Fakhar, Director, ISV Solutions at Databricks, survey the current and future data privacy landscape, what it means for enterprises like yours, and what you can do to ensure compliance. The webinar includes an in-depth CCPA compliance demonstration on Azure Databricks with Privacera, based on Apache Ranger.
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
CCPA Compliance for Analytics and Data Science Use Cases with Databricks and Privacera
1. Powered by Apache RangerTM
WEBINAR
CCPA Compliance for Analytics and Data Science
with Privacera and Databricks
February 19, 2020
2. Srikanth Venkat
Vice President of Product Management
Privacera
Nauman Fakhar
Director of ISV Solutions
Databricks
Today’s Presenters
3. Total data
Lost Value: Competing Mandates to Comply and Democratize Data
D
A
T
A
V
O
L
U
M
E
Lost value
2018 2020
Captured value
Data that meets
compliance
4. CCPA: Businesses & Consumers
Collects, shares, buys or sells Annual Revenue over $25M+ 50%+ of revenue from
PI data of 50K+ CA consumers CA
consumers
Receive a copy of the
specific personal
information collected
about them during
preceding 12 months
prior to their request
Right to have their
personal information
deleted (with
exceptions) within
reasonable timeframe
(45 days or less)
What categories of
personal information
collected, the source
and use of the
information, and
disclosed to who?
Right to know a
firm’s data sale
practices and to
request that their
PI not be sold to
3rd parties
Right not to be
discriminated against due
to exercise of CCPA
consumer rights!
FOR PROFIT BUSINESS
CONSUMER
5. CCPA: Personal Information & Obligations
● CCPA defines personal information broadly
○ “identifies, relates to, describes, is capable of being
associated with, or could reasonably be linked,
directly or indirectly, with a particular consumer or
household”
○ Inferences drawn to create a profile about the
individual to reflect preferences, attitudes, etc.
● Obligation of firms to:
○ Expand and annually update their privacy policy
disclosures
○ Provide on-demand to consumers within 45 days
information requested
○ Delete personal information upon request
○ Stop selling personal information of consumers upon
request
Personal Information (PI) Examples:
➢ Name
➢ Address
➢ Internet protocol address
➢ Email address
➢ Account name
➢ SSN, driver’s license number, passport numbers
➢ Protected classifications under CA or U.S. law
➢ Commercial activity (personal property, products or
services purchased)
➢ Biometric information
➢ Browsing history
➢ Geolocation data
➢ Audio, electronic, visual thermal, olfactory or similar
information
➢ Professional or employment information
➢ Education information
6. CCPA: Non-Compliance Consequences
● Up to $750 in damages per
consumer per incident or
actual damages, whichever
is greater
● Civil Action with fines by
Attorney General Office upto
$7500 for each intentional
violation, if offense is not
remedied within 30 days!
● Fines and penalties
● Costs for Litigations
● Product modification cost
● Restriction of operations
● Loss of revenue
● Increased insurance
coverage costs
● Loss of Brand image
● Loss of customer trust
● Customer churn
● Loss of employee trust
7. Unified data analytics platform for accelerating innovation across
data science, data engineering, and business analytics
Original creators of popular data and machine learning open source projects
Global company with 5,000 customers and 450+ partners
8. Accelerating data-driven
innovation across data
science, data engineering,
and business analytics
RAW DATA LAKE
DATA
SCIENTISTS
ML ENGINEERS DATA ANALYSTS
DATA
ENGINEERS
ENTERPRISE CLOUD SERVICE
A simple, scalable, and secure managed service
UNIFIED DATA SERVICE
High quality data with great performance
DATA SCIENCE WORKSPACE
Collaboration across the lifecycle
BI INTEGRATIONS
Access all your data
UNIFIED DATA ANALYTICS PLATFORM
9. Compliance - Transactions
Performance - Fast queries at scale
• Ability to delete/update specific rows of data from a cloud
native data lake
• Transaction log tracks history of operations on every Delta
table
• Compaction to optimize file sizes
• Data skipping reads only the relevant data
• Caching increases read throughput by up to 15x
Delta Lake: Adds Reliability & Performance
Reliability - High Quality Data
• Schema enforcement makes data consistent
• Transactions ensure only completed writes are committed
• Time travel maintains versions of data
10. 2012
XA Secure
founded.
XA Secure
acquired by
Hortonworks, open
sourced as
Apache Ranger.
2014
Apache Atlas,
data governance
project incubated
2015
Privacera
founded
2016
Privacera
platform
Generally
Available
2017
Customers
include
multiple
Fortune 100
companies.
Founded in 2016 by the creators of Apache Ranger and Apache Atlas.
Experienced and accomplished innovators in data and cloud governance.
Partner of Amazon Web Services, Microsoft, and Databricks.
2020
Privacera: Leaders in Big Data and Cloud
11. ● Centralized data access governance
platform.
● Works across heterogenous on-
premises and cloud data services.
● Based on open source Apache Ranger
project.
● Breaks data silos and simplifies data
access governance.
Privacera: Centralized Data Access Governance for the Hybrid Cloud
13. Benefits of Privacera Data Access Governance for Hybrid Cloud
For IT and data teams
✓ Single, centralized environment.
✓ Automated sensitive data discovery and
tagging.
✓ Consistent policy creation and
automated enforcement across services.
✓ Comprehensive monitoring, auditing and
compliance reporting.
14. Benefits of Privacera Data Access Governance for Hybrid Cloud
For data scientists and analysts
✓ Faster, safer access to more data and
data services.
✓ Transparent governance for improved
user experience.
✓ Reduced privacy, security and
compliance risk.
✓ More use cases, better insights, smarter
decisions.
16. DISCOVER DEFINE ENFORCE REPORT
Privacera: Data Access Governance Features
○ Diverse Compatibility: Quickly connect to cloud
storage & databases.
○ Scan & Tag Sensitive Data: Leverage machine
learning, rules to scan and tag sensitive data.
○ Scalable Metadata Storage: Store tags in a truly
scalable metadata store or integrate with 3rd party
data catalogs and associated tags.
17. DISCOVER DEFINE ENFORCE REPORT
Privacera: Data Access Governance Features
○ Centralized Management: Manage access control
policies for all data sources in a central portal.
○ FGAC: Create fine-grained access control policies
down to the file, row, and column level.
○ Robust Policy Definition: Create role-based,
attribute-based, and tag-based access control
policies.
18. DISCOVER DEFINE ENFORCE REPORT
Privacera: Data Access Governance Features
○ Heterogeneous Compatibility: Configure
enforcement points across on-premises and cloud
data and analytics services.
○ Simple, Immediate Enforcement: Automate
enforcement of access control policies for all users
across all environments.
19. DISCOVER DEFINE ENFORCE REPORT
Privacera: Data Access Governance Features
○ Instant Visibility: Quickly generate reports to help
teams get instant visibility on data assets.
○ Seamless Compliance: Generate custom reports to
prove compliance to outside regulators.
○ Comprehensive View of Sensitive Data Risks:
Monitor and audit data access behavior and get alerts
when sensitive data is moved.
20. CCPA: PI Handling & Processing Best Practices
● PI DATA INVENTORY : CCPA compliance starts with knowing what PI you have
=> Accurate, complete, and up-to-date sensitive data inventory is the foundation for compliance!
○ Review areas where any type of PI can reside (e.g. website, forms at retail locations, mail, email, employment
applications, HR documents, call center recordings, agreements and contracts (vendor or service providers,
landlord/tenant), marketing, CCTV, chatbot data etc.)
○ Identify and categorize or classify all PI with sources
○ Identify purposes for collecting PI data and uses
○ Identify retention period for each category of information to honor deletion requests
○ Identify who has been given access to the information, including 3rd parties via contracts and their use of the
information
○ Identify location of PI in data stores, storage format, and the owner or person(s) responsible for maintaining it
● For Managing Consumer Rights
○ Use de-identified PI data where possible to minimize exfiltration and attribution risk
○ Provide methods for record level deletion and updates across data stores in cloud and data center
○ Use masking, encryption (with removal of keys) and redaction on PI data where complete deletion is not
possible due to legal exceptions or other processing requirements
○ Centralize entitlement, access control, and consent management
22. CCPA as an Opportunity!
● Automated data security and privacy controls help:
○ reduce risk of manual errors
○ reduce operational complexity
○ improve response times to critical privacy and security incidents
○ avoid costly penalties, positively impacting bottom line
● Integrating a robust privacy program into your business processes
○ Helps build deeper customer engagement and improves business outcomes
○ Improves employee, partner, and customer trust and enhances brand image
and reputation
○ Improves data management practices to enable better and faster insights to
generate top line benefits
23. Questions?
Submit your questions now or email follow-up
questions to info@privacera.com and either
Srikanth or Nauman will follow up with you.
For more information about Privacera and
Databricks, visit www.privacera.com/databricks.