SlideShare a Scribd company logo
®
1© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Best Practices for Protecting Sensitive Data
Across the Big Data Platform
Mitesh Shah
MapR | Product Manager
Security & Data Governance
®
Venkat Subramanian
CTO | Dataguise
®
2© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Business Intelligence Trend for 2016 onwards…
IT-led, System-of-Record
• Limited access
• Glacial speed of response
Pervasive, Business-led, Self-service Analytics
• Near Real-time
• Agile BI & Analytics
• Deeper Insights into Diverse Data
Rita Sallam (Gartner)*
®
3© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Big Data Paradox
Data is the Biggest Asset
Data is also the Biggest Vulnerability
®
4© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Secure Business Execution
The ability of an Enterprise to safely and responsibly
leverage the value of all of their data assets to gain new
business insights, maximize competitive advantage,
and drive revenue growth.
®
5© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR and Dataguise…
Enable SECURE BUSINESS EXECUTION
Through
Trusted Platform and Sensitive Data Management
…
®
6© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Big Data Platform Needs to be Trusted (not just secure)
Can we properly identify users?
Can we authorize access
to data?
Can we plug in existing
enterprise systems?
Is my data highly available?
Is there a proper paper trail?
Have others done this before?
Is multi-tenancy supported?
Are apps supported across
geographies and data centers?
Is my data governed?
TRUSTED
SECURE
Questions to Ask of Your Big Data Vendor.
Verify the Platform is Trusted.
®
7© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR Trust Model
Credibility
VulnMgmt
Detection
Response
Compliance
AA
DPA
Governance
Resilience
Four Pillars
of Security
Auditing
Authorization
Data
Protection
Authentication
®
8© 2016 MapR Technologies | © 2016 Dataguise, Inc..
What’s the (Big) Difference?
Flexibility
•Multiple execution engines:
Hive, Spark, MapReduce,
Drill…
Scale
•1000s of users, groups
and applications sharing
the same cluster
•100s data sources
•PBs of data
Multi-Structured Data
•Multiple data formats: Parquet,
JSON, CSV, MapR-DB tables
®
9© 2016 MapR Technologies | © 2016 Dataguise, Inc..
A
MapR Trust Model (Product Security)
Granular
Authorization
Ubiquitous
Data Protection
• Access Control Expressions (ACEs)
• Protect files, tables, column families,
columns,and managementobjects
• Extend to role-based access control
(RBAC) with custom role functions
• Drill Views
• Encryption for data in motion
• Within a cluster
• Between clusters
• Between clientand cluster
• Encryption for data at rest
• LUKS
• Self-encrypting disk
• Partners
• NSA-level cryptographic algorithms
• All events recorded immediately
in JSON log files, with minimal
performance impact
• Includes data access and
administrative actions
• Ad hoc queries and custom
reports on audit logs via SQL and
standard BI tools
• Ticket-based authentication for all
services in the cluster
• Integration with LDAP, Active Directory
and other third-party directory services
• Kerberos or username/password
authentication
AA
DPA
4
21
3
Flexible
Authentication
Robust
Auditing
®
10© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Granular Authorization with MapR
®
11© 2016 MapR Technologies | © 2016 Dataguise, Inc..
The Problem with POSIX Permissions
-rw-rw---- bruce dev-team
POSIX Permissions
user
group
other
1.Change ownership of file to Sally.
2.Add Sally to dev-team group, even
if she’s not a developer.
3.Allow ‘others’ to read the file.
Scenario 1:
Sally needs to read the file.
Options:
???
Scenario 3:
All members that belong to both dev_team
and managers.
1.Allow ‘others’ to read the file.
2.Create a supergroup ‘Tech’, and
include all members from dev,
QA, and Support in that group.
chgrp Tech <filename>
Scenario 2:
Groups ‘QA’ and ‘Support’ need to read the file.
Options:
POSIX Permissions Are Limiting
AUTHORIZATION
®
12© 2016 MapR Technologies | © 2016 Dataguise, Inc..
POSIX ACLs vs ACEs
r : user:sally |
(group:dev_team & group:managers)
Access Control Lists
MapR Access Control Expressions
AUTHORIZATION
Which one is easier to set and understand?
Which one allows for higher granularity?
®
13© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR Has ACEs for Files and MapR-DB Records
Example: user:mary | (group:admins & group:VP) & user:!bob
Permissions on files, tables, column families, columns, JSON documents and sub-documents
AUTHORIZATION
Use Access Control Expressions (ACEs) to set granular permissions.
®
14© 2016 MapR Technologies | © 2016 Dataguise, Inc..
File ACEs – Key Features
Intuitive
Inheritance
Subdirectories and
files inherit perms
from parent
directory
Whole-Volume
ACEs
Volume-level filter –
useful in multitenant
environments.
Roles
Arbitrary grouping of
users according to
your business needs
High Performance
No performance hit
Boolean Operators
Allowing for
ultra fine-grain
permissions
AUTHORIZATION
®
15© 2016 MapR Technologies | © 2016 Dataguise, Inc..
File ACEs: Whole Volume ACE Example
Whole-Volume ACE
r: group:finance
Jane grants read access to Bob.
File: /finance/final_report.csv
r: user:bob
Bob cannot read the file
/finance/final_report.csv because
the whole-volume ACE is set to
allow read-access to finance only.
Jane
(Finance)
Bob
(Developer)
Whole-Volume ACE
AUTHORIZATION
®
© 2015 MapR Technologies 16
ACEs for Streams Too
AUTHORIZATION
®
17© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Robust Auditing with MapR
®
18© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR Audits
• Who touched customer records outside of
business hours?
• What actions did users take in the days
before leaving the company?
• What operations were performed without
following change control?
• Are users accessing sensitive files from
protected/secured source IPs?
• Why do my reports look different, despite
sourcing from same underlying data?
Monitoring Incident
Response
Security
AUDITING
Serving Security Analysts
®
19© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR Audits – Key Features
Data Access
• Files
• MapR-DB Tables
Cluster Operations
• Administrative Operations
• Maprcli commands
Authentication Requests
Secure High Performance
Flexible
• Retention Period
• Maxsize
• Coalesce Interval
• Selective Auditing
JSON Format
{"timestamp":"{$date=2015-06-
01T05:24:58.231Z}","operation":"GETATTR",
"user":"root","uid":"0","ipAddress":"10.10.x.x",
"nfsServer":"10.10.x.x","srcPath":"/dbtest.0/","
srcFid":"2147.16.2","VolumeName":“mktg_file
s","volumeId":“mktg_files","status":"0"}
AUDITING
®
20© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Querying Audit Logs with SQL
Example: detect suspicious, failed commands
®
21© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Auditing After-Hours Access
®
22© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Data Protection with MapR
®
23© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Encryption at Rest (Today)
SSN
Credit
Card #
Health Records
Name +
Age + Address
Sensitive Data
Volume
Self-
Encrypting
Disk
2
3
Use Partners for Masking, Tokenization,
Format Preserving Encryption
DATA PROTECTION
Many Options for Block-Level, Disk-Level,
and Field-Level Encryption
1
®
24© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Sensitive Data Management with Dataguise
®
25© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Cost of a Data Breach
“Hackers and criminal insiders cause the most data
breaches…malicious attacks can take an average of
256 days to identify…The most costly breaches
continue to occur in the US and Germany at $217 and
$211 per compromised record…If a healthcare
organization has a breach, the average cost could be
as high as $363.”
Time and Financial Impact on Organizations
Ponemon Institute’s 2015
®
26© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Secure Environment
Perimeter Security
• Physical security, Firewalls, IDS/IPS…
Volume/File-level Encryption
• Control over data access
• Meeting regulatory compliance…
Aren’t these enough?
YOU NEED BOTH…AND *MORE
®
27© 2016 MapR Technologies | © 2016 Dataguise, Inc..
PHI: Guidance for Data De-Identification
Sensitive/Privacy Data
• Name
• Address
• Dates – Birth, Death...
• Telephone Numbers
• Device Identifiers and Serial Numbers
• Email Addresses
• SSN
• Medical Record Numbers
• Account Numbers
….
….
®
28© 2016 MapR Technologies | © 2016 Dataguise, Inc..
What Should We Do?
At a Granular (cell) Level:
• Precisely locate sensitive content across ALL repositories
• Protect those assets appropriately – masking, encryption
• Provide “controlled” access to data
• Enable employees, trusted partners to make data-driven decisions
RISKS
BREACH
SECURITY
COMPLIANCE
VALUE
REVENUE
DATA DRIVEN DECISIONS
BUSINESS INTELLIGENCE
®
29© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure
DETECT
Where sensitive content is
present in structured,
unstruct. & semi-
structured data
AUDIT
Who has access to
which sensitive data &
identify misalignments
and risk factors
PROTECT
Sensitive data at the
element level –
encrypt/decrypt with RBAC
mask or redact
MONITOR
Based on alert policies,
track sensitive data
access through a 360°
dashboard
®
30© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure
DETECT
Where sensitive content is
present in structured,
unstruct. & semi-
structured data
AUDIT
Who has access to
which sensitive data &
identify misalignments
and risk factors
PROTECT
Sensitive data at the
element level –
encrypt/decrypt with RBAC
mask or redact
MONITOR
Based on alert policies,
track sensitive data
access through a 360°
dashboard
Across Hadoop, RDBMS,
Files, NoSQL DB
®
31© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure
On Premise, in the
Cloud, or Hybrid
DETECT
Where sensitive content is
present in structured,
unstruct. & semi-
structured data
AUDIT
Who has access to
which sensitive data &
identify misalignments
and risk factors
PROTECT
Sensitive data at the
element level –
encrypt/decrypt with RBAC
mask or redact
MONITOR
Based on alert policies,
track sensitive data
access through a 360°
dashboard
Across Hadoop, RDBMS,
Files, NoSQL DB
®
32© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
How do we do that in DgSecure?
®
33© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Complex Sensitive Data Detection
SENSITIVE DATA DISCOVERY FOR COMPLEX
ENVIRONMENTS
Patterns in “Strings”
• Digit Patterns: 4451 3340 0023 1200 8/16 B7127157
Expires 04-19-15
Patterns in “Grammar”
• August Thomson vs
1240 AugustAve vs
12 August 1994
Patterns in Context (Dependent)
• Other data elements in horizontal or vertical vicinity
‘94538’near address elements
Patterns in Combination (Composite)
• CCN & Name, CCN, Name, Expiry not just CCN
Patterns in Knowledge
• Ontologies HL7 Encoding, Financial Market Data
DISCOVERY FOR:
Data at Rest
• Hadoop (HDFS)
• DBMS
• Teradata
• Files
• SharePoint
Data in Motion
• Flume (into HDFS)
• FTP (into HDFS or between file
systems)
• Scoop (into HDFS)
• Kafka (Q3 2016)
®
34© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Sensitive Data Protection
Masking
• Obfuscation, one-way operation
• Multiple options in DgSecure – fictitious but realistic values, X’ing out part of the content…
• Consistent masking to retain statistical distribution of data
Encrpytion
• Encrypted cell/row
• Accessible by authorized users only – Hive, bulk, via App
• Granular protection
Redaction
• X’ing out entire sensitive data cell
• Nullifying
Masking & Encryption in Hadoop
®
35© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Masking multiple Options - Examples
Masking Option Applied Original Value Masked Value
Telephone – Random
- Realistic, fictitious -
(508) 850-0058 (325) 418-0131
Telephone – Character
- Hide digits -
(508) 850-0058 XXX-XXX-0058
Telephone – Intellimask
- Replace first 3 digits -
(508) 850-0058 (451) 850-0058
Telephone – FPM - Format Preserving
- Replace char & Digits with same type -
(508) 850-0058
508 850 0058
508-850-0058
(729) 432-9647
729 432 9647
729-432-9647
Telephone – Static Masking
- Replace all with (111) 222-3333 -
(508) 850-0058 (111) 222-3333
®
36© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Unstructured Data – Any Sensitive Elements?
RAW	DATA
®
37© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Masking Data in Hadoop (Cell Level)
RAW	DATA
®
38© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Masking Data in Hadoop (Cell Level)
MASKED	DATA
®
39© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Encrypting Data in Hadoop (Cell Level)
MASKED	DATAENCRYPTED	DATA
®
40© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Masking Data in Hadoop (Cell Level)
®
41© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Decryption through Hive Queries
User WITHOUT access privileges for Names and SSN
®
42© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Decryption through Hive Queries
User WITH access privileges for Names and SSN
®
43© 2016 MapR Technologies | © 2016 Dataguise, Inc..
BI Use Cases and Sensitive Elements
Brand Sentiment
Log Analysis
Customer Retention
Clinical Trial Analysis
Payments Risk Mgmt.
Trading System Perf.
Risk Modeling
Supply Chain Optimization
Smart Metering
Insurance Premiums
Process Efficiency
Person of Interest Discovery
Dynamic Pricing
IT Security Intelligence
Real-time Upsell
Monitoring Sensors
Analytic
Transactional
Name
Address
Email Address
Customer Lifetime Value
IP Address
URL
Medical Record Number
Social Security Number
Telephone Number
Date of Birth (DOB)
IP Addresses
Credit Card Number
Credit Limit
Purchase Amount
VIN
Device ID
®
44© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Protection Policy: Encryption, Masking
Brand Sentiment
Log Analysis
Customer Retention
Clinical Trial Analysis
Payments Risk Mgmt.
Trading System Perf.
Risk Modeling
Supply Chain Otimization
Smart Metering
Insurance Premiums
Process Efficiency
Person of Interest Discovery
Dynamic Pricing
IT Security Intelligence
Real-time Upsell
Monitoring Sensors
Analytic
Transactional
Name
Address
Email Address
Customer Lifetime Value
IP Address
URL
Medical Record Number
Social Security Number
Telephone Number
Date of Birth (DOB)
Medical Test Results
Credit Card Number
Credit Limit
Purchase Amount
VIN
Device ID
Transaction Date
Mask
Encrypt
®
45© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Protection Policy: Encryption, Masking
Brand Sentiment
Log Analysis
Customer Retention
Clinical Trial Analysis
Payments Risk Mgmt.
Trading System Perf.
Risk Modeling
Supply Chain Otimization
Smart Metering
Insurance Premiums
Process Efficiency
Person of Interest Discovery
Dynamic Pricing
IT Security Intelligence
Real-time Upsell
Monitoring Sensors
Analytic
Transactional
Name
Address
Email Address
Customer Lifetime Value
IP Address
URL
Medical Record Number
Social Security Number
Telephone Number
Date of Birth (DOB)
Medical Test Results
Credit Card Number
Credit Limit
Purchase Amount
VIN
Device ID
Transaction Date
Mask
Encrypt
®
46© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
DgSecure Solution Workflow
®
47© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure for Hadoop: Policy
DETECT AUDIT PROTECT REPORT
• Policy
• Per Data Feed?
• Protection Options
• Custom Elements
• Singleton
• Composite
• Dependent
• Domain Definition
• Key Management
®
48© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure for Hadoop: Detection
In-Flight
Within HDFS
Full vs. Incremental
Structured, Semi,
Unstructured
Quick Scan
Element Count
DETECT AUDIT PROTECT REPORT
®
49© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure for Hadoop: Access Audit
In-Flight
Within HDFS
Full vs. Incremental
Structured, Semi,
Unstructured
Quick Scan
Element Count
Files/Directories
- Sensitive Elements
- Protected?
- Who has access?
Users
- What can they
access?
DETECT AUDIT PROTECT REPORT
®
50© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure for Hadoop: Protection
In-Flight
Within HDFS
Full vs. Incremental
Structured, Semi,
Unstructured
Quick Scan
Element Count
Files/Directories
- Sensitive Elements
- Protected?
- Who has access?
Users
- What can they
access?
Domain Based
Masking
Redaction
Encryption
- Field or Record
- AES or FPE
DETECT AUDIT PROTECT REPORT
®
51© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure for Hadoop: Reports
In-Flight
Within HDFS
Full vs. Incremental
Structured, Semi,
Unstructured
Quick Scan
Element Count
Files/Directories
- Sensitive Elements
- Protected?
- Who has access?
Users
- What can they
access?
Domain Based
Masking
Redaction
Encryption
- Field or Record
- AES or FPE
Job Level
- Sensitive elements
- Directories & Files
- Remediation applied
Dashboard
- Directory or by policy
- Drill-down
Audit report
- User actions
Notifications
DETECT AUDIT PROTECT REPORT
®
52© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
DgSecure Monitor
®
53© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure Monitor
Precisely Focused on Monitoring Sensitive Data
• Where are the sensitive content and how many (density)
• How is it protected
• What data is accessed
• Who is accessing it
Across All Enterprise Repositories
• Hadoop and Cassandra
• Cloud support (AWS S3 and Azure Blob)
Continuous, Near-real-time Anomaly Behavior Detection
• Using maching learning to build user profile
• Complex event processing to detect breach
“Out of the Box” Templates
®
54© 2016 MapR Technologies | © 2016 Dataguise, Inc..
DgSecure Monitor
NoSQL
ON PREMISE
Sensitive Info
RDBMS
Hadoop
DgSECURE
CLOUD
DATASTORES
S3
RDBMS
BlobStorage
Hadoop
DgSecure
Repository
Monitoring
Metadata
Monitoring Metadata Manager
Detection
Data Access Information
Monitoring Engine
®
55© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc.
Secure Business Workflow
Enterprise Data Marketplace Use Case
®
56© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
Multiple Data Feeds with their own Policies
Data Asset Marketplace: Data Assets (Indexed)
Access Granted upon Request per policy & compliance
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
®
57© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
CISO/CPO:
Set	policy	per	data	
feed	type
®
58© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	Asset	Owner:
Provenance	
metadata
®
59© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Run	Discovery	to	
detect	sensitive	data
Metadata	to	
repository
Mask/Encrypt	to	protect	
sensitive	data
Metadata	incl.	lineage	
to	repository
®
60© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
IT/Set	Process:
Use	Metadata	to	set	
access	control
®
61© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	Asset	owner	
adds	annotations	&	
adds	to	Data	Asset	
Index
®
62© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	Scientist	
browses	available	
data	sets	and	makes	
access	request
®
63© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	owner	
approves	request
Sets	access	control	
in	Ranger
®
64© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	Scientist	runs	
data	
mining/BI/Analytics
®
65© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Data	Scientist	runs	
data	
mining/BI/Analytics
Other Data Sources
®
66© 2016 MapR Technologies | © 2016 Dataguise, Inc..
Data Marketplace End-to-End Workflow
1SOURCES LANDING ZONE DATA PROCESS
COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT
Data Admin Data Scientist Data Admin Data Scientist
Set policy per feed
Data Lake
Data Feed 1
Data Feed 2
Data Feed 3
Data Feed 4
Set Access Control
Metadata Repository
Q1 Region 1 Data Set 1
Q2 Region 2 Data Set 2
Q3 Region 3 Data Set 3
Q4 Region 4 Data Set 4
Access Given
Access Denied
WORKFLOW
SECURE BUSINESS EXECUTION
1 2 3
4 5 6 7 8
Other Data Sources
®
67© 2016 MapR Technologies | © 2016 Dataguise, Inc..
MapR + Dataguise: Comprehensive Data Security
Active
Directory
Disk
Auditing
Incident
Response
Authentication
Authorization
Data Protection
Data Protection
Compliance
Vulnerability Management
®
© 2016 MapR Technologies 68© 2016 MapR Technologies
Q&A

More Related Content

What's hot

3 guiding priciples to improve data security
3 guiding priciples to improve data security3 guiding priciples to improve data security
3 guiding priciples to improve data security
Keith Braswell
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Carol McDonald
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
MapR Technologies
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
Carol McDonald
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document Database
MapR Technologies
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
MapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
DataWorks Summit/Hadoop Summit
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production SuccessAllen Day, PhD
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013
jdfiori
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
Jan Pieter Posthuma
 
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
 

What's hot (20)

3 guiding priciples to improve data security
3 guiding priciples to improve data security3 guiding priciples to improve data security
3 guiding priciples to improve data security
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document Database
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Philly DB MapR Overview
Philly DB MapR OverviewPhilly DB MapR Overview
Philly DB MapR Overview
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
 
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
 

Viewers also liked

Why Data Security is Important
Why Data Security is ImportantWhy Data Security is Important
Why Data Security is Important
Buzz Marketing Pros
 
Secure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platformSecure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platform
redpel dot com
 
David Smith gfke 2014
David Smith gfke 2014David Smith gfke 2014
David Smith gfke 2014
innovationoecd
 
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
Ulf Mattsson
 
Big Data and Mobile Commerce - Privacy and Data Protection
Big Data and Mobile Commerce - Privacy and Data ProtectionBig Data and Mobile Commerce - Privacy and Data Protection
Big Data and Mobile Commerce - Privacy and Data ProtectionKenneth Ho
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
Privacy and Big Data Overload!
Privacy and Big Data Overload!Privacy and Big Data Overload!
Privacy and Big Data Overload!
SparkPost
 
Secure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platformSecure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platform
Nexgen Technology
 
Cloud computing for mobile users can offloading computation save energy
Cloud computing for mobile users can offloading computation save energyCloud computing for mobile users can offloading computation save energy
Cloud computing for mobile users can offloading computation save energy
Madan Golla
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
MapR Technologies
 
Presentation on vechile operator safety
Presentation on vechile operator safetyPresentation on vechile operator safety
Presentation on vechile operator safety
Shivam Sharma
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
 
Data protection and privacy framework in the design of learning analytics sys...
Data protection and privacy framework in the design of learning analytics sys...Data protection and privacy framework in the design of learning analytics sys...
Data protection and privacy framework in the design of learning analytics sys...
Tore Hoel
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
Hortonworks
 
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
MapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 

Viewers also liked (18)

Why Data Security is Important
Why Data Security is ImportantWhy Data Security is Important
Why Data Security is Important
 
Secure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platformSecure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platform
 
David Smith gfke 2014
David Smith gfke 2014David Smith gfke 2014
David Smith gfke 2014
 
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
 
Big Data and Mobile Commerce - Privacy and Data Protection
Big Data and Mobile Commerce - Privacy and Data ProtectionBig Data and Mobile Commerce - Privacy and Data Protection
Big Data and Mobile Commerce - Privacy and Data Protection
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
Privacy and Big Data Overload!
Privacy and Big Data Overload!Privacy and Big Data Overload!
Privacy and Big Data Overload!
 
Secure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platformSecure sensitive data sharing on a big data platform
Secure sensitive data sharing on a big data platform
 
Cloud computing for mobile users can offloading computation save energy
Cloud computing for mobile users can offloading computation save energyCloud computing for mobile users can offloading computation save energy
Cloud computing for mobile users can offloading computation save energy
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
 
Presentation on vechile operator safety
Presentation on vechile operator safetyPresentation on vechile operator safety
Presentation on vechile operator safety
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
Data protection and privacy framework in the design of learning analytics sys...
Data protection and privacy framework in the design of learning analytics sys...Data protection and privacy framework in the design of learning analytics sys...
Data protection and privacy framework in the design of learning analytics sys...
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 

Similar to Best Practices for Protecting Sensitive Data Across the Big Data Platform

Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
DataWorks Summit
 
Operating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environmentOperating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environment
DataWorks Summit
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
Cloudera, Inc.
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
 
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment modeCloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
Himani Singh
 
Agility, Business Continuity & Security in a Digital World: Can we have it all?
Agility, Business Continuity & Security in a Digital World: Can we have it all?Agility, Business Continuity & Security in a Digital World: Can we have it all?
Agility, Business Continuity & Security in a Digital World: Can we have it all?
Ocean9, Inc.
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
Cloudera, Inc.
 
GDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache AtlasGDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
The World Bank
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013DataTactics
 
Securing DevOps through Privileged Access Management
Securing DevOps through Privileged Access ManagementSecuring DevOps through Privileged Access Management
Securing DevOps through Privileged Access Management
BeyondTrust
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
IBM
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
Hadoop and-cisco-ucs
Hadoop and-cisco-ucsHadoop and-cisco-ucs
Hadoop and-cisco-ucs
CMR WORLD TECH
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
Amazon Web Services
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Cloudera, Inc.
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
MapR Technologies
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
BigDataEverywhere
 

Similar to Best Practices for Protecting Sensitive Data Across the Big Data Platform (20)

Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
 
Operating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environmentOperating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environment
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
 
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment modeCloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
Cloud security, Cloud security Access broker, CSAB's 4 pillar, deployment mode
 
Agility, Business Continuity & Security in a Digital World: Can we have it all?
Agility, Business Continuity & Security in a Digital World: Can we have it all?Agility, Business Continuity & Security in a Digital World: Can we have it all?
Agility, Business Continuity & Security in a Digital World: Can we have it all?
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
 
GDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache AtlasGDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache Atlas
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Securing DevOps through Privileged Access Management
Securing DevOps through Privileged Access ManagementSecuring DevOps through Privileged Access Management
Securing DevOps through Privileged Access Management
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Hadoop and-cisco-ucs
Hadoop and-cisco-ucsHadoop and-cisco-ucs
Hadoop and-cisco-ucs
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
 

More from MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
MapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 

More from MapR Technologies (18)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 

Recently uploaded

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 

Recently uploaded (20)

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 

Best Practices for Protecting Sensitive Data Across the Big Data Platform

  • 1. ® 1© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Best Practices for Protecting Sensitive Data Across the Big Data Platform Mitesh Shah MapR | Product Manager Security & Data Governance ® Venkat Subramanian CTO | Dataguise
  • 2. ® 2© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Business Intelligence Trend for 2016 onwards… IT-led, System-of-Record • Limited access • Glacial speed of response Pervasive, Business-led, Self-service Analytics • Near Real-time • Agile BI & Analytics • Deeper Insights into Diverse Data Rita Sallam (Gartner)*
  • 3. ® 3© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Big Data Paradox Data is the Biggest Asset Data is also the Biggest Vulnerability
  • 4. ® 4© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Secure Business Execution The ability of an Enterprise to safely and responsibly leverage the value of all of their data assets to gain new business insights, maximize competitive advantage, and drive revenue growth.
  • 5. ® 5© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR and Dataguise… Enable SECURE BUSINESS EXECUTION Through Trusted Platform and Sensitive Data Management …
  • 6. ® 6© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Big Data Platform Needs to be Trusted (not just secure) Can we properly identify users? Can we authorize access to data? Can we plug in existing enterprise systems? Is my data highly available? Is there a proper paper trail? Have others done this before? Is multi-tenancy supported? Are apps supported across geographies and data centers? Is my data governed? TRUSTED SECURE Questions to Ask of Your Big Data Vendor. Verify the Platform is Trusted.
  • 7. ® 7© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR Trust Model Credibility VulnMgmt Detection Response Compliance AA DPA Governance Resilience Four Pillars of Security Auditing Authorization Data Protection Authentication
  • 8. ® 8© 2016 MapR Technologies | © 2016 Dataguise, Inc.. What’s the (Big) Difference? Flexibility •Multiple execution engines: Hive, Spark, MapReduce, Drill… Scale •1000s of users, groups and applications sharing the same cluster •100s data sources •PBs of data Multi-Structured Data •Multiple data formats: Parquet, JSON, CSV, MapR-DB tables
  • 9. ® 9© 2016 MapR Technologies | © 2016 Dataguise, Inc.. A MapR Trust Model (Product Security) Granular Authorization Ubiquitous Data Protection • Access Control Expressions (ACEs) • Protect files, tables, column families, columns,and managementobjects • Extend to role-based access control (RBAC) with custom role functions • Drill Views • Encryption for data in motion • Within a cluster • Between clusters • Between clientand cluster • Encryption for data at rest • LUKS • Self-encrypting disk • Partners • NSA-level cryptographic algorithms • All events recorded immediately in JSON log files, with minimal performance impact • Includes data access and administrative actions • Ad hoc queries and custom reports on audit logs via SQL and standard BI tools • Ticket-based authentication for all services in the cluster • Integration with LDAP, Active Directory and other third-party directory services • Kerberos or username/password authentication AA DPA 4 21 3 Flexible Authentication Robust Auditing
  • 10. ® 10© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Granular Authorization with MapR
  • 11. ® 11© 2016 MapR Technologies | © 2016 Dataguise, Inc.. The Problem with POSIX Permissions -rw-rw---- bruce dev-team POSIX Permissions user group other 1.Change ownership of file to Sally. 2.Add Sally to dev-team group, even if she’s not a developer. 3.Allow ‘others’ to read the file. Scenario 1: Sally needs to read the file. Options: ??? Scenario 3: All members that belong to both dev_team and managers. 1.Allow ‘others’ to read the file. 2.Create a supergroup ‘Tech’, and include all members from dev, QA, and Support in that group. chgrp Tech <filename> Scenario 2: Groups ‘QA’ and ‘Support’ need to read the file. Options: POSIX Permissions Are Limiting AUTHORIZATION
  • 12. ® 12© 2016 MapR Technologies | © 2016 Dataguise, Inc.. POSIX ACLs vs ACEs r : user:sally | (group:dev_team & group:managers) Access Control Lists MapR Access Control Expressions AUTHORIZATION Which one is easier to set and understand? Which one allows for higher granularity?
  • 13. ® 13© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR Has ACEs for Files and MapR-DB Records Example: user:mary | (group:admins & group:VP) & user:!bob Permissions on files, tables, column families, columns, JSON documents and sub-documents AUTHORIZATION Use Access Control Expressions (ACEs) to set granular permissions.
  • 14. ® 14© 2016 MapR Technologies | © 2016 Dataguise, Inc.. File ACEs – Key Features Intuitive Inheritance Subdirectories and files inherit perms from parent directory Whole-Volume ACEs Volume-level filter – useful in multitenant environments. Roles Arbitrary grouping of users according to your business needs High Performance No performance hit Boolean Operators Allowing for ultra fine-grain permissions AUTHORIZATION
  • 15. ® 15© 2016 MapR Technologies | © 2016 Dataguise, Inc.. File ACEs: Whole Volume ACE Example Whole-Volume ACE r: group:finance Jane grants read access to Bob. File: /finance/final_report.csv r: user:bob Bob cannot read the file /finance/final_report.csv because the whole-volume ACE is set to allow read-access to finance only. Jane (Finance) Bob (Developer) Whole-Volume ACE AUTHORIZATION
  • 16. ® © 2015 MapR Technologies 16 ACEs for Streams Too AUTHORIZATION
  • 17. ® 17© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Robust Auditing with MapR
  • 18. ® 18© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR Audits • Who touched customer records outside of business hours? • What actions did users take in the days before leaving the company? • What operations were performed without following change control? • Are users accessing sensitive files from protected/secured source IPs? • Why do my reports look different, despite sourcing from same underlying data? Monitoring Incident Response Security AUDITING Serving Security Analysts
  • 19. ® 19© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR Audits – Key Features Data Access • Files • MapR-DB Tables Cluster Operations • Administrative Operations • Maprcli commands Authentication Requests Secure High Performance Flexible • Retention Period • Maxsize • Coalesce Interval • Selective Auditing JSON Format {"timestamp":"{$date=2015-06- 01T05:24:58.231Z}","operation":"GETATTR", "user":"root","uid":"0","ipAddress":"10.10.x.x", "nfsServer":"10.10.x.x","srcPath":"/dbtest.0/"," srcFid":"2147.16.2","VolumeName":“mktg_file s","volumeId":“mktg_files","status":"0"} AUDITING
  • 20. ® 20© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Querying Audit Logs with SQL Example: detect suspicious, failed commands
  • 21. ® 21© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Auditing After-Hours Access
  • 22. ® 22© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Data Protection with MapR
  • 23. ® 23© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Encryption at Rest (Today) SSN Credit Card # Health Records Name + Age + Address Sensitive Data Volume Self- Encrypting Disk 2 3 Use Partners for Masking, Tokenization, Format Preserving Encryption DATA PROTECTION Many Options for Block-Level, Disk-Level, and Field-Level Encryption 1
  • 24. ® 24© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Sensitive Data Management with Dataguise
  • 25. ® 25© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Cost of a Data Breach “Hackers and criminal insiders cause the most data breaches…malicious attacks can take an average of 256 days to identify…The most costly breaches continue to occur in the US and Germany at $217 and $211 per compromised record…If a healthcare organization has a breach, the average cost could be as high as $363.” Time and Financial Impact on Organizations Ponemon Institute’s 2015
  • 26. ® 26© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Secure Environment Perimeter Security • Physical security, Firewalls, IDS/IPS… Volume/File-level Encryption • Control over data access • Meeting regulatory compliance… Aren’t these enough? YOU NEED BOTH…AND *MORE
  • 27. ® 27© 2016 MapR Technologies | © 2016 Dataguise, Inc.. PHI: Guidance for Data De-Identification Sensitive/Privacy Data • Name • Address • Dates – Birth, Death... • Telephone Numbers • Device Identifiers and Serial Numbers • Email Addresses • SSN • Medical Record Numbers • Account Numbers …. ….
  • 28. ® 28© 2016 MapR Technologies | © 2016 Dataguise, Inc.. What Should We Do? At a Granular (cell) Level: • Precisely locate sensitive content across ALL repositories • Protect those assets appropriately – masking, encryption • Provide “controlled” access to data • Enable employees, trusted partners to make data-driven decisions RISKS BREACH SECURITY COMPLIANCE VALUE REVENUE DATA DRIVEN DECISIONS BUSINESS INTELLIGENCE
  • 29. ® 29© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure DETECT Where sensitive content is present in structured, unstruct. & semi- structured data AUDIT Who has access to which sensitive data & identify misalignments and risk factors PROTECT Sensitive data at the element level – encrypt/decrypt with RBAC mask or redact MONITOR Based on alert policies, track sensitive data access through a 360° dashboard
  • 30. ® 30© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure DETECT Where sensitive content is present in structured, unstruct. & semi- structured data AUDIT Who has access to which sensitive data & identify misalignments and risk factors PROTECT Sensitive data at the element level – encrypt/decrypt with RBAC mask or redact MONITOR Based on alert policies, track sensitive data access through a 360° dashboard Across Hadoop, RDBMS, Files, NoSQL DB
  • 31. ® 31© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure On Premise, in the Cloud, or Hybrid DETECT Where sensitive content is present in structured, unstruct. & semi- structured data AUDIT Who has access to which sensitive data & identify misalignments and risk factors PROTECT Sensitive data at the element level – encrypt/decrypt with RBAC mask or redact MONITOR Based on alert policies, track sensitive data access through a 360° dashboard Across Hadoop, RDBMS, Files, NoSQL DB
  • 32. ® 32© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. How do we do that in DgSecure?
  • 33. ® 33© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Complex Sensitive Data Detection SENSITIVE DATA DISCOVERY FOR COMPLEX ENVIRONMENTS Patterns in “Strings” • Digit Patterns: 4451 3340 0023 1200 8/16 B7127157 Expires 04-19-15 Patterns in “Grammar” • August Thomson vs 1240 AugustAve vs 12 August 1994 Patterns in Context (Dependent) • Other data elements in horizontal or vertical vicinity ‘94538’near address elements Patterns in Combination (Composite) • CCN & Name, CCN, Name, Expiry not just CCN Patterns in Knowledge • Ontologies HL7 Encoding, Financial Market Data DISCOVERY FOR: Data at Rest • Hadoop (HDFS) • DBMS • Teradata • Files • SharePoint Data in Motion • Flume (into HDFS) • FTP (into HDFS or between file systems) • Scoop (into HDFS) • Kafka (Q3 2016)
  • 34. ® 34© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Sensitive Data Protection Masking • Obfuscation, one-way operation • Multiple options in DgSecure – fictitious but realistic values, X’ing out part of the content… • Consistent masking to retain statistical distribution of data Encrpytion • Encrypted cell/row • Accessible by authorized users only – Hive, bulk, via App • Granular protection Redaction • X’ing out entire sensitive data cell • Nullifying Masking & Encryption in Hadoop
  • 35. ® 35© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Masking multiple Options - Examples Masking Option Applied Original Value Masked Value Telephone – Random - Realistic, fictitious - (508) 850-0058 (325) 418-0131 Telephone – Character - Hide digits - (508) 850-0058 XXX-XXX-0058 Telephone – Intellimask - Replace first 3 digits - (508) 850-0058 (451) 850-0058 Telephone – FPM - Format Preserving - Replace char & Digits with same type - (508) 850-0058 508 850 0058 508-850-0058 (729) 432-9647 729 432 9647 729-432-9647 Telephone – Static Masking - Replace all with (111) 222-3333 - (508) 850-0058 (111) 222-3333
  • 36. ® 36© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Unstructured Data – Any Sensitive Elements? RAW DATA
  • 37. ® 37© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Masking Data in Hadoop (Cell Level) RAW DATA
  • 38. ® 38© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Masking Data in Hadoop (Cell Level) MASKED DATA
  • 39. ® 39© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Encrypting Data in Hadoop (Cell Level) MASKED DATAENCRYPTED DATA
  • 40. ® 40© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Masking Data in Hadoop (Cell Level)
  • 41. ® 41© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Decryption through Hive Queries User WITHOUT access privileges for Names and SSN
  • 42. ® 42© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Decryption through Hive Queries User WITH access privileges for Names and SSN
  • 43. ® 43© 2016 MapR Technologies | © 2016 Dataguise, Inc.. BI Use Cases and Sensitive Elements Brand Sentiment Log Analysis Customer Retention Clinical Trial Analysis Payments Risk Mgmt. Trading System Perf. Risk Modeling Supply Chain Optimization Smart Metering Insurance Premiums Process Efficiency Person of Interest Discovery Dynamic Pricing IT Security Intelligence Real-time Upsell Monitoring Sensors Analytic Transactional Name Address Email Address Customer Lifetime Value IP Address URL Medical Record Number Social Security Number Telephone Number Date of Birth (DOB) IP Addresses Credit Card Number Credit Limit Purchase Amount VIN Device ID
  • 44. ® 44© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Protection Policy: Encryption, Masking Brand Sentiment Log Analysis Customer Retention Clinical Trial Analysis Payments Risk Mgmt. Trading System Perf. Risk Modeling Supply Chain Otimization Smart Metering Insurance Premiums Process Efficiency Person of Interest Discovery Dynamic Pricing IT Security Intelligence Real-time Upsell Monitoring Sensors Analytic Transactional Name Address Email Address Customer Lifetime Value IP Address URL Medical Record Number Social Security Number Telephone Number Date of Birth (DOB) Medical Test Results Credit Card Number Credit Limit Purchase Amount VIN Device ID Transaction Date Mask Encrypt
  • 45. ® 45© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Protection Policy: Encryption, Masking Brand Sentiment Log Analysis Customer Retention Clinical Trial Analysis Payments Risk Mgmt. Trading System Perf. Risk Modeling Supply Chain Otimization Smart Metering Insurance Premiums Process Efficiency Person of Interest Discovery Dynamic Pricing IT Security Intelligence Real-time Upsell Monitoring Sensors Analytic Transactional Name Address Email Address Customer Lifetime Value IP Address URL Medical Record Number Social Security Number Telephone Number Date of Birth (DOB) Medical Test Results Credit Card Number Credit Limit Purchase Amount VIN Device ID Transaction Date Mask Encrypt
  • 46. ® 46© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. DgSecure Solution Workflow
  • 47. ® 47© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure for Hadoop: Policy DETECT AUDIT PROTECT REPORT • Policy • Per Data Feed? • Protection Options • Custom Elements • Singleton • Composite • Dependent • Domain Definition • Key Management
  • 48. ® 48© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure for Hadoop: Detection In-Flight Within HDFS Full vs. Incremental Structured, Semi, Unstructured Quick Scan Element Count DETECT AUDIT PROTECT REPORT
  • 49. ® 49© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure for Hadoop: Access Audit In-Flight Within HDFS Full vs. Incremental Structured, Semi, Unstructured Quick Scan Element Count Files/Directories - Sensitive Elements - Protected? - Who has access? Users - What can they access? DETECT AUDIT PROTECT REPORT
  • 50. ® 50© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure for Hadoop: Protection In-Flight Within HDFS Full vs. Incremental Structured, Semi, Unstructured Quick Scan Element Count Files/Directories - Sensitive Elements - Protected? - Who has access? Users - What can they access? Domain Based Masking Redaction Encryption - Field or Record - AES or FPE DETECT AUDIT PROTECT REPORT
  • 51. ® 51© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure for Hadoop: Reports In-Flight Within HDFS Full vs. Incremental Structured, Semi, Unstructured Quick Scan Element Count Files/Directories - Sensitive Elements - Protected? - Who has access? Users - What can they access? Domain Based Masking Redaction Encryption - Field or Record - AES or FPE Job Level - Sensitive elements - Directories & Files - Remediation applied Dashboard - Directory or by policy - Drill-down Audit report - User actions Notifications DETECT AUDIT PROTECT REPORT
  • 52. ® 52© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. DgSecure Monitor
  • 53. ® 53© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure Monitor Precisely Focused on Monitoring Sensitive Data • Where are the sensitive content and how many (density) • How is it protected • What data is accessed • Who is accessing it Across All Enterprise Repositories • Hadoop and Cassandra • Cloud support (AWS S3 and Azure Blob) Continuous, Near-real-time Anomaly Behavior Detection • Using maching learning to build user profile • Complex event processing to detect breach “Out of the Box” Templates
  • 54. ® 54© 2016 MapR Technologies | © 2016 Dataguise, Inc.. DgSecure Monitor NoSQL ON PREMISE Sensitive Info RDBMS Hadoop DgSECURE CLOUD DATASTORES S3 RDBMS BlobStorage Hadoop DgSecure Repository Monitoring Metadata Monitoring Metadata Manager Detection Data Access Information Monitoring Engine
  • 55. ® 55© 2016 MapR Technologies | © 2016 Dataguise, Inc..© 2016 MapR Technologies | © 2016 Dataguise, Inc. Secure Business Workflow Enterprise Data Marketplace Use Case
  • 56. ® 56© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow Multiple Data Feeds with their own Policies Data Asset Marketplace: Data Assets (Indexed) Access Granted upon Request per policy & compliance 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8
  • 57. ® 57© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 CISO/CPO: Set policy per data feed type
  • 58. ® 58© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data Asset Owner: Provenance metadata
  • 59. ® 59© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Run Discovery to detect sensitive data Metadata to repository Mask/Encrypt to protect sensitive data Metadata incl. lineage to repository
  • 60. ® 60© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 IT/Set Process: Use Metadata to set access control
  • 61. ® 61© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data Asset owner adds annotations & adds to Data Asset Index
  • 62. ® 62© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data Scientist browses available data sets and makes access request
  • 63. ® 63© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data owner approves request Sets access control in Ranger
  • 64. ® 64© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data Scientist runs data mining/BI/Analytics
  • 65. ® 65© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Data Scientist runs data mining/BI/Analytics Other Data Sources
  • 66. ® 66© 2016 MapR Technologies | © 2016 Dataguise, Inc.. Data Marketplace End-to-End Workflow 1SOURCES LANDING ZONE DATA PROCESS COLLECT METADATA ANNOTATE & PUBLISH BROWSE & REQUEST APPROVE ANALYZE & REPORT Data Admin Data Scientist Data Admin Data Scientist Set policy per feed Data Lake Data Feed 1 Data Feed 2 Data Feed 3 Data Feed 4 Set Access Control Metadata Repository Q1 Region 1 Data Set 1 Q2 Region 2 Data Set 2 Q3 Region 3 Data Set 3 Q4 Region 4 Data Set 4 Access Given Access Denied WORKFLOW SECURE BUSINESS EXECUTION 1 2 3 4 5 6 7 8 Other Data Sources
  • 67. ® 67© 2016 MapR Technologies | © 2016 Dataguise, Inc.. MapR + Dataguise: Comprehensive Data Security Active Directory Disk Auditing Incident Response Authentication Authorization Data Protection Data Protection Compliance Vulnerability Management
  • 68. ® © 2016 MapR Technologies 68© 2016 MapR Technologies Q&A