SlideShare a Scribd company logo
1 of 77
Download to read offline
Page 1 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Leveraging Big Data for Insurance Insights Without
Putting PII/PHI at Risk
February 25, 2016
Page 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Today’s Speakers
Syed Mahmood, Sr. Product Marketing Manager – Hortonworks
smahmood@hortonworks.com
Cindy Maike, GM-Insurance Hortonworks
cmaike@hortonworks.com
Venkat Subramanian, CTO and VP of Engineering – Dataguise
venkat@dataguise.com
Page 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Is Sensitive Data (PII/PHI) a challenge
for your company’s analytics & big data
programs?
A. Yes
B. No
Page 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
If Yes, do you have capabilities in place
to manage sensitive data discovery,
protection and audit?
A. Yes
B. No
Page 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Big Data Business Insights
Insurance Opportunities
Data Privacy Protection Requirements
•  Regulatory
•  Customer Expectations
Page 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Insurance Data Landscape has Changed Dramatically
Customer centric / need
based Insurance Offerings
500GB data per annual
vehicle in UBI programs
Drones will make the
workflow efficient by 2020
Digital becoming consumer / Insured
preferred interaction channel
Growing availability &
usage of geospatial data
Change in Claim frequency &
severity, fraud anomaly analytics
Page 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Industry Opportunity
High-performance analytics, or
a combination of structured and
unstructured data, is changing
the ways of the insurance industry
after decades of conservatism.
Page 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
View of Insurance Industry Data Landscape
BatchReal-time
Datavelocity
Structured Unstructured
Data variety
Semi-structured
Weather-event
Drone image feeds
Social media
Sensor
(GoT)
Geo-location
Deposition recording
Notes and diary
Medical records & bills
Transcriptions
Photos
Investigation
TPA invoices
FNOL intake
Claims triage
Vendor invoices
Forms and
letters
Claim system
Policy verification
Applications/Submissions
3rd party risk models
Prior loss runs
Page 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
New Opportunities – Security Challenges
Use Cases &
Opportunities
Data Sources
(examples)
New Security Challenges
Know Your Customer Application documents, clickstream and web
logs, marketing research, CRM records, and
social media
•  Coverage for multiple file types
and sources
•  Critical detection to find and
measure sensitivity risk
Claims Optimization &
Fraud Detection
Policy records, claims databases, receipts,
accident reports, emails, and transcriptions
•  Reduce or eliminate PCI scope for
Hadoop
•  Detect new sensitivity risks in hard-
to-reach unstructured data
Evaluate Risk / New
Products
Mobile telematics, sensor data, social media,
and voice-to-text files
•  High scale
•  Large sets of small files
•  Detection and protection of
unstructured data
Traditional Documents &
Attachments
Claims data, insured prior loss data, and claims
adjuster notes
•  Masking of sensitive data for data
sharing
•  Sensitive data auditing
Third-party Data Sharing Reporting bureaus, third-party claims
administrators (TPAs), telematics service
providers (TSPs)
•  Tiered access — highly granular
roles with differing needs/views for
sensitive data
Hortonworks + Dataguise =
SECURE BUSINESS EXECUTION
CTO, DATAGUISE
VENKAT SUBRAMANIAN
Dataguise	
  enables	
  Secure	
  Business	
  Execu3on	
  
for	
  data-­‐driven	
  enterprises	
  
by	
  delivering	
  data-­‐centric	
  security	
  solu3ons	
  that	
  
Detect,	
  Audit,	
  Protect	
  and	
  Monitor	
  
sensi3ve	
  data	
  assets	
  
where	
  they	
  are	
  wherever	
  they	
  move	
  
across	
  repositories.	
  
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
11	
  
©2015	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
Secure Business Execution
The ability of an Enterprise
to safely and responsibly leverage
the value of all of their data assets
for the purpose of
gaining new business insights,
maximizing competitive advantage,
and driving revenue growth
12	
  
©2015	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
Business	
  Intelligence	
  Trend	
  for	
  2016	
  
Shi8	
  from	
  
IT-­‐led,	
  System-­‐of-­‐record	
  repor>ng	
  
	
  
Pervasive,	
  Business-­‐led,	
  self-­‐service	
  analy>cs	
  
	
  
•  Easy-­‐to-­‐use,	
  fast,	
  agile	
  BI	
  &	
  Analy>cs	
  
•  Deeper	
  Insights	
  into	
  diverse	
  data	
  sources	
  
	
  
**	
  Rita	
  Sallam,	
  Gartner	
  
13	
  
©2015	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
	
  
	
  
	
  
	
  
Data	
  is	
  your	
  biggest	
  Asset	
  
	
  
It	
  is	
  also	
  your	
  biggest	
  Vulnerability	
  
14	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
DgSecure
15	
  
DETECT	
  
Where	
  sensi3ve	
  content	
  is	
  
present	
  in	
  struct/unstruct/	
  
semi-­‐struct	
  data	
  
AUDIT	
  
Who	
  has	
  access	
  to	
  which	
  
sensi3ve	
  data	
  &	
  iden3fy	
  
misalignments	
  and	
  risk	
  factors	
  
PROTECT	
  
Sensi3ve	
  data	
  at	
  the	
  element	
  
level–encrypt/decrypt	
  with	
  
RBAC,	
  mask	
  
MONITOR	
  
Based	
  on	
  metadata,	
  track	
  how	
  
and	
  where	
  sensi3ve	
  data	
  is	
  
being	
  accessed	
  through	
  a	
  360°	
  
dashboard	
  
Across	
  Hadoop,	
  RDBMS,	
  
Files,	
  NoSQL	
  DB	
  
On	
  Premise,	
  in	
  the	
  
Cloud,	
  or	
  Hybrid	
  
PHI: Guidance for Data De-Identification
Sensitive/Privacy Data
16	
  
•  Name
•  Address
•  Dates – Birth, Death, ..
•  Telephone Numbers
•  Device Identifiers and serial numbers
•  Email addresses
•  SSN
•  Medical record numbers
•  Account Numbers
…..
Secure Environment
Perimeter Security, Volume/File encryption
17	
  
•  I have strong perimeter security
Physical Security, Firewall, IDS/IPS…
Isn’t that enough?
•  I	
  have	
  turned	
  on	
  volume/file-­‐level	
  
encryp>on	
  
	
  Control	
  data	
  access	
  	
  
	
  Mee>ng	
  regulatory	
  compliance	
  
	
  Isn’t	
  this	
  enough?	
  
Need	
  BOTH	
  and	
  *more!	
  
What Should We Do?	
  
18	
  
1.  Precisely locate sensitive content across ALL repositories
2.  Protect those assets appropriately – masking, encryption
3.  Open up ‘controlled’ access to data now that sensitive elements are
protected
4.  Enable employees, trusted partners and customers to make data-driven
decisions
RISKS	
  
	
  
BREACH	
  
	
  
SECURITY	
  
	
  
COMPLIANCE	
  
VALUE	
  
	
  
REVENUE	
  
	
  
DATA	
  DRIVEN	
  DECISIONS	
  
	
  
BUSINESS	
  INTELLIGENCE	
  
At the cell-level…
©2015	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
How do we do it in DgSecure
19	
  
Complex Sensitive Data Discovery
20	
  
Sensitive Data Type Sample Data
Address 50920 April Blvd. Apt. 181, Lalana ME 83271
1000 Coney Island Ave. Brooklyn NY 11230
Name George Smith
Smith, A. George
Credit Card Number 3710 664089 10315
345039502030507
3780-331072-30547
Telephone Number (510) 824-1036
510-824-1036
510.814.1036
5108141036
Sensitive Data Protection
Masking & Encryption in Hadoop
21	
  
•  MASKING
–  Obfuscation, one-way operation
–  Multiple options in DgSecure – fictitious but realistic values, X’ing out part of the
content….
–  Consistent masking to retain statistical distribution of data
•  ENCRYPTION
–  Encrypted cell/row
–  Accessible by authorized users only – Hive, bulk, via App
–  Granular protection
•  REDACTION
–  X’ing out entire sensitive data cell
–  Nullifying
Masking Data in Hadoop (Cell Level)
22	
  
Masking Data in Hadoop (Cell Level)
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
23	
  
Masking Data in Hadoop (Cell Level)
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
24	
  
Encrypting Data in Hadoop (Cell Level)
25	
  
Encrypting Data in Hadoop (Cell Level)
26	
  	
  26	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
Decryption through hive queries
27	
  
User	
  WITHOUT	
  access	
  privileges	
  on	
  Names	
  &	
  SSN	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
Decryption through hive queries
28	
  
User	
  WITH	
  access	
  privileges	
  on	
  Names	
  &	
  SSN	
  
Encryption or Masking in Hadoop
	
  
	
  
	
  
	
  Analy3c	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Transac3onal	
  
	
  
	
  
	
  
Trading	
  System	
  Perf.	
  
Customer	
  reten3on	
  
Payments	
  Risk	
  Mgmt.	
  
IT	
  Security	
  Intelligence	
  
IP	
  Addresses	
  
Name	
  
Personal	
  Health	
  Info	
  
Credit	
  Card	
  Number	
  
Dynamic	
  pricing	
  
Process	
  efficiency	
  
Log	
  analysis	
  
Insurance	
  Premiums	
  
Clinical	
  trial	
  analysis	
  
Smart	
  metering	
  
Risk	
  Modeling	
  
Supply	
  chain	
  op3miza3on	
  
Brand	
  sen3ment	
  
Real-­‐3me	
  upsell	
  
Monitoring	
  Sensors	
  
Social	
  Security	
  Number	
  
Date	
  of	
  Birth	
  (DOB)	
  
IP	
  Address	
  
URL	
  
Email	
  Address	
  
Telephone	
  Number	
  
Credit	
  limit	
  
Purchase	
  amount	
  
Customer	
  life3me	
  value	
  
Address	
  
Device	
  ID	
  
Transac3on	
  Date	
  
VIN	
  
Person	
  of	
  Interest	
  Discovery	
  
Session	
  Op3miza3on	
  
Encryption or Masking in Hadoop
	
  
	
  
	
  
	
  Analy3c	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Transac3onal	
  
	
  
	
  
	
  
Trading	
  System	
  Perf.	
  
Customer	
  reten3on	
  
Payments	
  Risk	
  Mgmt.	
  
IT	
  Security	
  Intelligence	
  
Medical	
  test	
  results	
  
Name	
  
Personal	
  Health	
  Info	
  
Credit	
  Card	
  Number	
  
Dynamic	
  pricing	
  
Process	
  efficiency	
  
Log	
  analysis	
  
Insurance	
  Premiums	
  
Clinical	
  trial	
  analysis	
  
Smart	
  metering	
  
Risk	
  Modeling	
  
Supply	
  chain	
  op3miza3on	
  
Brand	
  sen3ment	
  
Real-­‐3me	
  upsell	
  
Monitoring	
  Sensors	
  
Social	
  Security	
  Number	
  
Date	
  of	
  Birth	
  (DOB)	
  
IP	
  Address	
  
URL	
  
Email	
  Address	
  
Telephone	
  Number	
  
Credit	
  limit	
  
Purchase	
  amount	
  
Customer	
  life3me	
  value	
  
Address	
  
Mask	
  
Encrypt	
  Device	
  ID	
  
Transac3on	
  Date	
  
VIN	
  
Person	
  of	
  Interest	
  Discovery	
  
Session	
  Op3miza3on	
  
Encryption or Masking in Hadoop
	
  
	
  
	
  
	
  Analy3c	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Transac3onal	
  
	
  
	
  
	
  
Trading	
  System	
  Perf.	
  
Customer	
  reten3on	
  
Payments	
  Risk	
  Mgmt.	
  
IT	
  Security	
  Intelligence	
  
Biometric	
  IDs	
  
Name	
  
Personal	
  Health	
  Info	
  
Credit	
  Card	
  Number	
  
Dynamic	
  pricing	
  
Process	
  efficiency	
  
Log	
  analysis	
  
Insurance	
  Premiums	
  
Clinical	
  trial	
  analysis	
  
Smart	
  metering	
  
Risk	
  Modeling	
  
Supply	
  chain	
  op3miza3on	
  
Brand	
  sen3ment	
  
Real-­‐3me	
  upsell	
  
Monitoring	
  Sensors	
  
Social	
  Security	
  Number	
  
Date	
  of	
  Birth	
  (DOB)	
  
IP	
  Address	
  
URL	
  
Email	
  Address	
  
Telephone	
  Number	
  
Credit	
  limit	
  
Purchase	
  amount	
  
Customer	
  life3me	
  value	
  
Address	
  
Mask	
  
Encrypt	
  Device	
  ID	
  
Transac3on	
  Date	
  
VIN	
  
Person	
  of	
  Interest	
  Discovery	
  
Session	
  Op3miza3on	
  
Encryption or Masking in Hadoop
	
  
	
  
	
  
	
  Analy3c	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Transac3onal	
  
	
  
	
  
	
  
Trading	
  System	
  Perf.	
  
Customer	
  reten3on	
  
Payments	
  Risk	
  Mgmt.	
  
IT	
  Security	
  Intelligence	
  
Dynamic	
  pricing	
  
Process	
  efficiency	
  
Log	
  analysis	
  
Insurance	
  Premiums	
  
Clinical	
  trial	
  analysis	
  
Smart	
  metering	
  
Risk	
  Modeling	
  
Supply	
  chain	
  op3miza3on	
  
Brand	
  sen3ment	
  
Real-­‐3me	
  upsell	
  
Monitoring	
  Sensors	
  
Person	
  of	
  Interest	
  Discovery	
  
Session	
  Op3miza3on	
  
Medical	
  test	
  results	
  
Name	
  
Personal	
  Health	
  Info	
  
Credit	
  Card	
  Number	
  
Social	
  Security	
  Number	
  
Date	
  of	
  Birth	
  (DOB)	
  
IP	
  Address	
  
URL	
  
Email	
  Address	
  
Telephone	
  Number	
  
Credit	
  limit	
  
Purchase	
  amount	
  
Customer	
  life3me	
  value	
  
Address	
  
Mask	
  
Device	
  ID	
  
Transac3on	
  Date	
  
VIN	
  
Number	
  
Encrypt	
  
Encryption or Masking in Hadoop
	
  
	
  
	
  
	
  Analy3c	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  Transac3onal	
  
	
  
	
  
	
  
Trading	
  System	
  Perf.	
  
Customer	
  reten3on	
  
Payments	
  Risk	
  Mgmt.	
  
IT	
  Security	
  Intelligence	
  
Medical	
  test	
  results	
  
Name	
  
Personal	
  Health	
  Info	
  
Credit	
  Card	
  Number	
  
Dynamic	
  pricing	
  
Process	
  efficiency	
  
Log	
  analysis	
  
Insurance	
  Premiums	
  
Clinical	
  trial	
  analysis	
  
Smart	
  metering	
  
Risk	
  Modeling	
  
Supply	
  chain	
  op3miza3on	
  
Brand	
  sen3ment	
  
Real-­‐3me	
  upsell	
  
Monitoring	
  Sensors	
  
Social	
  Security	
  Number	
  
Date	
  of	
  Birth	
  (DOB)	
  
IP	
  Address	
  
URL	
  
Email	
  Address	
  
Telephone	
  Number	
  
Credit	
  limit	
  
Purchase	
  amount	
  
Customer	
  life3me	
  value	
  
Address	
  
Mask	
  
Encrypt	
  Device	
  ID	
  
Transac3on	
  Date	
  
VIN	
  
Person	
  of	
  Interest	
  Discovery	
  
Session	
  Op3miza3on	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  Proprietary	
   34	
  
	
  
	
  
	
  
How	
  does	
  this	
  work	
  in	
  DgSecure	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  
Proprietary	
  
HIGH-LEVEL DgSECURE FOR HADOOP FUNCTIONALITY
35	
  
Policy	
  Management	
  
Domain	
  Defini3on	
  
custom	
  Elements	
  
	
  	
  -­‐	
  	
  Composite	
  
	
  	
  -­‐	
  	
  Dependent	
  
Policy	
  
	
  	
  -­‐	
  	
  Per	
  Data	
  Feed?	
  
Protec3on	
  
Op3ons	
  
	
  
Detec3on	
  
In-­‐Flight	
  
Within	
  HDFS	
  
Full	
  vs.	
  
Incremental	
  
Structured	
  vs.	
  
Semi/Unstructured	
  
Quick	
  scan	
  
Element	
  Count	
  
Audi3ng	
  
Files/Dirs	
  
-­‐	
  	
  Sensi3ve	
  
elements	
  
-­‐	
  	
  	
  Protected?	
  
-­‐	
  	
  Who	
  has	
  access	
  	
  
	
  
Users	
  
-­‐	
  What	
  can	
  they	
  see	
  
Protec3on	
  
Domain	
  based	
  
Masking	
  
Redac3on	
  
Encryp3on	
  
	
  -­‐	
  Field	
  or	
  Record	
  
	
  -­‐	
  AES	
  or	
  FPE	
  
	
  
Repor3ng	
  
Job	
  Level	
  
	
  	
  -­‐	
  	
  Sensi3ve	
  elements	
  
	
  	
  -­‐	
  	
  Directories	
  &	
  Files	
  
	
  	
  -­‐	
  	
  Remedia3on	
  applied	
  
Dashboard	
  
	
  	
  -­‐	
  Directory	
  or	
  	
  by	
  policy	
  
	
  	
  	
  -­‐	
  Drill-­‐down	
  
Audit	
  report	
  
	
  -­‐	
  User	
  ac3ons	
  
No3fica3ons	
  
	
  
Set	
  Policy
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
36	
  
Data	
  Elements
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
37	
  
Define/Execute	
  Detec>on/Protec>on	
  Task
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
38	
  
Discovery	
  Task	
  Result
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
39	
  
MaskingTask	
  Result
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
40	
  
Masking	
  Task	
  Result	
  
41	
  
Dashboard
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
42	
  
Entitlement Reports
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
43	
  
Audit Reports
©2015	
  Contains	
  confiden3al	
  and	
  proprietary	
  informa3on	
  
and	
  may	
  not	
  be	
  disclosed	
  by	
  the	
  recipient	
  to	
  any	
  third	
  
party.	
  
44	
  
©2016	
  Dataguise,	
  Inc.	
  	
  	
  Confiden3al	
  and	
  Proprietary	
   45	
  
	
  
	
  
	
  
	
  
Sample	
  Secure	
  Business	
  Workflow	
  	
  
in	
  an	
  Enterprise	
  
Sample	
  End	
  to	
  End	
  Flow	
  
46	
  
Sample	
  End	
  to	
  End	
  Flow	
  
47	
  
CISO/CPO:	
  
Set	
  policy	
  per	
  data	
  feed	
  
type	
  
Sample	
  End	
  to	
  End	
  Flow	
  
48	
  
Data	
  Asset	
  Owner:	
  
Provenance	
  metadata	
  
Sample	
  End	
  to	
  End	
  Flow	
  
49	
  
IT/Set	
  Process:	
  
Run	
  Discovery	
  to	
  detect	
  
sensi3ve	
  data	
  
Metadata	
  to	
  repository	
  
(Atlas)	
  
Sample	
  End	
  to	
  End	
  Flow	
  
50	
  
IT/Set	
  Process:	
  
Use	
  Metadata	
  to	
  set	
  access	
  
control	
  in	
  Ranger	
  
Sample	
  End	
  to	
  End	
  Flow	
  
51	
  
Run	
  Masking/Encr	
  to	
  protect	
  
sensi3ve	
  data	
  
Metadata	
  incl.	
  lineage	
  to	
  
repository	
  (Atlas)	
  
Sample	
  End	
  to	
  End	
  Flow	
  
52	
  
IT/Set	
  Process:	
  
Use	
  Metadata	
  to	
  set	
  access	
  
control	
  in	
  Ranger	
  
Sample	
  End	
  to	
  End	
  Flow	
  
53	
  
Data	
  Asset	
  owner	
  adds	
  
annota3ons	
  &	
  adds	
  to	
  Data	
  
Asset	
  Index	
  
Sample	
  End	
  to	
  End	
  Flow	
  
54	
  
Data	
  Scien3st	
  browses	
  
available	
  data	
  sets	
  and	
  
makes	
  access	
  request	
  
Sample	
  End	
  to	
  End	
  Flow	
  
55	
  
Data	
  owner	
  approves	
  
request	
  
Sets	
  access	
  control	
  in	
  
Ranger	
  
Sample	
  End	
  to	
  End	
  Flow	
  
56	
  
Data	
  Scien3st	
  runs	
  data	
  
mining/BI/Analy3cs	
  
Sample	
  End	
  to	
  End	
  Flow	
  
57	
  
Page 58 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ranger and Knox: Building on the Vision of
Comprehensive Security
Syed Mahmood
Page 59 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Security Challenges of Data Lake
Central repository of critical and sensitive
data
Data maintained over long duration
External ecosystem is in flux
Users can access and analyze data in new
and different ways
Page 60 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
How do I set policy across the entire cluster?
Who am I/prove it?
What can I do?
What did I do?
How can I encrypt at rest and over the wire?
Differentiator 1: Comprehensive Approach to Security
Data Protection
Protect data at rest and in motion
In order to protect any data system you must implement the following:
Audit
Maintain a record of data access
Authorization
Provision access to data
Authentication
Authenticate users and systems
Administration
Central management and consistent security
Page 61 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDP Security: Comprehensive, Complete, Extensible
Data Protection
Protect data at rest and in motion
Security in HDP is the most comprehensive, complete and extensible for Hadoop
Audit
Maintain a record of data access
Authorization
Provision access to data
Authentication
Authenticate users and systems
Administration
Central management and consistent security
Single administrative console to set policy
across the entire cluster: Apache Ranger
Authentication for perimeter and cluster;
integrates with existing Active Directory and
LDAP solutions: Kerberos | Apache Knox
Consistent authorization controls across all
Apache components within HDP: Apache
Ranger
Record of data access events across all
components that is consistent and accessible:
Apache Ranger | Apache Atlas
Encrypts data in motion and data at rest; refer
partner encryption solutions for broader
needs: HDFS TDE with Ranger KMS
Page 62 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
	
  	
  
YARN : Data Operating System
DATA ACCESS SECURITY
GOVERNANCE &
INTEGRATION
OPERATIONS
1	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
N	
  
Data Lifecycle &
Governance
Falcon
Atlas
Administration
Authentication
Authorization
Auditing
Data Protection
Ranger
Knox
Atlas
HDFS Encryption
Data Workflow
Sqoop
Flume
Kafka
NFS
WebHDFS
Provisioning,
Managing, &
Monitoring
Ambari
Cloudbreak
Zookeeper
Scheduling
Oozie
Batch
MapReduce
Script
Pig
Search
Solr
SQL
Hive
NoSQL
HBase
Accumulo
Phoenix
Stream
Storm
In-memory
Spark
Others
ISV Engines
TezTez Tez Slider Slider
HDFS Hadoop Distributed File System
DATA MANAGEMENT
Hortonworks Data Platform 2.3
Deployment	
  Choice	
  Linux Windows On-Premise Cloud
Differentiator 2: Security Built into the Platform
Page 63 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Security Built into the Platform
Security is consistently
administered across data
access engines
Build or retire applications
without impacting security
	
  	
  
YARN : Data Operating System
DATA ACCESS SECURITY
GOVERNANCE &
INTEGRATION
OPERATIONS
1	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
N	
  
Data Lifecycle &
Governance
Falcon
Atlas
Administration
Authentication
Authorization
Auditing
Data Protection
Ranger
Knox
Atlas
HDFS EncryptionData Workflow
Sqoop
Flume
Kafka
NFS
WebHDFS
Provisioning,
Managing, &
Monitoring
Ambari
Cloudbreak
Zookeeper
Scheduling
Oozie
Batch
MapReduce
Script
Pig
Search
Solr
SQL
Hive
NoSQL
HBase
Accumulo
Phoenix
Stream
Storm
In-memory
Spark
Others
ISV Engines
TezTez Tez Slider Slider
HDFS Hadoop Distributed File System
DATA MANAGEMENT
Hortonworks Data Platform 2.3
Deployment	
  Choice	
  Linux Windows On-Premise Cloud
Page 64 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Security in Hadoop with HDP
•  Wire encryption in
Hadoop
•  HDFS Encryption
with Ranger KMS
•  Centralized audit
reporting with
Apache Ranger
•  Fine-grain access
control with
Apache Ranger
Authorization
What can I do?
Audit
What did I do?
Data Protection
Can data be encrypted at
rest and over the wire?
•  Kerberos
•  API security with
Apache Knox
Authentication
Who am I/prove it?
HDP2.3
Centralized Security Administration with Ranger
Page 65 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Ranger
Comprehensive security for Enterprise Hadoop
Page 66 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Centralized Security with Ranger
Centralized platform
•  Centralized platform to define,
administer and manage
security policies consistently
•  Define security policy once and
apply it to all the applicable
components across the stack
Page 67 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Page 68 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Centralized Security with Ranger
Centralized platform
•  Administer security for:
–  Database
–  Table
–  Column
–  LDAP Groups
–  Specific Users
Fine-grained
security definition
•  Centralized platform to define,
administer and manage
security policies consistently
•  Define security policy once and
apply it to all the applicable
components across the stack
Page 69 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Page 70 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Centralized Security with Ranger
•  Administrators have complete
visibility into the security
administration process
Deep visibilityCentralized platform
•  Administer security for:
–  Database
–  Table
–  Column
–  LDAP Groups
–  Specific Users
Fine-grained
security definition
•  Centralized platform to define,
administer and manage
security policies consistently
•  Define security policy once and
apply it to all the applicable
components across the stack
Page 71 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Page 72 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Authorization and Auditing with Ranger
HDFS
Ranger Administration Portal
HBase
Hive Server2
Ranger Audit Server
Ranger
Plugin
HadoopComponentsEnterprise
Users
Ranger
Plugin
Ranger
Plugin
Legacy Tools and Data
Governance
HDFS
Knox
Storm
Ranger
Plugin
Ranger
Plugin
RDBMS
Solr
Ranger
Plugin
Ranger Policy Server
Future Additions
Currently
Supported in
HDP 2.2
Integration API
Kafka
Ranger
Plugin
YARN
Ranger
Plugin
TBD
Ranger
Plugin
Page 73 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Atlas is Now Included in HDP
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Policy RulesTaxonomies
Tag Based
Policies
Data Lifecycle
Management
Real Time Tag Based Access Control
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Energy
PPDM
Retail
PCI
PII
Other
CWM
Rest API
Modern, flexible access to Atlas services, HDP components and external tools
Search—SQL, like DSL (Domain Specific Language)
Support for key word, faceted and full text searches
Lineage
Capture all SQL runtime activity on HiveServer2 providing lineage for both data
and schema
Exchange
Leverage existing metadata by importing it from ETL tools, ERP systems and
data warehouses
Export metadata to downstream systems
Page 74 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Atlas Vision 2015
Metadata Services
Business Taxonomy - classification
Operational Data – Model for Hive: DB, Tables, Col,
Centralized location for all metadata inside HDP
Single Interface point for Metadata Exchange with
platforms outside of HDP.
Search & Prescriptive Lineage – Model and Audit
Apache Atlas
Hive
Ranger
Falcon
Kafka
Storm
© Hortonworks Inc. 2015. All Rights Reserved
The Insurance Data Landscape has Changed
u  The insurance industry is joining and analyzing data which has never
been analyzed before
u  Many of these sources can be “murky” and sensitive
u  Traditional PII/PHI data sources ingested into Hadoop needs to be:
•  Discovered
•  Protected
Ø  Protecting PII/PHI data is not an option for Insurers, TPAs and
Brokers…. it is a Requirement
Summary
Page 76 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Questions ?
Page 77 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Call to Action
Additional Information :
q Data Protection Optimized for Insurance Big Data – A Dataguise and
Hortonworks Capability Overview
q Hortonworks: Comprehensive Security in Hadoop – Solving Security in
Hadoop Whitepaper
q Hortonworks: Building Governance into Big Data – Whitepaper

More Related Content

What's hot

Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...DataWorks Summit/Hadoop Summit
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?Hortonworks
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHortonworks
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationHortonworks
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic EcosystemsHortonworks
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...DataWorks Summit
 

What's hot (20)

Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare Transformation
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...
 

Viewers also liked

Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNHortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache AmbariHortonworks
 
Protecting phi and pii - hipaa challenges and solutions - privacy vs cost
Protecting phi and pii -  hipaa challenges and solutions - privacy vs costProtecting phi and pii -  hipaa challenges and solutions - privacy vs cost
Protecting phi and pii - hipaa challenges and solutions - privacy vs costUlf Mattsson
 
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...Ulf Mattsson
 
Best Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data PlatformBest Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data PlatformMapR Technologies
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...Hortonworks
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 

Viewers also liked (20)

Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Protecting phi and pii - hipaa challenges and solutions - privacy vs cost
Protecting phi and pii -  hipaa challenges and solutions - privacy vs costProtecting phi and pii -  hipaa challenges and solutions - privacy vs cost
Protecting phi and pii - hipaa challenges and solutions - privacy vs cost
 
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...
Securing fintech - threats, challenges, best practices, ffiec, nist, and beyo...
 
Best Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data PlatformBest Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data Platform
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 

Similar to Dataguise hortonworks insurance_feb25

David valovcin big data - big risk
David valovcin big data - big riskDavid valovcin big data - big risk
David valovcin big data - big riskIBM Sverige
 
Bridging the Data Security Gap
Bridging the Data Security GapBridging the Data Security Gap
Bridging the Data Security Gapxband
 
Data Loss Prevention
Data Loss PreventionData Loss Prevention
Data Loss PreventionReza Kopaee
 
A Survey On Data Leakage Detection
A Survey On Data Leakage DetectionA Survey On Data Leakage Detection
A Survey On Data Leakage DetectionIJERA Editor
 
Threat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the OutsideThreat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the OutsideDLT Solutions
 
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest Minds
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest MindsWhitepaper: IP Risk Assessment & Loss Prevention - Happiest Minds
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest MindsHappiest Minds Technologies
 
Security Implications of Accenture Technology Vision 2015 - Executive Report
Security Implications of Accenture Technology Vision 2015 - Executive ReportSecurity Implications of Accenture Technology Vision 2015 - Executive Report
Security Implications of Accenture Technology Vision 2015 - Executive ReportAccenture Technology
 
Bridging the Gap Between Your Security Defenses and Critical Data
Bridging the Gap Between Your Security Defenses and Critical DataBridging the Gap Between Your Security Defenses and Critical Data
Bridging the Gap Between Your Security Defenses and Critical DataIBM Security
 
Classification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtectionClassification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtectionGianmarco Ferri
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
Shariyaz abdeen data leakage prevention presentation
Shariyaz abdeen   data leakage prevention presentationShariyaz abdeen   data leakage prevention presentation
Shariyaz abdeen data leakage prevention presentationShariyaz Abdeen
 
2019 09-26 leveraging the power of automated intelligence for privacy management
2019 09-26 leveraging the power of automated intelligence for privacy management2019 09-26 leveraging the power of automated intelligence for privacy management
2019 09-26 leveraging the power of automated intelligence for privacy managementTrustArc
 
From Bad to Worse: How to Stay Protected from a Mega Data Breach
From Bad to Worse: How to Stay Protected from a Mega Data BreachFrom Bad to Worse: How to Stay Protected from a Mega Data Breach
From Bad to Worse: How to Stay Protected from a Mega Data BreachPaymetric, Inc.
 
Data security in the cloud
Data security in the cloud Data security in the cloud
Data security in the cloud IBM Security
 
Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5Accenture Technology
 
Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5accenture
 
Case study financial_services
Case study financial_servicesCase study financial_services
Case study financial_servicesG. Subramanian
 

Similar to Dataguise hortonworks insurance_feb25 (20)

David valovcin big data - big risk
David valovcin big data - big riskDavid valovcin big data - big risk
David valovcin big data - big risk
 
Bridging the Data Security Gap
Bridging the Data Security GapBridging the Data Security Gap
Bridging the Data Security Gap
 
Data Loss Prevention
Data Loss PreventionData Loss Prevention
Data Loss Prevention
 
A Survey On Data Leakage Detection
A Survey On Data Leakage DetectionA Survey On Data Leakage Detection
A Survey On Data Leakage Detection
 
Sensitive Data Assesment
Sensitive Data AssesmentSensitive Data Assesment
Sensitive Data Assesment
 
Threat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the OutsideThreat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the Outside
 
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest Minds
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest MindsWhitepaper: IP Risk Assessment & Loss Prevention - Happiest Minds
Whitepaper: IP Risk Assessment & Loss Prevention - Happiest Minds
 
Security Implications of Accenture Technology Vision 2015 - Executive Report
Security Implications of Accenture Technology Vision 2015 - Executive ReportSecurity Implications of Accenture Technology Vision 2015 - Executive Report
Security Implications of Accenture Technology Vision 2015 - Executive Report
 
Bridging the Gap Between Your Security Defenses and Critical Data
Bridging the Gap Between Your Security Defenses and Critical DataBridging the Gap Between Your Security Defenses and Critical Data
Bridging the Gap Between Your Security Defenses and Critical Data
 
Classification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtectionClassification-HowToBoostInformationProtection
Classification-HowToBoostInformationProtection
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Shariyaz abdeen data leakage prevention presentation
Shariyaz abdeen   data leakage prevention presentationShariyaz abdeen   data leakage prevention presentation
Shariyaz abdeen data leakage prevention presentation
 
2019 09-26 leveraging the power of automated intelligence for privacy management
2019 09-26 leveraging the power of automated intelligence for privacy management2019 09-26 leveraging the power of automated intelligence for privacy management
2019 09-26 leveraging the power of automated intelligence for privacy management
 
Microsoft 365 Compliance
Microsoft 365 ComplianceMicrosoft 365 Compliance
Microsoft 365 Compliance
 
A data-centric program
A data-centric program A data-centric program
A data-centric program
 
From Bad to Worse: How to Stay Protected from a Mega Data Breach
From Bad to Worse: How to Stay Protected from a Mega Data BreachFrom Bad to Worse: How to Stay Protected from a Mega Data Breach
From Bad to Worse: How to Stay Protected from a Mega Data Breach
 
Data security in the cloud
Data security in the cloud Data security in the cloud
Data security in the cloud
 
Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5
 
Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5Digital Trust - Tech Vision 2016 Trend 5
Digital Trust - Tech Vision 2016 Trend 5
 
Case study financial_services
Case study financial_servicesCase study financial_services
Case study financial_services
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Dataguise hortonworks insurance_feb25

  • 1. Page 1 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Leveraging Big Data for Insurance Insights Without Putting PII/PHI at Risk February 25, 2016
  • 2. Page 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Today’s Speakers Syed Mahmood, Sr. Product Marketing Manager – Hortonworks smahmood@hortonworks.com Cindy Maike, GM-Insurance Hortonworks cmaike@hortonworks.com Venkat Subramanian, CTO and VP of Engineering – Dataguise venkat@dataguise.com
  • 3. Page 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Is Sensitive Data (PII/PHI) a challenge for your company’s analytics & big data programs? A. Yes B. No
  • 4. Page 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved If Yes, do you have capabilities in place to manage sensitive data discovery, protection and audit? A. Yes B. No
  • 5. Page 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Big Data Business Insights Insurance Opportunities Data Privacy Protection Requirements •  Regulatory •  Customer Expectations
  • 6. Page 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Insurance Data Landscape has Changed Dramatically Customer centric / need based Insurance Offerings 500GB data per annual vehicle in UBI programs Drones will make the workflow efficient by 2020 Digital becoming consumer / Insured preferred interaction channel Growing availability & usage of geospatial data Change in Claim frequency & severity, fraud anomaly analytics
  • 7. Page 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Industry Opportunity High-performance analytics, or a combination of structured and unstructured data, is changing the ways of the insurance industry after decades of conservatism.
  • 8. Page 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved View of Insurance Industry Data Landscape BatchReal-time Datavelocity Structured Unstructured Data variety Semi-structured Weather-event Drone image feeds Social media Sensor (GoT) Geo-location Deposition recording Notes and diary Medical records & bills Transcriptions Photos Investigation TPA invoices FNOL intake Claims triage Vendor invoices Forms and letters Claim system Policy verification Applications/Submissions 3rd party risk models Prior loss runs
  • 9. Page 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New Opportunities – Security Challenges Use Cases & Opportunities Data Sources (examples) New Security Challenges Know Your Customer Application documents, clickstream and web logs, marketing research, CRM records, and social media •  Coverage for multiple file types and sources •  Critical detection to find and measure sensitivity risk Claims Optimization & Fraud Detection Policy records, claims databases, receipts, accident reports, emails, and transcriptions •  Reduce or eliminate PCI scope for Hadoop •  Detect new sensitivity risks in hard- to-reach unstructured data Evaluate Risk / New Products Mobile telematics, sensor data, social media, and voice-to-text files •  High scale •  Large sets of small files •  Detection and protection of unstructured data Traditional Documents & Attachments Claims data, insured prior loss data, and claims adjuster notes •  Masking of sensitive data for data sharing •  Sensitive data auditing Third-party Data Sharing Reporting bureaus, third-party claims administrators (TPAs), telematics service providers (TSPs) •  Tiered access — highly granular roles with differing needs/views for sensitive data
  • 10. Hortonworks + Dataguise = SECURE BUSINESS EXECUTION CTO, DATAGUISE VENKAT SUBRAMANIAN
  • 11. Dataguise  enables  Secure  Business  Execu3on   for  data-­‐driven  enterprises   by  delivering  data-­‐centric  security  solu3ons  that   Detect,  Audit,  Protect  and  Monitor   sensi3ve  data  assets   where  they  are  wherever  they  move   across  repositories.   ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   11  
  • 12. ©2015  Dataguise,  Inc.      Confiden3al  and   Proprietary   Secure Business Execution The ability of an Enterprise to safely and responsibly leverage the value of all of their data assets for the purpose of gaining new business insights, maximizing competitive advantage, and driving revenue growth 12  
  • 13. ©2015  Dataguise,  Inc.      Confiden3al  and   Proprietary   Business  Intelligence  Trend  for  2016   Shi8  from   IT-­‐led,  System-­‐of-­‐record  repor>ng     Pervasive,  Business-­‐led,  self-­‐service  analy>cs     •  Easy-­‐to-­‐use,  fast,  agile  BI  &  Analy>cs   •  Deeper  Insights  into  diverse  data  sources     **  Rita  Sallam,  Gartner   13  
  • 14. ©2015  Dataguise,  Inc.      Confiden3al  and   Proprietary           Data  is  your  biggest  Asset     It  is  also  your  biggest  Vulnerability   14  
  • 15. ©2016  Dataguise,  Inc.      Confiden3al  and   Proprietary   DgSecure 15   DETECT   Where  sensi3ve  content  is   present  in  struct/unstruct/   semi-­‐struct  data   AUDIT   Who  has  access  to  which   sensi3ve  data  &  iden3fy   misalignments  and  risk  factors   PROTECT   Sensi3ve  data  at  the  element   level–encrypt/decrypt  with   RBAC,  mask   MONITOR   Based  on  metadata,  track  how   and  where  sensi3ve  data  is   being  accessed  through  a  360°   dashboard   Across  Hadoop,  RDBMS,   Files,  NoSQL  DB   On  Premise,  in  the   Cloud,  or  Hybrid  
  • 16. PHI: Guidance for Data De-Identification Sensitive/Privacy Data 16   •  Name •  Address •  Dates – Birth, Death, .. •  Telephone Numbers •  Device Identifiers and serial numbers •  Email addresses •  SSN •  Medical record numbers •  Account Numbers …..
  • 17. Secure Environment Perimeter Security, Volume/File encryption 17   •  I have strong perimeter security Physical Security, Firewall, IDS/IPS… Isn’t that enough? •  I  have  turned  on  volume/file-­‐level   encryp>on    Control  data  access      Mee>ng  regulatory  compliance    Isn’t  this  enough?   Need  BOTH  and  *more!  
  • 18. What Should We Do?   18   1.  Precisely locate sensitive content across ALL repositories 2.  Protect those assets appropriately – masking, encryption 3.  Open up ‘controlled’ access to data now that sensitive elements are protected 4.  Enable employees, trusted partners and customers to make data-driven decisions RISKS     BREACH     SECURITY     COMPLIANCE   VALUE     REVENUE     DATA  DRIVEN  DECISIONS     BUSINESS  INTELLIGENCE   At the cell-level…
  • 19. ©2015  Dataguise,  Inc.      Confiden3al  and   Proprietary   How do we do it in DgSecure 19  
  • 20. Complex Sensitive Data Discovery 20   Sensitive Data Type Sample Data Address 50920 April Blvd. Apt. 181, Lalana ME 83271 1000 Coney Island Ave. Brooklyn NY 11230 Name George Smith Smith, A. George Credit Card Number 3710 664089 10315 345039502030507 3780-331072-30547 Telephone Number (510) 824-1036 510-824-1036 510.814.1036 5108141036
  • 21. Sensitive Data Protection Masking & Encryption in Hadoop 21   •  MASKING –  Obfuscation, one-way operation –  Multiple options in DgSecure – fictitious but realistic values, X’ing out part of the content…. –  Consistent masking to retain statistical distribution of data •  ENCRYPTION –  Encrypted cell/row –  Accessible by authorized users only – Hive, bulk, via App –  Granular protection •  REDACTION –  X’ing out entire sensitive data cell –  Nullifying
  • 22. Masking Data in Hadoop (Cell Level) 22  
  • 23. Masking Data in Hadoop (Cell Level) ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   23  
  • 24. Masking Data in Hadoop (Cell Level) ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   24  
  • 25. Encrypting Data in Hadoop (Cell Level) 25  
  • 26. Encrypting Data in Hadoop (Cell Level) 26    26  
  • 27. ©2016  Dataguise,  Inc.      Confiden3al  and   Proprietary   Decryption through hive queries 27   User  WITHOUT  access  privileges  on  Names  &  SSN  
  • 28. ©2016  Dataguise,  Inc.      Confiden3al  and   Proprietary   Decryption through hive queries 28   User  WITH  access  privileges  on  Names  &  SSN  
  • 29. Encryption or Masking in Hadoop        Analy3c                Transac3onal         Trading  System  Perf.   Customer  reten3on   Payments  Risk  Mgmt.   IT  Security  Intelligence   IP  Addresses   Name   Personal  Health  Info   Credit  Card  Number   Dynamic  pricing   Process  efficiency   Log  analysis   Insurance  Premiums   Clinical  trial  analysis   Smart  metering   Risk  Modeling   Supply  chain  op3miza3on   Brand  sen3ment   Real-­‐3me  upsell   Monitoring  Sensors   Social  Security  Number   Date  of  Birth  (DOB)   IP  Address   URL   Email  Address   Telephone  Number   Credit  limit   Purchase  amount   Customer  life3me  value   Address   Device  ID   Transac3on  Date   VIN   Person  of  Interest  Discovery   Session  Op3miza3on  
  • 30. Encryption or Masking in Hadoop        Analy3c                Transac3onal         Trading  System  Perf.   Customer  reten3on   Payments  Risk  Mgmt.   IT  Security  Intelligence   Medical  test  results   Name   Personal  Health  Info   Credit  Card  Number   Dynamic  pricing   Process  efficiency   Log  analysis   Insurance  Premiums   Clinical  trial  analysis   Smart  metering   Risk  Modeling   Supply  chain  op3miza3on   Brand  sen3ment   Real-­‐3me  upsell   Monitoring  Sensors   Social  Security  Number   Date  of  Birth  (DOB)   IP  Address   URL   Email  Address   Telephone  Number   Credit  limit   Purchase  amount   Customer  life3me  value   Address   Mask   Encrypt  Device  ID   Transac3on  Date   VIN   Person  of  Interest  Discovery   Session  Op3miza3on  
  • 31. Encryption or Masking in Hadoop        Analy3c                Transac3onal         Trading  System  Perf.   Customer  reten3on   Payments  Risk  Mgmt.   IT  Security  Intelligence   Biometric  IDs   Name   Personal  Health  Info   Credit  Card  Number   Dynamic  pricing   Process  efficiency   Log  analysis   Insurance  Premiums   Clinical  trial  analysis   Smart  metering   Risk  Modeling   Supply  chain  op3miza3on   Brand  sen3ment   Real-­‐3me  upsell   Monitoring  Sensors   Social  Security  Number   Date  of  Birth  (DOB)   IP  Address   URL   Email  Address   Telephone  Number   Credit  limit   Purchase  amount   Customer  life3me  value   Address   Mask   Encrypt  Device  ID   Transac3on  Date   VIN   Person  of  Interest  Discovery   Session  Op3miza3on  
  • 32. Encryption or Masking in Hadoop        Analy3c                Transac3onal         Trading  System  Perf.   Customer  reten3on   Payments  Risk  Mgmt.   IT  Security  Intelligence   Dynamic  pricing   Process  efficiency   Log  analysis   Insurance  Premiums   Clinical  trial  analysis   Smart  metering   Risk  Modeling   Supply  chain  op3miza3on   Brand  sen3ment   Real-­‐3me  upsell   Monitoring  Sensors   Person  of  Interest  Discovery   Session  Op3miza3on   Medical  test  results   Name   Personal  Health  Info   Credit  Card  Number   Social  Security  Number   Date  of  Birth  (DOB)   IP  Address   URL   Email  Address   Telephone  Number   Credit  limit   Purchase  amount   Customer  life3me  value   Address   Mask   Device  ID   Transac3on  Date   VIN   Number   Encrypt  
  • 33. Encryption or Masking in Hadoop        Analy3c                Transac3onal         Trading  System  Perf.   Customer  reten3on   Payments  Risk  Mgmt.   IT  Security  Intelligence   Medical  test  results   Name   Personal  Health  Info   Credit  Card  Number   Dynamic  pricing   Process  efficiency   Log  analysis   Insurance  Premiums   Clinical  trial  analysis   Smart  metering   Risk  Modeling   Supply  chain  op3miza3on   Brand  sen3ment   Real-­‐3me  upsell   Monitoring  Sensors   Social  Security  Number   Date  of  Birth  (DOB)   IP  Address   URL   Email  Address   Telephone  Number   Credit  limit   Purchase  amount   Customer  life3me  value   Address   Mask   Encrypt  Device  ID   Transac3on  Date   VIN   Person  of  Interest  Discovery   Session  Op3miza3on  
  • 34. ©2016  Dataguise,  Inc.      Confiden3al  and  Proprietary   34         How  does  this  work  in  DgSecure  
  • 35. ©2016  Dataguise,  Inc.      Confiden3al  and   Proprietary   HIGH-LEVEL DgSECURE FOR HADOOP FUNCTIONALITY 35   Policy  Management   Domain  Defini3on   custom  Elements      -­‐    Composite      -­‐    Dependent   Policy      -­‐    Per  Data  Feed?   Protec3on   Op3ons     Detec3on   In-­‐Flight   Within  HDFS   Full  vs.   Incremental   Structured  vs.   Semi/Unstructured   Quick  scan   Element  Count   Audi3ng   Files/Dirs   -­‐    Sensi3ve   elements   -­‐      Protected?   -­‐    Who  has  access       Users   -­‐  What  can  they  see   Protec3on   Domain  based   Masking   Redac3on   Encryp3on    -­‐  Field  or  Record    -­‐  AES  or  FPE     Repor3ng   Job  Level      -­‐    Sensi3ve  elements      -­‐    Directories  &  Files      -­‐    Remedia3on  applied   Dashboard      -­‐  Directory  or    by  policy        -­‐  Drill-­‐down   Audit  report    -­‐  User  ac3ons   No3fica3ons    
  • 36. Set  Policy ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   36  
  • 37. Data  Elements ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   37  
  • 38. Define/Execute  Detec>on/Protec>on  Task ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   38  
  • 39. Discovery  Task  Result ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   39  
  • 40. MaskingTask  Result ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   40  
  • 42. Dashboard ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   42  
  • 43. Entitlement Reports ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   43  
  • 44. Audit Reports ©2015  Contains  confiden3al  and  proprietary  informa3on   and  may  not  be  disclosed  by  the  recipient  to  any  third   party.   44  
  • 45. ©2016  Dataguise,  Inc.      Confiden3al  and  Proprietary   45           Sample  Secure  Business  Workflow     in  an  Enterprise  
  • 46. Sample  End  to  End  Flow   46  
  • 47. Sample  End  to  End  Flow   47   CISO/CPO:   Set  policy  per  data  feed   type  
  • 48. Sample  End  to  End  Flow   48   Data  Asset  Owner:   Provenance  metadata  
  • 49. Sample  End  to  End  Flow   49   IT/Set  Process:   Run  Discovery  to  detect   sensi3ve  data   Metadata  to  repository   (Atlas)  
  • 50. Sample  End  to  End  Flow   50   IT/Set  Process:   Use  Metadata  to  set  access   control  in  Ranger  
  • 51. Sample  End  to  End  Flow   51   Run  Masking/Encr  to  protect   sensi3ve  data   Metadata  incl.  lineage  to   repository  (Atlas)  
  • 52. Sample  End  to  End  Flow   52   IT/Set  Process:   Use  Metadata  to  set  access   control  in  Ranger  
  • 53. Sample  End  to  End  Flow   53   Data  Asset  owner  adds   annota3ons  &  adds  to  Data   Asset  Index  
  • 54. Sample  End  to  End  Flow   54   Data  Scien3st  browses   available  data  sets  and   makes  access  request  
  • 55. Sample  End  to  End  Flow   55   Data  owner  approves   request   Sets  access  control  in   Ranger  
  • 56. Sample  End  to  End  Flow   56   Data  Scien3st  runs  data   mining/BI/Analy3cs  
  • 57. Sample  End  to  End  Flow   57  
  • 58. Page 58 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ranger and Knox: Building on the Vision of Comprehensive Security Syed Mahmood
  • 59. Page 59 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security Challenges of Data Lake Central repository of critical and sensitive data Data maintained over long duration External ecosystem is in flux Users can access and analyze data in new and different ways
  • 60. Page 60 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How do I set policy across the entire cluster? Who am I/prove it? What can I do? What did I do? How can I encrypt at rest and over the wire? Differentiator 1: Comprehensive Approach to Security Data Protection Protect data at rest and in motion In order to protect any data system you must implement the following: Audit Maintain a record of data access Authorization Provision access to data Authentication Authenticate users and systems Administration Central management and consistent security
  • 61. Page 61 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDP Security: Comprehensive, Complete, Extensible Data Protection Protect data at rest and in motion Security in HDP is the most comprehensive, complete and extensible for Hadoop Audit Maintain a record of data access Authorization Provision access to data Authentication Authenticate users and systems Administration Central management and consistent security Single administrative console to set policy across the entire cluster: Apache Ranger Authentication for perimeter and cluster; integrates with existing Active Directory and LDAP solutions: Kerberos | Apache Knox Consistent authorization controls across all Apache components within HDP: Apache Ranger Record of data access events across all components that is consistent and accessible: Apache Ranger | Apache Atlas Encrypts data in motion and data at rest; refer partner encryption solutions for broader needs: HDFS TDE with Ranger KMS
  • 62. Page 62 © Hortonworks Inc. 2011 – 2016. All Rights Reserved     YARN : Data Operating System DATA ACCESS SECURITY GOVERNANCE & INTEGRATION OPERATIONS 1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   N   Data Lifecycle & Governance Falcon Atlas Administration Authentication Authorization Auditing Data Protection Ranger Knox Atlas HDFS Encryption Data Workflow Sqoop Flume Kafka NFS WebHDFS Provisioning, Managing, & Monitoring Ambari Cloudbreak Zookeeper Scheduling Oozie Batch MapReduce Script Pig Search Solr SQL Hive NoSQL HBase Accumulo Phoenix Stream Storm In-memory Spark Others ISV Engines TezTez Tez Slider Slider HDFS Hadoop Distributed File System DATA MANAGEMENT Hortonworks Data Platform 2.3 Deployment  Choice  Linux Windows On-Premise Cloud Differentiator 2: Security Built into the Platform
  • 63. Page 63 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security Built into the Platform Security is consistently administered across data access engines Build or retire applications without impacting security     YARN : Data Operating System DATA ACCESS SECURITY GOVERNANCE & INTEGRATION OPERATIONS 1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   N   Data Lifecycle & Governance Falcon Atlas Administration Authentication Authorization Auditing Data Protection Ranger Knox Atlas HDFS EncryptionData Workflow Sqoop Flume Kafka NFS WebHDFS Provisioning, Managing, & Monitoring Ambari Cloudbreak Zookeeper Scheduling Oozie Batch MapReduce Script Pig Search Solr SQL Hive NoSQL HBase Accumulo Phoenix Stream Storm In-memory Spark Others ISV Engines TezTez Tez Slider Slider HDFS Hadoop Distributed File System DATA MANAGEMENT Hortonworks Data Platform 2.3 Deployment  Choice  Linux Windows On-Premise Cloud
  • 64. Page 64 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security in Hadoop with HDP •  Wire encryption in Hadoop •  HDFS Encryption with Ranger KMS •  Centralized audit reporting with Apache Ranger •  Fine-grain access control with Apache Ranger Authorization What can I do? Audit What did I do? Data Protection Can data be encrypted at rest and over the wire? •  Kerberos •  API security with Apache Knox Authentication Who am I/prove it? HDP2.3 Centralized Security Administration with Ranger
  • 65. Page 65 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Ranger Comprehensive security for Enterprise Hadoop
  • 66. Page 66 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Centralized Security with Ranger Centralized platform •  Centralized platform to define, administer and manage security policies consistently •  Define security policy once and apply it to all the applicable components across the stack
  • 67. Page 67 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 68. Page 68 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Centralized Security with Ranger Centralized platform •  Administer security for: –  Database –  Table –  Column –  LDAP Groups –  Specific Users Fine-grained security definition •  Centralized platform to define, administer and manage security policies consistently •  Define security policy once and apply it to all the applicable components across the stack
  • 69. Page 69 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 70. Page 70 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Centralized Security with Ranger •  Administrators have complete visibility into the security administration process Deep visibilityCentralized platform •  Administer security for: –  Database –  Table –  Column –  LDAP Groups –  Specific Users Fine-grained security definition •  Centralized platform to define, administer and manage security policies consistently •  Define security policy once and apply it to all the applicable components across the stack
  • 71. Page 71 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 72. Page 72 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Authorization and Auditing with Ranger HDFS Ranger Administration Portal HBase Hive Server2 Ranger Audit Server Ranger Plugin HadoopComponentsEnterprise Users Ranger Plugin Ranger Plugin Legacy Tools and Data Governance HDFS Knox Storm Ranger Plugin Ranger Plugin RDBMS Solr Ranger Plugin Ranger Policy Server Future Additions Currently Supported in HDP 2.2 Integration API Kafka Ranger Plugin YARN Ranger Plugin TBD Ranger Plugin
  • 73. Page 73 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Atlas is Now Included in HDP Apache Atlas Knowledge Store Audit Store ModelsType-System Policy RulesTaxonomies Tag Based Policies Data Lifecycle Management Real Time Tag Based Access Control REST API Services Search Lineage Exchange Healthcare HIPAA HL7 Financial SOX Dodd-Frank Energy PPDM Retail PCI PII Other CWM Rest API Modern, flexible access to Atlas services, HDP components and external tools Search—SQL, like DSL (Domain Specific Language) Support for key word, faceted and full text searches Lineage Capture all SQL runtime activity on HiveServer2 providing lineage for both data and schema Exchange Leverage existing metadata by importing it from ETL tools, ERP systems and data warehouses Export metadata to downstream systems
  • 74. Page 74 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Atlas Vision 2015 Metadata Services Business Taxonomy - classification Operational Data – Model for Hive: DB, Tables, Col, Centralized location for all metadata inside HDP Single Interface point for Metadata Exchange with platforms outside of HDP. Search & Prescriptive Lineage – Model and Audit Apache Atlas Hive Ranger Falcon Kafka Storm
  • 75. © Hortonworks Inc. 2015. All Rights Reserved The Insurance Data Landscape has Changed u  The insurance industry is joining and analyzing data which has never been analyzed before u  Many of these sources can be “murky” and sensitive u  Traditional PII/PHI data sources ingested into Hadoop needs to be: •  Discovered •  Protected Ø  Protecting PII/PHI data is not an option for Insurers, TPAs and Brokers…. it is a Requirement Summary
  • 76. Page 76 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Questions ?
  • 77. Page 77 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Call to Action Additional Information : q Data Protection Optimized for Insurance Big Data – A Dataguise and Hortonworks Capability Overview q Hortonworks: Comprehensive Security in Hadoop – Solving Security in Hadoop Whitepaper q Hortonworks: Building Governance into Big Data – Whitepaper