Taking Hadoop to Enterprise Security Standards
Upcoming SlideShare
Loading in...5
×
 

Taking Hadoop to Enterprise Security Standards

on

  • 560 views

 

Statistics

Views

Total Views
560
Views on SlideShare
518
Embed Views
42

Actions

Likes
2
Downloads
38
Comments
0

1 Embed 42

http://www.scoop.it 42

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Taking Hadoop to Enterprise Security Standards Taking Hadoop to Enterprise Security Standards Presentation Transcript

  • ©2014 LinkedIn Corporation. All Rights Reserved. Taking Hadoop to Enterprise Security Standards
  • Access Control
  • How many of you need or have access control in Hadoop?
  • ©2014 LinkedIn Corporation. All Rights Reserved. Users First Internal Threat Keeping Data Secure External Threat
  • More granular the access controls are more people can have access to the data
  • ©2014 LinkedIn Corporation. All Rights Reserved. Hadoop – Status Quo Multiple Query Execution Engines Custom Code Execution Auditing
  • ©2014 LinkedIn Corporation. All Rights Reserved. User ID Email Address IP address Billing address Security Customer Service Data Scientist Adding & Removing group membership can take up to few hours HDFS file permissions are very coarse (at file level) HDFS File Permissions
  • ©2014 LinkedIn Corporation. All Rights Reserved. Other Access Control Solutions
  • ©2014 LinkedIn Corporation. All Rights Reserved. Mixed Data Multiple Data Processing Systems Data for Everyone Challenges
  • ©2014 LinkedIn Corporation. All Rights Reserved. Extensible Authorization Fine Grain Control Fast Changes to Authorization Rules What do we need?
  • ©2014 LinkedIn Corporation. All Rights Reserved. Our Solution: Access Control via Encryption Apache Kafka HDFS Key Server Parquet ETLEncrypted Events
  • ©2014 LinkedIn Corporation. All Rights Reserved. User A’s Job User B’s Job User C’s Job Producer Job ETL User Parquet File User Columns A 5 B 2, 5 Key Server Access Control via Encryption
  • ©2014 LinkedIn Corporation. All Rights Reserved. Columnar Storage Page 0 Page 1 Page 2 Column a Column b Rowgroup Parquet Format Brief Overview of Parquet
  • ©2014 LinkedIn Corporation. All Rights Reserved. *Yet to be integrated into open source Parquet Field mode Page Column | Page Mode | Hybrid Mode Encryption Support in Parquet*
  • ©2014 LinkedIn Corporation. All Rights Reserved. Examples  Emails – Analysts need it to join with other tables but may not require access to individual emails N Values (Page) Encrypt each value at a time karthik@gmail.com harsh@gmail.com harsh@gmail.com arvind@gmail.com xxxxxxx yyyyyyy yyyyyyy zzzzzzz Field Mode
  • ©2014 LinkedIn Corporation. All Rights Reserved. Field Mode Joins Counts Distribution Analysis No/Low compression
  • ©2014 LinkedIn Corporation. All Rights Reserved. Page Mode  No information is leaked except entropy of the data  Better performance than other modes N Values (Page) Encode Compress Encrypt
  • ©2014 LinkedIn Corporation. All Rights Reserved. Hybrid Mode  More fine grain control of information  Increase in overhead due to double encryption/decryption N Values (Page) Encrypt each value Encrypt
  • ©2014 LinkedIn Corporation. All Rights Reserved. Plain Text | Encrypted Value | No Access Field Mode Page Mode Hybrid Mode
  • ©2014 LinkedIn Corporation. All Rights Reserved. Key Versioning  Each key is versioned and specific for a source (File/Event name)  Reduces the exposure incase of key leakage  Time based access control – All users by default can access only last 30 days of data – Give users access to data in specific time period  Authentication of producers can be done separately
  • ©2014 LinkedIn Corporation. All Rights Reserved. Better Auditing Coverage Retention Enforcement Key Server Features Multifactor Authentication
  • ©2014 LinkedIn Corporation. All Rights Reserved. PIG Usage
  • Thank you!