Is Your Hadoop Environment Secure?
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Is Your Hadoop Environment Secure?

on

  • 645 views

How do you protect the data in big data analytics projects? ...

How do you protect the data in big data analytics projects?

As big data initiatives focus on volume, velocity or variety of data, often overlooked in the big data project is the security of the data. This is especially important for financial services, healthcare and government or anytime sensitive data is analyzed.

This webinar highlights:

*Hadoop security landscape
*Hadoop encryption, masking, and access control
*Customer examples of securing hadoop environments

Statistics

Views

Total Views
645
Views on SlideShare
639
Embed Views
6

Actions

Likes
0
Downloads
22
Comments
0

3 Embeds 6

https://twitter.com 4
http://www.linkedin.com 1
http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Architectural issuesBig data environments do not typically offer no finer granularity of access than schema levelLack of secure inter-node communication (create separate layer so customers don’t have to worry about this)Hadoop security is developingImprovements to role-based HDFS security are in progressOpensource projects are just beginningVendors offer immature solutionsTrying to apply traditional methods to HadoopCreate a central chokepoint and are not operating at the node levelLack of solutions in production
  • The big data users I have spoken with about data security agreed that data masking at that scale is infeasible. Given the rate of data insertion (also called ‘velocity’), masking sensitive data before loading it into a cluster would require “an entire ETL cluster to front the Hadoop cluster”. But apparently it’s doable, and Netflix did just that – fronted its analytics cluster with a data transformation cluster, all within EC2. 500 nodes massaging data for another 500 nodes. While the ETL cluster is not used for masking, note that it is about the same size as the analysis cluster. It’s this one-to-one mapping that I often worry about with security. Ask yourself, “Do we need another whole cluster for masking?” No? Then what about NoSQL activity monitoring? What about IAM, application monitoring, and any other security tasks. Do you start to see the problem with bolting on security? Logging and auditing are embeddable – most everything else is not.
  • Kerberos to provide mutual authentication—both the user and the server verify each other’s identityGazzang – block level for big data
  • A proper infrastructure PKI inside an organizationCert The warning screen – users are used to certificate warnings CDH4.1 – Kerberos SSLDisable Hadoop we access
  • Object typesUnix based permResults sharingEasy to understand and audit
  • Granular rolesPer type of an object not just per an objectExample:Hadoop admin – role – can access Hadoop settings and create import jobs – do not have access to data
  • Different group in an organization – more security – Hadoop admins do not have rights to change add/remove p. from groups
  • Join to a company AD infrastructure.Adopted by Hadoop as an authentication mechanismIntegration with other services across platforms – zookeeper, For example MSSQL services
  • Delegation – Datameer can run jobs as a owner of the jobWith imp only owner can access his own file.When user is deleted from system ……Jobs are run as an owner of the job and stored
  • - Show access rights, role screen, LDAP screen, Kerberos setup
  • Intel – Implemented in Hadoop APIYoung project – others – future will shows if others participate – Cloudera ….Others: Volateg. Preterit – not open source and not wildly used
  • Detailed information about user access Detailed information job runs – dependent on Hadoop logs
  • Datameer for big data.Use Datameer to analyze Datameer access logs.Abnormality detectionSecurity breach detection.Behavior analysis.* HDFS - Hadoop

Is Your Hadoop Environment Secure? Presentation Transcript

  • 1. Building Secure Hadoop Environments © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved.
  • 2. View the full recording You can view the full recording of this ondemand webinar with slides at: http://info.datameer.com/Slideshare-BuildingSecure-Hadoop-Environments.html © 2012 Datameer, Inc. All rights reserved.
  • 3. About our Speaker Karen Hsu With over 15 years of experience in enterprise software, Karen Hsu has coauthored 4 patents and worked in a variety of engineering, marketing and sales roles. Most recently she came from Informatica where she worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market. Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University. © 2012 Datameer, Inc. All rights reserved.
  • 4. About our Speaker Filip Slunecko Filip is part of the Customer support team at Datameer. He is a Linux professional and Python enthusiast. Before joining Datameer, he was on the Hadoop team at AVG, an antivirus/security company. Filip now uses his 8 years experience with Linux servers and Hadoop security to help Datameer customers. © 2012 Datameer, Inc. All rights reserved.
  • 5. Building Secure Hadoop Environments © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved.
  • 6. Agenda Challenges and use cases Hadoop security landscape Components for building successful Hadoop environments Call to Action © 2012 Datameer, Inc. All rights reserved.
  • 7. Hadoop Data Security Challenges Architectural issues Hadoop security is developing Vendors offer bolt-on solutions To add security capabilities into a big data environment, the capabilities need to scale with the data… Most security tools fail to scale and perform with big data environments. - Adrian Lane, Securosis Securosis, Oct 12, 2012 © 2012 Datameer, Inc. All rights reserved.
  • 8. Hadoop Security Use Cases Use Case Requirement Example Description Role based access Data access is restricted through the abstraction layer Users have a view of data in Hadoop they can manipulate Transformation of sensitive values during load Data is transformed, masked, or encrypted. Cluster is copied and then masked/transformed so that analysts work on anonymized data © 2012 Datameer, Inc. All rights reserved.
  • 9. Role Based Access Data Access Pig / Hive Map-Reduce Restrict View HDFS © 2012 Datameer, Inc. All rights reserved.
  • 10. Transformation of Sensitive Values Data Access Load Map-Reduce Transform Data HDFS © 2012 Datameer, Inc. All rights reserved.
  • 11. Hybrid of Role Based Access and Transformation of Sensitive Values Data Access Load Map-Reduce Transform Restrict View HDFS © 2012 Datameer, Inc. All rights reserved.
  • 12. Hadoop Security Offerings Type Description Example vendors Role based access control Use LDAP / Active Directory (AD) authentication to identify and manage users. Leveraging Kerberos to provide mutual authentication Encryption • • • Masking Data Masking performed before load Block level encryption Linux directory level encryption with external key store File encryption Disk encryption Format preserving encryption © 2012 Datameer, Inc. All rights reserved.
  • 13. Components for Building Secure Hadoop Environment Secure access – SSL Access controls Secure authentication Kerberos Logging – auditing File Encryption Disk encryption © 2012 Datameer, Inc. All rights reserved.
  • 14. Secure access © 2012 Datameer, Inc. All rights reserved.
  • 15. Access Controls Datameer Example Object permission Roles LDAP Kerberos Impersonation © 2012 Datameer, Inc. All rights reserved.
  • 16. Object Permission Datameer Example Object types Import jobs Data links Workbooks Export job Info graphics © 2012 Datameer, Inc. All rights reserved.
  • 17. Roles Datameer Example © 2012 Datameer, Inc. All rights reserved.
  • 18. Remote Authenticator Datameer Example Integrating into an existing infrastructure Active directory support Import groups and users to Datameer Centralized user management © 2012 Datameer, Inc. All rights reserved.
  • 19. Kerberos © 2012 Datameer, Inc. All rights reserved.
  • 20. Impersonation © 2012 Datameer, Inc. All rights reserved.
  • 21. Demonstration © 2012 Datameer, Inc. All rights reserved.
  • 22. Disk Encryption Why it’s important • 1 year - 2% • 2 year - 6-8% Criteria for success • Encryption per process • Key management • Safe and in full compliance with HIPAA, PCIDSS, FERPA © 2012 Datameer, Inc. All rights reserved.
  • 23. File Encryption Emerging Technology Intel Hadoop Project Rhino • Encryption and key management. • A common authorization framework. • Token based authentication and single sign on. • Improve audit logging. © 2012 Datameer, Inc. All rights reserved.
  • 24. Logging and Auditing Datameer UI Access Job execution Hadoop File access Job runs © 2012 Datameer, Inc. All rights reserved.
  • 25. Logging and Auditing Centralized logging Collectors Storage Real Time Search Visualization Datameer Datameer* Katta Datameer Splunk Splunk Elasticsearch Splunk Flume Elasticsearch Solr Greylog Greylog Solr Graphite Hive © 2012 Datameer, Inc. All rights reserved.
  • 26. Recap Challenges and use cases Hadoop security landscape Components for building successful Hadoop environments • Secure access – SSL • Access controls • Secure authentication • Kerberos • Logging – auditing • File Encryption • Disk encryption © 2012 Datameer, Inc. All rights reserved.
  • 27. Call to Action Contact • Filip Slunecko fslunecko@datameer.com • Karen Hsu khsu@datameer.com Implementing Hadoop Security Workshop • Contact marketing@datameer.com for more details Meet us at Discover Big Data 8 City Workshop near you! http://info.datameer.com/Discove r-Big-Data-RoadShow.html www.datameer.com © 2012 Datameer, Inc. All rights reserved.
  • 28. Online Resources   Try Datameer: www.datameer.com Follow us on Twitter @datameer © 2012 Datameer, Inc. All rights reserved.