• Like
  • Save

The Future of Hadoop Security - Hadoop Summit 2014

  • 3,865 views
Uploaded on

Hadoop deployments are rapidly moving from pilots to production, enabling unprecedented opportunity to build big data applications that deliver faster access to more information to more users than …

Hadoop deployments are rapidly moving from pilots to production, enabling unprecedented opportunity to build big data applications that deliver faster access to more information to more users than ever before possible. Yet without the ability to address data security and compliance regulations, Hadoop will be limited to another data silo.

In this talk, Matt Brandwein and David Tishgart discuss the requirements for securing Hadoop and how Cloudera (now with Gazzang) and Intel are collaborating in the open to deliver comprehensive, transparent, compliance-ready security to unlock the potential of the Hadoop ecosystem and enable innovation without compromise.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,865
On Slideshare
0
From Embeds
0
Number of Embeds
8

Actions

Shares
Downloads
0
Comments
0
Likes
14

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Future of Hadoop Security Matt Brandwein @mattbrandwein David Tishgart @dtish
  • 2. ©2014 Cloudera, Inc. All rights reserved.2
  • 3. ©2014 Cloudera, Inc. All rights reserved.3
  • 4. ©2014 Cloudera, Inc. All rights reserved.4
  • 5. Trusted Data Zone Sensitive Data, Multi-Tenant Access Hadoop “Data Lake” or Sandbox Non-Sensitive Data, Few Users RDBMS ©2014 Cloudera, Inc. All rights reserved. Hadoop is at risk of becoming another silo 5
  • 6. ✔ Meet compliance requirements ✔ Innovate without compromise ✔ Comprehensive security for all data ©2014 Cloudera, Inc. All rights reserved.6
  • 7. ©2014 Cloudera, Inc. All rights reserved. Key Requirements for Security in Hadoop Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage 7
  • 8. ©2014 Cloudera, Inc. All rights reserved. Key Requirements for Security in Hadoop Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Kerberos | AD/LDAP Today: First to market with Kerberos authentication Roadmap: Fully automated Kerberos that leverages existing active directory environment 8
  • 9. ©2014 Cloudera, Inc. All rights reserved. Key Requirements for Security in Hadoop Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Kerberos | AD/LDAP Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Rhino | Sentry Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Cloudera Navigator Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Encrypt | Key Trustee Today: Unified authorization for Hive, Impala, & Search through Apache Sentry Roadmap: Unified authorization across all access paths to data and metadata—Apache Sentry expansion 9
  • 10. ©2014 Cloudera, Inc. All rights reserved. Key Requirements for Security in Hadoop Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Kerberos | AD/LDAP Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Sentry Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Cloudera Navigator Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Encrypt | Key Trustee Today: First in the market with centralized audit capabilities Roadmap: Extend capabilities to cover more workloads including Spark 10
  • 11. ©2014 Cloudera, Inc. All rights reserved. • Founded: 2010 • Security: Singular product focus and a pillar of company culture. Security is at the front of everything we do • Big Data Expertise: While other security vendors retrofit their solutions for big data, Gazzang’s solutions are designed for the specific demands of Hadoop and NoSQL systems • Customer Success: Nearly 200 paying customers including several in the Fortune 1000 • Named a 2014 Cool Vendor in Big Data by Gartner About Gazzang 11
  • 12. Hadoop Security Challenges ©2014 Cloudera, Inc. All rights reserved. • We can ensure sensitive data and encryption keys are never stored in plain text nor exposed publicly • We can enable compliance (HIPAA, PCI-DSS, SOX, FERPA, EU data protection) initiatives that require at-rest encryption and key management 12 “I need to meet [insert acronym here] compliance”
  • 13. ©2014 Cloudera, Inc. All rights reserved. When thinking about compliance, consider the following: • Are your encryption processes (algorithm, key length) consistent with NIST Special Publication 800-111? • Are the encryption keys stored on a separate device or location from the encrypted data? • What kind of authentication and access controls are enforced? • Is the data secured in a manner that would enable you to claim “safe harbor” in the event of a breach? • Do the crypto modules meet FIPS 140-2 certification? • Can you account for all the sensitive data that may fall under compliance scope? Not all Data Security is Created Equal 13
  • 14. Hadoop Security Challenges ©2014 Cloudera, Inc. All rights reserved.14 “I want security that won’t impose a harsh penalty” • We provide a transparent layer between the application and file system that dramatically reduces performance impact of encryption • We can make sure only applications that need access to plaintext data will have it
  • 15. Hadoop Security Challenges ©2014 Cloudera, Inc. All rights reserved.15 “I need a centralized way to manage all my hadoop security artifacts” • Navigator key trustee provides cluster-level security, managing the growing volumes of Hadoop encryption keys, certificates, passwords • We can help you bring sensitive digital artifacts under a consistent set of controls and policies
  • 16. Hadoop Security Challenges ©2014 Cloudera, Inc. All rights reserved.16 “It’s critical that no unauthorized parties can access my data” • Navigator encrypt can prevent admins and super users from accessing encrypted data • You can establish a variety of key retrieval policies that dictate who or what can access the secure artifact
  • 17. ©2014 Cloudera, Inc. All rights reserved. How does it work? Navigator encrypt provides transparent encryption for Hadoop data as it’s written to disk • AES-256 encryption for HDFS data, Hive metadata, log files, ingest paths, etc... • Process-based ACLs • High-performance optimized on Intel • Fast, easy deployment and configuration • Enterprise scalability • Keys protected by Navigator key trustee 17
  • 18. ©2014 Cloudera, Inc. All rights reserved. Navigator key trustee is a “virtual safe-deposit box” for managing encrypt keys or any other Hadoop security artifact How does it work? • Separates keys from encrypted data • Centralized management of SSL certificates, SSH keys, tokens, passwords, kerberos keytab files and more • Unique “trustee” and machine-based policies deliver multifactor authentication • Integration with HSMs from Thales, RSA and SafeNet • Multiple deployment options include on- prem or hosted SaaS offering 18
  • 19. ©2014 Cloudera, Inc. All rights reserved. Introducing the Cloudera Center for Security Excellence • Based in Austin, Texas • Comprehensive data and cluster security technologies • Hadoop security test and certification lab • Security ecosystem partner enablement • Intel chipset, cloud and virtualization security alignment `19
  • 20. ©2014 Cloudera, Inc. All rights reserved. Hadoop Security Successes 20 • Health exchange for Minnesota • Using Cloudera to log, track and run analytics on interactions between case workers and consumers • The ability to drive data privacy and HIPAA compliance on Hadoop were critical requirements and key factors in the selections of Cloudera and Gazzang • Surprised by the performance and ease of use • Wanted to get to know its customers better in an effort to improve service and sniff out fraud • Massive amount of personal and PCI data being collected, the company is encrypting everything in its Hadoop cluster • Data is segregated with Apache Sentry (incubating) and Kerberos, monitored by Cloudera Navigator and encrypted by Gazzang • Key manager and process-based ACL’s enable separation of keys and data based on “business need to know”
  • 21. ©2014 Cloudera, Inc. All rights reserved. Key Requirements for Security in Hadoop Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Kerberos | AD/LDAP Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Sentry Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Cloudera Navigator Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Encrypt | Key Trustee Previous: Cloudera Partners Today: Transparent Encryption + Enterprise Key Management Roadmap: Transparent Encryption for HDFS (includes work-through Project Rhino) + Enterprise Key Management 21
  • 22. ©2014 Cloudera, Inc. All rights reserved. Result: Cloudera is the most secure Hadoop platform Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Kerberos | AD/LDAP Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Rhino | Sentry Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Cloudera Navigator Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Encrypt | Key Trustee 22
  • 23. Batch Processing Analytic MPP SQL Search Engine Machine Learning Stream Processing End-to-End, Zero-Downtime System Administration Workload & Resource Management 3rd Party Apps Distributed Filesystem Online NoSQL Database Access Control Authorization Perimeter Authentication Data Protection Encryption, Key Management Data Lifecycle BDR, Snapshots Data Visibility Audit, Lineage ANALYTIC & PROCESSING ENGINES SYSTEMS MANAGEMENT UNIFIED DATA STORAGE & INTEGRATION SECURITY & GOVERNANCE CLOUDERA ENTERPRISE 5 Comprehensive, Transparent,Compliance-ReadySecurity ©2014 Cloudera, Inc. All rights reserved.23
  • 24. Batch Processing Analytic MPP SQL Search Engine Machine Learning Stream Processing End-to-End, Zero-Downtime System Administration Workload & Resource Management 3rd Party Apps Distributed Filesystem Online NoSQL Database Access Control Authorization Perimeter Authentication Data Protection Encryption, Key Management Data Lifecycle BDR, Snapshots Data Visibility Audit, Lineage ANALYTIC & PROCESSING ENGINES SYSTEMS MANAGEMENT UNIFIED DATA STORAGE & INTEGRATION SECURITY & GOVERNANCE CLOUDERA ENTERPRISE 5 Comprehensive, Transparent,Compliance-ReadySecurity ©2014 Cloudera, Inc. All rights reserved.24
  • 25. ©2014 Cloudera, Inc. All rights reserved. Cloudera’s Vision for Hadoop Security Compliance-Ready Comprehensive Transparent • Standards-based Authentication • Centralized, Granular Authorization • Native Data Protection • End-to-End Data Audit and Lineage • Meet compliance requirements • HIPAA, PCI-DSS, … • Encryption and key management • Security at the core • Minimal performance impact • Compatible with new components • Insight with compliance 25
  • 26. ©2014 Cloudera, Inc. All rights reserved. Thank you! @mattbrandwein @dtish Visit our booths to learn more: 26