April 2014 HUG : Apache Sentry
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

April 2014 HUG : Apache Sentry

on

  • 971 views

April 2014 HUG : Apache Sentry

April 2014 HUG : Apache Sentry

Statistics

Views

Total Views
971
Views on SlideShare
967
Embed Views
4

Actions

Likes
3
Downloads
18
Comments
0

2 Embeds 4

https://twitter.com 2
http://www.slideee.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

April 2014 HUG : Apache Sentry Presentation Transcript

  • 1. 1 Apache Sentry: Enterprise-grade Security for Hadoop Xuefu Zhang, Srayva Tirukkovalur | Cloudera April 16, 2014
  • 2. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Future Work • Demo • Q&A 2
  • 3. Introduction ● Hadoop gets bigger ... ● Hadoop has been enjoying an increasing adoption rate ● More and more data on Hadoop Cluster ● More and more access to the data ● Data warehouse offload is the most common use case ● Apache Hive, Apache Drill, Cloudera Impala ● SQL on Hadoop is phenomenon 3
  • 4. Introduction (cont'd) ● But more encumbrance ... ● Enterprises wants to protect sensitive data ● Government regulations, compliance, like HIPPA, PII, FISMA ● Existing security problems with Hadoop has hindered the adoption ● Security has become the top priority 4
  • 5. Introduction (cont'd) ● Reality is ... ● Different components, different security mechanisms ● Multiple components may access the same data set ● Hadoop was born out of trust, not security ● Thinking of Windows 5
  • 6. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Future work • Demo • Q&A 6
  • 7. Hadoop Security Primer • Authentication ● Identify who you are ● Untrusted users has no access to the cluster network ● In a trusted network, every one is good citizen ● Who you are is determined by client host 7
  • 8. Hadoop Security Primer • Strong Authentication ● Kerberos ● LDAP, ActiveDirectory ● LDAP, AD integrated with Kerberos, establishing a single point of truth ● Single Sign On 8
  • 9. Hadoop Security Primer (cont'd) • Kerberos ● Strong authentication ● Provides mutual authentication ● Protects against eavesdropping and replay attacks ● Every user and service has a Kerberos “principal” ● Credentials: keytabs (service), password (user) 9
  • 10. Hadoop Security Primer (cont'd) • Authorization ● Determine if you can access ● HDFS Posix style permission R/W/X for U/G/O, coarse- grained ● Other components have authorization ● MR job queue ● HBase ACLs on table and column family. ● Accumulo provides cell-level access control ● Impersonation 10
  • 11. Hadoop Security Primer (cont'd) • Data Protection ● Data at rest and in transit ● Hadoop provides encryption on data in transit: DTP, HTTP, RPC, JDBC/ODBC ● Hadoop has no native encryption on data at rest (HDFS- 6134) ● Relying on OS-level encryption 11
  • 12. Hadoop Security Primer (cont'd) • Governance and auditing ● Again, component to component ● DFS and MapReduce provide base audit support ● Apache Hive metastore records audit (who/when) information for Hive interactions. ● Apache Oozie provides audit trail for services 12
  • 13. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Future work • Demo • Q&A 13
  • 14. Introducing Apache Sentry 14 ● Hadoop Authorization ● Existing authorization is fragmented, coarse-grained, and manual ● A lot of times data is just unprotected for simplicity ● Enterprises need a centralized authorization component that work across components with ease of use, fine- grained, role based
  • 15. Introducing Apache Sentry (cont'd) 15 ● What's Sentry ● Sentry is an authorization module for Hive, Search, Impala, and beyond ● It unlocks Key RBAC Requirements: secure, fine- grained, role-based authorization, multi-tenant administration ● Open Source, Apache Incubator project ● Ecosystem Support: Apache SOLR, HiveServer2, & Impala 1.1+
  • 16. Introducing Apache Sentry (cont'd) 16 ● Key Benefits ● Store Sensitive Data in Hadoop ● Extend Hadoop to More Users ● Comply with Regulations
  • 17. Introducing Apache Sentry (cont'd) 17 ● Key Capabilities ● Fine-Grained: SERVERS, DATABASES, TABLES & VIEWS; INDEXES, COLLECTIONS ● Role-Based: role including privileges such as SELECT, INSERT, ALL; UPDATE, QUERY ● Multi-Tenant administration ● Separate policies for each database/schema ● Can be maintained by separate admins
  • 18. Introducing Apache Sentry (cont'd) 18 Binding Layer Impala Impala Hive Policy Engine Policy Provider File Database HiveServer 2 Authorization Provider Local FS/HDFS Search SOLR Pig … Sentry Architecture
  • 19. Introducing Apache Sentry (cont'd) 19 QueryMR SQL Parse Build Check Plan Sentry Validate SQL grammar Construct statement tree Validate statement objects • First check: Authorization Forward to execution planner
  • 20. Introducing Apache Sentry (cont'd) • Actors ● User ● User group membership ● Resources ● Privilege ● Role 20
  • 21. Introducing Apache Sentry (cont'd) • User ● User authenticated ● User identity obtained from session context 21
  • 22. Introducing Apache Sentry (cont'd) • User group membership ● Defined outside sentry policy ● Obtained from user directory (LDAP, AD, HDFS) ● Maybe available from session context 22
  • 23. Introducing Apache Sentry (cont'd) • Resources ● Data to be protected ● File or directory on HDFS ● Table or views in Hive ● URI ● Resource can be hierarchical 23
  • 24. Introducing Apache Sentry (cont'd) • Privilege ● Action or operation associated with a resource ● Exists in a role only ● SELECT on a given TABLE or VIEW ● CREATE a TABLE or VIEW ● QUERY on a search COLLECTION ● DELETE a FILE or DIRECTORY ● Example collection=customerCol->action=query 24
  • 25. Introducing Apache Sentry (cont'd) • Roles ● A collection of privileges ● Defined in Sentry policy ● Example [roles] ana_query_role = collection=sentryColl->action=query ana_update_role = collection=sentryColl->action=update test_role = collection=testColl->action=update full_admin_role = collection=* 25
  • 26. Introducing Apache Sentry (cont'd) • (Group, Role) mapping ● Defined in policy ● One-to-Many ● Example [groups] analyts = ana_query_role, ana_update_role admins = full_admin_role testgroup = test_role hbase = full_admin_role 26
  • 27. Introducing Apache Sentry (cont'd) • Rule evaluation ● Who's the user? ● Which group(s) does the user belong to? ● What resource to be accessed? ● How the resource is accessed (READ, SELECT, etc.)? ● Does any of the user's groups have a role, which has the right privilege? ● Yes – great! Go head! ● No – sorry! No sufficient privilege! 27
  • 28. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Future work • Demo • Q&A 28
  • 29. Future Work 29 ● Introduce Sentry to more Hadoop components for their authorization needs ● Centralized policy store aiming for the whole enterprise ● Grant/Revoke ● Centralized authorization service for all protected resources including metadata  We appreciate your contribution and support
  • 30. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Future work • Demo • Q&A 30
  • 31. Click to edit Master title style 31