Your SlideShare is downloading. ×
0
Page 1 © Hortonworks Inc. 2014
Discover HDP 2.1
New Features for Security & Apache Knox
Hortonworks. We do Hadoop.
Page 2 © Hortonworks Inc. 2014
Speakers
Justin Sears
Hortonworks Product Marketing Manager
Vinay Shukla
Hortonworks Direct...
Page 3 © Hortonworks Inc. 2014
Agenda
•  Security for Hadoop REST/HTTP API – Knox Gateway
•  HDFS Security – ACLs
•  SQL S...
Page 4 © Hortonworks Inc. 2014
OPERATIONS*TOOLS*
Provision,
Manage &
Monitor
DEV*&*DATA*TOOLS*
Build &
Test
A Modern Data ...
Page 5 © Hortonworks Inc. 2014
HDP 2.1: Enterprise Hadoop
HDP 2.1
Hortonworks Data Platform
**
Provision,*
Manage*&*
Monit...
Page 6 © Hortonworks Inc. 2014
HDP 2.1: Enterprise Hadoop
HDP 2.1
Hortonworks Data Platform
**
Provision,*
Manage*&*
Monit...
Page 7 © Hortonworks Inc. 2014
Security: Rings of Defense
Perimeter Level Security
•  Network Security (i.e. Firewalls)
• ...
Page 8 © Hortonworks Inc. 2014
Security for Hadoop REST API –
Apache Knox Gateway
Page 9 © Hortonworks Inc. 2014
Current Hadoop Client Model
•  FileSystem and MapReduce Java APIs
•  HDFS, Pig, Hive and Oo...
Page 10 © Hortonworks Inc. 2014
Why Knox?
Simplified
Access
Single Hadoop access point
Rationalized REST API
hierarchy
Con...
Page 11 © Hortonworks Inc. 2014
Hadoop REST API Security: Drill-Down
REST
Client
Enterprise
Identity
Provider
LDAP/AD
Knox...
Page 12 © Hortonworks Inc. 2014
Knox Summary
•  Simplifies Client Interaction with REST Web Services
•  Abstracts away com...
Page 13 © Hortonworks Inc. 2014
HDFS Access Control List (ACL)
Page 14 © Hortonworks Inc. 2014
HDFS Permissions Model Before HDP 2.1
• HDFS permissions at a File & Directory level
• Man...
Page 15 © Hortonworks Inc. 2014
HDFS File Permissions Example
•  Authorization requirements:
–  In a sales department, the...
Page 16 © Hortonworks Inc. 2014
HDFS Extended ACLs in HDP 2.1
•  Problem
– No longer feasible for Maya to control all modi...
Page 17 © Hortonworks Inc. 2014
HDFS Extended ACLs in HDP 2.1
New Tools for ACL Management (setfacl, getfacl)
– hdfs dfs -...
Page 18 © Hortonworks Inc. 2014
HDFS Extended ACLs in HDP 2.1
Default ACLs
– hdfs dfs -setfacl -m default:group:execs:r-x ...
Page 19 © Hortonworks Inc. 2014
SQL-Style Security for Hive –ATZ-NG
Page 20 © Hortonworks Inc. 2014
Hive Authorization Before HDP 2.1
HiveAuthorizationProvider(HAP) as the base interface
1. ...
Page 21 © Hortonworks Inc. 2014
Hive Authorization in HDP 2.1
• Many paths into Hive
– Hive CLI, Beeline, Oozie, Hue, Pig,...
Page 22 © Hortonworks Inc. 2014
Hive ATZ-NG – Architecture
HDFS
Metastore
HiveServer2
O/JDBC Beeline CLI
•  ATZ-NG is call...
Page 23 © Hortonworks Inc. 2014
Hive ATZ-NG Details
Hive ATZ NG
SQL standard-based authorization
Manually config Hive to e...
Page 24 © Hortonworks Inc. 2014
What about MR/Pig/Hive CLI?
• All these are ETL run by privileged users
• Protect them at ...
Page 25 © Hortonworks Inc. 2014
Summary
ATZ-NG is a superior approach for Hive Authorization because it
delivers:
1.  Fami...
Page 26 © Hortonworks Inc. 2014
Hive ATZ-NG Example
Page 26
Page 27 © Hortonworks Inc. 2014
Scenario
• Objective: Share Product Management Roadmap
securely
• Actors:
– Admin Role – S...
Page 28 © Hortonworks Inc. 2014
Step 1: Admin role Creates Roles, Adds Users
1.  CREATE ROLE PM;
2.  CREATE ROLE ENG;
3.  ...
Page 29 © Hortonworks Inc. 2014
Step 2: Super-user Creates Tables/Views
create table hdp_hadoop_plans (
id int,
hadoop_roa...
Page 30 © Hortonworks Inc. 2014
Step 3: Users or Roles Assigned To Tables
1.  GRANT ALL ON hdp_hadoop_plans TO ROLE PM;
2....
Page 31 © Hortonworks Inc. 2014
Learn More
Hortonworks.com/labs/
security/
Register for the other six
Discover HDP 2.1 Web...
Upcoming SlideShare
Loading in...5
×

Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

2,050

Published on

Discover enterprise security features in Hortonworks Data Platform 2.1 (HDP) with Apache Knox

Published in: Technology
0 Comments
12 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,050
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
176
Comments
0
Likes
12
Embeds 0
No embeds

No notes for slide

Transcript of "Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox"

  1. 1. Page 1 © Hortonworks Inc. 2014 Discover HDP 2.1 New Features for Security & Apache Knox Hortonworks. We do Hadoop.
  2. 2. Page 2 © Hortonworks Inc. 2014 Speakers Justin Sears Hortonworks Product Marketing Manager Vinay Shukla Hortonworks Director of Product Management & owner of Hortonworks security roadmap Kevin Minder Hortonworks Engineer & Committer for Apache Knox Gateway project
  3. 3. Page 3 © Hortonworks Inc. 2014 Agenda •  Security for Hadoop REST/HTTP API – Knox Gateway •  HDFS Security – ACLs •  SQL Security – Next Generation Hive Authorization
  4. 4. Page 4 © Hortonworks Inc. 2014 OPERATIONS*TOOLS* Provision, Manage & Monitor DEV*&*DATA*TOOLS* Build & Test A Modern Data ArchitectureAPPLICATIONS*DATA**SYSTEM* REPOSITORIES* RDBMS* EDW* MPP* Business** Analy<cs* Custom* Applica<ons* Packaged* Applica<ons* Governance &Integration ENTERPRISE HADOOP Security Operations Data Access Data Management SOURCES* OLTP,&ERP,& CRM&Systems& Documents,&& Emails& Web&Logs,& Click&Streams& Social& Networks& Machine& Generated& Sensor& Data& GeolocaCon& Data&
  5. 5. Page 5 © Hortonworks Inc. 2014 HDP 2.1: Enterprise Hadoop HDP 2.1 Hortonworks Data Platform ** Provision,* Manage*&* Monitor* & Ambari& Zookeeper& Scheduling* & Oozie& Data*Workflow,* Lifecycle*&* Governance* * Falcon& Sqoop& Flume& NFS& WebHDFS& YARN*:*Data*Opera<ng*System& DATA**MANAGEMENT* SECURITY*DATA**ACCESS* GOVERNANCE*&* INTEGRATION* Authen<ca<on* Authoriza<on* Accoun<ng* Data*Protec<on* & Storage:&HDFS& Resources:&YARN& Access:&Hive,&…&& Pipeline:&Falcon& Cluster:&Knox& OPERATIONS* Script* & Pig& * * Search* * Solr& * * SQL* * Hive/Tez,& HCatalog& * * NoSQL* * HBase& Accumulo& * * Stream* ** Storm& & * * Others* * InTMemory& AnalyCcs,&& ISV&engines& 1& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& N* HDFS** (Hadoop&Distributed&File&System)& Batch* * Map& Reduce& * *
  6. 6. Page 6 © Hortonworks Inc. 2014 HDP 2.1: Enterprise Hadoop HDP 2.1 Hortonworks Data Platform ** Provision,* Manage*&* Monitor* & Ambari& Zookeeper& Scheduling* & Oozie& Data*Workflow,* Lifecycle*&* Governance* * Falcon& Sqoop& Flume& NFS& WebHDFS& YARN*:*Data*Opera<ng*System& DATA**MANAGEMENT* DATA**ACCESS* GOVERNANCE*&* INTEGRATION* OPERATIONS* Script* & Pig& * * Search* * Solr& * * SQL* * Hive/Tez,& HCatalog& * * NoSQL* * HBase& Accumulo& * * Stream* ** Storm& & * * Others* * InTMemory& AnalyCcs,&& ISV&engines& 1& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& °& N* HDFS** (Hadoop&Distributed&File&System)& Batch* * Map& Reduce& * * SECURITY* Authen<ca<on* Authoriza<on* Accoun<ng* Data*Protec<on* & Storage:&HDFS& Resources:&YARN& Access:&Hive,&…&& Pipeline:&Falcon& Cluster:&Knox&
  7. 7. Page 7 © Hortonworks Inc. 2014 Security: Rings of Defense Perimeter Level Security •  Network Security (i.e. Firewalls) •  Apache Knox (i.e. Gateways) Authentication •  Kerberos OS Security Authorization •  MR ACLs •  HDFS Permissions •  HDFS ACLs •  HiveATZ-NG •  HBase ACLs •  Accumulo Label Security Data Protection •  Core Hadoop •  Partners
  8. 8. Page 8 © Hortonworks Inc. 2014 Security for Hadoop REST API – Apache Knox Gateway
  9. 9. Page 9 © Hortonworks Inc. 2014 Current Hadoop Client Model •  FileSystem and MapReduce Java APIs •  HDFS, Pig, Hive and Oozie clients (that wrap the Java APIs) •  Typical use of APIs is via “Edge Node” that is “inside” cluster •  Users SSH to Edge Node and execute API commands from shell HadoopUser Edge Node SSH!
  10. 10. Page 10 © Hortonworks Inc. 2014 Why Knox? Simplified Access Single Hadoop access point Rationalized REST API hierarchy Consolidated API calls Multi-cluster support Client DSL Centralized Security Eliminate SSH “edge node” Central API management & audit Service-level authorization Identity Management SSO Integration LDAP & AD integration Knox eliminates the client’s requirements for intimate knowledge of cluster topology
  11. 11. Page 11 © Hortonworks Inc. 2014 Hadoop REST API Security: Drill-Down REST Client Enterprise Identity Provider LDAP/AD Knox Gateway GW GW Firewall Firewall DMZ L B Edge Node/ Hadoop CLIs Edge Node/ Hadoop CLIs RPC HTTP HTTP HTTP LDAP RPC Hadoop Cluster 2 Masters Slaves NN RM Oozie Web HCat HS2 HBase DN NM Hadoop Cluster 2 Masters Slaves NN RM Oozie Web HCat HS2 HBase DN NM
  12. 12. Page 12 © Hortonworks Inc. 2014 Knox Summary •  Simplifies Client Interaction with REST Web Services •  Abstracts away complexities of Kerberos •  Integrates with LDAP, Site Minder & other protocols in future •  Provides Authorization to each Web Service with IP, User, Group policies •  Able to secure multiple clusters through a single-endpoint
  13. 13. Page 13 © Hortonworks Inc. 2014 HDFS Access Control List (ACL)
  14. 14. Page 14 © Hortonworks Inc. 2014 HDFS Permissions Model Before HDP 2.1 • HDFS permissions at a File & Directory level • Managed by a set of 3 distinct user classes – “owner”, “group” and “others” • 3 permissions for each user class – Read (“r”), Write (“w”), Execute (“e”) – For Files, “r” for read, “w” for write – For Directories, “r” to list content, “w” to create/delete files + directories, “x” for access child of directory Owner Group Others HDFS Directory … rwx … rwx … rwx
  15. 15. Page 15 © Hortonworks Inc. 2014 HDFS File Permissions Example •  Authorization requirements: –  In a sales department, they would like a single user Maya (Department Manager) to control all modifications to sales data –  Other members of sales department need to view the data, but can’t modify it. –  Everyone else in the company must not be allowed to view the data. •  Can be implemented via the following: Read/Write perm for user maya User Group Read perm for group sales File with sales data
  16. 16. Page 16 © Hortonworks Inc. 2014 HDFS Extended ACLs in HDP 2.1 •  Problem – No longer feasible for Maya to control all modifications to the file –  New Requirement: Maya, Diane and Clark are allowed to make modifications –  New Requirement: New group called executives should be able to read the sales data – Current permissions model only allows permissions at 1 group and 1 user •  Solution: HDFS Extended ACLs – Now assign different permissions to different users and groups Owner Group Others HDFS Directory … rwx … rwx … rwx Group D … rwx Group F … rwx User Y … rwx
  17. 17. Page 17 © Hortonworks Inc. 2014 HDFS Extended ACLs in HDP 2.1 New Tools for ACL Management (setfacl, getfacl) – hdfs dfs -setfacl -m group:execs:r-- /sales-data! – hdfs dfs -getfacl /sales-data
 # file: /sales-data
 # owner: maya
 # group: sales
 user::rw-
 group::r--
 group:execs:r--
 mask::r--
 other::--! How do you know if a directory has ACLs set? – hdfs dfs -ls /sales-data
 Found 1 items
 -rw-r-----+  3 maya sales          0 2014-03-04 16:31 /sales-data!
  18. 18. Page 18 © Hortonworks Inc. 2014 HDFS Extended ACLs in HDP 2.1 Default ACLs – hdfs dfs -setfacl -m default:group:execs:r-x / monthly-sales-data! – hdfs dfs -mkdir /monthly-sales-data/JAN! – hdfs dfs –getfacl /monthly-sales-data/JAN! –  # file: /monthly-sales-data/JAN
 # owner: maya
 # group: sales
 user::rwx
 group::r-x
 group:execs:r-x
 mask::r-x
 other::---
 default:user::rwx
 default:group::r-x
 default:group:execs:r-x
 default:mask::r-x
 default:other::---"
  19. 19. Page 19 © Hortonworks Inc. 2014 SQL-Style Security for Hive –ATZ-NG
  20. 20. Page 20 © Hortonworks Inc. 2014 Hive Authorization Before HDP 2.1 HiveAuthorizationProvider(HAP) as the base interface 1.  StorageBasedAuthorizationProvider – Uses HDFS permissions to make authorization decision – HDFS dir permission = Table Permission – Coarse grained, no column level security – Secure://hive.apache.org/docs/hcat_r0.5.0/authorization.pdf 2.  DefaultHiveAuthorizationProvider – BROKEN HORTONWORKS RECOMMENDATION: DO NOT USE – RDBMS style authorization provider – Does not check all operations – Does not check policy grants
  21. 21. Page 21 © Hortonworks Inc. 2014 Hive Authorization in HDP 2.1 • Many paths into Hive – Hive CLI, Beeline, Oozie, Hue, Pig, HCatalog, etc. – Admin type users use CLI, Pig, HCatalog – Business users use O/JDBC, Beeline • Other security concerns – Authentication is enforced. It is a pre-requisite to meaningful authorization – No direct access to HDFS – cluster is Kerberized and restricts access – Hive Metastore is protected and allows only authorized access – Views are used to provide row/column level access with ATZ-NG
  22. 22. Page 22 © Hortonworks Inc. 2014 Hive ATZ-NG – Architecture HDFS Metastore HiveServer2 O/JDBC Beeline CLI •  ATZ-NG is called for O/JDBC & Beeline CLI •  Standard SQL GRANT / REVOKE for management •  Privilege to register UDF restricted to Admin user •  Policy integrated with Table/View life cycle Storage Based Authorization Hive CLI OozieHue PIG HCat Ambari 0. Enable HiveATZ-NG 1. Authentication UDFs Protected – fine grained Protected -- coarse grained Restrict direct access to Metastore Protect HDFS with Kerberos & HDFS ACL ATZ-NG 2. Authorization
  23. 23. Page 23 © Hortonworks Inc. 2014 Hive ATZ-NG Details Hive ATZ NG SQL standard-based authorization Manually config Hive to enable, Hive restart required Grants on tables or views to roles or users GRANT/REVOKE action ON [table | view] to role | user! Policy stored in Hive Metastore Table/View lifecycle auto-synced with policy stored in Hive Metastore Grant/Revoke does integrity check, prevents invalid policies Show grants on user | table | view | role & shows policy Supports delegated administration All data need to be readable/writable by Hive user, combined with HDFS ACL, need not be owned by Hive user Back up of Policy same as Hive Metastore backup Check on the ability to register UDF
  24. 24. Page 24 © Hortonworks Inc. 2014 What about MR/Pig/Hive CLI? • All these are ETL run by privileged users • Protect them at coarse grained level with StorageBasedAuthorization
  25. 25. Page 25 © Hortonworks Inc. 2014 Summary ATZ-NG is a superior approach for Hive Authorization because it delivers: 1.  Familiar & DBA-friendly approach for defining security policies for Hive Tables. No additional education required to understand how to take advantage of this. 2.  Integrated and error-free policy definition approach which works in lock-step with the lifecycle of tables and views. 3.  Minimal additional operational overhead to take advantage of ATZ-NG; from no required MR/YARN restart through leveraging pre-existing Hive Metastore (and associated handling - back-up, recovery, etc.)
  26. 26. Page 26 © Hortonworks Inc. 2014 Hive ATZ-NG Example Page 26
  27. 27. Page 27 © Hortonworks Inc. 2014 Scenario • Objective: Share Product Management Roadmap securely • Actors: – Admin Role – Specified in hive-site – Admin role controls role memberships – Product Management Role – Should be able to create, read all road map details. – Members: Vinay Shukla, Tim Hall – Engineering Role – Should be able to read (see) all roadmap details – Members: Kevin Minder, Larry McCay
  28. 28. Page 28 © Hortonworks Inc. 2014 Step 1: Admin role Creates Roles, Adds Users 1.  CREATE ROLE PM; 2.  CREATE ROLE ENG; 3.  GRANT ROLE PM to user timhall with admin option; 4.  GRANT ROLE PM to user vinayshukla; 5.  GRANT ROLE ENG to user kevinminder with admin option; 6.  GRANT ROLE ENG to user larrymccay;
  29. 29. Page 29 © Hortonworks Inc. 2014 Step 2: Super-user Creates Tables/Views create table hdp_hadoop_plans ( id int, hadoop_roadmap string, hdp_roadmap string );
  30. 30. Page 30 © Hortonworks Inc. 2014 Step 3: Users or Roles Assigned To Tables 1.  GRANT ALL ON hdp_hadoop_plans TO ROLE PM; 2.  GRANT SELECT ON hdp_hadoop_plans TO ROLE ENG;
  31. 31. Page 31 © Hortonworks Inc. 2014 Learn More Hortonworks.com/labs/ security/ Register for the other six Discover HDP 2.1 Webinars Hortonworks.com/webinars Next on the Security Roadmap
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×