SlideShare a Scribd company logo
1 of 49
Download to read offline
Securing Hadoop 
Hadoop Security Demystified…and then made more confusing. 
Presenter: 
Adam Muise 
Content: 
Balaji Ganesan 
Adam Muise 
Page 1 © Hortonworks Inc. 2014
What do we mean by Security? 
Say you have a house guest… 
- Authentication 
- Who gets in the door 
- Authorization 
- How far are they allowed in the house and what rooms 
are they allowed in 
- Auditing 
- Follow them around 
- Encryption 
- When all else fails, lock it up 
Page 2 © Hortonworks Inc. 2014
Insecurity – Not just for Teenagers 
- Security is really about risk mitigation 
- No perfect solution exists unless you 
locate your datacenter in the hull of 
the Titanic and cut all communications 
- The risks are: 
- Inappropriate access to data by internal 
resources 
- External data theft 
- Service outages 
- No knowledge of theft or inappropriate 
access 
- Hadoop’s value to a business is to centralize 
their data, that can make leaks more 
detrimental than a DDoS or stolen laptops 
Page 3 © Hortonworks Inc. 2014
Attention to Hadoop security on the rise… 
Page 4 © Hortonworks Inc. 2014 
- As Hadoop becomes more 
adopted, more sensitive 
production data is going into 
clusters, more attention is being 
paid to security 
- Intel/Cloudera working on Project Rhino 
- Hortonworks introduces Apache Knox 
- Cloudera buys Gazzang 
- Hortonworks buys XASecure and turns it 
into Apache Argus 
- HBase gets cell level security 
- … the list goes on
Watch out for those malicious attacks… 
Page 5 © Hortonworks Inc. 2014
Layers Of Hadoop Security 
Perimeter Level Security 
• Network Security (i.e. Firewalls) 
• Apache Knox (i.e. Gateways) 
Authentication 
• Kerberos 
• Delegation Tokens 
Authorization 
• Argus Security Policies 
OS Security 
• File Permissions 
• Process Isolation 
Page 6 © Hortonworks Inc. 2014 
Data Protection 
• Transport 
• Storage 
• Access
Typical Hadoop Security 
Vanilla Hadoop 
Page 7 © Hortonworks Inc. 2014
Hadoop out of the box 
- While a lot of security is built into Hadoop, out of the box not much of it 
is turned on 
- Without strong authentication, anyone with sufficient access to 
underlying OS has ability to impersonate users 
- Often paired with gateway nodes that provide stronger access 
restrictions 
- HDFS/YARN/Hive 
- Authentication - Derived from OS users local to the box the task/request is submitted from 
- Authorization – Dependent on each project/service 
Page 8 © Hortonworks Inc. 2014
Page 9 © Hortonworks Inc. 2014 
HDFS 
Typical Flow – Hive Access 
HiveServer 2 
A B C 
Beeline 
Client
Typical Hadoop Security 
Strong Authentication through Kerberos 
Page 10 © Hortonworks Inc. 2014
Kerberos Primer 
Page 11 © Hortonworks Inc. 2014 
Page 11 
KDC 
Client 
NN 
DN 
1. kinit - Login and get Ticket Granting Ticket (TGT) 
3. Get NameNode Service Ticket (NN-ST) 
2. Client Stores TGT in Ticket Cache 
4. Client Stores NN-ST in Ticket Cache 
5. Read/write file given NN-ST and 
file name; returns block locations, 
block IDs and Block Access Tokens 
if access permitted 
6. Read/write block given 
Block Access Token and block ID 
Client’s 
Kerberos 
Ticket Cache
Kerberos Summary 
• Provides Strong Authentication 
• Establishes identity for users, services and hosts 
• Prevents impersonation on unauthorized account 
• Supports token delegation model 
• Works with existing directory services 
• Basis for Authorization 
Page 12 © Hortonworks Inc. 2014 
Page 12
Hadoop Authentication 
• Users authenticate with the services 
– CLI & API: Kerberos kinit or keytab 
– Web UIs: Kerberos SPNego or custom plugin (e.g. SSO) 
• Services authenticate with each other 
– Prepopulated Kerberos keytab 
– e.g. DN->NN, NM->RM 
• Services propagate authenticated user identity 
– Authenticated trusted proxy service 
– e.g. Oozie->RM, Knox->WebHCat 
• Job tasks present delegated user’s identity/access 
– Delegation tokens 
– e.g. Job task -> NN, Job task -> JT/RM 
• Strong authentication is the basis for authorization 
Page 13 © Hortonworks Inc. 2014 
Client 
Page 13 
Name 
Node 
Data Node 
Name 
Node 
Oozie Job 
Tracker 
Task Name 
Node 
(User) 
Kerberos 
or 
Custom 
(Service) 
Kerberos 
(Service) 
Kerberos 
+ 
(User) 
doas 
(User) 
Delegation 
Token
User Management 
• Most implementations use LDAP for user info 
– LDAP guarantees that user information is consistent across the 
cluster 
– An easy way to manage users & groups 
– The standard user to group mapping comes from the OS on the 
NameNode 
• Kerberos provides authentication 
– PAM can automatically log user into Kerberos 
Page 14 © Hortonworks Inc. 2014 
Page 14
Kerberos + Active Directory 
Page 15 © Hortonworks Inc. 2014 
Page 15 
Cross Realm Trust 
Client 
Hadoop Cluster 
AD / 
LDAP KDC 
Users: smith@EXAMPLE.COM! 
Hosts: host1@HADOOP.EXAMPLE.COM! 
Services: hdfs/host1@HADOOP.EXAMPLE.COM! 
User Store 
Use existing directory 
tools to manage users 
Use Kerberos tools to 
manage host + service 
principals 
Authentication
Groups 
• Define groups for each required role 
• Hadoop has pluggable interface 
– Mapping from user to group not stored within Hadoop 
• Defaults to the OS information on master node 
– Typically driven from LDAP on Linux 
– Existing Plugins 
– ShellBasedUnixGroupsMapping - /bin/id 
– JniBasedUnixGroupsMapping – system call 
– LdapGroupsMapping – LDAP call 
– CompositeGroupMapping – combines Unix & LDAP group mapping 
• Strong authentication and role-based groups provide protections 
enabling shared clusters 
Page 16 © Hortonworks Inc. 2014 
Page 16
Groups 
AD / 
LDAP 
User Store 
Page 17 © Hortonworks Inc. 2014 
Plugin rw! 
Page 17 
NameNode 
Client Hadoop Cluster
Kerberos FAQ 
• Where do I install KDC? 
– On a master type node 
• User Provisioning 
– Hook up to Corporate AD/LDAP to leverage existing User Provisioning 
• Growing a cluster 
– Provision new services and nodes in MIT KDC, copy keytabs to new nodes 
• Is Kerberos a SPOF? 
– Kerberos support HA, with delegation tokens the KDC load is reduced 
Page 18 © Hortonworks Inc. 2014 
Page 18
Typical Flow – Authenticate through Kerberos 
Page 19 © Hortonworks Inc. 2014 
HDFS 
HiveServer 2 
A B C 
KDC 
Use Hive ST, 
submit query 
Hive gets 
Namenode (NN) 
service ticket 
Hive creates 
map reduce 
using NN ST 
Client gets 
service ticket for 
Hive 
Beeline 
Client
Typical Hadoop Security 
Strong Authentication + Cross-cutting Authorization 
Page 20 © Hortonworks Inc. 2014
Apache Argus (aka HDP Security) Capabilities 
Page 21 © Hortonworks Inc. 2014 
Hadoop and Argus 
Authentication 
Cross Platform Security Kerberos, Integration with AD 
Gateway for REST APIs Knox for http, REST APIs 
Role Based Authorizations 
Fine grained access control HDFS – Folder, File, 
Hive – Database, Table, Column, UDFs 
HBase – Table, Column Family, Column 
Wildcard Resource Names Yes 
Permission Support HDFS – Read, Write, Execute 
Hive – Select, Update, Create, Drop, Alter, Index, Lock 
Hbase – Read, Write, Create
Authorization and Audit 
Authorization 
Fine grain access control 
• HDFS – Folder, File 
• Hive – Database, Table, Column 
• HBase – Table, Column Family, Column 
Audit 
Extensive user access auditing in 
HDFS, Hive and HBase 
• IP Address 
• Resource type/ resource 
• Timestamp 
• Access granted or denied 
Page 22 © Hortonworks Inc. 2014 
Flexibility 
in defining 
policies 
Control 
access into 
system
Central Security Administration 
Apache Argus 
• Delivers a ‘single pane of glass’ for 
the security administrator 
• Centralizes administration of 
security policy 
• Ensures consistent coverage across 
the entire Hadoop stack 
Page 23 © Hortonworks Inc. 2014
Setup Authorization Policies 
24 
Page 24 © Hortonworks Inc. 2014 
file level 
access 
control, 
flexible 
definition 
Control 
permissions
Monitor through Auditing 
25 
Page 25 © Hortonworks Inc. 2014
Authorization and Auditing with Argus 
Hadoop distributed 
file system (HDFS) 
Page 26 © Hortonworks Inc. 2014 
Argus Administration Portal 
HBase 
Hive Server2 
Argus Policy 
Server 
Argus Audit 
Server 
Argus 
Agent 
Hadoop Components Enterprise 
Users 
Argus 
Agent 
Argus 
Agent 
Legacy 
Tools 
Integration API 
RDBMS 
HDFS 
Knox 
Falcon 
Argus 
Agent* 
Argus 
Agent* 
Argus 
Agent* 
Storm 
YARN 
: 
Data 
Opera.ng 
System 
* - Future Integration
Simplified Workflow - HDFS 
Users access HDFS data 
through application Name Node 
Page 27 © Hortonworks Inc. 2014 
Argus 
Policy 
Manager 
Argus Agent 
Admin sets policies for HDFS 
files/folder 
User 
Application 
Data scientist runs a 
map reduce job 
IT users access 
HDFS through 
CLI 
Namenode uses 
Argus Agent for 
Authorization 
Audit 
Database Audit logs pushed to DB 
Namenode provides 
resource access to 
user/client 
1 
2 
2 
2 
3 
4 
5
Simplified Workflow - Hive 
28 
Page 28 © Hortonworks Inc. 2014 
Audit logs pushed to DB 
Argus Agent 
Admin sets policies for Hive db/ 
tables/columns 
Hive Server2 
HiveServer2 
provide data 
access to 
users 
1 
3 
4 
5 
IT users access 
Hive via beeline 
2 command tool 
Hive 
Authorizes with 
Argus Agent 
2 
Users access Hive data using 
JDBC/ODBC 
Argus 
Policy 
Manager 
User 
Application 
Audit 
Database
Simplified Workflow - HBase 
29 
Page 29 © Hortonworks Inc. 2014 
Audit 
Database Audit logs pushed to DB 
Argus 
Policy 
Manager 
Argus Agent 
Admin sets policies for HBase 
table/cf/column 
User 
Application 
Data scientist runs a 
map reduce job 
Hbase Server 
HBase server 
provide data 
access to users 
1 
2 
3 
4 
5 
IT users access 
Hbase via 
HBShell 
2 
HBase Authorizes 
with Argus Agent 
2 
Users access HBase data 
using Java API
Typical Flow – Add Authorization through Argus 
Page 30 © Hortonworks Inc. 2014 
HDFS 
HiveServer 2 
A B C 
KDC 
Use Hive ST, 
submit query 
Hive gets 
Namenode (NN) 
service ticket 
Argus 
Hive creates 
map reduce 
using NN ST 
Client gets 
service ticket for 
Hive 
Beeline 
Client
Typical Hadoop Security 
Strong Authentication + Cross-cutting Authorization + Perimeter 
Security 
Page 31 © Hortonworks Inc. 2014
What does Perimeter Security really mean? 
REST API 
Page 32 © Hortonworks Inc. 2014 
Hadoop 
Services 
Gateway 
REST API 
Firewall 
User 
Firewall 
required at 
perimeter 
(today) 
Knox Gateway 
controls all 
Hadoop REST 
API access 
through firewall 
Hadoop 
cluster 
mostly 
unaffected 
Firewall only 
allows 
connections 
through specific 
ports from Knox 
host
Why Knox? 
Simplified Access 
• Kerberos encapsulation 
• Extends API reach 
• Single access point 
• Multi-cluster support 
• Single SSL certificate 
Page 33 © Hortonworks Inc. 2014 
Centralized Control 
• Central REST API auditing 
• Service-level authorization 
• Alternative to SSH “edge node” 
Enterprise Integration 
• LDAP integration 
• Active Directory integration 
• SSO integration 
• Apache Shiro extensibility 
• Custom extensibility 
Enhanced Security 
• Protect network details 
• Partial SSL for non-SSL services 
• WebApp vulnerability filter
Current Hadoop Client Model 
• FileSystem and MapReduce Java APIs 
• HDFS, Pig, Hive and Oozie clients (that wrap the Java APIs) 
• Typical use of APIs is via “Edge Node” that is “inside” cluster 
• Users SSH to Edge Node and execute API commands from shell 
Page 34 © Hortonworks Inc. 2014 
Page 34 
SSH! 
User Edge Node Hadoop
Hadoop REST APIs 
Service API 
WebHDFS Supports HDFS user operations including reading files, writing to files, 
making directories, changing permissions and renaming. Learn more about 
WebHDFS. 
WebHCat Job control for MapReduce, Pig and Hive jobs, and HCatalog DDL 
• Useful for connecting to Hadoop from the outside the cluster 
• When more client language flexibility is required 
– i.e. Java binding not an option 
• Challenges 
– Client must have knowledge of cluster topology 
– Required to open ports (and in some cases, on every host) outside the cluster 
Page 35 © Hortonworks Inc. 2014 
Page 35 
commands. Learn more about WebHCat. 
Hive Hive REST API operations 
HBase HBase REST API operations 
Oozie Job submission and management, and Oozie administration. Learn more 
about Oozie.
Knox Deployment with Hadoop Cluster 
Application Tier 
DMZ 
Switch 
NN 
SNN 
Page 36 © Hortonworks Inc. 2014 
LB 
Switch Switch 
…. 
Master Nodes 
Rack 1 
Switch Switch 
DN DN 
…. 
Slave Nodes 
Rack 2 
…. 
Slave Nodes 
Rack N 
Web Tier 
Knox 
Hadoop 
CLIs
Hadoop REST API Security: Drill-Down 
Page 37 © Hortonworks Inc. 2014 
Page 37 
REST 
Client 
Enterprise 
Identity 
Provider 
LDAP/AD 
Knox Gateway 
GGWW 
Firewall 
Firewall 
DMZ 
LB 
Edge Node/ 
Hadoop 
CLIs RPC 
HTTP 
HTTP HTTP 
LDAP 
Hadoop Cluster 1 
Masters 
Slaves 
NN 
RM 
Web 
Oozie HCat 
DN NM 
HBase 
HS2 
Hadoop Cluster 2 
Masters 
Slaves 
NN 
RM 
Web 
Oozie HCat 
DN NM 
HBase 
HS2
OpenLDAP Configuration 
• In sandbox.xml: 
<param> 
<name>main.ldapRealm</name> 
<value>org.apache.shiro.realm.ldap.JndiLdapRealm</value> 
</param> 
<param> 
<name>main.ldapRealm.userDnTemplate</name> 
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value> 
</param> 
<param> 
<name>main.ldapRealm.contextFactory.url</name> 
<value>ldap://localhost:33389</value> 
</param> 
Page 38 © Hortonworks Inc. 2014 
Page 38
Service level authorization Configuration 
• In <cluster.xml> 
<provider> 
<role>authorization</role> 
<name>AclsAuthz</name> 
<enabled>true</enabled> 
<param> 
<name>webhdfs.acl.mode</name> 
<value>OR</value> 
</param> 
<param> 
<name>webhdfs.acl</name> 
<value>guest;*;*</value> <-Format user(s);groups;ipaddress 
</param> 
<param> 
<name>webhcat.acl</name> 
<value>hdfs;admin;127.0.0.2,127.0.0.3</value> 
</param> 
</provider> 
Page 39 © Hortonworks Inc. 2014 
Page 39
Page 40 © Hortonworks Inc. 2014 
HDFS 
Typical Flow – Firewall, Route through Knox 
Gateway 
HiveServer 2 
A B C 
KDC 
Use Hive ST, 
submit query 
Hive gets 
Namenode (NN) 
service ticket 
Argus 
Hive creates 
map reduce 
using NN ST 
Knox runs as proxy 
user using Hive ST 
Knox gets 
service ticket for 
Hive 
Original 
request w/user 
id/password 
Client gets 
query result 
Beeline 
Client
SSL 
Page 41 © Hortonworks Inc. 2014 
HDFS 
Optionally - Add Wire and File Encryption 
SSL SSL 
HiveServer 2 
A B C 
KDC 
Use Hive ST, 
submit query 
Hive gets 
Namenode (NN) 
service ticket 
Argus 
Hive creates 
map reduce 
using NN ST 
Knox runs as proxy 
user using Hive ST 
Knox gets 
service ticket for 
Hive 
Original 
request w/user 
id/password 
Client gets 
query result 
Beeline 
Client 
SASL SASL
Security Features 
Page 42 © Hortonworks Inc. 2014 
Hadoop with Argus 
Auditing 
Configurable audit Yes, auditing can be controlled through policy 
Resource access 
auditing 
User id, request type, repository, access resource, IP address, 
timestamp, access granted/denied 
Admin auditing Changes to policies, login sessions and agent monitoring, 
Data Protection 
Over the wire SASL for RPC, SSL for MR shuffle, Web HDFS 
Data at rest LUKS for Volume Encryption, Partners 
Manage 
User/ Group mapping Local, Sync with LDAP/AD, Sync with Unix 
Delegated administration Delegate policy administration to groups or users
Data Protection 
Page 43 © Hortonworks Inc. 2014 
Page 43
Data Protection 
HDP allows you to apply data protection policy at 
three different layers across the Hadoop stack 
Layer What? How ? 
Storage Encrypt data while it is at rest 3rd Party, Future Hadoop improvements 
Transmission Encrypt data as it moves Already in Hadoop 
Upon Access Apply restrictions when accessed 3rd Party 
Page 44 © Hortonworks Inc. 2014
Points of Communication 
Page 45 © Hortonworks Inc. 2014 
Page 45 
WebHDFS 
DataTransferProtocol 
Nodes 
2 DataTransfer 
3 RPC Nodes 
M/R Shuffle 
Client 
1 
2 
4 
JDBC/ODBC 
3 
Hadoop Cluster 
RPC 
4
Data Transmission Protection in HDP 2.1 
• WebHDFS 
– Provides read/write access to HDFS 
– Optionally enable HTTPS 
– Authenticated using SPNEGO (Kerberos for HTTP) filter 
– SSL based wire encryption 
• RPC 
– Communications between NNs, DNs, etc. and Clients 
– SASL based wire encryption 
– DTP encryption with SASL 
• JDBC/ODBC 
– SSL based wire encryption 
– Also available SASL based encryption 
• Shuffle 
– Mapper to Reducer over HTTP(S) with SSL 
Page 46 © Hortonworks Inc. 2014 
46
Data Storage Protection 
• Encrypt at the physical file system level (e.g. dm-crypt) 
• Encrypt via custom HDFS “compression” codec 
• Encrypt at Application level (including security service/device) 
Page 47 © Hortonworks Inc. 2014 
DEF ABC 
Page 47 
Security Service 
(e.g. Voltage) 
ABC 1a3d HDFS 
ABC DEF 
ETL App 
ENCRYPT DECRYPT
Current Open Source Initiatives 
• HDFS Encryption 
– Transparent encryption of data at rest in HDFS via Encryption zones. Being worked in the community 
– Dependency on Key Management Server and Keyshell 
• Hive Column Level Encryption 
• HBase Column Level Encryption 
– Transparent Column Encryption, needs more testing/validation 
• Key Management Server 
• Key Provider API 
• Command line Key Operations 
Page 48 © Hortonworks Inc. 2014
And remember…. 
Page 49 © Hortonworks Inc. 2014

More Related Content

What's hot

Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateSteve Loughran
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Hadoop security
Hadoop securityHadoop security
Hadoop securityBiju Nair
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosSarvesh Meena
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesBolke de Bruin
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionSteve Loughran
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyDataWorks Summit
 
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: OverviewCloudera, Inc.
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Big Data Spain
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessCloudera, Inc.
 

What's hot (20)

Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using Kerberos
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenches
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
 
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: Overview
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 

Viewers also liked

Online assessment and data analytics - Peter Tan - Institute of Technical Edu...
Online assessment and data analytics - Peter Tan - Institute of Technical Edu...Online assessment and data analytics - Peter Tan - Institute of Technical Edu...
Online assessment and data analytics - Peter Tan - Institute of Technical Edu...Blackboard APAC
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with HadoopCloudera, Inc.
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014StampedeCon
 
Big Data Security with HP ArcSight
Big Data Security with HP ArcSightBig Data Security with HP ArcSight
Big Data Security with HP ArcSightSridhar Karnam
 
What are performance assessments?
What are performance assessments?What are performance assessments?
What are performance assessments?Curtis Chandler
 
Hadoop Security Now and Future
Hadoop Security Now and FutureHadoop Security Now and Future
Hadoop Security Now and Futuretcloudcomputing-tw
 
Big Data Security Intelligence and Analytics for Advanced Threat Protection
Big Data Security Intelligence and Analytics for Advanced Threat ProtectionBig Data Security Intelligence and Analytics for Advanced Threat Protection
Big Data Security Intelligence and Analytics for Advanced Threat ProtectionBlue Coat
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterEdureka!
 
LPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to executeLPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to executeTelefónica IoT
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Coastal Pet Products, Inc.
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop AdministrationEdureka!
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdfEdureka!
 
Performance based-assessment
Performance based-assessmentPerformance based-assessment
Performance based-assessmentluisagodoy444
 

Viewers also liked (17)

Hadoop and Big Data Security
Hadoop and Big Data SecurityHadoop and Big Data Security
Hadoop and Big Data Security
 
Online assessment and data analytics - Peter Tan - Institute of Technical Edu...
Online assessment and data analytics - Peter Tan - Institute of Technical Edu...Online assessment and data analytics - Peter Tan - Institute of Technical Edu...
Online assessment and data analytics - Peter Tan - Institute of Technical Edu...
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
 
Big security for big data
Big security for big dataBig security for big data
Big security for big data
 
Big Data Security with HP ArcSight
Big Data Security with HP ArcSightBig Data Security with HP ArcSight
Big Data Security with HP ArcSight
 
What are performance assessments?
What are performance assessments?What are performance assessments?
What are performance assessments?
 
Hadoop Security Now and Future
Hadoop Security Now and FutureHadoop Security Now and Future
Hadoop Security Now and Future
 
Big Data Security Intelligence and Analytics for Advanced Threat Protection
Big Data Security Intelligence and Analytics for Advanced Threat ProtectionBig Data Security Intelligence and Analytics for Advanced Threat Protection
Big Data Security Intelligence and Analytics for Advanced Threat Protection
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
 
LPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to executeLPWA-Open for Business. It’s time to execute
LPWA-Open for Business. It’s time to execute
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title)
 
IoT - Big Data & Security
IoT - Big Data & SecurityIoT - Big Data & Security
IoT - Big Data & Security
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
 
Performance based-assessment
Performance based-assessmentPerformance based-assessment
Performance based-assessment
 

Similar to 2014 sept 4_hadoop_security

August 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopAugust 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopYahoo Developer Network
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access SecurityCloudera, Inc.
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Clusterahortonworks
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...huguk
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataGreat Wide Open
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...DataWorks Summit
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Big Data Spain
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Apache Hive authorization models
Apache Hive authorization modelsApache Hive authorization models
Apache Hive authorization modelsThejas Nair
 

Similar to 2014 sept 4_hadoop_security (20)

August 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopAugust 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for Hadoop
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
 
Curb your insecurity with HDP
Curb your insecurity with HDPCurb your insecurity with HDP
Curb your insecurity with HDP
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
 
Securing Spark Applications
Securing Spark ApplicationsSecuring Spark Applications
Securing Spark Applications
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Apache Hive authorization models
Apache Hive authorization modelsApache Hive authorization models
Apache Hive authorization models
 

More from Adam Muise

2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_finalAdam Muise
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Adam Muise
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascienceAdam Muise
 
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadam2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadamAdam Muise
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop IntroductionAdam Muise
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopAdam Muise
 
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1Adam Muise
 
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop2014 july 24_what_ishadoop
2014 july 24_what_ishadoopAdam Muise
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLAdam Muise
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101Adam Muise
 
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitectureAdam Muise
 
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mdaAdam Muise
 
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - HadoopAdam Muise
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0Adam Muise
 
What is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMACWhat is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMACAdam Muise
 
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013Adam Muise
 
Sept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical IntroductionSept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical IntroductionAdam Muise
 
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive TuningAdam Muise
 
2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_points2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_pointsAdam Muise
 
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalogAdam Muise
 

More from Adam Muise (20)

2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
 
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadam2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop Introduction
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
 
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
 
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
 
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
 
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda
 
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
What is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMACWhat is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMAC
 
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013
 
Sept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical IntroductionSept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical Introduction
 
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning
 
2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_points2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_points
 
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 

2014 sept 4_hadoop_security

  • 1. Securing Hadoop Hadoop Security Demystified…and then made more confusing. Presenter: Adam Muise Content: Balaji Ganesan Adam Muise Page 1 © Hortonworks Inc. 2014
  • 2. What do we mean by Security? Say you have a house guest… - Authentication - Who gets in the door - Authorization - How far are they allowed in the house and what rooms are they allowed in - Auditing - Follow them around - Encryption - When all else fails, lock it up Page 2 © Hortonworks Inc. 2014
  • 3. Insecurity – Not just for Teenagers - Security is really about risk mitigation - No perfect solution exists unless you locate your datacenter in the hull of the Titanic and cut all communications - The risks are: - Inappropriate access to data by internal resources - External data theft - Service outages - No knowledge of theft or inappropriate access - Hadoop’s value to a business is to centralize their data, that can make leaks more detrimental than a DDoS or stolen laptops Page 3 © Hortonworks Inc. 2014
  • 4. Attention to Hadoop security on the rise… Page 4 © Hortonworks Inc. 2014 - As Hadoop becomes more adopted, more sensitive production data is going into clusters, more attention is being paid to security - Intel/Cloudera working on Project Rhino - Hortonworks introduces Apache Knox - Cloudera buys Gazzang - Hortonworks buys XASecure and turns it into Apache Argus - HBase gets cell level security - … the list goes on
  • 5. Watch out for those malicious attacks… Page 5 © Hortonworks Inc. 2014
  • 6. Layers Of Hadoop Security Perimeter Level Security • Network Security (i.e. Firewalls) • Apache Knox (i.e. Gateways) Authentication • Kerberos • Delegation Tokens Authorization • Argus Security Policies OS Security • File Permissions • Process Isolation Page 6 © Hortonworks Inc. 2014 Data Protection • Transport • Storage • Access
  • 7. Typical Hadoop Security Vanilla Hadoop Page 7 © Hortonworks Inc. 2014
  • 8. Hadoop out of the box - While a lot of security is built into Hadoop, out of the box not much of it is turned on - Without strong authentication, anyone with sufficient access to underlying OS has ability to impersonate users - Often paired with gateway nodes that provide stronger access restrictions - HDFS/YARN/Hive - Authentication - Derived from OS users local to the box the task/request is submitted from - Authorization – Dependent on each project/service Page 8 © Hortonworks Inc. 2014
  • 9. Page 9 © Hortonworks Inc. 2014 HDFS Typical Flow – Hive Access HiveServer 2 A B C Beeline Client
  • 10. Typical Hadoop Security Strong Authentication through Kerberos Page 10 © Hortonworks Inc. 2014
  • 11. Kerberos Primer Page 11 © Hortonworks Inc. 2014 Page 11 KDC Client NN DN 1. kinit - Login and get Ticket Granting Ticket (TGT) 3. Get NameNode Service Ticket (NN-ST) 2. Client Stores TGT in Ticket Cache 4. Client Stores NN-ST in Ticket Cache 5. Read/write file given NN-ST and file name; returns block locations, block IDs and Block Access Tokens if access permitted 6. Read/write block given Block Access Token and block ID Client’s Kerberos Ticket Cache
  • 12. Kerberos Summary • Provides Strong Authentication • Establishes identity for users, services and hosts • Prevents impersonation on unauthorized account • Supports token delegation model • Works with existing directory services • Basis for Authorization Page 12 © Hortonworks Inc. 2014 Page 12
  • 13. Hadoop Authentication • Users authenticate with the services – CLI & API: Kerberos kinit or keytab – Web UIs: Kerberos SPNego or custom plugin (e.g. SSO) • Services authenticate with each other – Prepopulated Kerberos keytab – e.g. DN->NN, NM->RM • Services propagate authenticated user identity – Authenticated trusted proxy service – e.g. Oozie->RM, Knox->WebHCat • Job tasks present delegated user’s identity/access – Delegation tokens – e.g. Job task -> NN, Job task -> JT/RM • Strong authentication is the basis for authorization Page 13 © Hortonworks Inc. 2014 Client Page 13 Name Node Data Node Name Node Oozie Job Tracker Task Name Node (User) Kerberos or Custom (Service) Kerberos (Service) Kerberos + (User) doas (User) Delegation Token
  • 14. User Management • Most implementations use LDAP for user info – LDAP guarantees that user information is consistent across the cluster – An easy way to manage users & groups – The standard user to group mapping comes from the OS on the NameNode • Kerberos provides authentication – PAM can automatically log user into Kerberos Page 14 © Hortonworks Inc. 2014 Page 14
  • 15. Kerberos + Active Directory Page 15 © Hortonworks Inc. 2014 Page 15 Cross Realm Trust Client Hadoop Cluster AD / LDAP KDC Users: smith@EXAMPLE.COM! Hosts: host1@HADOOP.EXAMPLE.COM! Services: hdfs/host1@HADOOP.EXAMPLE.COM! User Store Use existing directory tools to manage users Use Kerberos tools to manage host + service principals Authentication
  • 16. Groups • Define groups for each required role • Hadoop has pluggable interface – Mapping from user to group not stored within Hadoop • Defaults to the OS information on master node – Typically driven from LDAP on Linux – Existing Plugins – ShellBasedUnixGroupsMapping - /bin/id – JniBasedUnixGroupsMapping – system call – LdapGroupsMapping – LDAP call – CompositeGroupMapping – combines Unix & LDAP group mapping • Strong authentication and role-based groups provide protections enabling shared clusters Page 16 © Hortonworks Inc. 2014 Page 16
  • 17. Groups AD / LDAP User Store Page 17 © Hortonworks Inc. 2014 Plugin rw! Page 17 NameNode Client Hadoop Cluster
  • 18. Kerberos FAQ • Where do I install KDC? – On a master type node • User Provisioning – Hook up to Corporate AD/LDAP to leverage existing User Provisioning • Growing a cluster – Provision new services and nodes in MIT KDC, copy keytabs to new nodes • Is Kerberos a SPOF? – Kerberos support HA, with delegation tokens the KDC load is reduced Page 18 © Hortonworks Inc. 2014 Page 18
  • 19. Typical Flow – Authenticate through Kerberos Page 19 © Hortonworks Inc. 2014 HDFS HiveServer 2 A B C KDC Use Hive ST, submit query Hive gets Namenode (NN) service ticket Hive creates map reduce using NN ST Client gets service ticket for Hive Beeline Client
  • 20. Typical Hadoop Security Strong Authentication + Cross-cutting Authorization Page 20 © Hortonworks Inc. 2014
  • 21. Apache Argus (aka HDP Security) Capabilities Page 21 © Hortonworks Inc. 2014 Hadoop and Argus Authentication Cross Platform Security Kerberos, Integration with AD Gateway for REST APIs Knox for http, REST APIs Role Based Authorizations Fine grained access control HDFS – Folder, File, Hive – Database, Table, Column, UDFs HBase – Table, Column Family, Column Wildcard Resource Names Yes Permission Support HDFS – Read, Write, Execute Hive – Select, Update, Create, Drop, Alter, Index, Lock Hbase – Read, Write, Create
  • 22. Authorization and Audit Authorization Fine grain access control • HDFS – Folder, File • Hive – Database, Table, Column • HBase – Table, Column Family, Column Audit Extensive user access auditing in HDFS, Hive and HBase • IP Address • Resource type/ resource • Timestamp • Access granted or denied Page 22 © Hortonworks Inc. 2014 Flexibility in defining policies Control access into system
  • 23. Central Security Administration Apache Argus • Delivers a ‘single pane of glass’ for the security administrator • Centralizes administration of security policy • Ensures consistent coverage across the entire Hadoop stack Page 23 © Hortonworks Inc. 2014
  • 24. Setup Authorization Policies 24 Page 24 © Hortonworks Inc. 2014 file level access control, flexible definition Control permissions
  • 25. Monitor through Auditing 25 Page 25 © Hortonworks Inc. 2014
  • 26. Authorization and Auditing with Argus Hadoop distributed file system (HDFS) Page 26 © Hortonworks Inc. 2014 Argus Administration Portal HBase Hive Server2 Argus Policy Server Argus Audit Server Argus Agent Hadoop Components Enterprise Users Argus Agent Argus Agent Legacy Tools Integration API RDBMS HDFS Knox Falcon Argus Agent* Argus Agent* Argus Agent* Storm YARN : Data Opera.ng System * - Future Integration
  • 27. Simplified Workflow - HDFS Users access HDFS data through application Name Node Page 27 © Hortonworks Inc. 2014 Argus Policy Manager Argus Agent Admin sets policies for HDFS files/folder User Application Data scientist runs a map reduce job IT users access HDFS through CLI Namenode uses Argus Agent for Authorization Audit Database Audit logs pushed to DB Namenode provides resource access to user/client 1 2 2 2 3 4 5
  • 28. Simplified Workflow - Hive 28 Page 28 © Hortonworks Inc. 2014 Audit logs pushed to DB Argus Agent Admin sets policies for Hive db/ tables/columns Hive Server2 HiveServer2 provide data access to users 1 3 4 5 IT users access Hive via beeline 2 command tool Hive Authorizes with Argus Agent 2 Users access Hive data using JDBC/ODBC Argus Policy Manager User Application Audit Database
  • 29. Simplified Workflow - HBase 29 Page 29 © Hortonworks Inc. 2014 Audit Database Audit logs pushed to DB Argus Policy Manager Argus Agent Admin sets policies for HBase table/cf/column User Application Data scientist runs a map reduce job Hbase Server HBase server provide data access to users 1 2 3 4 5 IT users access Hbase via HBShell 2 HBase Authorizes with Argus Agent 2 Users access HBase data using Java API
  • 30. Typical Flow – Add Authorization through Argus Page 30 © Hortonworks Inc. 2014 HDFS HiveServer 2 A B C KDC Use Hive ST, submit query Hive gets Namenode (NN) service ticket Argus Hive creates map reduce using NN ST Client gets service ticket for Hive Beeline Client
  • 31. Typical Hadoop Security Strong Authentication + Cross-cutting Authorization + Perimeter Security Page 31 © Hortonworks Inc. 2014
  • 32. What does Perimeter Security really mean? REST API Page 32 © Hortonworks Inc. 2014 Hadoop Services Gateway REST API Firewall User Firewall required at perimeter (today) Knox Gateway controls all Hadoop REST API access through firewall Hadoop cluster mostly unaffected Firewall only allows connections through specific ports from Knox host
  • 33. Why Knox? Simplified Access • Kerberos encapsulation • Extends API reach • Single access point • Multi-cluster support • Single SSL certificate Page 33 © Hortonworks Inc. 2014 Centralized Control • Central REST API auditing • Service-level authorization • Alternative to SSH “edge node” Enterprise Integration • LDAP integration • Active Directory integration • SSO integration • Apache Shiro extensibility • Custom extensibility Enhanced Security • Protect network details • Partial SSL for non-SSL services • WebApp vulnerability filter
  • 34. Current Hadoop Client Model • FileSystem and MapReduce Java APIs • HDFS, Pig, Hive and Oozie clients (that wrap the Java APIs) • Typical use of APIs is via “Edge Node” that is “inside” cluster • Users SSH to Edge Node and execute API commands from shell Page 34 © Hortonworks Inc. 2014 Page 34 SSH! User Edge Node Hadoop
  • 35. Hadoop REST APIs Service API WebHDFS Supports HDFS user operations including reading files, writing to files, making directories, changing permissions and renaming. Learn more about WebHDFS. WebHCat Job control for MapReduce, Pig and Hive jobs, and HCatalog DDL • Useful for connecting to Hadoop from the outside the cluster • When more client language flexibility is required – i.e. Java binding not an option • Challenges – Client must have knowledge of cluster topology – Required to open ports (and in some cases, on every host) outside the cluster Page 35 © Hortonworks Inc. 2014 Page 35 commands. Learn more about WebHCat. Hive Hive REST API operations HBase HBase REST API operations Oozie Job submission and management, and Oozie administration. Learn more about Oozie.
  • 36. Knox Deployment with Hadoop Cluster Application Tier DMZ Switch NN SNN Page 36 © Hortonworks Inc. 2014 LB Switch Switch …. Master Nodes Rack 1 Switch Switch DN DN …. Slave Nodes Rack 2 …. Slave Nodes Rack N Web Tier Knox Hadoop CLIs
  • 37. Hadoop REST API Security: Drill-Down Page 37 © Hortonworks Inc. 2014 Page 37 REST Client Enterprise Identity Provider LDAP/AD Knox Gateway GGWW Firewall Firewall DMZ LB Edge Node/ Hadoop CLIs RPC HTTP HTTP HTTP LDAP Hadoop Cluster 1 Masters Slaves NN RM Web Oozie HCat DN NM HBase HS2 Hadoop Cluster 2 Masters Slaves NN RM Web Oozie HCat DN NM HBase HS2
  • 38. OpenLDAP Configuration • In sandbox.xml: <param> <name>main.ldapRealm</name> <value>org.apache.shiro.realm.ldap.JndiLdapRealm</value> </param> <param> <name>main.ldapRealm.userDnTemplate</name> <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value> </param> <param> <name>main.ldapRealm.contextFactory.url</name> <value>ldap://localhost:33389</value> </param> Page 38 © Hortonworks Inc. 2014 Page 38
  • 39. Service level authorization Configuration • In <cluster.xml> <provider> <role>authorization</role> <name>AclsAuthz</name> <enabled>true</enabled> <param> <name>webhdfs.acl.mode</name> <value>OR</value> </param> <param> <name>webhdfs.acl</name> <value>guest;*;*</value> <-Format user(s);groups;ipaddress </param> <param> <name>webhcat.acl</name> <value>hdfs;admin;127.0.0.2,127.0.0.3</value> </param> </provider> Page 39 © Hortonworks Inc. 2014 Page 39
  • 40. Page 40 © Hortonworks Inc. 2014 HDFS Typical Flow – Firewall, Route through Knox Gateway HiveServer 2 A B C KDC Use Hive ST, submit query Hive gets Namenode (NN) service ticket Argus Hive creates map reduce using NN ST Knox runs as proxy user using Hive ST Knox gets service ticket for Hive Original request w/user id/password Client gets query result Beeline Client
  • 41. SSL Page 41 © Hortonworks Inc. 2014 HDFS Optionally - Add Wire and File Encryption SSL SSL HiveServer 2 A B C KDC Use Hive ST, submit query Hive gets Namenode (NN) service ticket Argus Hive creates map reduce using NN ST Knox runs as proxy user using Hive ST Knox gets service ticket for Hive Original request w/user id/password Client gets query result Beeline Client SASL SASL
  • 42. Security Features Page 42 © Hortonworks Inc. 2014 Hadoop with Argus Auditing Configurable audit Yes, auditing can be controlled through policy Resource access auditing User id, request type, repository, access resource, IP address, timestamp, access granted/denied Admin auditing Changes to policies, login sessions and agent monitoring, Data Protection Over the wire SASL for RPC, SSL for MR shuffle, Web HDFS Data at rest LUKS for Volume Encryption, Partners Manage User/ Group mapping Local, Sync with LDAP/AD, Sync with Unix Delegated administration Delegate policy administration to groups or users
  • 43. Data Protection Page 43 © Hortonworks Inc. 2014 Page 43
  • 44. Data Protection HDP allows you to apply data protection policy at three different layers across the Hadoop stack Layer What? How ? Storage Encrypt data while it is at rest 3rd Party, Future Hadoop improvements Transmission Encrypt data as it moves Already in Hadoop Upon Access Apply restrictions when accessed 3rd Party Page 44 © Hortonworks Inc. 2014
  • 45. Points of Communication Page 45 © Hortonworks Inc. 2014 Page 45 WebHDFS DataTransferProtocol Nodes 2 DataTransfer 3 RPC Nodes M/R Shuffle Client 1 2 4 JDBC/ODBC 3 Hadoop Cluster RPC 4
  • 46. Data Transmission Protection in HDP 2.1 • WebHDFS – Provides read/write access to HDFS – Optionally enable HTTPS – Authenticated using SPNEGO (Kerberos for HTTP) filter – SSL based wire encryption • RPC – Communications between NNs, DNs, etc. and Clients – SASL based wire encryption – DTP encryption with SASL • JDBC/ODBC – SSL based wire encryption – Also available SASL based encryption • Shuffle – Mapper to Reducer over HTTP(S) with SSL Page 46 © Hortonworks Inc. 2014 46
  • 47. Data Storage Protection • Encrypt at the physical file system level (e.g. dm-crypt) • Encrypt via custom HDFS “compression” codec • Encrypt at Application level (including security service/device) Page 47 © Hortonworks Inc. 2014 DEF ABC Page 47 Security Service (e.g. Voltage) ABC 1a3d HDFS ABC DEF ETL App ENCRYPT DECRYPT
  • 48. Current Open Source Initiatives • HDFS Encryption – Transparent encryption of data at rest in HDFS via Encryption zones. Being worked in the community – Dependency on Key Management Server and Keyshell • Hive Column Level Encryption • HBase Column Level Encryption – Transparent Column Encryption, needs more testing/validation • Key Management Server • Key Provider API • Command line Key Operations Page 48 © Hortonworks Inc. 2014
  • 49. And remember…. Page 49 © Hortonworks Inc. 2014