SlideShare a Scribd company logo
6 ways to exploit Hive
– and what to do about it
Brock Noland |Software Engineer, Cloudera
January 23, 2013

1
Outline
Introduction
• Hadoop security primer
•

•
•

•

Security options
•
•
•

•

2

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Introduction
Tonight's focus is SQL-on-Hadoop
• Vast majority of Hadoop users use Hive or Cloudera
Impala
• Data warehouse offload is the most common use
case
• Data warehouse offload is a two step process
1.
2.

3

Automatic transformations moved to Hadoop
Data analysts given query access
Data warehouse use case

Online
Database

4

Hadoop

Data Warehouse
Outline
Introduction
• Hadoop Security Primer
•

•
•

•

Security options
•
•
•

•

5

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Authentication
Authentication is who you are
• Hadoop models
•

•
•

6

Default - “trusted network”
Strong - Kerberos
Default Authentication – trusted network
Default security mechanism
• Hadoop client uses local username
• Used in
•

•
•
•
•

7

POCs
Startups
Demos
Pre-prod environments
Default Authentication – trusted network

Client Host

User: brock
File: a.txt
Contents: some data

$ whoami
brock
$ cat a.txt
some data
$ hadoop fs -put file .

8

Hadoop
Strong Authentication – Kerberos
•

Hadoop is secured with Kerberos
•
•

•

Every user and service has a Kerberos “principal”
•
•

•

Service: impala/hostname@MYCOMPANY.COM
User: brock@MYCOMPANY.COM

Credentials
•
•

9

Provides mutual authentication
Protects against eavesdropping and replay attacks

Service: keytabs
User: password
Strong Authentication – Kerberos

Client Host

User: brock
<kerberos ticket>
<encrypted data> *

$ whoami
brock
$ kinit
Password: *******
$ cat a.txt
some data
$ hadoop fs -put file .
10

Hadoop

* RPC Encryption must be enabled
Strong Authentication – Kerberos
•

Keytab
•
•

11

Encrypted key for servers (similar to a “password”)
Generated by server such as MIT Kerberos or Active
Directory
Hive Server 2 and Oozie
Beeline
(Hive CLI)

Tableau

JDBC

Hive Server 2 (HS2)

Oozie

Hadoop
12

Oozie CLI

Control-M
Strong Authentication – Kerberos
•

Impersonation
•
•
•

13

Services such as Hive Server2 impersonate users
Data loaded by “joe” via HS2 is owned by “joe”
Oozie jobs submitted by “brock” are run as “brock”
Authorization
•

HDFS permissions
•
•
•

•

Other Hadoop components have authorization
•
•

14

Unix style
Read/Write/Execute for Owner/Group/Other
Coarse grained
MapReduce who can use which job queues
HBase table ACL’s
HDFS Permisssions
$ hadoop fs -ls file
-rw-r----1 analyst1 analysts

•

Permissions
•
•
•

•

Owner
•

•

Unix style permissions
Read/Write/Execute
Owner/Group/Other

One and only one owner

Group
•

One and only one group

2244 2014-01-19 12:15 file
Back to our use case
•

Scenario facts
•
•
•

•

Next step
•
•

16

ETL offload is a success
Data warehouse is expensive and at capacity
Same data is in Hadoop
End users start using Hadoop to augment the DW
Security becomes primary concern
End users need to share data
Unlike automated ETL jobs, end users want to share
data with peers
• Must manage HDFS permissions manually
• Each file has a single group
• End result is users set permissions to world
readable/writeable
•

17
Outline
Introduction
• Hadoop Security Primer
•

•
•

•

Security options
•
•
•

•

18

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Hive: Security holes
CREATE TEMPORARY FUNCTION
custom_udf AS ’com.mycompany.
MaliciousClass’;
SELECT TRANSFORM(stuff)
USING 'malicious-script.pl'
AS thing1, thing;
CREATE EXTERNAL TABLE
external_table(column1 string)
LOCATION ‘/path/to/any/table’;
19
Hive: Security holes
CREATE TABLE test (c1 string)
ROW FORMAT SERDE
'com.mycompany.MaliciousClass';
FROM (
FROM t1
MAP t1.c1
USING 'malicious-script1.pl'
CLUSTER BY key) map_output
INSERT OVERWRITE TABLE t2
REDUCE t2.c1
USING 'malicious-script2.pl'
AS c2;
20
Default: Authorization
•

Hive ships with an “advisory” authorization system
•
•
•

21

All users see all databases/tables/columns
Does not fix any security holes
Users grant themselves permissions
Outline
Introduction
• Hadoop Security Primer
•

•
•

•

Security options
•
•
•

•

22

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Kerberos with impersonation: Sharing data
The user “manager1” wants to share the table “manager1_table”
with senior analysts but not junior analysts.
# hadoop fs -ls -R /user/hive/warehouse
drwxr-x--T
- analyst1
analyst1
drwxr-x--T
- jranalyst1 jranalyst1
drwxr-x--T
- manager1
manager1

23

0
0
0

analyst1_table
jranalyst1_table
manager1_table
Kerberos with impersonation: Sharing data
IT must create a group
# groupadd senioranalysts

Then add the appropriate members to group
# usermod -G analyst,senioranalysts analyst1
# usermod -G management,analyst,senioranalysts manager1

24
Kerberos with impersonation: Sharing data
Then “manager1” can manually change the file permissions
$ hadoop fs -chgrp -R senioranalysts …/warehouse/manager1_table
$ hadoop fs -ls /user/hive/warehouse/
Found 3 items
drwxr-x--T
- analyst1
analyst1
drwxr-x--T
- jranalyst1 jranalyst1
drwxr-x--T
- manager1
senioranalysts

25

0
0
0

analyst1_table
jranalyst1_table
manager1_table
Kerberos with impersonation: Sharing data
Now any senior-level analyst can query the data
$ whoami
analyst1
$ beeline ...
Connected to: Hive (version 0.10.0)
0: jdbc:hive2://localhost:10000/default>
select count(*) from manager1_table;
+------------+
| count(*)
|
+------------+
| 47
|
+------------+

26

⏎
Kerberos with impersonation: Sharing data
Junior analysts cannot query the data:
$ whoami
jranalyst1
$ beeline ....
Connected to: Hive (version 0.10.0)
0: jdbc:hive2://localhost:10000/default> ⏎
select * from manager1_table;
Error: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Permission denied:
user=jranalyst1, access=READ_EXECUTE, inode="/user/hive/warehouse/mana
ger1_table":manager1:senioranalysts:drwxr-x--T

27
Kerberos with impersonation: Sharing data

What happens in the real world?

28
Kerberos with impersonation: Sharing data
Table “manager1_table” is owned by user/group “manager1”
$ hadoop fs -ls /user/hive/warehouse/
Found 3 items
drwxr-x--T
- analyst1
analyst1
drwxr-x--T
- jranalyst1 jranalyst1
drwxr-x--T
- manager1
manager1

29

0
0
0

analyst1_table
jranalyst1_table
manager1_table
Kerberos with impersonation: Sharing data
User “manager1” makes “manager1_table” world readable/writable
$ hadoop fs -chmod -R 777 /user/hive/warehouse/manager1_table
$ hadoop fs -ls /user/hive/warehouse/
Found 3 items
drwxr-x--T
- analyst1
analyst1
drwxr-x--T
- jranalyst1 jranalyst1
drwxrwxrwt
- manager1
manager1

30

0
0
0

analyst1_table
jranalyst1_table
manager1_table
Kerberos with impersonation: Summary
•

Securing Hive with Kerberos makes Hive unusable for
DW offload
•
•
•
•

31

Manual file permission management
End state is world writable/readable
No ability to restrict access to columns or rows
All users see all databases/tables/columns
Outline
Introduction
• Hadoop Security Primer
•

•
•

•

Security options
•
•
•

•

32

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Fine Grained Security: Apache Sentry
Authorization module for Hive, Search, & Impala
Unlocks Key RBAC Requirements
Secure, fine-grained, role-based authorization
Multi-tenant administration

Open Source
Apache Incubator project

Ecosystem Support
Apache SOLR, HiveServer2, & Impala 1.1+

33
Key Benefits of Sentry
Store Sensitive Data in Hadoop
Extend Hadoop to More Users

Comply with Regulations

34
Key Capabilities of Sentry
Fine-Grained Authorization
Specify security for SERVERS, DATABASES, TABLES & VIEWS

Role-Based Authorization
SELECT privilege on views & tables
INSERT privilege on tables
ALL privilege on the server, databases, tables & views
ALL privilege is needed to create/modify schema

Multi-Tenant Administration
Separate policies for each database/schema
Can be maintained by separate admins

35
Sentry Architecture
Impala

Binding
Layer

Impala

HiveServer2

Hive

Authorization
Provider

SOLR

Search

Pig

Policy Engine
Policy Provider
File

Local FS/HDFS

36

Database

…
Query Execution Flow
SQL

Parse

Validate SQL grammar

Build

Construct statement tree

Check

Sentry

Forward to execution planner

Plan
MR
37

Validate statement objects
• First check: Authorization

Query
Outline
Introduction
• Hadoop Security Primer
•

•
•

•

Security options
•
•
•

•

38

Authentication
Authorization
Default
Kerberos with Impersonation
Kerberos with Sentry

Demo
Click to edit Master title style

39

More Related Content

What's hot

Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
Cloudera, Inc.
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Cloudera, Inc.
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
Anurag Shrivastava
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Shivaji Dutta
 
Hadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117revHadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117rev
Jason Shih
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
Cloudera, Inc.
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
Rommel Garcia
 
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
Cloudera, Inc.
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
Owen O'Malley
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
Cloudera, Inc.
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction
Alexander Alten
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
Uwe Printz
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
Hortonworks
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Big Data Spain
 
Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption
Cloudera, Inc.
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 

What's hot (20)

Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Hadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117revHadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117rev
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
Comprehensive Hadoop Security for the Enterprise | Part I | Compliance Ready ...
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
 
Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 

Viewers also liked

Introduction to Apache HBase Training
Introduction to Apache HBase TrainingIntroduction to Apache HBase Training
Introduction to Apache HBase Training
Cloudera, Inc.
 
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache HadoopIntroduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Cloudera, Inc.
 
Securing Your Apache Spark Applications
Securing Your Apache Spark ApplicationsSecuring Your Apache Spark Applications
Securing Your Apache Spark Applications
Cloudera, Inc.
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
Cloudera, Inc.
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
Steve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Steve Loughran
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Introduction to Hadoop Developer Training Webinar
Introduction to Hadoop Developer Training WebinarIntroduction to Hadoop Developer Training Webinar
Introduction to Hadoop Developer Training Webinar
Cloudera, Inc.
 
Introduction to sentry
Introduction to sentryIntroduction to sentry
Introduction to sentry
mozillazg
 
One Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsOne Hadoop, Multiple Clouds
One Hadoop, Multiple Clouds
Cloudera, Inc.
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
Saumitra Srivastav
 
Cloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-HadoopCloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-Hadoop
Cloudera, Inc.
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
Edureka!
 
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Lucidworks
 
Confluent building a real-time streaming platform using kafka streams and k...
Confluent   building a real-time streaming platform using kafka streams and k...Confluent   building a real-time streaming platform using kafka streams and k...
Confluent building a real-time streaming platform using kafka streams and k...
Thomas Alex
 

Viewers also liked (18)

Introduction to Apache HBase Training
Introduction to Apache HBase TrainingIntroduction to Apache HBase Training
Introduction to Apache HBase Training
 
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache HadoopIntroduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
 
Securing Your Apache Spark Applications
Securing Your Apache Spark ApplicationsSecuring Your Apache Spark Applications
Securing Your Apache Spark Applications
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
 
Introduction to Hadoop Developer Training Webinar
Introduction to Hadoop Developer Training WebinarIntroduction to Hadoop Developer Training Webinar
Introduction to Hadoop Developer Training Webinar
 
Introduction to sentry
Introduction to sentryIntroduction to sentry
Introduction to sentry
 
One Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsOne Hadoop, Multiple Clouds
One Hadoop, Multiple Clouds
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Cloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-HadoopCloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-Hadoop
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
 
Confluent building a real-time streaming platform using kafka streams and k...
Confluent   building a real-time streaming platform using kafka streams and k...Confluent   building a real-time streaming platform using kafka streams and k...
Confluent building a real-time streaming platform using kafka streams and k...
 

Similar to Deploying Enterprise-grade Security for Hadoop

TriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache SentryTriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache Sentry
trihug
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
Hellmar Becker
 
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
PROIDEA
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for Hadoop
Jim Dowling
 
Secure Hadoop clusters on Windows platform
Secure Hadoop clusters on Windows platformSecure Hadoop clusters on Windows platform
Secure Hadoop clusters on Windows platform
Remus Rusanu
 
Containers and security
Containers and securityContainers and security
Containers and security
sriram_rajan
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
saipriyacoool
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Jeffrey Breen
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Caserta
 
Security Threats to Hadoop: Data Leakage Attacks and Investigation
Security Threats to Hadoop: Data Leakage Attacks  and InvestigationSecurity Threats to Hadoop: Data Leakage Attacks  and Investigation
Security Threats to Hadoop: Data Leakage Attacks and Investigation
Kiran Gajbhiye
 
Introduction to firebidSQL 3.x
Introduction to firebidSQL 3.xIntroduction to firebidSQL 3.x
Introduction to firebidSQL 3.x
Fabio Codebue
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)
Hellmar Becker
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
DataWorks Summit/Hadoop Summit
 
PC = Personal Cloud (or how to use your development machine with Vagrant and ...
PC = Personal Cloud (or how to use your development machine with Vagrant and ...PC = Personal Cloud (or how to use your development machine with Vagrant and ...
PC = Personal Cloud (or how to use your development machine with Vagrant and ...
Codemotion
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Phil Estes
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production Cloud
Salman Baset
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with Chef
Matt Ray
 
Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARE
Fernando Lopez Aguilar
 
Cosmos, Big Data GE Implementation
Cosmos, Big Data GE ImplementationCosmos, Big Data GE Implementation
Cosmos, Big Data GE ImplementationFIWARE
 

Similar to Deploying Enterprise-grade Security for Hadoop (20)

TriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache SentryTriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache Sentry
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for Hadoop
 
Secure Hadoop clusters on Windows platform
Secure Hadoop clusters on Windows platformSecure Hadoop clusters on Windows platform
Secure Hadoop clusters on Windows platform
 
Containers and security
Containers and securityContainers and security
Containers and security
 
Big data security
Big data securityBig data security
Big data security
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 
Security Threats to Hadoop: Data Leakage Attacks and Investigation
Security Threats to Hadoop: Data Leakage Attacks  and InvestigationSecurity Threats to Hadoop: Data Leakage Attacks  and Investigation
Security Threats to Hadoop: Data Leakage Attacks and Investigation
 
Introduction to firebidSQL 3.x
Introduction to firebidSQL 3.xIntroduction to firebidSQL 3.x
Introduction to firebidSQL 3.x
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
PC = Personal Cloud (or how to use your development machine with Vagrant and ...
PC = Personal Cloud (or how to use your development machine with Vagrant and ...PC = Personal Cloud (or how to use your development machine with Vagrant and ...
PC = Personal Cloud (or how to use your development machine with Vagrant and ...
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker Security
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production Cloud
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with Chef
 
Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARE
 
Cosmos, Big Data GE Implementation
Cosmos, Big Data GE ImplementationCosmos, Big Data GE Implementation
Cosmos, Big Data GE Implementation
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 

Recently uploaded (20)

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 

Deploying Enterprise-grade Security for Hadoop

  • 1. 6 ways to exploit Hive – and what to do about it Brock Noland |Software Engineer, Cloudera January 23, 2013 1
  • 2. Outline Introduction • Hadoop security primer • • • • Security options • • • • 2 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 3. Introduction Tonight's focus is SQL-on-Hadoop • Vast majority of Hadoop users use Hive or Cloudera Impala • Data warehouse offload is the most common use case • Data warehouse offload is a two step process 1. 2. 3 Automatic transformations moved to Hadoop Data analysts given query access
  • 4. Data warehouse use case Online Database 4 Hadoop Data Warehouse
  • 5. Outline Introduction • Hadoop Security Primer • • • • Security options • • • • 5 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 6. Authentication Authentication is who you are • Hadoop models • • • 6 Default - “trusted network” Strong - Kerberos
  • 7. Default Authentication – trusted network Default security mechanism • Hadoop client uses local username • Used in • • • • • 7 POCs Startups Demos Pre-prod environments
  • 8. Default Authentication – trusted network Client Host User: brock File: a.txt Contents: some data $ whoami brock $ cat a.txt some data $ hadoop fs -put file . 8 Hadoop
  • 9. Strong Authentication – Kerberos • Hadoop is secured with Kerberos • • • Every user and service has a Kerberos “principal” • • • Service: impala/hostname@MYCOMPANY.COM User: brock@MYCOMPANY.COM Credentials • • 9 Provides mutual authentication Protects against eavesdropping and replay attacks Service: keytabs User: password
  • 10. Strong Authentication – Kerberos Client Host User: brock <kerberos ticket> <encrypted data> * $ whoami brock $ kinit Password: ******* $ cat a.txt some data $ hadoop fs -put file . 10 Hadoop * RPC Encryption must be enabled
  • 11. Strong Authentication – Kerberos • Keytab • • 11 Encrypted key for servers (similar to a “password”) Generated by server such as MIT Kerberos or Active Directory
  • 12. Hive Server 2 and Oozie Beeline (Hive CLI) Tableau JDBC Hive Server 2 (HS2) Oozie Hadoop 12 Oozie CLI Control-M
  • 13. Strong Authentication – Kerberos • Impersonation • • • 13 Services such as Hive Server2 impersonate users Data loaded by “joe” via HS2 is owned by “joe” Oozie jobs submitted by “brock” are run as “brock”
  • 14. Authorization • HDFS permissions • • • • Other Hadoop components have authorization • • 14 Unix style Read/Write/Execute for Owner/Group/Other Coarse grained MapReduce who can use which job queues HBase table ACL’s
  • 15. HDFS Permisssions $ hadoop fs -ls file -rw-r----1 analyst1 analysts • Permissions • • • • Owner • • Unix style permissions Read/Write/Execute Owner/Group/Other One and only one owner Group • One and only one group 2244 2014-01-19 12:15 file
  • 16. Back to our use case • Scenario facts • • • • Next step • • 16 ETL offload is a success Data warehouse is expensive and at capacity Same data is in Hadoop End users start using Hadoop to augment the DW Security becomes primary concern
  • 17. End users need to share data Unlike automated ETL jobs, end users want to share data with peers • Must manage HDFS permissions manually • Each file has a single group • End result is users set permissions to world readable/writeable • 17
  • 18. Outline Introduction • Hadoop Security Primer • • • • Security options • • • • 18 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 19. Hive: Security holes CREATE TEMPORARY FUNCTION custom_udf AS ’com.mycompany. MaliciousClass’; SELECT TRANSFORM(stuff) USING 'malicious-script.pl' AS thing1, thing; CREATE EXTERNAL TABLE external_table(column1 string) LOCATION ‘/path/to/any/table’; 19
  • 20. Hive: Security holes CREATE TABLE test (c1 string) ROW FORMAT SERDE 'com.mycompany.MaliciousClass'; FROM ( FROM t1 MAP t1.c1 USING 'malicious-script1.pl' CLUSTER BY key) map_output INSERT OVERWRITE TABLE t2 REDUCE t2.c1 USING 'malicious-script2.pl' AS c2; 20
  • 21. Default: Authorization • Hive ships with an “advisory” authorization system • • • 21 All users see all databases/tables/columns Does not fix any security holes Users grant themselves permissions
  • 22. Outline Introduction • Hadoop Security Primer • • • • Security options • • • • 22 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 23. Kerberos with impersonation: Sharing data The user “manager1” wants to share the table “manager1_table” with senior analysts but not junior analysts. # hadoop fs -ls -R /user/hive/warehouse drwxr-x--T - analyst1 analyst1 drwxr-x--T - jranalyst1 jranalyst1 drwxr-x--T - manager1 manager1 23 0 0 0 analyst1_table jranalyst1_table manager1_table
  • 24. Kerberos with impersonation: Sharing data IT must create a group # groupadd senioranalysts Then add the appropriate members to group # usermod -G analyst,senioranalysts analyst1 # usermod -G management,analyst,senioranalysts manager1 24
  • 25. Kerberos with impersonation: Sharing data Then “manager1” can manually change the file permissions $ hadoop fs -chgrp -R senioranalysts …/warehouse/manager1_table $ hadoop fs -ls /user/hive/warehouse/ Found 3 items drwxr-x--T - analyst1 analyst1 drwxr-x--T - jranalyst1 jranalyst1 drwxr-x--T - manager1 senioranalysts 25 0 0 0 analyst1_table jranalyst1_table manager1_table
  • 26. Kerberos with impersonation: Sharing data Now any senior-level analyst can query the data $ whoami analyst1 $ beeline ... Connected to: Hive (version 0.10.0) 0: jdbc:hive2://localhost:10000/default> select count(*) from manager1_table; +------------+ | count(*) | +------------+ | 47 | +------------+ 26 ⏎
  • 27. Kerberos with impersonation: Sharing data Junior analysts cannot query the data: $ whoami jranalyst1 $ beeline .... Connected to: Hive (version 0.10.0) 0: jdbc:hive2://localhost:10000/default> ⏎ select * from manager1_table; Error: java.io.IOException: org.apache.hadoop.security.AccessControlException: Permission denied: user=jranalyst1, access=READ_EXECUTE, inode="/user/hive/warehouse/mana ger1_table":manager1:senioranalysts:drwxr-x--T 27
  • 28. Kerberos with impersonation: Sharing data What happens in the real world? 28
  • 29. Kerberos with impersonation: Sharing data Table “manager1_table” is owned by user/group “manager1” $ hadoop fs -ls /user/hive/warehouse/ Found 3 items drwxr-x--T - analyst1 analyst1 drwxr-x--T - jranalyst1 jranalyst1 drwxr-x--T - manager1 manager1 29 0 0 0 analyst1_table jranalyst1_table manager1_table
  • 30. Kerberos with impersonation: Sharing data User “manager1” makes “manager1_table” world readable/writable $ hadoop fs -chmod -R 777 /user/hive/warehouse/manager1_table $ hadoop fs -ls /user/hive/warehouse/ Found 3 items drwxr-x--T - analyst1 analyst1 drwxr-x--T - jranalyst1 jranalyst1 drwxrwxrwt - manager1 manager1 30 0 0 0 analyst1_table jranalyst1_table manager1_table
  • 31. Kerberos with impersonation: Summary • Securing Hive with Kerberos makes Hive unusable for DW offload • • • • 31 Manual file permission management End state is world writable/readable No ability to restrict access to columns or rows All users see all databases/tables/columns
  • 32. Outline Introduction • Hadoop Security Primer • • • • Security options • • • • 32 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 33. Fine Grained Security: Apache Sentry Authorization module for Hive, Search, & Impala Unlocks Key RBAC Requirements Secure, fine-grained, role-based authorization Multi-tenant administration Open Source Apache Incubator project Ecosystem Support Apache SOLR, HiveServer2, & Impala 1.1+ 33
  • 34. Key Benefits of Sentry Store Sensitive Data in Hadoop Extend Hadoop to More Users Comply with Regulations 34
  • 35. Key Capabilities of Sentry Fine-Grained Authorization Specify security for SERVERS, DATABASES, TABLES & VIEWS Role-Based Authorization SELECT privilege on views & tables INSERT privilege on tables ALL privilege on the server, databases, tables & views ALL privilege is needed to create/modify schema Multi-Tenant Administration Separate policies for each database/schema Can be maintained by separate admins 35
  • 37. Query Execution Flow SQL Parse Validate SQL grammar Build Construct statement tree Check Sentry Forward to execution planner Plan MR 37 Validate statement objects • First check: Authorization Query
  • 38. Outline Introduction • Hadoop Security Primer • • • • Security options • • • • 38 Authentication Authorization Default Kerberos with Impersonation Kerberos with Sentry Demo
  • 39. Click to edit Master title style 39

Editor's Notes

  1. Other aspects areConfidentiallyAudit
  2. Many, many ways to execute arbitrary codeHive was created originally by web companies that simply don’t care about security. In fact we often run into push back from the community when integrating security. In my presentation at the TC HUG I will explain in detail all the ways in which Hive is insecure. The point is by default any user can execute any code they wish.Users grant themselves permissionsUsers can query any data they please by granting themselves permissions.Zero metadata securityNote possible to stop users from modifying or viewing any metadata.
  3. Manual file permission managementWhen users want to share tables and data with other users it requires modifying file permissions. Can anyone guess what happens next?End state is world writable/readableUsers end up making data world writable and readable.No ability to restrict access to columns or rows Users cannot be restricted to a subset of the data and so tables are copied simply to restrict access to data which results in thousands of out of date tables which full read and write permissions.
  4. Role-Based Access Control (RBAC) For finer-grained access to data accessible via schema -- that is, data structures described by the Apache Hive Metastore and utilized by computing engines like Hive and Impala, as well as collections and indices within Cloudera Search -- Cloudera developed Apache Sentry, which offers a highly modular, role-based privilege model for this data and its given schema. (Cloudera donated Apache Sentry to the Apache Foundation in 2013.) Sentry governs access to each schema object in the Metastore via a set of privileges like SELECT and INSERT. The schema objects are common entities in data management, such as SERVER, DATABASE, TABLE, COLUMN, and URI, i.e. file location within HDFS. Cloudera Search has its own set of privileges, e.g. QUERY, and objects, e.g. COLLECTION. As with other RBAC systems that IT teams are already familiar with, Sentry provides for: Hierarchies of objects, with permissions automatically inherited by objects that exist within a larger umbrella object; Rules containing a set of multiple object/permission pairs; Groups that can be granted one or more roles; Users can be assigned to one or more groups. Sentry is normally configured to deny access to services and data by default so that users have limited rights until they are assigned to a group that has explicit access roles. Column-level Security, Row-level Security and Masked Access Using the combination of Sentry-based permissions, SQL views, and User Defined Functions (UDFs), developers can gain a high degree of access control granularity for SQL computing engines through HiveServer2 and Impala, including: Column-level security - To limit access to only particular columns of entire tables, uses can access the data through a view, which contains either a subset of columns in the table, or have certain columns masked. For example, a view can filter a column to only the last four digits of a US Social Security number. Row-level security - To limit access by particular values, views can employ CASE statements to control rows to which a group of users has access. For example, a broker at a financial services firm may only be able to see data within her managed accounts.
  5. Impala metadata queries, i.e. “SHOW TABLES,” query the Hive Metastore directly and then queries Sentry to filter the results before returning.