Curb your insecurity with HDP - Tips for a Secure Cluster

Curb
Your
Insecurity
with

HDP

Tips
for
a
Secure
Cluster
(with
Spark
too)

Ancil
McBarneA

Senior
Solu*ons
Engineer
–
Security
&

Governance

Future
of
Data
Meetup
–
New
York

June
2nd,
2016

2
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Hadoop Security in 4 Steps

3
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Agenda
•  Introduction to Hadoop Security
–  The
4
Steps
to
Hadoop
Security

•  Authentication with Kerbeos
–  Integra*on
with
LDAP

•  Authorization with Apache Ranger
–  Hive,
HDFS,
YARN

•  Rest API Security with Apache Knox
–  WebHDFS

–  Hive

•  Encrypt the Data/ Data Protection
–  Transparent
Data
Encryp*on
and
KMS

4
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

How
do
I
set
policy
across
the
en*re
cluster?

Who
am
I/prove
it?

What
can
I
do?

What
did
I
do?

How
can
I
encrypt
at
rest
and
over
the
wire?

Comprehensive
Approach
to
Security

Data
ProtecDon

Protect
data
at
rest
and
in
mo*on

In
order
to
protect
any
data
system
you
must
implement
the
following:

Audit

Maintain
a
record
of
data
access

AuthorizaDon

Provision
access
to
data

AuthenDcaDon

Authen*cate
users
and
systems

AdministraDon

Central
management
and
consistent
security

5
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

HDP
Security:
Comprehensive,
Complete,
Extensible

Data
ProtecDon

Protect
data
at
rest
and
in
mo*on

Security
in
HDP
is
the
most
comprehensive,
complete
and
extensible
for
Hadoop

Audit

Maintain
a
record
of
data
access

AuthorizaDon

Provision
access
to
data

AuthenDcaDon

Authen*cate
users
and
systems

AdministraDon

Central
management
and
consistent
security

Single
administra*ve
console
to
set
policy
across

the
en*re
cluster:
Apache
Ranger

Authen*ca*on
for
perimeter
and
cluster;

integrates
with
exis*ng
Ac*ve
Directory
and

LDAP
solu*ons:
Kerberos

|

Apache
Knox

Consistent
authoriza*on
controls
across
all

Apache
components
within
HDP:
Apache
Ranger

Record
of
data
access
events
across
all

components
that
is
consistent
and
accessible:

Apache
Ranger

Encrypts
data
in
mo*on
and
data
at
rest;
refer

partner
encryp*on
solu*ons
for
broader
needs:

HDFS
TDE
with
Ranger
KMS

6
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Security: Rings of Defense
Perimeter
Level
Security

•  Network
Security
(i.e.
Firewalls)

•  Apache
Knox
(i.e.
Gateways)

AuthenDcaDon

•  Kerberos

OS
Security

AuthorizaDon

•  MR
ACLs

•  HDFS
Permissions

•  HDFS
ACLs

•  HiveATZ-‐NG

•  HBase
ACLs

•  Accumulo
Label
Security

7
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

AuthenDcaDon
with
Kerberos

8
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Security
Without
Kerberos

9
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Conﬁgure
Kerberos
–
Ambari
Wizard

10
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Security
With
Kerberos

11
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Apache
Ranger

12
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Apache
Ranger

13
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Centralized
Security
with
Ranger

•  Administrators have complete
visibility into the security
administration process
Deep
Visibility
Centralized
PlaVorm

•  Administer security for:
– Database

– Table

– Column

– LDAP
Groups

– Speciﬁc
Users

Fine-‐Grained
Security

DeﬁniDon

•  Centralized platform to define,
administer and manage security
policies consistently
•  Define security policy once and
apply it to all the applicable
components across the stack

14
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

HDFS
File
Security

15
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Hive
Database
and
Table
Security

16
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Authorization and Audit
Authorization
Fine
grain
access
control

•  HDFS
–
Folder,
File

•  Hive
–
Database,
Table,
Column

•  HBase
–
Table,
Column
Family,
Column

•  Storm,
Knox
and
more

Audit
Extensive
user
access
audi*ng
in

HDFS,
Hive
and
HBase

•  IP
Address

•  Resource
type/
resource

•  Timestamp

•  Access
granted
or
denied

Control
access

into
system

Flexibility

in
deﬁning

policies

17
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Rest
API
Security
with
Apache
Knox

18
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

AuthenDcaDon—API
Security
with
Knox

Eliminates SSH “edge node”
Central API management
Central audit control
Service level authorization
SSO Integration—Siteminder
and OAM
LDAP and AD integration
Incubated
and
led
by
Hortonworks,

Apache
Knox
extends
the
reach
of
Hadoop
REST
API

without
Kerberos
complexi*es

Integrated
with
exisDng
systems
to

simplify
idenDty
maintenance

Single,
simple
point
of
access
for
a

cluster

Central
controls
ensure
consistency

across
one
or
more
clusters

Kerberos Encapsulation
Single Hadoop access point
REST API hierarchy
Consolidated API calls
Multi-cluster support

19
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Load
Balancer

Extend Hadoop API reach with Knox
Hadoop
Cluster

Applica*on
Tier
App
A
App
N
App
B
App
C

Data
Ingest

ETL

Admin/

Operators

Bas*an
Node

SSH

RPC
Call

Falcon

Oozie

Scoop

Flume

Data

Operator

Business

User

Hadoop

Admin

JDBC/ODBC
REST/HTTP

Knox

20
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Hadoop REST APIs
Ã Useful for connecting to Hadoop from the outside the cluster
Ã When more client language flexibility is required
–  i.e.
Java
binding
not
an
op*on

Ã Challenges
–  Client
must
have
knowledge
of
cluster
topology

–  Required
to
open
ports
(and
in
some
cases,
on
every
host)
outside
the
cluster

Service
API

WebHDFS
Supports
HDFS
user
opera*ons
including
reading
ﬁles,
wri*ng
to

ﬁles,
making
directories,
changing
permissions
and
renaming.

WebHCat
Job
control
for
MapReduce,
Pig
and
Hive
jobs,
and
HCatalog
DDL

commands.
Learn
more
about
WebHCat.

Hive
Hive
REST
API
opera*ons

HBase
HBase
REST
API
opera*ons

Oozie
Job
submission
and
management,
and
Oozie
administra*on.

21
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Hadoop REST API with Knox – Representative Examples
Service
Direct
URL
Knox
URL

WebHDFS
hkp://namenode-‐host:50070/webhdfs

hkps://knox-‐host:8443/webhdfs

WebHCat
hkp://webhcat-‐host:50111/templeton

hkps://knox-‐host:8443/templeton

Oozie
hkp://ooziehost:11000/oozie

hkps://knox-‐host:8443/oozie

Hbase/
Stargate

hkp://hbasehost:60080

hkps://knox-‐host:8443/hbase

Hive
hkp://hivehost:10001/cliservice
hkps://knox-‐host:8443/hive

YARN
hkp://yarn-‐host:yarn-‐port/ws
hkps://knox-‐host:8443/resourcemanager

24
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Security
in
Hadoop
with
HDP

•  Wire
encryp*on
in

Hadoop

•  HDFS
Encryp*on

with
Ranger
KMS

•  Centralized
audit

repor*ng
with

Apache
Ranger

•  Fine-‐grain
access

control
with

Apache
Ranger

AuthorizaDon

What
can
I
do?

Audit

What
did
I
do?

Data
ProtecDon

Can
data
be
encrypted
at
rest

and
over
the
wire?

•  Kerberos

•  API
security
with
Apache

Knox

AuthenDcaDon

Who
am
I/prove
it?

HDP
2.4

Centralized
Security
AdministraDon
with
Ranger

25
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Data Protection
HDP allows you to apply data protection policy at
different layers across the Hadoop stack
Layer
What?
How
?

Storage
and

Access

Encrypt
data
while
it
is
at
rest

HDFS
Transparent
Data
Encryp*on,
Partners,

Hbase
encryp*on,
OS
level
encrypt,

Transmission
Encrypt
data
as
it
moves
SSL,
SASL,
Supported
from
HDP
2.1

26
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Points
of
CommunicaDon

Page
26

WebHDFS

DataTransferProtocol

Nodes

M/R
Shuﬄe

Client

1

2

4

RPC
3

Nodes

DataTransfer
2

JDBC/ODBC

3

Hadoop
Cluster

RPC

4

27
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Data
ProtecDon
-‐
HDFS
EncrypDon

DATA

ACCESS

DATA

MANAGEMENT

SECURITY
PARTNERS

YARN

KeyProvider
API

(partner
integra*on
point)

Key
Management
System
(KMS)

Stateless
Key
Management

°

1

°

°

°

°

°
°

°
°

°
°

°
°

°
N
°

1
°
°
°
°
°

°
°
°
°
°
°

°
°
°
°
°
°

°
°
°
°
°
°

°
°
°
°
°
°

°
°

°
°

°
°

°
°

°

HDFS

EncrypDon
Zone

Encrypted

File

Encrypted

File

Encrypted

File

Encrypted

File

Encrypted

Files

Name

Node

HDFS

Client

HDFS

Client

•  Hortonworks
collabora*ng
with
partners
to
deliver
enterprise
scale

Key
Management
,
deliver
more
choices
to
customers

•  Open
source
KMS

with
Ranger

•  Or
Partner
with
Voltage
KMS

-  Partner
joint
engineering
resources

-  Voltage
Stateless
Key
Management
integrated
with
KeyProvider
API

Only
HDP
oﬀers
open

source
and

commercial
choices

for
key
management
Open
Source
Key
Management

29
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

Security in Spark?
Spark supports running in a Kerberized Cluster
Only Spark on YARN supports security (Kerberos support)
From command line run kinit before submitting spark jobs
Spark reads data from HDFS & ORC
•  HDFS file permissions (& Ranger integration) applicable to Spark jobs
Spark submits job to YARN queue
•  YARN queue ACL (& Ranger integration) applicable to Spark jobs
Wire Encryption
•  Spark has some coverage, not all channels are covered
LDAP Authentication
•  No Authentication in Spark UI OOB, supports filter for hooking in LDAP

30
©
Hortonworks
Inc.
2011
–
2016.
All
Rights
Reserved

What
makes
Hadoop
Summit
Diﬀerent?

– Deep
technical
sessions
chosen
by
the
community

– Business
Track
based
on
real-‐world
implementa*ons

– Keynotes
from
Progressive
Insurance,
Ford,
Macy’s,

MD
Anderson,
GE,
Capital
One,
…

– Free
Hands-‐on
labs

– Networking
events
and
10
Year
Celebra*on!

– 
20%
Oﬀ
Code:
16SJext20x

Apache
Hadoop,
SPARK,
IoT,
Streaming,
Data
Science

EVERYTHING
DATA!

Curb your insecurity with HDP - Tips for a Secure Cluster

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Curb your insecurity with HDP - Tips for a Secure Cluster

Similar to Curb your insecurity with HDP - Tips for a Secure Cluster (20)

Recently uploaded

Recently uploaded (20)

Curb your insecurity with HDP - Tips for a Secure Cluster