Securing Hadoop using
Ranger
Raj Nadipalli
Director Professional Services, Zaloni
rnadipalli@zaloni.com
09.22.2016
Agenda
Ø  Security Landscape in Hadoop
Ø  Role of Ranger
Ø  Ranger Key Features
Ø  Demo
Ø  Q&A
Overview
Security Landscape in Hadoop (open source)
Authentication
Who am I?
AD/LDAP
Kerberos
Apache Knox
Authorization
What can I do?
Apache Ranger
Apache Sentry
Audit
What happened?
Apache Ranger
Data Protection
SSL
KMS
Ranger in a slide
5	
	
Ø  Centralized	security	framework,	authen*ca*on,	audi*ng,	data	encryp*on	and	security		
Ø  Fine-grained	access	control	over	Hadoop	
Ø  Components	Supported:		HDFS,	Hive,	Hbase,	Storm,	YARN,	Knox,	KaCa,	Solr	
Ø  Manage/Create	policies	using	browser	
Ø  Manage	Audit	tracking	and	policy	analy*cs	in	HDFS,	RDMS	or	SOLR	
Ø  Supports	governance	with	Tag	based	policies	
Ø  REST	API’s	for	policy	management	automate,	integrate	and	extend
Key Components of Ranger
http://www.slideshare.net/RommelGarcia2/apache-ranger?qid=1150145e-a144-4603-9165-
a09b2ae5ece0&v=&b=&from_search=4
Securing HDFS
Ranger in Action - HDFS
http://www.slideshare.net/RommelGarcia2/apache-ranger?qid=1150145e-a144-4603-9165-
a09b2ae5ece0&v=&b=&from_search=4
Ranger administration portal
9
List HDFS policies
10
Under	HDFS	policies	we	can	view	all	the	HDFS	policies	created	and	which	user(s)	/	group(s)	has	access	to	which	
policies			
Actions
delete / edit
Policy Name
Groups/users
assigned to
policies
Create HDFS policy
11
Under	HDFS	policy	we	can	edit/create	HDFS	policies,	this	page	shows	how	to	create	a	policy	at	user	level	and	
provide	appropriate	permissions.
Access error in Audit
12
Under	Audit	tab	admin	can	view	which	user	tried	to	access	which	directory,	here	user	mukesh	got	access	denied	
as	it	did	not	had	the	permission	to	access	/testRanger	directory	
Access Denied
to user mukesh
List HDFS policies for group
13
Under HDFS policies we can view all the HDFS policies created and which user(s) / group(s) has
access to which policies
Create HDFS policy for group
14
Under	HDFS	policy	we	can	edit/create	HDFS	policies,	this	page	shows	how	to	create	a	policy	at	group	level	and	
provide	appropriate	permissions.	
Access given
to a group
Securing Hive
List policies of Hive
16
Under	Hive	policies	we	can	view	all	the	Hive	policies	created	and	which	user(s)	/	group(s)	has	access	to	which	
policies			
Hive policy for database User assigned to a policy
Create policy for Hive
17
Under	Hive	policy	we	can	edit/create	Hive	policies,	this	page	shows	how	to	create	a	policy	at	user	level	and	
provide	appropriate	permissions.
Access error in Audit
18
Under	Audit	tab	admin	can	view	which	user	tried	to	access	which	table/database,	here	user	mukesh	got	access	denied	
as	it	did	not	had	the	permission	to	create	table	under	testranger	database.
Securing HBase
Create HBase policy
20
Under	HBase	policy	we	can	edit/create	HBase	policies,	this	page	shows	how	to	create	a	policy	at	user	level	and	
provide	appropriate	permissions.
Access error in Audit
21
Under	Audit	tab	admin	can	view	which	user	tried	to	access	which	table	here	user	nabadeep	got	access	denied	as	it	did	
not	had	the	permission	to	put	data	in	table	testranger.
Audit Logs
Audit logs in JSON format
For	each	of	the	service	like	HDFS,	Hive	there	will	audit	logs	generated	if	enabled	in	
Ambari	
23
Audit logs in JSON format
24
HDFS Audit File structure
25
Audit Log Storage Options
HDFS
Long term storage that can be used to understand user event trends and predict anomaly
RDBMS
MySQL, Oracle, Postgres, SQL Server
Solr
Good for quick reporting metrics to understand user event trends
Log4j Appenders
Best practices to use HDFS in Ranger
27
•  Change	HDFS	umask	to	077	
	fs.permissions.umask.mode=077	
	
•  	IdenLfy	directory	which	can	be	managed	by	Ranger	policies	
	/apps/hive,	/apps/Hbase	
•  IdenLfy	directories	which	need	to	be	managed	by	HDFS	naLve	permissions	
	/tmp	and	/user		to	700	
•  Enable	Ranger	policy	to	audit	all	records
Best practices to use Hive in Ranger
28
•  HiveServer2	access	with	limited	HDFS	access	
̶  Column	level	access	control	over	Hive	data	
•  Hiveserver2,	and	HDFS	files	through	Pig/MR	jobs	
̶  hive.server2.enable.doAs	is	set	to	"true“	
•  Hive	CLI	access
Atlas & Ranger
Tag Based Policies in Atlas
Ø  Atlas and Ranger combination supports automation for governance and policies
Ø  Atlas is where tags get set on metadata for example, a Customer table in Hive
can be tagged with value “PII”
Ø  Ranger policies can be created on these tags to enforce access
Ø  Ranger shows audit logs on access
Source: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies
Ranger Tag based policy flow
Tag Service Setup – Ranger Admin
Source: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies
Tag Policy Setup
Source: https://cwiki.apache.org/confluence/
display/RANGER/Tag+Based+Policies
Tag Policy Expiry
Backup
References
http://www.slideshare.net/trihug/trihug-october-apache-ranger
http://www.slideshare.net/RommelGarcia2/apache-ranger
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207
http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-
hdp-2-2
https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies
Q&A
rajesh.nadipalli@gmail.com
@ranadipa

Apache ranger meetup