SlideShare a Scribd company logo
Hadoop Security Hadoop Summit 2010 Owen O’Malley [email_address] Yahoo’s Hadoop Team
Problem Yahoo! has more yahoos than clusters. Hundreds of yahoos using Hadoop each month 38,000 computers in ~20 Hadoop clusters. Sharing requires isolation or trust. Different users need different data. Not all yahoos should have access to sensitive data financial data and PII In Hadoop 0.20, easy to impersonate. Segregate different data on separate clusters
Solution Prevent unauthorized HDFS access All HDFS clients  must  be authenticated. Including tasks running as part of MapReduce jobs And jobs submitted through Oozie. Users must also authenticate servers Otherwise fraudulent servers could steal credentials Integrate Hadoop with Kerberos Provides well tested open source distributed authentication system.
Requirements Security must be optional. Not all clusters are shared between users. Hadoop must not prompt for passwords Makes it easy to make trojan horse versions. Must have single sign on. Must support backwards compatibility HFTP must be secure, but allow reading from insecure clusters
Kerberos and Single Sign-on Kerberos allows user to sign in once Obtains Ticket Granting Ticket (TGT) kinit –  get a new Kerberos ticket klist – list your Kerberos tickets kdestroy – destroy your Kerberos ticket TGT’s last  for 10 hours, renewable for 7 days by default Once you have a TGT, Hadoop commands just work hadoop  fs –ls / hadoop jar  wordcount.jar in-dir out-dir
Kerberos Dataflow
Definitions Authentication  – Determining the user Hadoop 0.20 completely trusted the user User states their username and groups over wire We need it on both RPC and Web UI. Authorization  – What can that user do? HDFS had owners, groups and permissions since 0.16. Map/Reduce had nothing in 0.20.
Authentication Changes low-level transport RPC authentication using SASL Kerberos Token Simple Browser HTTP secured via plugin Tool HTTP (eg. Fsck) via SSL/Kerberos
Primary Communication Paths
Authorization HDFS Command line unchanged Web UI enforces authentication MapReduce added Access Control Lists Lists of users and groups that have access. mapreduce.job.acl-view-job – view job mapreduce.job.acl-modify-job – kill or modify job
API Changes Very Minimal API Changes UserGroupInformation *completely* changed. MapReduce added secret credentials Available from JobConf and JobContext Never displayed via Web UI Automatically get tokens for HDFS Primary HDFS, File{In,Out}putFormat, and DistCp Can set mapreduce.job.hdfs-servers
MapReduce Security Changes MapReduce System directory now 700. Tasks run as user instead of TaskTracker. Setuid program that runs tasks. Task directories are now 700. Distributed Cache is now secure Shared (original is world readable) is shared by everyone’s jobs. Private (original is not world readable) is shared by user’s jobs.
Web UIs Hadoop relies on the Web UIs. These need to be authenticated also… Web UI authentication is pluggable. Yahoo uses an internal package We have written a very simple static auth plug-in Dr. Who returns again (the third doctor?) We really need a SPNEGO plug-in… All servlets enforce permissions.
Proxy-Users Some services access HDFS and MapReduce as other users. Configure services with the proxy user: Who the proxy service can impersonate hadoop.proxyuser.superguy.groups=goodguys Which hosts they can impersonate from hadoop.proxyuser.superguy.hosts=secretbase New admin commands to refresh Don’t need to bounce cluster
Out of Scope Encryption RPC transport – easy Block transport protocol – difficult On disk – difficult File Access Control Lists Still use Unix-style owner, group, other permissions Non-Kerberos Authentication Much easier now that framework is available
Schedule The security team worked hard to get security added to Hadoop on schedule. Security Development team: Devaraj Das, Ravi Gummadi, Jakob Homan, Owen O’Malley, Jitendra Pandey, Boris Shkolnik, Vinod Vavilapalli, Kan Zhang Currently on science (beta) clusters Deploy to production clusters in August
Questions? Questions should be sent to: common/hdfs/mapreduce-user@hadoop.apache.org Security holes should be sent to: [email_address] Available from  http://developer.yahoo.com/hadoop/distribution/ Also a VM with Hadoop cluster with security Thanks!

More Related Content

What's hot

OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your Data
Daniel Nüst
 
Mdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgrMdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgr
Daniel M. Farrell
 
Kerberos Survival Guide - St. Louis Day of .Net
Kerberos Survival Guide - St. Louis Day of .NetKerberos Survival Guide - St. Louis Day of .Net
Kerberos Survival Guide - St. Louis Day of .Net
J.D. Wade
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction
Alexander Alten
 
Presentation (PowerPoint File)
Presentation (PowerPoint File)Presentation (PowerPoint File)
Presentation (PowerPoint File)
webhostingguy
 
HTTP 완벽가이드 1장.
HTTP 완벽가이드 1장.HTTP 완벽가이드 1장.
HTTP 완벽가이드 1장.
HyeonSeok Choi
 
0505 Windows Server 2008 一日精華營 PartI
0505 Windows Server 2008 一日精華營 PartI0505 Windows Server 2008 一日精華營 PartI
0505 Windows Server 2008 一日精華營 PartI
Timothy Chen
 
HTTP 완벽가이드 6장.
HTTP 완벽가이드 6장.HTTP 완벽가이드 6장.
HTTP 완벽가이드 6장.
HyeonSeok Choi
 
Cross Origin Resource Sharing (CORS) - Azizul Hakim
Cross Origin Resource Sharing (CORS) - Azizul HakimCross Origin Resource Sharing (CORS) - Azizul Hakim
Cross Origin Resource Sharing (CORS) - Azizul Hakim
Cefalo
 
Apache2 BootCamp : Restricting Access
Apache2 BootCamp : Restricting AccessApache2 BootCamp : Restricting Access
Apache2 BootCamp : Restricting Access
Wildan Maulana
 
Presentation about servers
Presentation about serversPresentation about servers
Presentation about servers
Sasin Prabu
 
Distributed Virtual Transaction Directory Server
Distributed Virtual Transaction Directory ServerDistributed Virtual Transaction Directory Server
Distributed Virtual Transaction Directory Server
LDAPCon
 
Caching objects-in-memory
Caching objects-in-memoryCaching objects-in-memory
Caching objects-in-memory
Mauro Cassani
 
phptraininginindore
phptraininginindore phptraininginindore
phptraininginindore
fmindore
 
OrientDB
OrientDBOrientDB
OrientDB
aemadrid
 
DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016
Maarten Balliauw
 
Ctive directory interview question and answers
Ctive directory interview question and answersCtive directory interview question and answers
Ctive directory interview question and answers
sankar palla
 
Intro To Couch Db
Intro To Couch DbIntro To Couch Db
Intro To Couch Db
Shahar Evron
 

What's hot (18)

OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your Data
 
Mdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgrMdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgr
 
Kerberos Survival Guide - St. Louis Day of .Net
Kerberos Survival Guide - St. Louis Day of .NetKerberos Survival Guide - St. Louis Day of .Net
Kerberos Survival Guide - St. Louis Day of .Net
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction
 
Presentation (PowerPoint File)
Presentation (PowerPoint File)Presentation (PowerPoint File)
Presentation (PowerPoint File)
 
HTTP 완벽가이드 1장.
HTTP 완벽가이드 1장.HTTP 완벽가이드 1장.
HTTP 완벽가이드 1장.
 
0505 Windows Server 2008 一日精華營 PartI
0505 Windows Server 2008 一日精華營 PartI0505 Windows Server 2008 一日精華營 PartI
0505 Windows Server 2008 一日精華營 PartI
 
HTTP 완벽가이드 6장.
HTTP 완벽가이드 6장.HTTP 완벽가이드 6장.
HTTP 완벽가이드 6장.
 
Cross Origin Resource Sharing (CORS) - Azizul Hakim
Cross Origin Resource Sharing (CORS) - Azizul HakimCross Origin Resource Sharing (CORS) - Azizul Hakim
Cross Origin Resource Sharing (CORS) - Azizul Hakim
 
Apache2 BootCamp : Restricting Access
Apache2 BootCamp : Restricting AccessApache2 BootCamp : Restricting Access
Apache2 BootCamp : Restricting Access
 
Presentation about servers
Presentation about serversPresentation about servers
Presentation about servers
 
Distributed Virtual Transaction Directory Server
Distributed Virtual Transaction Directory ServerDistributed Virtual Transaction Directory Server
Distributed Virtual Transaction Directory Server
 
Caching objects-in-memory
Caching objects-in-memoryCaching objects-in-memory
Caching objects-in-memory
 
phptraininginindore
phptraininginindore phptraininginindore
phptraininginindore
 
OrientDB
OrientDBOrientDB
OrientDB
 
DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016
 
Ctive directory interview question and answers
Ctive directory interview question and answersCtive directory interview question and answers
Ctive directory interview question and answers
 
Intro To Couch Db
Intro To Couch DbIntro To Couch Db
Intro To Couch Db
 

Viewers also liked

JCドリームフェスタ出店に際しての案内
JCドリームフェスタ出店に際しての案内JCドリームフェスタ出店に際しての案内
JCドリームフェスタ出店に際しての案内
Kyoko Matsuoka
 
Osc100th asiabsdcon
Osc100th asiabsdconOsc100th asiabsdcon
Osc100th asiabsdcon
Jun Ebihara
 
OpenStack Atlanta Summit for JOSUG
OpenStack Atlanta Summit for JOSUGOpenStack Atlanta Summit for JOSUG
OpenStack Atlanta Summit for JOSUG
ak-hasegawa
 
Vpn
VpnVpn
家島はがきツアー
家島はがきツアー家島はがきツアー
家島はがきツアー
Kyoko Matsuoka
 
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
Shigechika AIKAWA
 
Made In Japan - Akio Morita And SONY
Made In Japan - Akio Morita And SONYMade In Japan - Akio Morita And SONY
Made In Japan - Akio Morita And SONY
Sabin Nepal
 
タイルの話
タイルの話タイルの話
タイルの話
Taro Matsuzawa
 
Openassets ruby
Openassets rubyOpenassets ruby
Openassets ruby
shigeyuki azuchi
 
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 WinterHuahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Ryu Kobayashi
 
What's new in LibreOffice 4.3
What's new in LibreOffice 4.3 What's new in LibreOffice 4.3
What's new in LibreOffice 4.3
Naruhiko Ogasawara
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
Ryu Kobayashi
 
Japanese Open and Generative Design
Japanese Open and Generative DesignJapanese Open and Generative Design
Japanese Open and Generative Design
Yuichi Yazaki
 
Global cellular market trends
Global cellular market trends Global cellular market trends
Global cellular market trends
Sidhartha Muraleedharan
 
BPStudy #87 (iOS8 & iPhone6)
BPStudy #87 (iOS8 & iPhone6)BPStudy #87 (iOS8 & iPhone6)
BPStudy #87 (iOS8 & iPhone6)
Yukio Andoh
 
NetBSDworkshop
NetBSDworkshopNetBSDworkshop
NetBSDworkshop
Jun Ebihara
 
Android wear ui guidelines ( and Circle Design UX )
Android wear ui guidelines ( and Circle Design UX )Android wear ui guidelines ( and Circle Design UX )
Android wear ui guidelines ( and Circle Design UX )
Yukio Andoh
 
WebGL Performance Tuning Tips
WebGL Performance Tuning TipsWebGL Performance Tuning Tips
WebGL Performance Tuning Tips
Yukio Andoh
 
About Clack
About ClackAbout Clack
About Clack
fukamachi
 

Viewers also liked (20)

映画
映画映画
映画
 
JCドリームフェスタ出店に際しての案内
JCドリームフェスタ出店に際しての案内JCドリームフェスタ出店に際しての案内
JCドリームフェスタ出店に際しての案内
 
Osc100th asiabsdcon
Osc100th asiabsdconOsc100th asiabsdcon
Osc100th asiabsdcon
 
OpenStack Atlanta Summit for JOSUG
OpenStack Atlanta Summit for JOSUGOpenStack Atlanta Summit for JOSUG
OpenStack Atlanta Summit for JOSUG
 
Vpn
VpnVpn
Vpn
 
家島はがきツアー
家島はがきツアー家島はがきツアー
家島はがきツアー
 
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
Google Apps Japan Users Group #32 Members Talk(Lightning Talk)
 
Made In Japan - Akio Morita And SONY
Made In Japan - Akio Morita And SONYMade In Japan - Akio Morita And SONY
Made In Japan - Akio Morita And SONY
 
タイルの話
タイルの話タイルの話
タイルの話
 
Openassets ruby
Openassets rubyOpenassets ruby
Openassets ruby
 
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 WinterHuahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
 
What's new in LibreOffice 4.3
What's new in LibreOffice 4.3 What's new in LibreOffice 4.3
What's new in LibreOffice 4.3
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
 
Japanese Open and Generative Design
Japanese Open and Generative DesignJapanese Open and Generative Design
Japanese Open and Generative Design
 
Global cellular market trends
Global cellular market trends Global cellular market trends
Global cellular market trends
 
BPStudy #87 (iOS8 & iPhone6)
BPStudy #87 (iOS8 & iPhone6)BPStudy #87 (iOS8 & iPhone6)
BPStudy #87 (iOS8 & iPhone6)
 
NetBSDworkshop
NetBSDworkshopNetBSDworkshop
NetBSDworkshop
 
Android wear ui guidelines ( and Circle Design UX )
Android wear ui guidelines ( and Circle Design UX )Android wear ui guidelines ( and Circle Design UX )
Android wear ui guidelines ( and Circle Design UX )
 
WebGL Performance Tuning Tips
WebGL Performance Tuning TipsWebGL Performance Tuning Tips
WebGL Performance Tuning Tips
 
About Clack
About ClackAbout Clack
About Clack
 

Similar to 1 hadoop security_in_details_hadoop_summit2010

Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
Hadoop User Group
 
Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
Hadoop User Group
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
Owen O'Malley
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
DataWorks Summit
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Hortonworks
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
shrey mehrotra
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
Positive Hack Days
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
Edureka!
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
 
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBayHadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
Cloudera, Inc.
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
Chris Nauroth
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
A glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika AcharyA glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika Achary
QA or the Highway
 
Strata Hadoop Hopsworks
Strata Hadoop HopsworksStrata Hadoop Hopsworks
Strata Hadoop Hopsworks
Jim Dowling
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
Great Wide Open
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
Rommel Garcia
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: Overview
Cloudera, Inc.
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
Cloudera, Inc.
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 

Similar to 1 hadoop security_in_details_hadoop_summit2010 (20)

Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
 
Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBayHadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
Hadoop World 2011: Hadoop Gateway - Konstantin Schvako, eBay
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
A glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika AcharyA glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika Achary
 
Strata Hadoop Hopsworks
Strata Hadoop HopsworksStrata Hadoop Hopsworks
Strata Hadoop Hopsworks
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: Overview
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
 

More from Hadoop User Group

Common crawlpresentation
Common crawlpresentationCommon crawlpresentation
Common crawlpresentation
Hadoop User Group
 
Hdfs high availability
Hdfs high availabilityHdfs high availability
Hdfs high availability
Hadoop User Group
 
Cascalog internal dsl_preso
Cascalog internal dsl_presoCascalog internal dsl_preso
Cascalog internal dsl_preso
Hadoop User Group
 
Karmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-toolsKarmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-tools
Hadoop User Group
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
Hadoop User Group
 
Hdfs high availability
Hdfs high availabilityHdfs high availability
Hdfs high availability
Hadoop User Group
 
Pig at Linkedin
Pig at LinkedinPig at Linkedin
Pig at Linkedin
Hadoop User Group
 
HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
Hadoop User Group
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
Hadoop User Group
 
1 content optimization-hug-2010-07-21
1 content optimization-hug-2010-07-211 content optimization-hug-2010-07-21
1 content optimization-hug-2010-07-21
Hadoop User Group
 
3 avro hug-2010-07-21
3 avro hug-2010-07-213 avro hug-2010-07-21
3 avro hug-2010-07-21
Hadoop User Group
 
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Hadoop User Group
 
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Hadoop User Group
 
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Hadoop User Group
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Hadoop User Group
 
Hadoop, Hbase and Hive- Bay area Hadoop User Group
Hadoop, Hbase and Hive- Bay area Hadoop User GroupHadoop, Hbase and Hive- Bay area Hadoop User Group
Hadoop, Hbase and Hive- Bay area Hadoop User Group
Hadoop User Group
 
Yahoo! Mail antispam - Bay area Hadoop user group
Yahoo! Mail antispam - Bay area Hadoop user groupYahoo! Mail antispam - Bay area Hadoop user group
Yahoo! Mail antispam - Bay area Hadoop user group
Hadoop User Group
 
Flightcaster Presentation Hadoop
Flightcaster  Presentation  HadoopFlightcaster  Presentation  Hadoop
Flightcaster Presentation Hadoop
Hadoop User Group
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
Hadoop User Group
 
Hadoop Release Plan Feb17
Hadoop Release Plan Feb17Hadoop Release Plan Feb17
Hadoop Release Plan Feb17
Hadoop User Group
 

More from Hadoop User Group (20)

Common crawlpresentation
Common crawlpresentationCommon crawlpresentation
Common crawlpresentation
 
Hdfs high availability
Hdfs high availabilityHdfs high availability
Hdfs high availability
 
Cascalog internal dsl_preso
Cascalog internal dsl_presoCascalog internal dsl_preso
Cascalog internal dsl_preso
 
Karmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-toolsKarmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-tools
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
 
Hdfs high availability
Hdfs high availabilityHdfs high availability
Hdfs high availability
 
Pig at Linkedin
Pig at LinkedinPig at Linkedin
Pig at Linkedin
 
HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
 
1 content optimization-hug-2010-07-21
1 content optimization-hug-2010-07-211 content optimization-hug-2010-07-21
1 content optimization-hug-2010-07-21
 
3 avro hug-2010-07-21
3 avro hug-2010-07-213 avro hug-2010-07-21
3 avro hug-2010-07-21
 
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
 
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
 
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
 
Hadoop, Hbase and Hive- Bay area Hadoop User Group
Hadoop, Hbase and Hive- Bay area Hadoop User GroupHadoop, Hbase and Hive- Bay area Hadoop User Group
Hadoop, Hbase and Hive- Bay area Hadoop User Group
 
Yahoo! Mail antispam - Bay area Hadoop user group
Yahoo! Mail antispam - Bay area Hadoop user groupYahoo! Mail antispam - Bay area Hadoop user group
Yahoo! Mail antispam - Bay area Hadoop user group
 
Flightcaster Presentation Hadoop
Flightcaster  Presentation  HadoopFlightcaster  Presentation  Hadoop
Flightcaster Presentation Hadoop
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
 
Hadoop Release Plan Feb17
Hadoop Release Plan Feb17Hadoop Release Plan Feb17
Hadoop Release Plan Feb17
 

Recently uploaded

How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
moinahousna
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
chetankumar9855
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
digitalxplive
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 

Recently uploaded (20)

How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 

1 hadoop security_in_details_hadoop_summit2010

  • 1. Hadoop Security Hadoop Summit 2010 Owen O’Malley [email_address] Yahoo’s Hadoop Team
  • 2. Problem Yahoo! has more yahoos than clusters. Hundreds of yahoos using Hadoop each month 38,000 computers in ~20 Hadoop clusters. Sharing requires isolation or trust. Different users need different data. Not all yahoos should have access to sensitive data financial data and PII In Hadoop 0.20, easy to impersonate. Segregate different data on separate clusters
  • 3. Solution Prevent unauthorized HDFS access All HDFS clients must be authenticated. Including tasks running as part of MapReduce jobs And jobs submitted through Oozie. Users must also authenticate servers Otherwise fraudulent servers could steal credentials Integrate Hadoop with Kerberos Provides well tested open source distributed authentication system.
  • 4. Requirements Security must be optional. Not all clusters are shared between users. Hadoop must not prompt for passwords Makes it easy to make trojan horse versions. Must have single sign on. Must support backwards compatibility HFTP must be secure, but allow reading from insecure clusters
  • 5. Kerberos and Single Sign-on Kerberos allows user to sign in once Obtains Ticket Granting Ticket (TGT) kinit – get a new Kerberos ticket klist – list your Kerberos tickets kdestroy – destroy your Kerberos ticket TGT’s last for 10 hours, renewable for 7 days by default Once you have a TGT, Hadoop commands just work hadoop fs –ls / hadoop jar wordcount.jar in-dir out-dir
  • 7. Definitions Authentication – Determining the user Hadoop 0.20 completely trusted the user User states their username and groups over wire We need it on both RPC and Web UI. Authorization – What can that user do? HDFS had owners, groups and permissions since 0.16. Map/Reduce had nothing in 0.20.
  • 8. Authentication Changes low-level transport RPC authentication using SASL Kerberos Token Simple Browser HTTP secured via plugin Tool HTTP (eg. Fsck) via SSL/Kerberos
  • 10. Authorization HDFS Command line unchanged Web UI enforces authentication MapReduce added Access Control Lists Lists of users and groups that have access. mapreduce.job.acl-view-job – view job mapreduce.job.acl-modify-job – kill or modify job
  • 11. API Changes Very Minimal API Changes UserGroupInformation *completely* changed. MapReduce added secret credentials Available from JobConf and JobContext Never displayed via Web UI Automatically get tokens for HDFS Primary HDFS, File{In,Out}putFormat, and DistCp Can set mapreduce.job.hdfs-servers
  • 12. MapReduce Security Changes MapReduce System directory now 700. Tasks run as user instead of TaskTracker. Setuid program that runs tasks. Task directories are now 700. Distributed Cache is now secure Shared (original is world readable) is shared by everyone’s jobs. Private (original is not world readable) is shared by user’s jobs.
  • 13. Web UIs Hadoop relies on the Web UIs. These need to be authenticated also… Web UI authentication is pluggable. Yahoo uses an internal package We have written a very simple static auth plug-in Dr. Who returns again (the third doctor?) We really need a SPNEGO plug-in… All servlets enforce permissions.
  • 14. Proxy-Users Some services access HDFS and MapReduce as other users. Configure services with the proxy user: Who the proxy service can impersonate hadoop.proxyuser.superguy.groups=goodguys Which hosts they can impersonate from hadoop.proxyuser.superguy.hosts=secretbase New admin commands to refresh Don’t need to bounce cluster
  • 15. Out of Scope Encryption RPC transport – easy Block transport protocol – difficult On disk – difficult File Access Control Lists Still use Unix-style owner, group, other permissions Non-Kerberos Authentication Much easier now that framework is available
  • 16. Schedule The security team worked hard to get security added to Hadoop on schedule. Security Development team: Devaraj Das, Ravi Gummadi, Jakob Homan, Owen O’Malley, Jitendra Pandey, Boris Shkolnik, Vinod Vavilapalli, Kan Zhang Currently on science (beta) clusters Deploy to production clusters in August
  • 17. Questions? Questions should be sent to: common/hdfs/mapreduce-user@hadoop.apache.org Security holes should be sent to: [email_address] Available from http://developer.yahoo.com/hadoop/distribution/ Also a VM with Hadoop cluster with security Thanks!