SlideShare a Scribd company logo
1 of 52
Download to read offline
Shell Script Rewrite Overview
Allen Wittenauer
Twitter: @_a__w_ (1 a 2 w 1)
Email: aw @ apache.org!
3
What is the shell code?!
!
! bin/*!
! etc/hadoop/*sh!
! libexec/*!
! sbin/*!
!
CUTTING, DOUG
1710 554 6239
2005
APACHE SOFTWARE FOUNDATION
6 https://www.flickr.com/photos/new_and_used_tires/6549497793/
7
8 https://www.flickr.com/photos/hkuchera/5084213883
9
10
11 https://www.flickr.com/photos/83633410@N07/7658225516/
“[The scripts] finally got to
you, didn’t they?”
13
Primary Goals!
Consistency!
Code and Config Simplification!
De-clash Parameters!
Documentation!
!
Secondary Goals!
Backward Compatibility!
“Lost” Ideas and Fixes!
14 https://www.flickr.com/photos/k6mmc/2176537668/
15
!
!
Tuesday, August 19, 2014 majority committed into trunk:!
!
!
!
!
!
... followed by many fixes & enhancements from the
community
16
https://www.flickr.com/photos/ifindkarma/9304374538/	
  
https://www.flickr.com/photos/liveandrock/2650732780/
17
Old:!
! hadoop -> hadoop-config.sh -> hadoop-env.sh!
! yarn -> yarn-config.sh -> yarn-env.sh!
! hdfs-> hdfs-config.sh -> hadoop-env.sh !
!
New:!
! hadoop -> hadoop-config.sh! -> hadoop-functions.sh!
! ! ! ! ! ! ! -> hadoop-env.sh!
! yarn -> yarn-config.sh! -> hadoop-config.sh -> (above)!
! ! ! ! ! ! -> yarn-env.sh!
! hdfs -> hdfs-config.sh! -> hadoop-config.sh -> (above)!
18
Old:!
! yarn-env.sh:!
	
  	
   	
   JAVA_HOME=xyz	
  
! hadoop-env.sh:!
	
   	
   JAVA_HOME=xyz	
  
! mapred-env.sh:!
	
   	
   JAVA_HOME=xyz	
   	
  
New:!
! hadoop-env.sh!
	
   	
   JAVA_HOME=xyz	
  
! OS X:!
	
   	
   JAVA_HOME=$(/usr/libexec/java_home)
19
Old:!
! xyz_OPT=“-­‐Xmx4g”	
  hdfs	
  namenode	
  
	
   	
   java	
  …	
  -­‐Xmx1000	
  …	
  -­‐Xmx4g	
  …	
  
	
   !
! Command line size: ~2500 bytes!
New:!
! xyz_OPT=“-­‐Xmx4g”	
  hdfs	
  namenode	
  
	
   	
   java	
  …	
  -­‐Xmx4g	
  …	
  
!
! Command line size: ~1750 bytes
20
! $	
  TOOL_PATH=blah:blah:blah	
  hadoop	
  distcp	
  /old	
  /new	
  
	
   Error:	
  could	
  not	
  find	
  or	
  load	
  main	
  class	
  
org.apache.hadoop.tools.DistCp!
!
Old:!
! $	
  bash	
  -­‐x	
  hadoop	
  distcp	
  /old	
  /new	
  
+	
  this=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop	
  
+++	
  dirname	
  -­‐-­‐	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop	
  
++	
  cd	
  -­‐P	
  -­‐-­‐	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin	
  
++	
  pwd	
  -­‐P	
  
+	
  bin=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin	
  
+	
  DEFAULT_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec	
  
+	
  HADOOP_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec	
  
+	
  [[	
  -­‐f	
  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec/hadoop-­‐
config.sh	
  ]]	
  
…	
  
!
21
New:!
! $	
  TOOL_PATH=blah:blah:blah	
  hadoop	
  -­‐-­‐debug	
  
distcp	
  /tmp/	
  /1	
  
	
   DEBUG:	
  HADOOP_CONF_DIR=/home/aw/HADOOP/conf	
  
	
   DEBUG:	
  Initial	
  CLASSPATH=/home/aw/HADOOP/conf	
  
	
   	
   	
   	
   	
   …	
  
	
   DEBUG:	
  Append	
  CLASSPATH:	
  /home/aw/HADOOP/
hadoop-­‐3.0.0-­‐SNAPSHOT/share/hadoop/mapreduce/*	
  
	
   DEBUG:	
  Injecting	
  TOOL_PATH	
  into	
  CLASSPATH	
  
	
   DEBUG:	
  Rejected	
  CLASSPATH:	
  blah:blah:blah	
  (does	
  
not	
  exist)	
  
	
   	
   	
   	
   	
   …	
  
!
22
Old:!
! hdfs help!
23 https://www.flickr.com/photos/joshuamckenty/2297179486/
24
New:!
! hdfs help!
25
Old:!
!
	
   hadoop	
  thisisnotacommand	
  
! ! == stack trace!
New:!
	
   hadoop	
  thisisnotacommand	
  
! ! == hadoop help
26
Old:!
! sbin/hadoop-­‐daemon.sh	
  start	
  namenode	
  
	
  	
   sbin/yarn-­‐daemon.sh	
  start	
  resourcemanager	
  
!
New:!
! bin/hdfs	
  -­‐-­‐daemon	
  start	
  namenode	
  
	
  	
   bin/yarn	
  -­‐-­‐daemon	
  start	
  resourcemanager	
  
!
! + common daemon start/stop/status routines
27
hdfs	
  namenode vs hadoop-­‐daemon.sh	
  namenode	
  
!
Old:!!
! - effectively different code paths!
! - no pid vs pid!
! ! - wait for socket for failure!
New:!
! - same code path !
! - hadoop-­‐daemon.sh	
  cmd => hdfs	
  -­‐-­‐daemon	
  cmd !
! ! - both generate pid!
! - hdfs	
  -­‐-­‐daemon	
  status	
  namenode
28
Old:!
! “mkdir:	
  cannot	
  create	
  <dir>”!
! “chown:	
  cannot	
  change	
  permission	
  of	
  <dir>”!
! !
New:!
! “WARNING:	
  <dir>	
  does	
  not	
  exist.	
  Creating.”!
! “ERROR:	
  Unable	
  to	
  create	
  <dir>.	
  Aborting.”!
! “ERROR:	
  Cannot	
  write	
  to	
  <dir>.”
29
Old:!
! (foo)	
  >	
  (foo).out	
  
	
   rm	
  (foo).out	
  
	
   	
   = Open file handle!
!
New:!
	
   (foo)	
  >>	
  (foo).out	
  
	
   rm	
  (foo).out	
  
! ! = Closed file handle!
! ! = rotatable .out files!
30
Old:!
! sbin/*-­‐daemons.sh	
  -­‐>	
  slaves.sh	
  blah!
! (several hundred ssh processes later)!
! *crash*! !
!
New:!
! sbin/*-­‐daemons.sh -> hadoop-­‐functions.sh	
  
! slaves.sh -> hadoop-­‐functions.sh	
  
! pdsh or (if enabled) xargs	
  -­‐P!
! *real work gets done*
31
Old:!
	
   egrep	
  -­‐c	
  ‘^#’	
  hadoop-­‐branch-­‐2/…/*-­‐env.sh	
  
! ! ! hadoop-env.sh: 59!
! ! ! mapred-env.sh: 21!
! ! ! yarn-env.sh: 60!
New:!
! egrep	
  -­‐c	
  ‘^#’	
  hadoop-­‐trunk/…/*-­‐env.sh	
  
! ! ! hadoop-env.sh: 333!
! ! ! mapred-env.sh: 40!
! ! ! yarn-env.sh: 112!
! ! ! + hadoop-layout.sh.example : 77!
! ! ! + hadoop-user-functions.sh.example: 109
But wait! There’s more!
33
!
! HADOOP_namenode_USER=hdfs !
! ! hdfs	
  namenode only works as hdfs!
! ! Fun: HADOOP_fs_USER=aw!
! ! ! hadoop	
  fs only works as aw!
!
! hadoop	
  -­‐-­‐loglevel	
  WARN !
! ! ! => WARN,whatever!
! hadoop	
  -­‐-­‐loglevel	
  DEBUG	
  -­‐-­‐daemon	
  start	
  	
  
	
   	
   => start daemon in DEBUG mode!
34
!
Old:!
! HADOOP_HEAPSIZE=15234	
  	
  	
  	
  	
  <-­‐-­‐-­‐	
  M	
  only	
  
	
   JAVA_HEAP_MAX="hahahah	
  you	
  set	
  something	
  in	
  
HADOOP_HEAPSIZE"	
  
!
New:!
! HADOOP_HEAPSIZE_MAX=15g	
  
	
   HADOOP_HEAPSIZE_MIN=10g	
  	
  	
  	
  <-­‐-­‐-­‐	
  units!	
  
	
   JAVA_HEAP_MAX	
  removed	
  =>	
  
	
   	
   no	
  Xmx	
  settings	
  ==	
  Java	
  default	
  
35
!
Old:!
! Lots of different yet same variables for settings	
  
!
New:!
! Deprecated	
  ~60	
  variables	
  
	
   ${HDFS|YARN|KMS|HTTPFS|*}_{foo}	
  =>	
  	
  
	
   	
   HADOOP_{foo}
36
!
Old:!
! "I wonder what's in HADOOP_CLIENT_OPTS?"!
! "I want to override just this one thing in *-env.sh."!
!
New:!
! ${HOME}/.hadooprc
37
!
shellprofile.d!
!
! bash snippets to easily inject:!
! ! classpath!
! ! JNI!
! ! Java command line options!
! ! ... and more!
38 https://www.flickr.com/photos/83633410@N07/7658230838/
Power Users Rejoice:!
Function Overrides
40
Default *.out log rotation:!
!
function	
  hadoop_rotate_log	
  
{	
  
	
  	
  local	
  log=$1;	
  
	
  	
  local	
  num=${2:-­‐5};	
  
!
	
  	
  if	
  [[	
  -­‐f	
  "${log}"	
  ]];	
  then	
  #	
  rotate	
  logs	
  
	
  	
  	
  	
  while	
  [[	
  ${num}	
  -­‐gt	
  1	
  ]];	
  do	
  
	
  	
  	
  	
  	
  let	
  prev=${num}-­‐1	
  
	
  	
  	
  	
  	
  	
  if	
  [[	
  -­‐f	
  "${log}.${prev}"	
  ]];	
  then	
  
	
  	
  	
  	
  	
  	
  	
  	
  mv	
  "${log}.${prev}"	
  "${log}.${num}"	
  
	
  	
  	
  	
  	
  	
  fi	
  
	
  	
  	
  	
  	
  	
  num=${prev}	
  
	
  	
  	
  	
  done	
  
	
  	
  	
  	
  mv	
  "${log}"	
  "${log}.${num}"	
  
	
  	
  fi	
  
}
namenode.out.1	
  -­‐>	
  namenode.out.2	
  
namenode.out	
  -­‐>	
  namenode.out.1
41
Put a replacement rotate function w/gzip support in hadoop-user-functions.sh!!
!
function	
  hadoop_rotate_log	
  
{	
  
	
  	
  local	
  log=$1;	
  
	
  	
  local	
  num=${2:-­‐5};	
  
!
	
  	
  if	
  [[	
  -­‐f	
  "${log}"	
  ]];	
  then	
  
	
  	
  	
  	
  while	
  [[	
  ${num}	
  -­‐gt	
  1	
  ]];	
  do	
  
	
  	
  	
  	
  	
  	
  let	
  prev=${num}-­‐1	
  
	
  	
  	
  	
  	
  	
  if	
  [[	
  -­‐f	
  "${log}.${prev}.gz"	
  ]];	
  then	
  
	
  	
  	
  	
  	
  	
  	
  	
  mv	
  "${log}.${prev}.gz"	
  "${log}.${num}.gz"	
  
	
  	
  	
  	
  	
  	
  fi	
  
	
  	
  	
  	
  	
  	
  num=${prev}	
  
	
  	
  	
  	
  done	
  
	
  	
  	
  	
  mv	
  "${log}"	
  "${log}.${num}"	
  
	
  	
  	
  	
  gzip	
  -­‐9	
  "${log}.${num}"	
  
	
  	
  fi	
  
}
namenode.out.1.gz	
  -­‐>	
  namenode.out.2.gz	
  
namenode.out	
  -­‐>	
  namenode.out.1	
  
gzip	
  -­‐9	
  namenode.out.1	
  -­‐>	
  namenode.out.1.gz
What if we wanted to log
every daemon start in
syslog?
43
Default daemon starter:!
!
function	
  hadoop_start_daemon	
  
{	
  
	
  	
  local	
  command=$1	
  
	
  	
  local	
  class=$2	
  
	
  	
  shift	
  2	
  
!
	
  	
  hadoop_debug	
  "Final	
  CLASSPATH:	
  ${CLASSPATH}"	
  
	
  	
  hadoop_debug	
  "Final	
  HADOOP_OPTS:	
  ${HADOOP_OPTS}"	
  
!
	
  	
  export	
  CLASSPATH	
  
	
  	
  exec	
  "${JAVA}"	
  "-­‐Dproc_${command}"	
  ${HADOOP_OPTS}	
  "$
{class}"	
  "$@"	
  
}	
  
44
Put a replacement start function in hadoop-user-functions.sh!!
!
function	
  hadoop_start_daemon	
  
{	
  
	
  	
  local	
  command=$1	
  
	
  	
  local	
  class=$2	
  
	
  	
  shift	
  2	
  
!
	
  	
  hadoop_debug	
  "Final	
  CLASSPATH:	
  ${CLASSPATH}"	
  
	
  	
  hadoop_debug	
  "Final	
  HADOOP_OPTS:	
  ${HADOOP_OPTS}"	
  
!
	
  	
  export	
  CLASSPATH	
  
	
  	
  logger	
  -­‐i	
  -­‐p	
  local0.notice	
  -­‐t	
  hadoop	
  "Started	
  ${COMMAND}"	
  
	
  	
  exec	
  "${JAVA}"	
  "-­‐Dproc_${command}"	
  ${HADOOP_OPTS}	
  "$
{class}"	
  "$@"	
  
}
Secure Daemons
What if we could start them
as non-root?
47
Setup:!
!
sudoers (either /etc/sudoers or in LDAP):!
!
hdfs	
   ALL=(root:root)	
  NOPASSWD:	
  /usr/bin/jsvc!
!
hadoop-env.sh:!
!
HADOOP_SECURE_COMMAND=/usr/sbin/sudo	
  
48
# hadoop-user-functions.sh: (partial code below)!
function	
  hadoop_start_secure_daemon	
  
{	
  
	
  	
   	
   	
   	
   	
   	
   	
  …	
  
	
  	
  jsvc="${JSVC_HOME}/jsvc"	
  
!
	
  	
  if	
  [[	
  “${USER}”	
  -­‐ne	
  "${HADOOP_SECURE_USER}"	
  ]];	
  then	
  	
  
	
  	
  	
  	
  hadoop_error	
  "You	
  must	
  be	
  ${HADOOP_SECURE_USER}	
  in	
  order	
  to	
  start	
  a	
  
secure	
  ${daemonname}"	
  
	
  	
  	
  	
  exit	
  1	
  
	
  	
  fi	
  	
  
	
  	
  	
   	
   	
   	
   	
   	
   …	
  
	
  	
  exec	
  /usr/sbin/sudo	
  "${jsvc}"	
  "-­‐Dproc_${daemonname}"	
  	
  
	
  	
  -­‐outfile	
  "${daemonoutfile}"	
  -­‐errfile	
  "${daemonerrfile}"	
  	
  
	
  	
  -­‐pidfile	
  "${daemonpidfile}"	
  -­‐nodetach	
  -­‐home	
  "${JAVA_HOME}"	
  	
  
	
  	
  —user	
  "${HADOOP_SECURE_USER}"	
  	
  
	
  	
  -­‐cp	
  "${CLASSPATH}"	
  ${HADOOP_OPTS}	
  "${class}"	
  "$@"	
  
}
49
$ hdfs	
  datanode!
sudo launches jsvc as root!
jsvc launches secure datanode!
!
!
In order to get -­‐-­‐daemon	
  start to work, one other
function needs to get replaced*, but that’s a SMOP, now
that you know how!!
!
!
* - hadoop_start_secure_daemon_wrapper	
  assumes it
is running as root!
50
Lots more, but out of time... e.g.:!
!
! Internals for contributors!
! Unit tests!
! API documentation!
! Other projects in the works!
! ...!
!
! Reminder: This is in trunk. Ask vendors their plans!
51 https://www.flickr.com/photos/nateone/3768979925
Altiscale copyright 2015. All rights reserved.52

More Related Content

What's hot

Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Manish Chopra
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in RustInfluxData
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparisonarunkumar sadhasivam
 
Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Tim Bunce
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809Tim Bunce
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyTim Bunce
 
2005_Structures and functions of Makefile
2005_Structures and functions of Makefile2005_Structures and functions of Makefile
2005_Structures and functions of MakefileNakCheon Jung
 
Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Jinho Kim
 
Hvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterHvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterLibriotech
 
Commands documentaion
Commands documentaionCommands documentaion
Commands documentaionTejalNijai
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4moai kids
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Herokuronnywang_tw
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Tim Bunce
 
apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010Thejas Nair
 
Ansible for Beginners
Ansible for BeginnersAnsible for Beginners
Ansible for BeginnersArie Bregman
 
Package Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π SupercomputerPackage Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π SupercomputerJianwen Wei
 
Using ngx_lua in UPYUN
Using ngx_lua in UPYUNUsing ngx_lua in UPYUN
Using ngx_lua in UPYUNCong Zhang
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011Puppet
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationhabeebulla g
 

What's hot (20)

Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparison
 
Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
2005_Structures and functions of Makefile
2005_Structures and functions of Makefile2005_Structures and functions of Makefile
2005_Structures and functions of Makefile
 
Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501
 
Hvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterHvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøster
 
Commands documentaion
Commands documentaionCommands documentaion
Commands documentaion
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010apache pig performance optimizations talk at apachecon 2010
apache pig performance optimizations talk at apachecon 2010
 
Ansible for Beginners
Ansible for BeginnersAnsible for Beginners
Ansible for Beginners
 
Package Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π SupercomputerPackage Management via Spack on SJTU π Supercomputer
Package Management via Spack on SJTU π Supercomputer
 
Using ngx_lua in UPYUN
Using ngx_lua in UPYUNUsing ngx_lua in UPYUN
Using ngx_lua in UPYUN
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011Oliver hookins puppetcamp2011
Oliver hookins puppetcamp2011
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 

Viewers also liked

Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopAllen Wittenauer
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Allen Wittenauer
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsAllen Wittenauer
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemAllen Wittenauer
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedInAllen Wittenauer
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedInAllen Wittenauer
 

Viewers also liked (6)

Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache Hadoop
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedIn
 

Similar to Apache Hadoop Shell Rewrite

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them Allegypt
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash FeaturesNati Cohen
 
Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetOmar Reygaert
 
Really useful linux commands
Really useful linux commandsReally useful linux commands
Really useful linux commandsMichael J Geiser
 
Im trying to run make qemu-nox In a putty terminal but it.pdf
Im trying to run  make qemu-nox  In a putty terminal but it.pdfIm trying to run  make qemu-nox  In a putty terminal but it.pdf
Im trying to run make qemu-nox In a putty terminal but it.pdfmaheshkumar12354
 
Using Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutionsUsing Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutionsSander van der Burg
 
Biicode OpenExpoDay
Biicode OpenExpoDayBiicode OpenExpoDay
Biicode OpenExpoDayfcofdezc
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basicsAbhay Sapru
 
Unix Shell Scripting Basics
Unix Shell Scripting BasicsUnix Shell Scripting Basics
Unix Shell Scripting BasicsDr.Ravi
 
Bash is not a second zone citizen programming language
Bash is not a second zone citizen programming languageBash is not a second zone citizen programming language
Bash is not a second zone citizen programming languageRené Ribaud
 
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Composeraccoony
 
Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL John Anderson
 
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2Yros
 
What we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at BackstopWhat we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at BackstopPuppet
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 

Similar to Apache Hadoop Shell Rewrite (20)

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + Puppet
 
Really useful linux commands
Really useful linux commandsReally useful linux commands
Really useful linux commands
 
Im trying to run make qemu-nox In a putty terminal but it.pdf
Im trying to run  make qemu-nox  In a putty terminal but it.pdfIm trying to run  make qemu-nox  In a putty terminal but it.pdf
Im trying to run make qemu-nox In a putty terminal but it.pdf
 
Git::Hooks
Git::HooksGit::Hooks
Git::Hooks
 
Using Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutionsUsing Nix and Docker as automated deployment solutions
Using Nix and Docker as automated deployment solutions
 
Biicode OpenExpoDay
Biicode OpenExpoDayBiicode OpenExpoDay
Biicode OpenExpoDay
 
EC2
EC2EC2
EC2
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basics
 
Unix Shell Scripting Basics
Unix Shell Scripting BasicsUnix Shell Scripting Basics
Unix Shell Scripting Basics
 
Bash is not a second zone citizen programming language
Bash is not a second zone citizen programming languageBash is not a second zone citizen programming language
Bash is not a second zone citizen programming language
 
myHadoop 0.30
myHadoop 0.30myHadoop 0.30
myHadoop 0.30
 
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
파이썬 개발환경 구성하기의 끝판왕 - Docker Compose
 
Os Treat
Os TreatOs Treat
Os Treat
 
Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL Automate Yo'self -- SeaGL
Automate Yo'self -- SeaGL
 
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
 
What we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at BackstopWhat we Learned Implementing Puppet at Backstop
What we Learned Implementing Puppet at Backstop
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 

Recently uploaded

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Recently uploaded (20)

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 

Apache Hadoop Shell Rewrite

  • 1. Shell Script Rewrite Overview Allen Wittenauer
  • 2. Twitter: @_a__w_ (1 a 2 w 1) Email: aw @ apache.org!
  • 3. 3 What is the shell code?! ! ! bin/*! ! etc/hadoop/*sh! ! libexec/*! ! sbin/*! !
  • 4.
  • 5. CUTTING, DOUG 1710 554 6239 2005 APACHE SOFTWARE FOUNDATION
  • 7. 7
  • 9. 9
  • 10. 10
  • 12. “[The scripts] finally got to you, didn’t they?”
  • 13. 13 Primary Goals! Consistency! Code and Config Simplification! De-clash Parameters! Documentation! ! Secondary Goals! Backward Compatibility! “Lost” Ideas and Fixes!
  • 15. 15 ! ! Tuesday, August 19, 2014 majority committed into trunk:! ! ! ! ! ! ... followed by many fixes & enhancements from the community
  • 17. 17 Old:! ! hadoop -> hadoop-config.sh -> hadoop-env.sh! ! yarn -> yarn-config.sh -> yarn-env.sh! ! hdfs-> hdfs-config.sh -> hadoop-env.sh ! ! New:! ! hadoop -> hadoop-config.sh! -> hadoop-functions.sh! ! ! ! ! ! ! ! -> hadoop-env.sh! ! yarn -> yarn-config.sh! -> hadoop-config.sh -> (above)! ! ! ! ! ! ! -> yarn-env.sh! ! hdfs -> hdfs-config.sh! -> hadoop-config.sh -> (above)!
  • 18. 18 Old:! ! yarn-env.sh:!       JAVA_HOME=xyz   ! hadoop-env.sh:!     JAVA_HOME=xyz   ! mapred-env.sh:!     JAVA_HOME=xyz     New:! ! hadoop-env.sh!     JAVA_HOME=xyz   ! OS X:!     JAVA_HOME=$(/usr/libexec/java_home)
  • 19. 19 Old:! ! xyz_OPT=“-­‐Xmx4g”  hdfs  namenode       java  …  -­‐Xmx1000  …  -­‐Xmx4g  …     ! ! Command line size: ~2500 bytes! New:! ! xyz_OPT=“-­‐Xmx4g”  hdfs  namenode       java  …  -­‐Xmx4g  …   ! ! Command line size: ~1750 bytes
  • 20. 20 ! $  TOOL_PATH=blah:blah:blah  hadoop  distcp  /old  /new     Error:  could  not  find  or  load  main  class   org.apache.hadoop.tools.DistCp! ! Old:! ! $  bash  -­‐x  hadoop  distcp  /old  /new   +  this=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop   +++  dirname  -­‐-­‐  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/hadoop   ++  cd  -­‐P  -­‐-­‐  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin   ++  pwd  -­‐P   +  bin=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin   +  DEFAULT_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec   +  HADOOP_LIBEXEC_DIR=/home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec   +  [[  -­‐f  /home/aw/HADOOP/hadoop-­‐3.0.0-­‐SNAPSHOT/bin/../libexec/hadoop-­‐ config.sh  ]]   …   !
  • 21. 21 New:! ! $  TOOL_PATH=blah:blah:blah  hadoop  -­‐-­‐debug   distcp  /tmp/  /1     DEBUG:  HADOOP_CONF_DIR=/home/aw/HADOOP/conf     DEBUG:  Initial  CLASSPATH=/home/aw/HADOOP/conf             …     DEBUG:  Append  CLASSPATH:  /home/aw/HADOOP/ hadoop-­‐3.0.0-­‐SNAPSHOT/share/hadoop/mapreduce/*     DEBUG:  Injecting  TOOL_PATH  into  CLASSPATH     DEBUG:  Rejected  CLASSPATH:  blah:blah:blah  (does   not  exist)             …   !
  • 25. 25 Old:! !   hadoop  thisisnotacommand   ! ! == stack trace! New:!   hadoop  thisisnotacommand   ! ! == hadoop help
  • 26. 26 Old:! ! sbin/hadoop-­‐daemon.sh  start  namenode       sbin/yarn-­‐daemon.sh  start  resourcemanager   ! New:! ! bin/hdfs  -­‐-­‐daemon  start  namenode       bin/yarn  -­‐-­‐daemon  start  resourcemanager   ! ! + common daemon start/stop/status routines
  • 27. 27 hdfs  namenode vs hadoop-­‐daemon.sh  namenode   ! Old:!! ! - effectively different code paths! ! - no pid vs pid! ! ! - wait for socket for failure! New:! ! - same code path ! ! - hadoop-­‐daemon.sh  cmd => hdfs  -­‐-­‐daemon  cmd ! ! ! - both generate pid! ! - hdfs  -­‐-­‐daemon  status  namenode
  • 28. 28 Old:! ! “mkdir:  cannot  create  <dir>”! ! “chown:  cannot  change  permission  of  <dir>”! ! ! New:! ! “WARNING:  <dir>  does  not  exist.  Creating.”! ! “ERROR:  Unable  to  create  <dir>.  Aborting.”! ! “ERROR:  Cannot  write  to  <dir>.”
  • 29. 29 Old:! ! (foo)  >  (foo).out     rm  (foo).out       = Open file handle! ! New:!   (foo)  >>  (foo).out     rm  (foo).out   ! ! = Closed file handle! ! ! = rotatable .out files!
  • 30. 30 Old:! ! sbin/*-­‐daemons.sh  -­‐>  slaves.sh  blah! ! (several hundred ssh processes later)! ! *crash*! ! ! New:! ! sbin/*-­‐daemons.sh -> hadoop-­‐functions.sh   ! slaves.sh -> hadoop-­‐functions.sh   ! pdsh or (if enabled) xargs  -­‐P! ! *real work gets done*
  • 31. 31 Old:!   egrep  -­‐c  ‘^#’  hadoop-­‐branch-­‐2/…/*-­‐env.sh   ! ! ! hadoop-env.sh: 59! ! ! ! mapred-env.sh: 21! ! ! ! yarn-env.sh: 60! New:! ! egrep  -­‐c  ‘^#’  hadoop-­‐trunk/…/*-­‐env.sh   ! ! ! hadoop-env.sh: 333! ! ! ! mapred-env.sh: 40! ! ! ! yarn-env.sh: 112! ! ! ! + hadoop-layout.sh.example : 77! ! ! ! + hadoop-user-functions.sh.example: 109
  • 33. 33 ! ! HADOOP_namenode_USER=hdfs ! ! ! hdfs  namenode only works as hdfs! ! ! Fun: HADOOP_fs_USER=aw! ! ! ! hadoop  fs only works as aw! ! ! hadoop  -­‐-­‐loglevel  WARN ! ! ! ! => WARN,whatever! ! hadoop  -­‐-­‐loglevel  DEBUG  -­‐-­‐daemon  start         => start daemon in DEBUG mode!
  • 34. 34 ! Old:! ! HADOOP_HEAPSIZE=15234          <-­‐-­‐-­‐  M  only     JAVA_HEAP_MAX="hahahah  you  set  something  in   HADOOP_HEAPSIZE"   ! New:! ! HADOOP_HEAPSIZE_MAX=15g     HADOOP_HEAPSIZE_MIN=10g        <-­‐-­‐-­‐  units!     JAVA_HEAP_MAX  removed  =>       no  Xmx  settings  ==  Java  default  
  • 35. 35 ! Old:! ! Lots of different yet same variables for settings   ! New:! ! Deprecated  ~60  variables     ${HDFS|YARN|KMS|HTTPFS|*}_{foo}  =>         HADOOP_{foo}
  • 36. 36 ! Old:! ! "I wonder what's in HADOOP_CLIENT_OPTS?"! ! "I want to override just this one thing in *-env.sh."! ! New:! ! ${HOME}/.hadooprc
  • 37. 37 ! shellprofile.d! ! ! bash snippets to easily inject:! ! ! classpath! ! ! JNI! ! ! Java command line options! ! ! ... and more!
  • 40. 40 Default *.out log rotation:! ! function  hadoop_rotate_log   {      local  log=$1;      local  num=${2:-­‐5};   !    if  [[  -­‐f  "${log}"  ]];  then  #  rotate  logs          while  [[  ${num}  -­‐gt  1  ]];  do            let  prev=${num}-­‐1              if  [[  -­‐f  "${log}.${prev}"  ]];  then                  mv  "${log}.${prev}"  "${log}.${num}"              fi              num=${prev}          done          mv  "${log}"  "${log}.${num}"      fi   } namenode.out.1  -­‐>  namenode.out.2   namenode.out  -­‐>  namenode.out.1
  • 41. 41 Put a replacement rotate function w/gzip support in hadoop-user-functions.sh!! ! function  hadoop_rotate_log   {      local  log=$1;      local  num=${2:-­‐5};   !    if  [[  -­‐f  "${log}"  ]];  then          while  [[  ${num}  -­‐gt  1  ]];  do              let  prev=${num}-­‐1              if  [[  -­‐f  "${log}.${prev}.gz"  ]];  then                  mv  "${log}.${prev}.gz"  "${log}.${num}.gz"              fi              num=${prev}          done          mv  "${log}"  "${log}.${num}"          gzip  -­‐9  "${log}.${num}"      fi   } namenode.out.1.gz  -­‐>  namenode.out.2.gz   namenode.out  -­‐>  namenode.out.1   gzip  -­‐9  namenode.out.1  -­‐>  namenode.out.1.gz
  • 42. What if we wanted to log every daemon start in syslog?
  • 43. 43 Default daemon starter:! ! function  hadoop_start_daemon   {      local  command=$1      local  class=$2      shift  2   !    hadoop_debug  "Final  CLASSPATH:  ${CLASSPATH}"      hadoop_debug  "Final  HADOOP_OPTS:  ${HADOOP_OPTS}"   !    export  CLASSPATH      exec  "${JAVA}"  "-­‐Dproc_${command}"  ${HADOOP_OPTS}  "$ {class}"  "$@"   }  
  • 44. 44 Put a replacement start function in hadoop-user-functions.sh!! ! function  hadoop_start_daemon   {      local  command=$1      local  class=$2      shift  2   !    hadoop_debug  "Final  CLASSPATH:  ${CLASSPATH}"      hadoop_debug  "Final  HADOOP_OPTS:  ${HADOOP_OPTS}"   !    export  CLASSPATH      logger  -­‐i  -­‐p  local0.notice  -­‐t  hadoop  "Started  ${COMMAND}"      exec  "${JAVA}"  "-­‐Dproc_${command}"  ${HADOOP_OPTS}  "$ {class}"  "$@"   }
  • 46. What if we could start them as non-root?
  • 47. 47 Setup:! ! sudoers (either /etc/sudoers or in LDAP):! ! hdfs   ALL=(root:root)  NOPASSWD:  /usr/bin/jsvc! ! hadoop-env.sh:! ! HADOOP_SECURE_COMMAND=/usr/sbin/sudo  
  • 48. 48 # hadoop-user-functions.sh: (partial code below)! function  hadoop_start_secure_daemon   {                  …      jsvc="${JSVC_HOME}/jsvc"   !    if  [[  “${USER}”  -­‐ne  "${HADOOP_SECURE_USER}"  ]];  then            hadoop_error  "You  must  be  ${HADOOP_SECURE_USER}  in  order  to  start  a   secure  ${daemonname}"          exit  1      fi                     …      exec  /usr/sbin/sudo  "${jsvc}"  "-­‐Dproc_${daemonname}"        -­‐outfile  "${daemonoutfile}"  -­‐errfile  "${daemonerrfile}"        -­‐pidfile  "${daemonpidfile}"  -­‐nodetach  -­‐home  "${JAVA_HOME}"        —user  "${HADOOP_SECURE_USER}"        -­‐cp  "${CLASSPATH}"  ${HADOOP_OPTS}  "${class}"  "$@"   }
  • 49. 49 $ hdfs  datanode! sudo launches jsvc as root! jsvc launches secure datanode! ! ! In order to get -­‐-­‐daemon  start to work, one other function needs to get replaced*, but that’s a SMOP, now that you know how!! ! ! * - hadoop_start_secure_daemon_wrapper  assumes it is running as root!
  • 50. 50 Lots more, but out of time... e.g.:! ! ! Internals for contributors! ! Unit tests! ! API documentation! ! Other projects in the works! ! ...! ! ! Reminder: This is in trunk. Ask vendors their plans!
  • 52. Altiscale copyright 2015. All rights reserved.52