Status of Hadoop 0.23Operations at Yahoo!From Inception to Customer ValidationCharles Wimmer, Staff Site Reliability      ...
Summary of This Talk● Includes  ○ Operational changes required to support 0.23● Does not include  ○ Specifics about custom...
Scope of This Change at Yahoo!●   42,000+ Hadoop servers●   20+ clusters●   Three tiers: Sandbox, Research, Production●   ...
Overview of the Process● Provide customers a 0.23 Sandbox cluster● Provide customers enough data to test their  applicatio...
Test Cluster●   420 Nodes●   2 x Westmere 4 core processors●   24G RAM●   12 x 2T Disks●   No Federation
Configuration● Hierarchical Queues● Memory Configuration● Kerberos
Hierarchical Queues<property>  <name>yarn.scheduler.capacity.root.queues</name>  <value>BIZUNIT-A,BIZUNIT-U,BIZUNIT-C,unfu...
Hierarchical Queues<property> <name>yarn.scheduler.capacity.root.BIZUNIT-A.capacity</name>  <value>50</value></property><p...
Hierarchical Queues<property>  <name>yarn.scheduler.capacity.root.BIZUNIT-U.proj-a.capacity</name>  <value>50</value></pro...
Hierarchical Queues
Memory Configuration  <property> <name>yarn.nodemanager.resource.memory-mb</name>  <value>21504</value>  </property>
Kerberos Configuration  <property>   <name>yarn.resourcemanager.principal</name>    <value>mapred/clustername-jt1.domain.n...
Init Scripts●   DataNode/NameNode●   SecondaryNameNode●   HistoryServer●   NodeManager●   ResourceManager
DataNode/NameNodestart_20(){ . . .}start_next(){ . . .}if [ -x /home/gs/hadoop/current/bin/hdfs ] ; then   start_next $@el...
SecondaryNameNodefunction clean_checkpoint_dir { CHECKPOINT_DIR=/grid/0/tmp/hadoop-hdfs/dfs/namesecondary/current if [ -d ...
HistoryServercase "$1" in start)   su $HADOOP_USER -s /bin/sh -c "$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh--config $HAD...
ResourceManager/NodeManagercase "$1" in start)  su $HADOOP_USER -s /bin/sh -c "$HADOOP_PREFIX/sbin/yarn-daemon.sh --config...
Questions?Charles Wimmer@cwimmercharles@wimmer.netcwimmer@linkedin.com
Upcoming SlideShare
Loading in …5
×

Status of Hadoop 0.23 Operations at Yahoo

1,607 views
1,486 views

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,607
On SlideShare
0
From Embeds
0
Number of Embeds
71
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Status of Hadoop 0.23 Operations at Yahoo

  1. 1. Status of Hadoop 0.23Operations at Yahoo!From Inception to Customer ValidationCharles Wimmer, Staff Site Reliability Engineer at LinkedIn
  2. 2. Summary of This Talk● Includes ○ Operational changes required to support 0.23● Does not include ○ Specifics about customer testing ○ Deployment into Research or Production clusters
  3. 3. Scope of This Change at Yahoo!● 42,000+ Hadoop servers● 20+ clusters● Three tiers: Sandbox, Research, Production● 0.20.205.x
  4. 4. Overview of the Process● Provide customers a 0.23 Sandbox cluster● Provide customers enough data to test their applications● Provide developer support to address application issues quickly● Upgrade Research and Production clusters as applications are certified to work with 0.23
  5. 5. Test Cluster● 420 Nodes● 2 x Westmere 4 core processors● 24G RAM● 12 x 2T Disks● No Federation
  6. 6. Configuration● Hierarchical Queues● Memory Configuration● Kerberos
  7. 7. Hierarchical Queues<property> <name>yarn.scheduler.capacity.root.queues</name> <value>BIZUNIT-A,BIZUNIT-U,BIZUNIT-C,unfunded</value></property><property> <name>yarn.scheduler.capacity.root.capacity</name> <value>100</value></property>
  8. 8. Hierarchical Queues<property> <name>yarn.scheduler.capacity.root.BIZUNIT-A.capacity</name> <value>50</value></property><property> <name>yarn.scheduler.capacity.root.BIZUNIT-U.capacity</name> <value>30</value></property><property> <name>yarn.scheduler.capacity.root.BIZUNIT-C.capacity</name> <value>15</value></property><property> <name>yarn.scheduler.capacity.root.unfunded.capacity</name> <value>5</value></property>
  9. 9. Hierarchical Queues<property> <name>yarn.scheduler.capacity.root.BIZUNIT-U.proj-a.capacity</name> <value>50</value></property><property> <name>yarn.scheduler.capacity.root.BIZUNIT-U.proj-b.capacity</name> <value>50</value></property>
  10. 10. Hierarchical Queues
  11. 11. Memory Configuration <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>21504</value> </property>
  12. 12. Kerberos Configuration <property> <name>yarn.resourcemanager.principal</name> <value>mapred/clustername-jt1.domain.name.com@REALM.NAME.COM</value> </property> <property> <name>yarn.nodemanager.principal</name> <value>tt/_HOST@REALM.NAME.COM</value> </property>
  13. 13. Init Scripts● DataNode/NameNode● SecondaryNameNode● HistoryServer● NodeManager● ResourceManager
  14. 14. DataNode/NameNodestart_20(){ . . .}start_next(){ . . .}if [ -x /home/gs/hadoop/current/bin/hdfs ] ; then start_next $@else start_20 $@fi
  15. 15. SecondaryNameNodefunction clean_checkpoint_dir { CHECKPOINT_DIR=/grid/0/tmp/hadoop-hdfs/dfs/namesecondary/current if [ -d "$CHECKPOINT_DIR" ] ; then DELETE_DIR=`mktemp -p /grid/0/tmp -d delete-XXXXXX` if [ $? -eq 0 ] ; then echo "moving $CHECKPOINT_DIR to ${DELETE_DIR}/ " mv $CHECKPOINT_DIR ${DELETE_DIR}/ cat<<EOF | at now+1min 2>/dev/nullif [ -d $DELETE_DIR ] ; then rm -rf --preserve-root $DELETE_DIRfiEOF fi fi}
  16. 16. HistoryServercase "$1" in start) su $HADOOP_USER -s /bin/sh -c "$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh--config $HADOOP_CONF_DIR starthistoryserver" RET=$? ;;
  17. 17. ResourceManager/NodeManagercase "$1" in start) su $HADOOP_USER -s /bin/sh -c "$HADOOP_PREFIX/sbin/yarn-daemon.sh --config$HADOOP_CONF_DIR start $PROC" RET=$? ;;
  18. 18. Questions?Charles Wimmer@cwimmercharles@wimmer.netcwimmer@linkedin.com

×