An example Apache Hadoop Yarn upgrade

  • 1,722 views
Uploaded on

This is a simple example of how Hadoop on …

This is a simple example of how Hadoop on
Ubuntu Linux can be upgraded from V1 to Yarn.
It shows the steps, the configuration, a
mapreduce check and the errors encountered.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,722
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
85
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Apache Yarn Upgrade ● Example upgrade ● From V1 -> Yarn ● Environment ● Approach ● Install steps ● Install check www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 2. Yarn Upgrade Environment ● Java OpenJDK 1.6.0_27 ● Ubuntu 12.04 ● Maven 3.0.4 ● Hadoop 1.2.0 ● Mahout 0.9 ● Hadoop to install – 2.0.6-alpha Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 3. Yarn Upgrade Approach ● Install along side existing Hadoop on all nodes ● Use existing hdfs ● Change cfg files on all nodes ● Set up as single nodes and test via mapreduce ● Create cluster and test via mapreduce ● Check web GUI access Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 4. Yarn Upgrade Install ● Build with Maven into a distribution directory mvn clean package -Pdist -Dtar -DskipTests -Pnative release created under ./hadoop-dist/target/hadoop-2.0.6-alpha ● Only skip tests after first build to speed things up ● Configure $HOME/.bashrc – HADOOP_COMMON_HOME – HADOOP_HDFS_HOME – HADOOP_MAPRED_HOME – HADOOP_YARN_HOME – HADOOP_CONF_DIR – YARN_CONF_DIR – MAPRED_CONF_DIR – HADOOP_PREFIX – PATH – YARN_CLASSPATH www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 5. Yarn Upgrade Install ● Set up core-site.xml cd $HADOOP_COMMON_HOME/etc/hadoop ● Alter values for – fs.default.name – hadoop.tmp.dir – fs.checkpoint.dir www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 6. Yarn Upgrade Install ● Set up hdfs-site.xml cd $HADOOP_HDFS_HOME/etc/hadoop ● Alter values for – dfs.name.dir – dfs.data.dir – dfs.http.address – dfs.secondary.http.address – dfs.https.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 7. Yarn Upgrade Install ● Set up yarn-site.xml cd $YARN_CONF_DIR ● Alter values for – yarn.resourcemanager.resource-tracker.address – yarn.resourcemanager.scheduler.address – yarn.resourcemanager.scheduler.class – yarn.resourcemanager.address – yarn.nodemanager.local-dirs – yarn.nodemanager.address – yarn.nodemanager.resource.memory-mb – yarn.nodemanager.remote-app-log-dir – yarn.nodemanager.log-dirs – yarn.nodemanager.aux-services – yarn.web-proxy.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 8. Yarn Upgrade Install ● Set up mapred-site.xml cd $MAPRED_CONF_DIR ● Alter values for – mapreduce.cluster.temp.dir – mapreduce.cluster.local.dir – mapreduce.jobhistory.address – mapreduce.jobhistory.webapp.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 9. Yarn Upgrade Install ● Set up capcity-scheduler.xml cd $HADOOP_YARN_HOME/etc/hadoop ● Alter values for – yarn.scheduler.capacity.maximum-applications – yarn.scheduler.capacity.maximum-am-resource-percent – yarn.scheduler.capacity.resource-calculator – yarn.scheduler.capacity.root.queues – yarn.scheduler.capacity.child.queues – yarn.scheduler.capacity.child.unfunded.capacity – yarn.scheduler.capacity.child.default.capacity – yarn.scheduler.capacity.root.capacity – yarn.scheduler.capacity.root.unfunded.capacity – yarn.scheduler.capacity.root.default.capacity – yarn.scheduler.capacity.root.default.user-limit-factor – yarn.scheduler.capacity.root.default.maximum-capacity – yarn.scheduler.capacity.root.default.state – yarn.scheduler.capacity.root.default.acl_submit_applications – yarn.scheduler.capacity.root.default.acl_administer_queue – yarn.scheduler.capacity.node-locality-delay www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 10. Yarn Upgrade Install ● Start Resource Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start resourcemanager ● Start Node Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start ndemanager ● Test via map reduce job cd $HADOOP_MAPRED_HOME/share/hadoop/mapreduce $HADOOP_COMMON_HOME/bin/hadoop jar hadoop-mapreduce-examples-2.0.6-alpha.jar randomwriter out www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 11. Yarn Upgrade Install ● Mapreduce job should end with BYTES_WRITTEN=1073750341 RECORDS_WRITTEN=102099 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1085699265 Job ended: Sun Aug 25 12:45:35 NZST 2013 The job took 89 seconds. ● Run this test on each node being upgraded www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 12. Yarn Upgrade Install ● Stop the servers cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh stop resourcemanager stopping resourcemanager sbin/yarn-daemon.sh stop nodemanager stopping nodemanager ● Alter Hadoop env cd $HADOOP_CONF_DIR vi hadoop-env.sh add a JAVA_HOME definition at the end. i.e. export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 13. Yarn Upgrade Install ● Alter $HADOOP_CONF_DIR/slaves file – Add details ( one per line ) for slave nodes ● Format the cluster – DONT have the cluster running else you will lose data – hdfs namenode -format ● Now proceed to start the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 14. Yarn Upgrade Install cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_COMMON_HOME/etc/hadoop --script hdfs start namenode cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager cd $HADOOP_YARN_HOME bin/yarn start proxyserver --config $HADOOP_CONF_DIR cd $HADOOP_MAPRED_HOME sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 15. Yarn Upgrade Install ● Use jps to check servers running jps 5856 DataNode 6434 Jps 5776 NameNode 6181 NodeManager 6255 WebAppProxyServer 5927 ResourceManager 6352 JobHistoryServer ● Then run the same mapreduce job on the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 16. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 17. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 18. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 19. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems