An example Apache Hadoop Yarn upgrade
Upcoming SlideShare
Loading in...5
×
 

An example Apache Hadoop Yarn upgrade

on

  • 1,911 views

This is a simple example of how Hadoop on

This is a simple example of how Hadoop on
Ubuntu Linux can be upgraded from V1 to Yarn.
It shows the steps, the configuration, a
mapreduce check and the errors encountered.

Statistics

Views

Total Views
1,911
Views on SlideShare
1,911
Embed Views
0

Actions

Likes
1
Downloads
66
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

An example Apache Hadoop Yarn upgrade An example Apache Hadoop Yarn upgrade Presentation Transcript

  • Apache Yarn Upgrade ● Example upgrade ● From V1 -> Yarn ● Environment ● Approach ● Install steps ● Install check www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Environment ● Java OpenJDK 1.6.0_27 ● Ubuntu 12.04 ● Maven 3.0.4 ● Hadoop 1.2.0 ● Mahout 0.9 ● Hadoop to install – 2.0.6-alpha Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Approach ● Install along side existing Hadoop on all nodes ● Use existing hdfs ● Change cfg files on all nodes ● Set up as single nodes and test via mapreduce ● Create cluster and test via mapreduce ● Check web GUI access Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Build with Maven into a distribution directory mvn clean package -Pdist -Dtar -DskipTests -Pnative release created under ./hadoop-dist/target/hadoop-2.0.6-alpha ● Only skip tests after first build to speed things up ● Configure $HOME/.bashrc – HADOOP_COMMON_HOME – HADOOP_HDFS_HOME – HADOOP_MAPRED_HOME – HADOOP_YARN_HOME – HADOOP_CONF_DIR – YARN_CONF_DIR – MAPRED_CONF_DIR – HADOOP_PREFIX – PATH – YARN_CLASSPATH www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Set up core-site.xml cd $HADOOP_COMMON_HOME/etc/hadoop ● Alter values for – fs.default.name – hadoop.tmp.dir – fs.checkpoint.dir www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Set up hdfs-site.xml cd $HADOOP_HDFS_HOME/etc/hadoop ● Alter values for – dfs.name.dir – dfs.data.dir – dfs.http.address – dfs.secondary.http.address – dfs.https.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Set up yarn-site.xml cd $YARN_CONF_DIR ● Alter values for – yarn.resourcemanager.resource-tracker.address – yarn.resourcemanager.scheduler.address – yarn.resourcemanager.scheduler.class – yarn.resourcemanager.address – yarn.nodemanager.local-dirs – yarn.nodemanager.address – yarn.nodemanager.resource.memory-mb – yarn.nodemanager.remote-app-log-dir – yarn.nodemanager.log-dirs – yarn.nodemanager.aux-services – yarn.web-proxy.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Set up mapred-site.xml cd $MAPRED_CONF_DIR ● Alter values for – mapreduce.cluster.temp.dir – mapreduce.cluster.local.dir – mapreduce.jobhistory.address – mapreduce.jobhistory.webapp.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Set up capcity-scheduler.xml cd $HADOOP_YARN_HOME/etc/hadoop ● Alter values for – yarn.scheduler.capacity.maximum-applications – yarn.scheduler.capacity.maximum-am-resource-percent – yarn.scheduler.capacity.resource-calculator – yarn.scheduler.capacity.root.queues – yarn.scheduler.capacity.child.queues – yarn.scheduler.capacity.child.unfunded.capacity – yarn.scheduler.capacity.child.default.capacity – yarn.scheduler.capacity.root.capacity – yarn.scheduler.capacity.root.unfunded.capacity – yarn.scheduler.capacity.root.default.capacity – yarn.scheduler.capacity.root.default.user-limit-factor – yarn.scheduler.capacity.root.default.maximum-capacity – yarn.scheduler.capacity.root.default.state – yarn.scheduler.capacity.root.default.acl_submit_applications – yarn.scheduler.capacity.root.default.acl_administer_queue – yarn.scheduler.capacity.node-locality-delay www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Start Resource Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start resourcemanager ● Start Node Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start ndemanager ● Test via map reduce job cd $HADOOP_MAPRED_HOME/share/hadoop/mapreduce $HADOOP_COMMON_HOME/bin/hadoop jar hadoop-mapreduce-examples-2.0.6-alpha.jar randomwriter out www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Mapreduce job should end with BYTES_WRITTEN=1073750341 RECORDS_WRITTEN=102099 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1085699265 Job ended: Sun Aug 25 12:45:35 NZST 2013 The job took 89 seconds. ● Run this test on each node being upgraded www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Stop the servers cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh stop resourcemanager stopping resourcemanager sbin/yarn-daemon.sh stop nodemanager stopping nodemanager ● Alter Hadoop env cd $HADOOP_CONF_DIR vi hadoop-env.sh add a JAVA_HOME definition at the end. i.e. export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Alter $HADOOP_CONF_DIR/slaves file – Add details ( one per line ) for slave nodes ● Format the cluster – DONT have the cluster running else you will lose data – hdfs namenode -format ● Now proceed to start the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_COMMON_HOME/etc/hadoop --script hdfs start namenode cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager cd $HADOOP_YARN_HOME bin/yarn start proxyserver --config $HADOOP_CONF_DIR cd $HADOOP_MAPRED_HOME sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Yarn Upgrade Install ● Use jps to check servers running jps 5856 DataNode 6434 Jps 5776 NameNode 6181 NodeManager 6255 WebAppProxyServer 5927 ResourceManager 6352 JobHistoryServer ● Then run the same mapreduce job on the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems