Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu

5,790 views
5,447 views

Published on

Published in: Technology
3 Comments
5 Likes
Statistics
Notes
No Downloads
Views
Total views
5,790
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
262
Comments
3
Likes
5
Embeds 0
No embeds

No notes for slide

Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu

  1. 1. 東海大學資工系 Hadoop 2.2.0 Multi-node Installation on Ubuntu 康志強 G02357004 2014/1/3
  2. 2. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言......................................................................................................................................... 2 二、安裝環境................................................................................................................................. 3 三、安裝步驟................................................................................................................................. 4 1. 安裝環境說明................................................................................................................. 4 2. 設定................................................................................................................................. 5 3. 增加三台機器的 ip 和 hostname 的對應 .................................................................... 7 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.................................. 8 5. 安裝 JDK ...................................................................................................................... 10 6. 關閉防火牆................................................................................................................... 11 7. Hadoop 2.2 安裝 ......................................................................................................... 12 8. Hadoop 2.2 啟動 ......................................................................................................... 18 五、本文的引用網址: ................................................................................................................. 24 1
  3. 3. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言 略 2
  4. 4. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 二、安裝環境 CPU Intel Core i7-4470 3.40GHz RAM 8 GB * 2 HD 128 SSD + 1TB HD Network 100M/1000M bps Ethernet OS Windows7_64-bit VM Platform VMware® Workstation10.0.0 build-1295980 VM Guest OS ubuntu-12.04.3-desktop-amd64 VMRAM 2.0GB VM HD 40GB 3
  5. 5. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 三、安裝步驟 1. 安裝環境說明 這裡我們建構一個由三台機器組成的叢集 Hostname User/Password cloud001 hduser/adm123 cloud002 hduser/adm123 cloud003 hduser/adm123 Cluster 角色 Name node Secondary Name node Resource manager Data node Node manager Data node Node manager 4 OS ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits
  6. 6. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 2. 設定 (1) 修改 hostname,改成 cloud001 vim /etc/hostname (2) 修改 hduser 權限 : vim /etc/sudoers (3) 系统升级到最新 sudo apt-get update 5
  7. 7. Hadoop 2.2.0 (multi-node) Installation on Ubuntu sudo apt-get upgrade 基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了 6
  8. 8. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 3. 增加三台機器的 ip 和 hostname 的對應 hduser@cloud001:~$ vim /etc/hosts 7
  9. 9. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入 (1) 安裝 SSH sudo apt-get install ssh (2) 設置 local 無密碼登陸,在登入目錄下執行下面指令 建立.ssh 目錄,進入 hduser@ubuntu:~$ mkdir .ssh hduser@ubuntu:~$ cd .ssh 產生金鑰(一直 Enter 就可以) hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa 把 id_rsa.pub 追加到授權的 key 裡面去 hduser@ubuntu:~/.ssh$cat id_rsa.pub >> authorized_keys 重啟 SSH 服務 hduser@ubuntu:~/.ssh$ service ssh restart 8
  10. 10. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 ssh localhos 9
  11. 11. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 5. 安裝 JDK 下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz 安裝 hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm 環境變數 hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc 最後面增加 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH 輸入下面的命令來使之生效 hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc 10
  12. 12. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 hduser@ubuntu:/usr/lib/jvm$ java -version java version "1.7.0_45" Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) hduser@ubuntu:/usr/lib/jvm$ 6. 關閉防火牆 hduser@ubuntu:/usr/lib/jvm$ sudo ufw disable Firewall stopped and disabled on system startup hduser@ubuntu:/usr/lib/jvm$ 重啟生效 11
  13. 13. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 7. Hadoop 2.2 安裝 (1) 下載檔案 hadoop-2.2.tar.gz,解壓到/home/hduser 路径下 hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gz hduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz (2) hadoop 配置 配置之前,需要在 cloud001 新增以下資料夾 /home/hduser/dfs/name /home/hduser/dfs/data /home/hduser/temp 修改相關設定擋案內容,清單如下 ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh ~/hadoop-2.2.0/etc/hadoop/slaves ~/hadoop-2.2.0/etc/hadoop/core-site.xml ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp) ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml 修改 hadoop-env.sh 修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 yarn-env.sh 修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 slaves (這個文件裡面 KEEP 所有 slave 節點) 寫入以下內容: cloud002 12
  14. 14. Hadoop 2.2.0 (multi-node) Installation on Ubuntu cloud003 修改 core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://cloud001:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hduser/temp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> </configuration> 13
  15. 15. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>cloud001:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hduser/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration> 14
  16. 16. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>cloud001:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>cloud001:19888</value> </property> </configuration> 修改 yarn-site.xml <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 15
  17. 17. Hadoop 2.2.0 (multi-node) Installation on Ubuntu <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>cloud001:8040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>cloud001:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>cloud001:8025</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>cloud001:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>cloud001:8088</value> </property> </configuration> 設定環境變數 hduser@cloud001:~$ vim ~/.bashrc 16
  18. 18. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 最後面貼上 export HADOOP_HOME=/home/hduser/hadoop-2.2.0 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" (3) clone imagecloud001 to cloud002 & cloud003 ,然後修改 hostname 17
  19. 19. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 8. Hadoop 2.2 啟動 (1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode ./bin/hdfs namenode –format 18
  20. 20. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (2) 啟動 hdfs ./sbin/start-dfs.sh 此時在 001 上面運行的進程有:namenode secondarynamenode 002 和 003 上面運行的進程有:datanode (3) 啟動 yarn ./sbin/start-yarn.sh 此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager 002 和 003 上面運行的進程有:datanode nodemanaget 19
  21. 21. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 20
  22. 22. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (4) 查看叢集狀態 ./bin/hdfs dfsadmin –report (5) 查看文件組成 ./bin/hdfs fsck / -files –blocks 21
  23. 23. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (6) 查看 HDFS (7) 查看 RM 22
  24. 24. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 23
  25. 25. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 五、本文的引用網址: 1. http://blog.csdn.net/licongcong_0224/article/details/12972889 2. http://blog.csdn.net/focusheart/article/details/14005893(單機板) 3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7) 4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介) 5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介) 24

×