Hadoop Cluster
安裝
Intern Report
主要參考網頁
 http://bigdatahandler.com/hadoop-
hdfs/installing-single-node-hadoop-2-2-0-
on-ubuntu/
Software Versions
 Ubuntu Linux 12.04.4 LTS
 Hadoop 2.2.0
 If you are using putty to access your Linux
box remotely, please install openssh by
running this command, this also help...
Prerequisites:
 Installing Java v1.7
 Adding dedicated Hadoop system user.
 Configuring SSH access.
1. Installing Java v1.7:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java...
2. Adding dedicated Hadoop
system user.
 a. Adding group:
sudo addgroup hadoop
 b. Creating a user and adding the user t...
3. Configuring SSH access:
 su – hduser
 ssh-keyegen -t rsa -P "“
 cat $HOME/.ssh/id_rsa.pub >>
$HOME/.ssh/authorized_k...
Hadoop Installation
 i. Run this following command to download
Hadoop version 2.2.0
wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-...
 iv. Move hadoop package of your choice
sudo mv hadoop /usr/local/
 v. Make sure to change the owner of all the files
to...
Configuring Hadoop
 The following are the required files we will use
for the perfect configuration of the single
node Hadoop cluster.
a. yar...
a.yarn-site.xml:
<configuration>
<!-- Site specific YARN configuration properties --> <property>
<name>yarn.nodemanager.au...
b. core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>...
c. mapred-site.xml:
<configuration>
property>
<name>mapreduce.framework.name
</name>
<value>yarn</value>
</property>
</con...
sudo mkdir -p
$HADOOP_HOME/yarn_data/hdfs/namenode
sudo mkdir -p
$HADOOP_HOME/yarn_data/hdfs/datanode
d. hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>df...
e. Update $HOME/.bashrc
 i. Go back to the root and edit
the .bashrc file.
vi .bashrc
e. Update $HOME/.bashrc
#Set Hadoop-related environment variables
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HOM...
Formatting and Starting/Stopping
the HDFS filesystem via the
NameNode
 i. The first step to starting up your Hadoop
installation is formatting the Hadoop
filesystem which is implemented on to...
 ii. Start Hadoop Daemons by running the
following commands:
 Name node:
hadoop-daemon.sh start namenode
 Data node:
ha...
 Resource Manager:
yarn-daemon.sh start resourcemanager
 Node Manager:
yarn-daemon.sh start nodemanager
 Job History Se...
 Stop Hadoop by running the following
command
stop-dfs.sh
stop-yarn.sh
 Start and stop hadoop daemons all at
once.
start-all.sh
stop-all.sh
Thanks for listening
Hadoop cluster 安裝
Hadoop cluster 安裝
Hadoop cluster 安裝
Hadoop cluster 安裝
Hadoop cluster 安裝
Upcoming SlideShare
Loading in...5
×

Hadoop cluster 安裝

804

Published on

Hadoop 2.2.0

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
804
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hadoop cluster 安裝

  1. 1. Hadoop Cluster 安裝 Intern Report
  2. 2. 主要參考網頁  http://bigdatahandler.com/hadoop- hdfs/installing-single-node-hadoop-2-2-0- on-ubuntu/
  3. 3. Software Versions  Ubuntu Linux 12.04.4 LTS  Hadoop 2.2.0
  4. 4.  If you are using putty to access your Linux box remotely, please install openssh by running this command, this also helps in configuring SSH access easily in the later part of the installation: sudo apt-get install openssh-server
  5. 5. Prerequisites:  Installing Java v1.7  Adding dedicated Hadoop system user.  Configuring SSH access.
  6. 6. 1. Installing Java v1.7: sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java7-installer export JAVA_HOME=/usr/lib/jvm/java-7-oracle
  7. 7. 2. Adding dedicated Hadoop system user.  a. Adding group: sudo addgroup hadoop  b. Creating a user and adding the user to a group: sudo adduser –ingroup hadoop hduser
  8. 8. 3. Configuring SSH access:  su – hduser  ssh-keyegen -t rsa -P "“  cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys  ssh hduser@localhost
  9. 9. Hadoop Installation
  10. 10.  i. Run this following command to download Hadoop version 2.2.0 wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop- 2.2.0/hadoop-2.2.0.tar.gz  ii. Unpack the compressed hadoop file by using this command: tar -xvzf hadoop-2.2.0.tar.gz  iii. move hadoop-2.2.0 to hadoop directory by using give command mv hadoop-2.2.0 hadoop
  11. 11.  iv. Move hadoop package of your choice sudo mv hadoop /usr/local/  v. Make sure to change the owner of all the files to the hduser user and hadoop group by using this command: cd /usr/local/ sudo chown -R hduser:hadoop hadoop
  12. 12. Configuring Hadoop
  13. 13.  The following are the required files we will use for the perfect configuration of the single node Hadoop cluster. a. yarn-site.xml: b. core-site.xml c. mapred-site.xml d. hdfs-site.xml e. Update $HOME/.bashrc  We can find the list of files in Hadoop directory which is located in cd /usr/local/hadoop/etc/hadoop
  14. 14. a.yarn-site.xml: <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler </value> </property> </configuration>
  15. 15. b. core-site.xml: <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
  16. 16. c. mapred-site.xml: <configuration> property> <name>mapreduce.framework.name </name> <value>yarn</value> </property> </configuration>
  17. 17. sudo mkdir -p $HADOOP_HOME/yarn_data/hdfs/namenode sudo mkdir -p $HADOOP_HOME/yarn_data/hdfs/datanode
  18. 18. d. hdfs-site.xml: <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/yarn_data/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/yarn_data/hdfs/datanode</value> </property> </configuration>
  19. 19. e. Update $HOME/.bashrc  i. Go back to the root and edit the .bashrc file. vi .bashrc
  20. 20. e. Update $HOME/.bashrc #Set Hadoop-related environment variables export HADOOP_PREFIX=/usr/local/hadoop export HADOOP_HOME=/usr/local/hadoop export HADOOP_MAPRED_HOME=${HADOOP_HOME} export HADOOP_COMMON_HOME=${HADOOP_HOME} export HADOOP_HDFS_HOME=${HADOOP_HOME} export YARN_HOME=${HADOOP_HOME} export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop #Native Path export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib" #Java path export JAVA_HOME='/usr/lib/jvm/java-7-oracle' #Add Hadoop bin/ directory to PATH export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbi n
  21. 21. Formatting and Starting/Stopping the HDFS filesystem via the NameNode
  22. 22.  i. The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your cluster. You need to do this the first time you set up a Hadoop cluster. Do not format a running Hadoop filesystem as you will lose all the data currently in the cluster (in HDFS). hadoop namenode -format
  23. 23.  ii. Start Hadoop Daemons by running the following commands:  Name node: hadoop-daemon.sh start namenode  Data node: hadoop-daemon.sh start datanode
  24. 24.  Resource Manager: yarn-daemon.sh start resourcemanager  Node Manager: yarn-daemon.sh start nodemanager  Job History Server: mr-jobhistory-daemon.sh start historyserver
  25. 25.  Stop Hadoop by running the following command stop-dfs.sh stop-yarn.sh
  26. 26.  Start and stop hadoop daemons all at once. start-all.sh stop-all.sh
  27. 27. Thanks for listening
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×