SlideShare a Scribd company logo
1 of 47
Single node Cluster
Hadoop Installation
Profile
 Ankit Desai
 Ph.D. Scholar, IET, Ahmedabad University
 Education: M. Tech. (C.E.), B. E. (I. T.)
 Experience: 8 years (Academic and Research)
 Research Interest: IoT, Big Data Analytics,
Machine Learning, Data Mining, Algorithms.
Install Ubuntu
 Ubuntu 14.04.2 LTS
 Download Source
 http://www.ubuntu.com/download/desktop
 64 bit OS v/s 32 bit OS
 ubuntu-14.04.2-desktop-amd64.iso file – 64bit
 Or
 ubuntu-14.10-desktop-i386.iso file – 32bit
Download Java
 Java 7
 http://www.oracle.com/technetwork/java/javase/d
ownloads/jdk7-downloads-1880260.html
 Java 6
 http://www.oracle.com/technetwork/java/javase/d
ownloads/java-archive-downloads-javase6-
419409.html
 x86 or x64 bit as per your computer
configurations
 Download:
 For 7: jdk-7u75-linux-i586.tar.gz or jdk-7u75-linux-
Cont…
 Extract the jdk file
 Open the terminal
 $cd /user/lib (make new dir java)
 $ sudo mkdir java
 Move jdk folder from Download to usr/lib/java
 $sudo mv jdk1.7.0_67/ /usr/lib/java
Cont…
 Goto /usr/lib/java/jdk1.7.0_67/bin
 $ sudo update-alternatives –install
“/usr/bin/java” “java”
“/usr/lib/java/jdk1.7.0_67/bin/java” 1
 $ sudo update-aternatives –install
“/usr/bin/javac” “javac”
“usr/lib/java/jdk1.7.0_67/bin/javac” 1
 $ sudo update-aternatives –install
“/usr/bin/javaws” “javaws”
“usr/lib/java/jdk1.7.0_67/bin/javaws”
1
Cont…
 Check java version
 $java –version
 Set env. Variable JAVA_HOME in .bashrc file
 $gedit ~/.bashrc file
 In .bashrc file
 export
JAVA_HOME=”/usr/lib/java/jdk1.7.0_67”
 set PATH=”$PATH:$JAVA_HOME/bin”
 export PATH
 save & exit
Create hduser
 Create user group
 $sudo addgroup hadoop
 Create user hduser
 $ sudo adduser --ingroup hadoop hduser
 Login to hduser
 user@ubuntu:~$ su – hduser
Working with SSH
 hduser@ubuntu: ~$ ssh (should give you path of
ssh), if not then type $sudo apt-get install
ssh
 hduser@ubuntu: ~$ sshd (should give you path
of sshd), if not then type user@ubuntu:~$ sudo
apt-get install openssh-server
 Generate public and private key pair:
 hduser@ubuntu:~$ ssh-keygen -t rsa -P “”
 hduser@ubuntu:~$ cat
/home/hduser/.ssh/id_rsa.pub >>
/home/hduser /.ssh/authorized_keys
Continue…
 Add user to authenticated user.
 hduser@ubuntu:~$ ssh localhost
The authenticity of host 'localhost (::1)' can't be
established. RSA key fingerprint is
d7:87:25:47:ae:02:00:eb:1d:75:4f:bb:44:f9:36:26.
Are you sure you want to continue connecting
(yes/no)? yes Warning: Permanently added
'localhost' (RSA) to the list of known hosts. Linux
ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed
Apr 28 13:27:30 UTC 2010 i686 GNU/Linux Ubuntu
10.04 LTS [...snipp...]
Disable IPv6
 open /etc/sysctl.conf file with
 hduser@ubuntu:~$ sudo gedit /etc/sysctl.conf
 You have to reboot your machine in order to make the
changes take effect.
 You can check whether IPv6 is enabled on your
machine with the following command:
 $ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
Install Hadoop
 Download Hadoop 1.0.3 from
 https://archive.apache.org/dist/hadoop/core/hadoop
-1.0.3/
 hadoop-1.0.3.tar.gz 2012-05-08 20:35 60M
 Navigate to: /usr/local/hadoop
$ cd /usr/local
$ sudo tar xzf hadoop-1.0.3.tar.gz
$ sudo mv hadoop-1.0.3 hadoop
$ sudo chown -R hduser:hadoop hadoop
Cont…
 hadoop-env.sh
 Write line:
 export JAVA_HOME=”/usr/lib/java/jdk1.7.0_67”
 export HADOOP_HOME_WARN_SUPPRESS=”TRUE”
 Edit ~/.bashrc
 Add following
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=”/usr/lib/java/jdk1.7.0_67”
set PATH=”$PATH:$JAVA_HOME/bin”
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
export PATH=$PATH:$HADOOP_HOME/bin
export PATH
Cont…
 conf/*-site.xml
 Create dir /app/hadoop/tmp
 $ sudo mkdir -p /app/hadoop/tmp
 $ sudo chown hduser:hadoop
/app/hadoop/tmp
 # ...and if you want to tighten up
security, chmod from 755 to 750...
 $ sudo chmod 750 /app/hadoop/tmp
Conf files
Add the following snippets between the <configuration>
... </configuration> tags in the respective configuration
XML file.
conf/core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.
</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A
URI whose scheme and authority determine the FileSystem
implementation. The uri's scheme determines the config
property (fs.SCHEME.impl) naming the FileSystem
implementation class. The uri's authority is used to determine
the host, port, etc. for a filesystem.
</description>
</property>
conf/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the
MapReduce job tracker runs at. If "local", then jobs
are run in-process as a single map and reduce
task.
</description>
</property>
conf/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication. The
actual number of replications can be specified
when the file is created. The default is used if
replication is not specified in create time.
</description>
</property>
Done.
How to start? How to work?
Formatting the HDFS filesystem
via the NameNode
hduser@ubuntu:~$
/usr/local/hadoop/bin/hadoop
namenode –format
10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************ STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG:
version = 0.20.2 STARTUP_MSG: build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by
'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/ 10/05/08 16:59:56 INFO
namenode.FSNamesystem: fsOwner=hduser,hadoop 10/05/08 16:59:56 INFO
namenode.FSNamesystem: supergroup=supergroup 10/05/08 16:59:56 INFO
namenode.FSNamesystem: isPermissionEnabled=true 10/05/08 16:59:56 INFO
common.Storage: Image file of size 96 saved in 0 seconds. 10/05/08 16:59:57 INFO
common.Storage: Storage directory .../hadoop-hduser/dfs/name has been successfully
formatted. 10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************ SHUTDOWN_MSG: Shutting down
NameNode at ubuntu/127.0.1.1 ************************************************************/
hduser@ubuntu:/usr/local/hadoop$
Starting your single-node
cluster
hduser@ubuntu:~$
/usr/local/hadoop/bin/start-all.sh
starting namenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-
ubuntu.out localhost: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-
ubuntu.out localhost: starting secondarynamenode,
logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-
secondarynamenode-ubuntu.out starting jobtracker,
logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-
jobtracker-ubuntu.out localhost: starting tasktracker,
logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-
tasktracker-ubuntu.out
hduser@ubuntu:/usr/local/hadoop$
Verify
hduser@ubuntu:/usr/local/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode
Stopping your single-node
cluster
hduser@ubuntu:~$
/usr/local/hadoop/bin/stop-all.sh
hduser@ubuntu:/usr/local/hadoop$
bin/stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
hduser@ubuntu:/usr/local/hadoop$
Hadoop Web Interfaces
 http://localhost:50070/ – web UI of the
NameNode daemon
 http://localhost:50030/ – web UI of the JobTracker
daemon
 http://localhost:50060/ – web UI of the
TaskTracker daemon
Name node
Job Tracker
Task Tracker
References
1. http://www.michael-noll.com/tutorials/running-
hadoop-on-ubuntu-linux-single-node-cluster/
2. http://www.tutorialspoint.com/hadoop/hadoop_e
nviornment_setup.htm
Multi-node Cluster
Hadoop Installation
Making single node cluster
 Marge two single node cluster in to multi-node
cluster.
 One will become designated master
 Will also work as slave (will store and process the data as
well)
 Pseudo distributed cluster
 Another will become slave only
Prerequisites
 Configuring single-node clusters first
 Copy the Ubuntu install folder and paste it.
(Replication of the same VM)
 Make sure your ubuntu system uses DHCP
settings/ reasonably considerable settings for
network setup.
Change Host-names
 Change hostnames of each systems
 Login to each system using hduser@ubuntu$ and
open file /etc/hosts
 Find system’s ipv4 address using command ifconfig
 Make and entry of ip address and host name of
master and slave both on both systems as follows.
 Command:
 sudo gedit /etc/hosts
Also change
 /etc/hostname
 On master
 master
 On slave
 slave
 To take its effect run:
 Run sudo /etc/init.d/hostname restart or sudo
service hostname restart
Verification
 Exit the terminal atleast once to see the effects:
 From:
 hduser@ubuntu$
 To:
 hduser@master$ on master side
 hduser@slave$ on slave side
SSH access
 Distribute the SSH public key of hduser@master
 Command:
 hduser@master:~$ ssh-copy-id -i
$HOME/.ssh/id_rsa.pub hduser@slave
 Above cmd will copy id_rsa.pub on hduser@slave
SSH Login
 So, connecting from master to master…
 Command:
 hduser@master:~$ ssh master
 Sample output:
 hduser@master:~$ ssh master The authenticity of
host 'master (192.168.0.1)' can't be established.
RSA key fingerprint is
3b:21:b3:c0:21:5c:7c:54:2f:1e:2d:96:79:eb:7f:95.
Are you sure you want to continue connecting
(yes/no)? yes Warning: Permanently added 'master'
(RSA) to the list of known hosts. Linux master
2.6.20-16-386 #2 Thu Jun 7 20:16:13 UTC 2007
i686 ...
hduser@master:~$
SSH Login
 …and from master to slave.
 Command:
 hduser@master:~$ ssh slave
 Sample output:
 The authenticity of host 'slave (192.168.0.2)' can't
be established. RSA key fingerprint is
74:d7:61:86:db:86:8f:31:90:9c:68:b0:13:88:52:72.
Are you sure you want to continue connecting
(yes/no)? yes Warning: Permanently added 'slave'
(RSA) to the list of known hosts. Ubuntu 10.04 ...
hduser@slave:~$
Only on master side
 Update /usr/local/hadoop/conf/masters file as
follow
Only on master side
 Update /usr/local/hadoop/conf/slaves file as
follow
 If you have more than one slaves then…
conf/core-site.xml (ALL
machines)
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A
URI whose
scheme and authority determine the FileSystem
implementation. The
uri's scheme determines the config property
(fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's
authority is used to
determine the host, port, etc. for a
filesystem.</description>
</property>
conf/mapred-site.xml (ALL machines)
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the
MapReduce job tracker runs
at. If "local", then jobs are run in-process as a
single map
and reduce task.
</description>
</property>
conf/hdfs-site.xml (ALL
machines)
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified
when the file is created.
The default is used if replication is not specified in
create time.
</description>
</property>
Run name-node format (critical)
hduser@master:/usr/local/hadoop$ bin/hadoop
namenode -format
... INFO dfs.Storage: Storage directory
/app/hadoop/tmp/dfs/name has been successfully
formatted.
hduser@master:/usr/local/hadoop$
Start multi-node Cluster
hduser@master:/usr/local/hadoop$
bin/hadoop/start-all.sh
starting namenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-
master.out
slave: Ubuntu 10.xx
slave: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-
slave.out
master: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-
master.out
master: starting secondarynamenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hduser-
secondarynamenode-master.out
hduser@master:/usr/local/hadoop$
Common Errors
 After some duration… shutdown of Datanode
automatically…
 Fix
1. Restart Hadoop
2. Go to /app/hadoop/tmp/dfs/name/current
3. Open VERSION (i.e. by vim VERSION)
4. Record namespaceID
5. Go to /app/hadoop/tmp/dfs/data/current
6. Open VERSION (i.e. by vim VERSION)
7. Replace the namespaceID with the namespaceID
you recorded in step 4.
Common Errors
 $HADOOP_HOME is deprecated.
 Fix
 Try setting
 export HADOOP_HOME_WARN_SUPPRESS="TRUE" in
my conf/hadoop-env.sh file
Enjoy Hadooping!!!

More Related Content

What's hot

Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Shilpa Hemaraj
 
HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성Young Pyo
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationhabeebulla g
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows habeebulla g
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparisonarunkumar sadhasivam
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Denish Patel
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Bud Siddhisena
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubTiago Simões
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215exsuns
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Praguetomasbart
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoringTiago Simões
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
A Journey to Boot Linux on Raspberry Pi
A Journey to Boot Linux on Raspberry PiA Journey to Boot Linux on Raspberry Pi
A Journey to Boot Linux on Raspberry PiJian-Hong Pan
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016StackIQ
 
Software Packaging for Cross OS Distribution
Software Packaging for Cross OS DistributionSoftware Packaging for Cross OS Distribution
Software Packaging for Cross OS DistributionJian-Hong Pan
 

What's hot (20)

Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners
 
Run wordcount job (hadoop)
Run wordcount job (hadoop)Run wordcount job (hadoop)
Run wordcount job (hadoop)
 
HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성
 
Hadoop 3.1.1 single node
Hadoop 3.1.1 single nodeHadoop 3.1.1 single node
Hadoop 3.1.1 single node
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparison
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHub
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
A Journey to Boot Linux on Raspberry Pi
A Journey to Boot Linux on Raspberry PiA Journey to Boot Linux on Raspberry Pi
A Journey to Boot Linux on Raspberry Pi
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
 
Software Packaging for Cross OS Distribution
Software Packaging for Cross OS DistributionSoftware Packaging for Cross OS Distribution
Software Packaging for Cross OS Distribution
 

Viewers also liked

Presentation8 silder switch_progress
Presentation8 silder switch_progressPresentation8 silder switch_progress
Presentation8 silder switch_progressAnkit Desai
 
Presentation15 parse xml
Presentation15 parse xmlPresentation15 parse xml
Presentation15 parse xmlAnkit Desai
 
Presentation4 date picker
Presentation4 date pickerPresentation4 date picker
Presentation4 date pickerAnkit Desai
 
Presentation11 sq lite
Presentation11 sq litePresentation11 sq lite
Presentation11 sq liteAnkit Desai
 
Presentation14 audio play
Presentation14 audio playPresentation14 audio play
Presentation14 audio playAnkit Desai
 
java code and document security
java code and document securityjava code and document security
java code and document securityAnkit Desai
 
java drag and drop and data transfer
java drag and drop and data transferjava drag and drop and data transfer
java drag and drop and data transferAnkit Desai
 

Viewers also liked (9)

Presentation8 silder switch_progress
Presentation8 silder switch_progressPresentation8 silder switch_progress
Presentation8 silder switch_progress
 
Presentation15 parse xml
Presentation15 parse xmlPresentation15 parse xml
Presentation15 parse xml
 
Presentation4 date picker
Presentation4 date pickerPresentation4 date picker
Presentation4 date picker
 
Presentation11 sq lite
Presentation11 sq litePresentation11 sq lite
Presentation11 sq lite
 
Presentation14 audio play
Presentation14 audio playPresentation14 audio play
Presentation14 audio play
 
java code and document security
java code and document securityjava code and document security
java code and document security
 
java Jdbc
java Jdbc java Jdbc
java Jdbc
 
java drag and drop and data transfer
java drag and drop and data transferjava drag and drop and data transfer
java drag and drop and data transfer
 
Java Beans
Java BeansJava Beans
Java Beans
 

Similar to Hadoop installation

Advanced Level Training on Koha / TLS (ToT)
Advanced Level Training on Koha / TLS (ToT)Advanced Level Training on Koha / TLS (ToT)
Advanced Level Training on Koha / TLS (ToT)Ata Rehman
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linuxVicent Selfa
 
Two single node cluster to one multinode cluster
Two single node cluster to one multinode clusterTwo single node cluster to one multinode cluster
Two single node cluster to one multinode clustersushantbit04
 
Developing IT infrastructures with Puppet
Developing IT infrastructures with PuppetDeveloping IT infrastructures with Puppet
Developing IT infrastructures with PuppetAlessandro Franceschi
 
Linux conf-admin
Linux conf-adminLinux conf-admin
Linux conf-adminbadamisri
 
Linux conf-admin
Linux conf-adminLinux conf-admin
Linux conf-adminbadamisri
 
RH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationRH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationIsabella789
 
RH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationRH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationIsabella789
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 
Network File System (NFS)
Network File System (NFS)Network File System (NFS)
Network File System (NFS)abdullah roomi
 
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)Isabella789
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Mandakini Kumari
 
Large Scale Deployment of Linux
Large Scale Deployment of LinuxLarge Scale Deployment of Linux
Large Scale Deployment of LinuxKshitij Agarwal
 

Similar to Hadoop installation (20)

Dev ops
Dev opsDev ops
Dev ops
 
Linux filesystemhierarchy
Linux filesystemhierarchyLinux filesystemhierarchy
Linux filesystemhierarchy
 
Advanced Level Training on Koha / TLS (ToT)
Advanced Level Training on Koha / TLS (ToT)Advanced Level Training on Koha / TLS (ToT)
Advanced Level Training on Koha / TLS (ToT)
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
 
Two single node cluster to one multinode cluster
Two single node cluster to one multinode clusterTwo single node cluster to one multinode cluster
Two single node cluster to one multinode cluster
 
Developing IT infrastructures with Puppet
Developing IT infrastructures with PuppetDeveloping IT infrastructures with Puppet
Developing IT infrastructures with Puppet
 
Solaris_quickref.pdf
Solaris_quickref.pdfSolaris_quickref.pdf
Solaris_quickref.pdf
 
Linux conf-admin
Linux conf-adminLinux conf-admin
Linux conf-admin
 
Linux conf-admin
Linux conf-adminLinux conf-admin
Linux conf-admin
 
Linux Conf Admin
Linux Conf AdminLinux Conf Admin
Linux Conf Admin
 
RH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationRH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux Certification
 
RH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux CertificationRH302 Exam-Red Hat Linux Certification
RH302 Exam-Red Hat Linux Certification
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Network File System (NFS)
Network File System (NFS)Network File System (NFS)
Network File System (NFS)
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
 
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)
RH-302 Exam-Red Hat Certified Engineer on Redhat Enterprise Linux 4 (Labs)
 
Linux
LinuxLinux
Linux
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
 
Large Scale Deployment of Linux
Large Scale Deployment of LinuxLarge Scale Deployment of Linux
Large Scale Deployment of Linux
 

More from Ankit Desai

java swing programming
java swing programming java swing programming
java swing programming Ankit Desai
 
Presentation10 view navigation
Presentation10 view navigationPresentation10 view navigation
Presentation10 view navigationAnkit Desai
 
Presentation7 segment control
Presentation7 segment controlPresentation7 segment control
Presentation7 segment controlAnkit Desai
 
Presentation6 ui image_view
Presentation6 ui image_viewPresentation6 ui image_view
Presentation6 ui image_viewAnkit Desai
 
Presentation5 picker view
Presentation5 picker viewPresentation5 picker view
Presentation5 picker viewAnkit Desai
 
Presentation3 actionsheet alertview
Presentation3 actionsheet alertviewPresentation3 actionsheet alertview
Presentation3 actionsheet alertviewAnkit Desai
 
Presentation1 password
Presentation1 passwordPresentation1 password
Presentation1 passwordAnkit Desai
 
Presentation2 gesture control
Presentation2 gesture controlPresentation2 gesture control
Presentation2 gesture controlAnkit Desai
 

More from Ankit Desai (11)

Java RMI
Java RMIJava RMI
Java RMI
 
java swing programming
java swing programming java swing programming
java swing programming
 
Java Networking
Java NetworkingJava Networking
Java Networking
 
JDBC
JDBCJDBC
JDBC
 
Presentation10 view navigation
Presentation10 view navigationPresentation10 view navigation
Presentation10 view navigation
 
Presentation7 segment control
Presentation7 segment controlPresentation7 segment control
Presentation7 segment control
 
Presentation6 ui image_view
Presentation6 ui image_viewPresentation6 ui image_view
Presentation6 ui image_view
 
Presentation5 picker view
Presentation5 picker viewPresentation5 picker view
Presentation5 picker view
 
Presentation3 actionsheet alertview
Presentation3 actionsheet alertviewPresentation3 actionsheet alertview
Presentation3 actionsheet alertview
 
Presentation1 password
Presentation1 passwordPresentation1 password
Presentation1 password
 
Presentation2 gesture control
Presentation2 gesture controlPresentation2 gesture control
Presentation2 gesture control
 

Recently uploaded

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Recently uploaded (20)

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Hadoop installation

  • 2. Profile  Ankit Desai  Ph.D. Scholar, IET, Ahmedabad University  Education: M. Tech. (C.E.), B. E. (I. T.)  Experience: 8 years (Academic and Research)  Research Interest: IoT, Big Data Analytics, Machine Learning, Data Mining, Algorithms.
  • 3. Install Ubuntu  Ubuntu 14.04.2 LTS  Download Source  http://www.ubuntu.com/download/desktop  64 bit OS v/s 32 bit OS  ubuntu-14.04.2-desktop-amd64.iso file – 64bit  Or  ubuntu-14.10-desktop-i386.iso file – 32bit
  • 4. Download Java  Java 7  http://www.oracle.com/technetwork/java/javase/d ownloads/jdk7-downloads-1880260.html  Java 6  http://www.oracle.com/technetwork/java/javase/d ownloads/java-archive-downloads-javase6- 419409.html  x86 or x64 bit as per your computer configurations  Download:  For 7: jdk-7u75-linux-i586.tar.gz or jdk-7u75-linux-
  • 5. Cont…  Extract the jdk file  Open the terminal  $cd /user/lib (make new dir java)  $ sudo mkdir java  Move jdk folder from Download to usr/lib/java  $sudo mv jdk1.7.0_67/ /usr/lib/java
  • 6. Cont…  Goto /usr/lib/java/jdk1.7.0_67/bin  $ sudo update-alternatives –install “/usr/bin/java” “java” “/usr/lib/java/jdk1.7.0_67/bin/java” 1  $ sudo update-aternatives –install “/usr/bin/javac” “javac” “usr/lib/java/jdk1.7.0_67/bin/javac” 1  $ sudo update-aternatives –install “/usr/bin/javaws” “javaws” “usr/lib/java/jdk1.7.0_67/bin/javaws” 1
  • 7. Cont…  Check java version  $java –version  Set env. Variable JAVA_HOME in .bashrc file  $gedit ~/.bashrc file  In .bashrc file  export JAVA_HOME=”/usr/lib/java/jdk1.7.0_67”  set PATH=”$PATH:$JAVA_HOME/bin”  export PATH  save & exit
  • 8. Create hduser  Create user group  $sudo addgroup hadoop  Create user hduser  $ sudo adduser --ingroup hadoop hduser  Login to hduser  user@ubuntu:~$ su – hduser
  • 9. Working with SSH  hduser@ubuntu: ~$ ssh (should give you path of ssh), if not then type $sudo apt-get install ssh  hduser@ubuntu: ~$ sshd (should give you path of sshd), if not then type user@ubuntu:~$ sudo apt-get install openssh-server  Generate public and private key pair:  hduser@ubuntu:~$ ssh-keygen -t rsa -P “”  hduser@ubuntu:~$ cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser /.ssh/authorized_keys
  • 10. Continue…  Add user to authenticated user.  hduser@ubuntu:~$ ssh localhost The authenticity of host 'localhost (::1)' can't be established. RSA key fingerprint is d7:87:25:47:ae:02:00:eb:1d:75:4f:bb:44:f9:36:26. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts. Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:27:30 UTC 2010 i686 GNU/Linux Ubuntu 10.04 LTS [...snipp...]
  • 11. Disable IPv6  open /etc/sysctl.conf file with  hduser@ubuntu:~$ sudo gedit /etc/sysctl.conf  You have to reboot your machine in order to make the changes take effect.  You can check whether IPv6 is enabled on your machine with the following command:  $ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
  • 12. Install Hadoop  Download Hadoop 1.0.3 from  https://archive.apache.org/dist/hadoop/core/hadoop -1.0.3/  hadoop-1.0.3.tar.gz 2012-05-08 20:35 60M  Navigate to: /usr/local/hadoop $ cd /usr/local $ sudo tar xzf hadoop-1.0.3.tar.gz $ sudo mv hadoop-1.0.3 hadoop $ sudo chown -R hduser:hadoop hadoop
  • 13. Cont…  hadoop-env.sh  Write line:  export JAVA_HOME=”/usr/lib/java/jdk1.7.0_67”  export HADOOP_HOME_WARN_SUPPRESS=”TRUE”  Edit ~/.bashrc  Add following export HADOOP_HOME=/usr/local/hadoop export JAVA_HOME=”/usr/lib/java/jdk1.7.0_67” set PATH=”$PATH:$JAVA_HOME/bin” unalias fs &> /dev/null alias fs="hadoop fs" unalias hls &> /dev/null alias hls="fs -ls" lzohead () { hadoop fs -cat $1 | lzop -dc | head -1000 | less } export PATH=$PATH:$HADOOP_HOME/bin export PATH
  • 14. Cont…  conf/*-site.xml  Create dir /app/hadoop/tmp  $ sudo mkdir -p /app/hadoop/tmp  $ sudo chown hduser:hadoop /app/hadoop/tmp  # ...and if you want to tighten up security, chmod from 755 to 750...  $ sudo chmod 750 /app/hadoop/tmp
  • 15. Conf files Add the following snippets between the <configuration> ... </configuration> tags in the respective configuration XML file.
  • 16. conf/core-site.xml <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary directories. </description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. </description> </property>
  • 17. conf/mapred-site.xml <property> <name>mapred.job.tracker</name> <value>localhost:54311</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property>
  • 18. conf/hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property>
  • 19. Done. How to start? How to work?
  • 20. Formatting the HDFS filesystem via the NameNode hduser@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode –format 10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = ubuntu/127.0.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 10/05/08 16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop 10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup 10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true 10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds. 10/05/08 16:59:57 INFO common.Storage: Storage directory .../hadoop-hduser/dfs/name has been successfully formatted. 10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1 ************************************************************/ hduser@ubuntu:/usr/local/hadoop$
  • 21. Starting your single-node cluster hduser@ubuntu:~$ /usr/local/hadoop/bin/start-all.sh starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode- ubuntu.out localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode- ubuntu.out localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser- secondarynamenode-ubuntu.out starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser- jobtracker-ubuntu.out localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser- tasktracker-ubuntu.out hduser@ubuntu:/usr/local/hadoop$
  • 22. Verify hduser@ubuntu:/usr/local/hadoop$ jps 2287 TaskTracker 2149 JobTracker 1938 DataNode 2085 SecondaryNameNode 2349 Jps 1788 NameNode
  • 23. Stopping your single-node cluster hduser@ubuntu:~$ /usr/local/hadoop/bin/stop-all.sh hduser@ubuntu:/usr/local/hadoop$ bin/stop-all.sh stopping jobtracker localhost: stopping tasktracker stopping namenode localhost: stopping datanode localhost: stopping secondarynamenode hduser@ubuntu:/usr/local/hadoop$
  • 24. Hadoop Web Interfaces  http://localhost:50070/ – web UI of the NameNode daemon  http://localhost:50030/ – web UI of the JobTracker daemon  http://localhost:50060/ – web UI of the TaskTracker daemon
  • 30. Making single node cluster  Marge two single node cluster in to multi-node cluster.  One will become designated master  Will also work as slave (will store and process the data as well)  Pseudo distributed cluster  Another will become slave only
  • 31. Prerequisites  Configuring single-node clusters first  Copy the Ubuntu install folder and paste it. (Replication of the same VM)  Make sure your ubuntu system uses DHCP settings/ reasonably considerable settings for network setup.
  • 32. Change Host-names  Change hostnames of each systems  Login to each system using hduser@ubuntu$ and open file /etc/hosts  Find system’s ipv4 address using command ifconfig  Make and entry of ip address and host name of master and slave both on both systems as follows.  Command:  sudo gedit /etc/hosts
  • 33. Also change  /etc/hostname  On master  master  On slave  slave  To take its effect run:  Run sudo /etc/init.d/hostname restart or sudo service hostname restart
  • 34. Verification  Exit the terminal atleast once to see the effects:  From:  hduser@ubuntu$  To:  hduser@master$ on master side  hduser@slave$ on slave side
  • 35. SSH access  Distribute the SSH public key of hduser@master  Command:  hduser@master:~$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slave  Above cmd will copy id_rsa.pub on hduser@slave
  • 36. SSH Login  So, connecting from master to master…  Command:  hduser@master:~$ ssh master  Sample output:  hduser@master:~$ ssh master The authenticity of host 'master (192.168.0.1)' can't be established. RSA key fingerprint is 3b:21:b3:c0:21:5c:7c:54:2f:1e:2d:96:79:eb:7f:95. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'master' (RSA) to the list of known hosts. Linux master 2.6.20-16-386 #2 Thu Jun 7 20:16:13 UTC 2007 i686 ... hduser@master:~$
  • 37. SSH Login  …and from master to slave.  Command:  hduser@master:~$ ssh slave  Sample output:  The authenticity of host 'slave (192.168.0.2)' can't be established. RSA key fingerprint is 74:d7:61:86:db:86:8f:31:90:9c:68:b0:13:88:52:72. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave' (RSA) to the list of known hosts. Ubuntu 10.04 ... hduser@slave:~$
  • 38. Only on master side  Update /usr/local/hadoop/conf/masters file as follow
  • 39. Only on master side  Update /usr/local/hadoop/conf/slaves file as follow  If you have more than one slaves then…
  • 40. conf/core-site.xml (ALL machines) <property> <name>fs.default.name</name> <value>hdfs://master:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property>
  • 41. conf/mapred-site.xml (ALL machines) <property> <name>mapred.job.tracker</name> <value>master:54311</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property>
  • 42. conf/hdfs-site.xml (ALL machines) <property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property>
  • 43. Run name-node format (critical) hduser@master:/usr/local/hadoop$ bin/hadoop namenode -format ... INFO dfs.Storage: Storage directory /app/hadoop/tmp/dfs/name has been successfully formatted. hduser@master:/usr/local/hadoop$
  • 44. Start multi-node Cluster hduser@master:/usr/local/hadoop$ bin/hadoop/start-all.sh starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode- master.out slave: Ubuntu 10.xx slave: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode- slave.out master: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode- master.out master: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser- secondarynamenode-master.out hduser@master:/usr/local/hadoop$
  • 45. Common Errors  After some duration… shutdown of Datanode automatically…  Fix 1. Restart Hadoop 2. Go to /app/hadoop/tmp/dfs/name/current 3. Open VERSION (i.e. by vim VERSION) 4. Record namespaceID 5. Go to /app/hadoop/tmp/dfs/data/current 6. Open VERSION (i.e. by vim VERSION) 7. Replace the namespaceID with the namespaceID you recorded in step 4.
  • 46. Common Errors  $HADOOP_HOME is deprecated.  Fix  Try setting  export HADOOP_HOME_WARN_SUPPRESS="TRUE" in my conf/hadoop-env.sh file

Editor's Notes

  1. A return value of 0 means IPv6 is enabled, a value of 1 means disabled (that’s what we want).
  2. (Just to give you the idea, YMMV – personally, I create a symlink from hadoop-1.0.3 to hadoop.)
  3. export HADOOP_HOME_WARN_SUPPRESS=”TRUE” (to solve the warning given by $hadoop version, warning: hadoop-home is deprecated
  4. If you forget to set the required ownerships and permissions, you will see a java.io.IOException when you try to format the name node in the next section).
  5. Do not format a running Hadoop filesystem as you will lose all the data currently in the cluster (in HDFS)!
  6. This will startup a Namenode, Datanode, Jobtracker and a Tasktracker on your machine.
  7. This will not work on vmware due to our copy paste operation of the same system. It may say that same file exist on the system folder.
  8. The default value of dfs.replication is 3. However, we have only two nodes available, so we set dfs.replication to 2.
  9. http://stackoverflow.com/questions/18300940/why-does-data-node-shut-down-when-i-run-hadoop
  10. http://stackoverflow.com/questions/16936745/hadoop-home-is-deprecated-hadoop