SlideShare a Scribd company logo
1 of 18
Download to read offline
Set up Hadoop cluster
on Amazon EC2
Nattinan Yontchai (Earng): KMIT
Dr.Thanachart Numnonda: IMC Institute
By
January 2015
11
VPC Creation
In order to install Hadoop on EC2 instance, we may need to make use of Amazon VPC (Virtual Private
Cloud Network) and Elastic IP addresses so that we can stop and start the instance whenever we needed.
With these two AWS services, we can achieve static private and public IP addresses for the EC2
instances being created.
In this step, we will create VPC and assign the security group as follows:
1. Select a VPC Cloud service, it will open VPC dashboard as shown below:
2. Click on Start VPC Wizard and select VPC with a single public subnet as shown below:
22
3. Select VPC name as Hadoop VPC and left the rest as default, then click on Create VPC
4. On VPC Dashboard, select Security group and then the default group as shown below:
5. Select Inbound Rules, then click on Edit and enter the following rules:
33
6. Save the Inbound rules and rename the group as Hadoop Security Group
Launch EC2 for a Hadoop Master
In this step, we will launch an EC2 instance for a Hadoop master node as follows:
1. Select a EC2 service and click on Lunch Instance
2. Select an Amazon Machine Image (AMI). Select Ubuntu Server 14.04 LTS (PV)
3. Select an Instance Type: m3.large and click Next: Configure Instance Details
44
4 . Select 1 instance for Namenode, Hadoop-VPC as Network (the above created VPC) and remaining
properties as default > Click Next: Add Storage
5. Add Storage at least 40 GB > Next: Tag Instance
6. Tag Instance > Enter Value: Hadoop Master 01 > Click Next.
55
7. Configure Security Group > Select an existing security group > Select Security Group Name: default >
Click Review and Launch
8. Review Instance Launch > Click Launch
9. Choose an existing key pair > LabCloudera (or Create new key pair) >click on acknowledge > Launch
Instances
66
10. Select on EC2 service and choose Elastic IPs, then click on Allocate New Address
11. After getting the IP address, click on Allocate New Address
12. In the Allocate Address dialog box, select the instance just created
77
Install Hadoop Master
In this step. we will install a Hadoop master node as follows:
1. View a command for connecting the EC2 instance by select on EC2 dashboard then choose Hadoop master
01, click on Connect you will see the ssh command as follows: (Note in this case the public IP is
54.69.195.87)
2. Open the client terminal console and type the following command
ssh -i clouderalab.pem ubuntu@54.69.195.xx
3. The EC2 instance terminal will now be open
88
4. Type command > sudo apt-get update
5. Type command > ssh-keygen (press Enter when it prompts for answering)
6. Type command > cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
7. Type command > ssh 54.69.195.xx (Enter yes when prompt for answering)
8. Type command > exit
9. Type command > sudo apt-get install openjdk-7-jdk (Enter Y when prompt for answering)
10. Type command > java –version and press Enter key. (It should display as shown below)
11. Type command > wget http://mirror.issp.co.th/apache/hadoop/common/hadoop-1.2.1/hadoop-
1.2.1.tar.gz
12. Type command > tar –xvzf hadoop-1.2.1.tar.gz
99
13. Type command > sudo mv hadoop-1.2.1 /usr/local/hadoop
14. Type command > sudo vi $HOME/.bashrc
15. Add config as figure below
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin
16. Type command > exec bash
17. Type command > sudo vi /usr/local/hadoop/conf/hadoop-env.sh command. Press Enter key.
18. Edit the file as figure below
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=TRUE
19. Type command > sudo vi /usr/local/hadoop/conf/core-site.xml
1010
20. Add Private IP of a master server as figure below (in this case a private IP is 10.0.0.212)
21. Type command > sudo vi /usr/local/hadoop/conf/mapred-site.xml
22. Add Private IP of Jobtracker server as figure below
23. Type command > sudo vi /usr/local/hadoop/conf/hdfs-site.xml
24. Add configure as figure below
1111
25. Type command > sudo mkdir /usr/local/hadoop/tmp
26. Type command > sudo chown ubuntu /usr/local/hadoop
27. Type command > sudo chown ubuntu /usr/local/hadoop/tmp
28. Type command > hadoop namenode –format
29. Finish
Cloning Instance on EC2 for Hadoop Slaves
In this step, we will clone the created Hadoop instance for three other Hadoop instances to act as Hadoop
slave.
1. Select a EC2 service and choose Hadoop Master 01
1212
2. Click on Actions > Create Image
3. Name an image as Hadoop-Image as shown below:
4. Select on AMI tab (in the left pane) and choose Hadoop-Image, then click on Launch
5. Select an Instance Type: m3.medium and click Next: Configure Instance Details
1313
6. Select 3 instance for Namenode, Hadoop-VPC as Network (the above created VPC) and remaining
properties as default > Click Next: Add Storage
7. Add Storage at least 80 GB > Next: Tag Instance
8. Tag Instance > Enter Value: Hadoop Slave > Click Next.
9. Configure Security Group > Select an existing security group > Select Security Group Name: default >
Click Review and Launch
1414
10. Review Instance Launch > Click Launch
11. Choose an existing key pair > LabCloudera > select on acknowledge > Launch Instances
12. View the EC2 dashboard, it will show three new instances named Hadoop Slave
13. Allocate three new Elastic IP addresses and associate them to the Hadoop Slave instances as shown on
example below
Setup Hadoop Cluster
1. Ssh to the Master node (ssh -i clouderalab.pem ubuntu@54.69.195.xx)
2. Type command > sudo vi /usr/local/hadoop/conf/masters
3. Enter Private IP for the master server. Save and exit.
4. Type command > sudo vi /usr/local/hadoop/conf/slaves
5. Enter Private IP for Datanode servers. Save and exit.
6. Type command > ssh-copy-id –i $HOME/.ssh/id_rsa.pub ubuntu@10.0.0.193 (Enter yes when prompt
for answering)
7. Type command > ssh 10.0.0.193 and press Enter key. (Test password-less )
1515
8. Type command > exit
9. Repeat step 6 – 8 for all slaves
10. Start Hadoop services by type command >> start-all.sh
11. Type command jps in all four systems to ensure that Hadoop services are running
At this point, the following Java processes should run on master…
…and the following on slave.
1616
Testing the Hadoop Cluster
1. Viewing the Hadoop HDFS using WebUI by typing the following url in the web browser
http://54.69.195.xx:50070/
1717

More Related Content

What's hot

Ansible : what's ansible & use case by REX
Ansible :  what's ansible & use case by REXAnsible :  what's ansible & use case by REX
Ansible : what's ansible & use case by REXSaewoong Lee
 
How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...Tiago Simões
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Shilpa Hemaraj
 
How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera clusterTiago Simões
 
Ansible Network Automation session1
Ansible Network Automation session1Ansible Network Automation session1
Ansible Network Automation session1Dhruv Sharma
 
Apache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseApache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseShiva Rama Krishna Dasharathi
 
How to create a multi tenancy for an interactive data analysis
How to create a multi tenancy for an interactive data analysisHow to create a multi tenancy for an interactive data analysis
How to create a multi tenancy for an interactive data analysisTiago Simões
 
10 Million hits a day with WordPress using a $15 VPS
10 Million hits a day  with WordPress using a $15 VPS10 Million hits a day  with WordPress using a $15 VPS
10 Million hits a day with WordPress using a $15 VPSPaolo Tonin
 
DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)Soshi Nemoto
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopAiden Seonghak Hong
 
R hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveR hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveAiden Seonghak Hong
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopLorin Hochstein
 
Multinode kubernetes-cluster
Multinode kubernetes-clusterMultinode kubernetes-cluster
Multinode kubernetes-clusterRam Nath
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabricandymccurdy
 
How to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieHow to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieTiago Simões
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubTiago Simões
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoringTiago Simões
 

What's hot (17)

Ansible : what's ansible & use case by REX
Ansible :  what's ansible & use case by REXAnsible :  what's ansible & use case by REX
Ansible : what's ansible & use case by REX
 
How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners
 
How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera cluster
 
Ansible Network Automation session1
Ansible Network Automation session1Ansible Network Automation session1
Ansible Network Automation session1
 
Apache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseApache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exercise
 
How to create a multi tenancy for an interactive data analysis
How to create a multi tenancy for an interactive data analysisHow to create a multi tenancy for an interactive data analysis
How to create a multi tenancy for an interactive data analysis
 
10 Million hits a day with WordPress using a $15 VPS
10 Million hits a day  with WordPress using a $15 VPS10 Million hits a day  with WordPress using a $15 VPS
10 Million hits a day with WordPress using a $15 VPS
 
DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
 
R hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveR hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing Hive
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
 
Multinode kubernetes-cluster
Multinode kubernetes-clusterMultinode kubernetes-cluster
Multinode kubernetes-cluster
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
 
How to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieHow to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozie
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHub
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 

Similar to Set up Hadoop Cluster on Amazon EC2

Installing Lamp Stack on Ubuntu Instance
Installing Lamp Stack on Ubuntu InstanceInstalling Lamp Stack on Ubuntu Instance
Installing Lamp Stack on Ubuntu Instancekamarul kawnayeen
 
Installing WordPress on AWS
Installing WordPress on AWSInstalling WordPress on AWS
Installing WordPress on AWSManish Jain
 
sfdx continuous Integration with Jenkins on aws (Part I)
sfdx continuous Integration with Jenkins on aws (Part I)sfdx continuous Integration with Jenkins on aws (Part I)
sfdx continuous Integration with Jenkins on aws (Part I)Jérémy Vial
 
Lamp Server With Drupal Installation
Lamp Server With Drupal InstallationLamp Server With Drupal Installation
Lamp Server With Drupal Installationfranbow
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceCloudian
 
How To Deploy A Cloud Based Webserver in 5 minutes - LAMP
How To Deploy A Cloud Based Webserver in 5 minutes - LAMPHow To Deploy A Cloud Based Webserver in 5 minutes - LAMP
How To Deploy A Cloud Based Webserver in 5 minutes - LAMPMatt Dunlap
 
Amazon AWS Workspace Howto
Amazon AWS Workspace HowtoAmazon AWS Workspace Howto
Amazon AWS Workspace Howtomailbhargav
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
R server and spark
R server and sparkR server and spark
R server and sparkBAINIDA
 
Oracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOsama Mustafa
 
Setting up the hyperledger composer in ubuntu
Setting up the hyperledger composer in ubuntuSetting up the hyperledger composer in ubuntu
Setting up the hyperledger composer in ubuntukesavan N B
 
Oracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMOracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMArun Sharma
 

Similar to Set up Hadoop Cluster on Amazon EC2 (20)

Installing Lamp Stack on Ubuntu Instance
Installing Lamp Stack on Ubuntu InstanceInstalling Lamp Stack on Ubuntu Instance
Installing Lamp Stack on Ubuntu Instance
 
Installing WordPress on AWS
Installing WordPress on AWSInstalling WordPress on AWS
Installing WordPress on AWS
 
sfdx continuous Integration with Jenkins on aws (Part I)
sfdx continuous Integration with Jenkins on aws (Part I)sfdx continuous Integration with Jenkins on aws (Part I)
sfdx continuous Integration with Jenkins on aws (Part I)
 
Lamp Server With Drupal Installation
Lamp Server With Drupal InstallationLamp Server With Drupal Installation
Lamp Server With Drupal Installation
 
Freeradius edir
Freeradius edirFreeradius edir
Freeradius edir
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
 
instaling
instalinginstaling
instaling
 
instaling
instalinginstaling
instaling
 
instaling
instalinginstaling
instaling
 
instaling
instalinginstaling
instaling
 
How To Deploy A Cloud Based Webserver in 5 minutes - LAMP
How To Deploy A Cloud Based Webserver in 5 minutes - LAMPHow To Deploy A Cloud Based Webserver in 5 minutes - LAMP
How To Deploy A Cloud Based Webserver in 5 minutes - LAMP
 
Amazon AWS Workspace Howto
Amazon AWS Workspace HowtoAmazon AWS Workspace Howto
Amazon AWS Workspace Howto
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
R server and spark
R server and sparkR server and spark
R server and spark
 
Intalacion de owncloud
Intalacion de owncloudIntalacion de owncloud
Intalacion de owncloud
 
Oracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single node
 
Play Framework in EC2
Play Framework in EC2Play Framework in EC2
Play Framework in EC2
 
Setting up the hyperledger composer in ubuntu
Setting up the hyperledger composer in ubuntuSetting up the hyperledger composer in ubuntu
Setting up the hyperledger composer in ubuntu
 
Oracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMOracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASM
 
Knowledge article
Knowledge articleKnowledge article
Knowledge article
 

More from IMC Institute

นิตยสาร Digital Trends ฉบับที่ 14
นิตยสาร Digital Trends ฉบับที่ 14นิตยสาร Digital Trends ฉบับที่ 14
นิตยสาร Digital Trends ฉบับที่ 14IMC Institute
 
Digital trends Vol 4 No. 13 Sep-Dec 2019
Digital trends Vol 4 No. 13  Sep-Dec 2019Digital trends Vol 4 No. 13  Sep-Dec 2019
Digital trends Vol 4 No. 13 Sep-Dec 2019IMC Institute
 
บทความ The evolution of AI
บทความ The evolution of AIบทความ The evolution of AI
บทความ The evolution of AIIMC Institute
 
IT Trends eMagazine Vol 4. No.12
IT Trends eMagazine  Vol 4. No.12IT Trends eMagazine  Vol 4. No.12
IT Trends eMagazine Vol 4. No.12IMC Institute
 
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformation
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformationเพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformation
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital TransformationIMC Institute
 
IT Trends 2019: Putting Digital Transformation to Work
IT Trends 2019: Putting Digital Transformation to WorkIT Trends 2019: Putting Digital Transformation to Work
IT Trends 2019: Putting Digital Transformation to WorkIMC Institute
 
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรม
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรมมูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรม
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรมIMC Institute
 
IT Trends eMagazine Vol 4. No.11
IT Trends eMagazine  Vol 4. No.11IT Trends eMagazine  Vol 4. No.11
IT Trends eMagazine Vol 4. No.11IMC Institute
 
แนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationแนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationIMC Institute
 
บทความ The New Silicon Valley
บทความ The New Silicon Valleyบทความ The New Silicon Valley
บทความ The New Silicon ValleyIMC Institute
 
นิตยสาร IT Trends ของ IMC Institute ฉบับที่ 10
นิตยสาร IT Trends ของ  IMC Institute  ฉบับที่ 10นิตยสาร IT Trends ของ  IMC Institute  ฉบับที่ 10
นิตยสาร IT Trends ของ IMC Institute ฉบับที่ 10IMC Institute
 
แนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationแนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationIMC Institute
 
The Power of Big Data for a new economy (Sample)
The Power of Big Data for a new economy (Sample)The Power of Big Data for a new economy (Sample)
The Power of Big Data for a new economy (Sample)IMC Institute
 
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง IMC Institute
 
IT Trends eMagazine Vol 3. No.9
IT Trends eMagazine  Vol 3. No.9 IT Trends eMagazine  Vol 3. No.9
IT Trends eMagazine Vol 3. No.9 IMC Institute
 
Thailand software & software market survey 2016
Thailand software & software market survey 2016Thailand software & software market survey 2016
Thailand software & software market survey 2016IMC Institute
 
Developing Business Blockchain Applications on Hyperledger
Developing Business  Blockchain Applications on Hyperledger Developing Business  Blockchain Applications on Hyperledger
Developing Business Blockchain Applications on Hyperledger IMC Institute
 
Digital transformation @thanachart.org
Digital transformation @thanachart.orgDigital transformation @thanachart.org
Digital transformation @thanachart.orgIMC Institute
 
บทความ Big Data จากบล็อก thanachart.org
บทความ Big Data จากบล็อก thanachart.orgบทความ Big Data จากบล็อก thanachart.org
บทความ Big Data จากบล็อก thanachart.orgIMC Institute
 
กลยุทธ์ 5 ด้านกับการทำ Digital Transformation
กลยุทธ์ 5 ด้านกับการทำ Digital Transformationกลยุทธ์ 5 ด้านกับการทำ Digital Transformation
กลยุทธ์ 5 ด้านกับการทำ Digital TransformationIMC Institute
 

More from IMC Institute (20)

นิตยสาร Digital Trends ฉบับที่ 14
นิตยสาร Digital Trends ฉบับที่ 14นิตยสาร Digital Trends ฉบับที่ 14
นิตยสาร Digital Trends ฉบับที่ 14
 
Digital trends Vol 4 No. 13 Sep-Dec 2019
Digital trends Vol 4 No. 13  Sep-Dec 2019Digital trends Vol 4 No. 13  Sep-Dec 2019
Digital trends Vol 4 No. 13 Sep-Dec 2019
 
บทความ The evolution of AI
บทความ The evolution of AIบทความ The evolution of AI
บทความ The evolution of AI
 
IT Trends eMagazine Vol 4. No.12
IT Trends eMagazine  Vol 4. No.12IT Trends eMagazine  Vol 4. No.12
IT Trends eMagazine Vol 4. No.12
 
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformation
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformationเพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformation
เพราะเหตุใด Digitization ไม่ตอบโจทย์ Digital Transformation
 
IT Trends 2019: Putting Digital Transformation to Work
IT Trends 2019: Putting Digital Transformation to WorkIT Trends 2019: Putting Digital Transformation to Work
IT Trends 2019: Putting Digital Transformation to Work
 
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรม
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรมมูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรม
มูลค่าตลาดดิจิทัลไทย 3 อุตสาหกรรม
 
IT Trends eMagazine Vol 4. No.11
IT Trends eMagazine  Vol 4. No.11IT Trends eMagazine  Vol 4. No.11
IT Trends eMagazine Vol 4. No.11
 
แนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationแนวทางการทำ Digital transformation
แนวทางการทำ Digital transformation
 
บทความ The New Silicon Valley
บทความ The New Silicon Valleyบทความ The New Silicon Valley
บทความ The New Silicon Valley
 
นิตยสาร IT Trends ของ IMC Institute ฉบับที่ 10
นิตยสาร IT Trends ของ  IMC Institute  ฉบับที่ 10นิตยสาร IT Trends ของ  IMC Institute  ฉบับที่ 10
นิตยสาร IT Trends ของ IMC Institute ฉบับที่ 10
 
แนวทางการทำ Digital transformation
แนวทางการทำ Digital transformationแนวทางการทำ Digital transformation
แนวทางการทำ Digital transformation
 
The Power of Big Data for a new economy (Sample)
The Power of Big Data for a new economy (Sample)The Power of Big Data for a new economy (Sample)
The Power of Big Data for a new economy (Sample)
 
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง
บทความ Robotics แนวโน้มใหม่สู่บริการเฉพาะทาง
 
IT Trends eMagazine Vol 3. No.9
IT Trends eMagazine  Vol 3. No.9 IT Trends eMagazine  Vol 3. No.9
IT Trends eMagazine Vol 3. No.9
 
Thailand software & software market survey 2016
Thailand software & software market survey 2016Thailand software & software market survey 2016
Thailand software & software market survey 2016
 
Developing Business Blockchain Applications on Hyperledger
Developing Business  Blockchain Applications on Hyperledger Developing Business  Blockchain Applications on Hyperledger
Developing Business Blockchain Applications on Hyperledger
 
Digital transformation @thanachart.org
Digital transformation @thanachart.orgDigital transformation @thanachart.org
Digital transformation @thanachart.org
 
บทความ Big Data จากบล็อก thanachart.org
บทความ Big Data จากบล็อก thanachart.orgบทความ Big Data จากบล็อก thanachart.org
บทความ Big Data จากบล็อก thanachart.org
 
กลยุทธ์ 5 ด้านกับการทำ Digital Transformation
กลยุทธ์ 5 ด้านกับการทำ Digital Transformationกลยุทธ์ 5 ด้านกับการทำ Digital Transformation
กลยุทธ์ 5 ด้านกับการทำ Digital Transformation
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Set up Hadoop Cluster on Amazon EC2

  • 1. Set up Hadoop cluster on Amazon EC2 Nattinan Yontchai (Earng): KMIT Dr.Thanachart Numnonda: IMC Institute By January 2015
  • 2. 11 VPC Creation In order to install Hadoop on EC2 instance, we may need to make use of Amazon VPC (Virtual Private Cloud Network) and Elastic IP addresses so that we can stop and start the instance whenever we needed. With these two AWS services, we can achieve static private and public IP addresses for the EC2 instances being created. In this step, we will create VPC and assign the security group as follows: 1. Select a VPC Cloud service, it will open VPC dashboard as shown below: 2. Click on Start VPC Wizard and select VPC with a single public subnet as shown below:
  • 3. 22 3. Select VPC name as Hadoop VPC and left the rest as default, then click on Create VPC 4. On VPC Dashboard, select Security group and then the default group as shown below: 5. Select Inbound Rules, then click on Edit and enter the following rules:
  • 4. 33 6. Save the Inbound rules and rename the group as Hadoop Security Group Launch EC2 for a Hadoop Master In this step, we will launch an EC2 instance for a Hadoop master node as follows: 1. Select a EC2 service and click on Lunch Instance 2. Select an Amazon Machine Image (AMI). Select Ubuntu Server 14.04 LTS (PV) 3. Select an Instance Type: m3.large and click Next: Configure Instance Details
  • 5. 44 4 . Select 1 instance for Namenode, Hadoop-VPC as Network (the above created VPC) and remaining properties as default > Click Next: Add Storage 5. Add Storage at least 40 GB > Next: Tag Instance 6. Tag Instance > Enter Value: Hadoop Master 01 > Click Next.
  • 6. 55 7. Configure Security Group > Select an existing security group > Select Security Group Name: default > Click Review and Launch 8. Review Instance Launch > Click Launch 9. Choose an existing key pair > LabCloudera (or Create new key pair) >click on acknowledge > Launch Instances
  • 7. 66 10. Select on EC2 service and choose Elastic IPs, then click on Allocate New Address 11. After getting the IP address, click on Allocate New Address 12. In the Allocate Address dialog box, select the instance just created
  • 8. 77 Install Hadoop Master In this step. we will install a Hadoop master node as follows: 1. View a command for connecting the EC2 instance by select on EC2 dashboard then choose Hadoop master 01, click on Connect you will see the ssh command as follows: (Note in this case the public IP is 54.69.195.87) 2. Open the client terminal console and type the following command ssh -i clouderalab.pem ubuntu@54.69.195.xx 3. The EC2 instance terminal will now be open
  • 9. 88 4. Type command > sudo apt-get update 5. Type command > ssh-keygen (press Enter when it prompts for answering) 6. Type command > cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 7. Type command > ssh 54.69.195.xx (Enter yes when prompt for answering) 8. Type command > exit 9. Type command > sudo apt-get install openjdk-7-jdk (Enter Y when prompt for answering) 10. Type command > java –version and press Enter key. (It should display as shown below) 11. Type command > wget http://mirror.issp.co.th/apache/hadoop/common/hadoop-1.2.1/hadoop- 1.2.1.tar.gz 12. Type command > tar –xvzf hadoop-1.2.1.tar.gz
  • 10. 99 13. Type command > sudo mv hadoop-1.2.1 /usr/local/hadoop 14. Type command > sudo vi $HOME/.bashrc 15. Add config as figure below export HADOOP_PREFIX=/usr/local/hadoop export PATH=$PATH:$HADOOP_PREFIX/bin 16. Type command > exec bash 17. Type command > sudo vi /usr/local/hadoop/conf/hadoop-env.sh command. Press Enter key. 18. Edit the file as figure below export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64 export HADOOP_OPTS=-Djava.net.preferIPv4Stack=TRUE 19. Type command > sudo vi /usr/local/hadoop/conf/core-site.xml
  • 11. 1010 20. Add Private IP of a master server as figure below (in this case a private IP is 10.0.0.212) 21. Type command > sudo vi /usr/local/hadoop/conf/mapred-site.xml 22. Add Private IP of Jobtracker server as figure below 23. Type command > sudo vi /usr/local/hadoop/conf/hdfs-site.xml 24. Add configure as figure below
  • 12. 1111 25. Type command > sudo mkdir /usr/local/hadoop/tmp 26. Type command > sudo chown ubuntu /usr/local/hadoop 27. Type command > sudo chown ubuntu /usr/local/hadoop/tmp 28. Type command > hadoop namenode –format 29. Finish Cloning Instance on EC2 for Hadoop Slaves In this step, we will clone the created Hadoop instance for three other Hadoop instances to act as Hadoop slave. 1. Select a EC2 service and choose Hadoop Master 01
  • 13. 1212 2. Click on Actions > Create Image 3. Name an image as Hadoop-Image as shown below: 4. Select on AMI tab (in the left pane) and choose Hadoop-Image, then click on Launch 5. Select an Instance Type: m3.medium and click Next: Configure Instance Details
  • 14. 1313 6. Select 3 instance for Namenode, Hadoop-VPC as Network (the above created VPC) and remaining properties as default > Click Next: Add Storage 7. Add Storage at least 80 GB > Next: Tag Instance 8. Tag Instance > Enter Value: Hadoop Slave > Click Next. 9. Configure Security Group > Select an existing security group > Select Security Group Name: default > Click Review and Launch
  • 15. 1414 10. Review Instance Launch > Click Launch 11. Choose an existing key pair > LabCloudera > select on acknowledge > Launch Instances 12. View the EC2 dashboard, it will show three new instances named Hadoop Slave 13. Allocate three new Elastic IP addresses and associate them to the Hadoop Slave instances as shown on example below Setup Hadoop Cluster 1. Ssh to the Master node (ssh -i clouderalab.pem ubuntu@54.69.195.xx) 2. Type command > sudo vi /usr/local/hadoop/conf/masters 3. Enter Private IP for the master server. Save and exit. 4. Type command > sudo vi /usr/local/hadoop/conf/slaves 5. Enter Private IP for Datanode servers. Save and exit. 6. Type command > ssh-copy-id –i $HOME/.ssh/id_rsa.pub ubuntu@10.0.0.193 (Enter yes when prompt for answering) 7. Type command > ssh 10.0.0.193 and press Enter key. (Test password-less )
  • 16. 1515 8. Type command > exit 9. Repeat step 6 – 8 for all slaves 10. Start Hadoop services by type command >> start-all.sh 11. Type command jps in all four systems to ensure that Hadoop services are running At this point, the following Java processes should run on master… …and the following on slave.
  • 17. 1616 Testing the Hadoop Cluster 1. Viewing the Hadoop HDFS using WebUI by typing the following url in the web browser http://54.69.195.xx:50070/
  • 18. 1717