SlideShare a Scribd company logo
1 of 18
Building a Hadoop Cluster on
Amazon EC2 using Cloudera
April 2013
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Create Name Node: m1.large
Ubuntu Server 12.04.1 LTS 64-bit
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Create Name Node: m1.large
Can use defaults for most Wizard screens except Firewall
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Launch Instance, Connect Via SSH
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Download & Run Cloudera Manager
Depending on settings, might need to run as “sudo”
Might take 5 or so
minutes to go through
licensing and
installation menus
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Open browser And Go To EC2 Instance Public
DNS At Port 7180
Ex: http://ec2-50-17-162-58.compute-1.amazonaws.com:7180
On this first login, you set your username and password credentials
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Use Defaults,„18‟ for instances
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Type In AWS Access Key ID & Secret Access Key
Credentials can be found under “Security Credentials” in EC2 dashboard
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Review Settings Then Install!
Provisioning Instances will take
a few minutes
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
If Any Installations Fail, Retry Until
Success
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Make Sure Consistency Check Passes
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Cluster Services Will Start, Then Success!
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Finding Hue Public DNS
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Hue (Hadoop User Experience) is the more user-friendly way
to interact with Hadoop
Finding Hue Public DNS
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Clicking on the “Hue Web UI” button doesn‟t work, because it references the
Internal Address for Amazon EC2
Clicking this link button won‟t work!
Need to find the Public DNS for this Internal Address in Amazon Dashboard
Finding Hue Public DNS
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Type in Internal Address in Search Box to find the Instance having Hue
This is the public DNS
Address to access Hue
Finding Hue Public DNS
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
Hue is accessed via Port 8888
Ex: http://ec2-54-224-118-78.compute-1.amazonaws.com:8888
Pick your username/password carefully, this is the superuser
If You See Hue, You‟re Ready For Analysis!
http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
This is the Hive editor, which allows for SQL-Like Syntax to
create MapReduce jobs
Reference
http://blog.cloudera.com/blog/2013/03/how-to-create-a-cdh-cluster-on-amazon-ec2-via-cloudera-manager/

More Related Content

What's hot

Aeon mike guide transparent ssl filtering (1)
Aeon mike guide transparent ssl filtering (1)Aeon mike guide transparent ssl filtering (1)
Aeon mike guide transparent ssl filtering (1)Conrad Cruz
 
Aeon mike guide transparent ssl filtering
Aeon mike guide transparent ssl filteringAeon mike guide transparent ssl filtering
Aeon mike guide transparent ssl filteringConrad Cruz
 
Deploying your rails application to a clean ubuntu 10
Deploying your rails application to a clean ubuntu 10Deploying your rails application to a clean ubuntu 10
Deploying your rails application to a clean ubuntu 10Maurício Linhares
 
Terraform bootstrap code_execute
Terraform bootstrap code_executeTerraform bootstrap code_execute
Terraform bootstrap code_executerknaik76
 
Making the secure communication between Server and Client with https protocol
Making the secure communication between Server and Client with https protocolMaking the secure communication between Server and Client with https protocol
Making the secure communication between Server and Client with https protocolArmenuhi Abramyan
 
WordPress Security - A Top Down Approach
WordPress Security - A Top Down ApproachWordPress Security - A Top Down Approach
WordPress Security - A Top Down ApproachBrecht Ryckaert
 
How to launch an aws ec2 instance
How to launch an aws ec2 instanceHow to launch an aws ec2 instance
How to launch an aws ec2 instanceAndrea Cirillo
 
Controlling multiple VMs with the power of Python
Controlling multiple VMs with the power of PythonControlling multiple VMs with the power of Python
Controlling multiple VMs with the power of PythonYurii Vasylenko
 
Build your own secure mail server on the cloud using Amazon Web Services
Build your own secure mail server on the cloud using Amazon Web ServicesBuild your own secure mail server on the cloud using Amazon Web Services
Build your own secure mail server on the cloud using Amazon Web Servicesponukumatla joel nishanth
 
Setting up a kubernetes cluster on ubuntu 18.04- loves cloud
Setting up a kubernetes cluster on ubuntu 18.04- loves cloudSetting up a kubernetes cluster on ubuntu 18.04- loves cloud
Setting up a kubernetes cluster on ubuntu 18.04- loves cloudLoves Cloud
 
Rstudio in aws 16 9
Rstudio in aws 16 9Rstudio in aws 16 9
Rstudio in aws 16 9Tal Galili
 

What's hot (20)

Aeon mike guide transparent ssl filtering (1)
Aeon mike guide transparent ssl filtering (1)Aeon mike guide transparent ssl filtering (1)
Aeon mike guide transparent ssl filtering (1)
 
Aeon mike guide transparent ssl filtering
Aeon mike guide transparent ssl filteringAeon mike guide transparent ssl filtering
Aeon mike guide transparent ssl filtering
 
Knowledge article
Knowledge articleKnowledge article
Knowledge article
 
Deploying your rails application to a clean ubuntu 10
Deploying your rails application to a clean ubuntu 10Deploying your rails application to a clean ubuntu 10
Deploying your rails application to a clean ubuntu 10
 
Terraform bootstrap code_execute
Terraform bootstrap code_executeTerraform bootstrap code_execute
Terraform bootstrap code_execute
 
Making the secure communication between Server and Client with https protocol
Making the secure communication between Server and Client with https protocolMaking the secure communication between Server and Client with https protocol
Making the secure communication between Server and Client with https protocol
 
WordPress Security - A Top Down Approach
WordPress Security - A Top Down ApproachWordPress Security - A Top Down Approach
WordPress Security - A Top Down Approach
 
Pentesting Cloud Environment
Pentesting Cloud EnvironmentPentesting Cloud Environment
Pentesting Cloud Environment
 
How to launch an aws ec2 instance
How to launch an aws ec2 instanceHow to launch an aws ec2 instance
How to launch an aws ec2 instance
 
Controlling multiple VMs with the power of Python
Controlling multiple VMs with the power of PythonControlling multiple VMs with the power of Python
Controlling multiple VMs with the power of Python
 
Build your own secure mail server on the cloud using Amazon Web Services
Build your own secure mail server on the cloud using Amazon Web ServicesBuild your own secure mail server on the cloud using Amazon Web Services
Build your own secure mail server on the cloud using Amazon Web Services
 
Setting up a kubernetes cluster on ubuntu 18.04- loves cloud
Setting up a kubernetes cluster on ubuntu 18.04- loves cloudSetting up a kubernetes cluster on ubuntu 18.04- loves cloud
Setting up a kubernetes cluster on ubuntu 18.04- loves cloud
 
SoftLayer-demoLabV3
SoftLayer-demoLabV3SoftLayer-demoLabV3
SoftLayer-demoLabV3
 
Task 3.2
Task 3.2Task 3.2
Task 3.2
 
Rdo mitaka
Rdo mitakaRdo mitaka
Rdo mitaka
 
Rstudio in aws 16 9
Rstudio in aws 16 9Rstudio in aws 16 9
Rstudio in aws 16 9
 
grate techniques
grate techniquesgrate techniques
grate techniques
 
Your first meteor package
Your first meteor packageYour first meteor package
Your first meteor package
 
Cloudinit
CloudinitCloudinit
Cloudinit
 
Aws amazon ec2
Aws amazon ec2Aws amazon ec2
Aws amazon ec2
 

Similar to Build a Hadoop Cluster on AWS EC2 using Cloudera

How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera clusterTiago Simões
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
CloudStack and cloud-init
CloudStack and cloud-initCloudStack and cloud-init
CloudStack and cloud-initMarcusS13
 
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...Cloudian
 
CCCEU15 run cloudstack in docker
CCCEU15 run cloudstack in dockerCCCEU15 run cloudstack in docker
CCCEU15 run cloudstack in dockerPierre-Luc Dion
 
CloudStack Collab Conference 2015 Run CloudStack in Docker
CloudStack Collab Conference 2015 Run CloudStack in DockerCloudStack Collab Conference 2015 Run CloudStack in Docker
CloudStack Collab Conference 2015 Run CloudStack in DockerCloudOps2005
 
Docker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting TechniquesDocker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting TechniquesSreenivas Makam
 
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境謝 宗穎
 
Java App On Digital Ocean: Deploying With Gitlab CI/CD
Java App On Digital Ocean: Deploying With Gitlab CI/CDJava App On Digital Ocean: Deploying With Gitlab CI/CD
Java App On Digital Ocean: Deploying With Gitlab CI/CDSeun Matt
 
How to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieHow to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieTiago Simões
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linuxVicent Selfa
 
Node.js Cloud deployment
Node.js Cloud deploymentNode.js Cloud deployment
Node.js Cloud deploymentNicholas McClay
 
Null bhopal Sep 2016: What it Takes to Secure a Web Application
Null bhopal Sep 2016: What it Takes to Secure a Web ApplicationNull bhopal Sep 2016: What it Takes to Secure a Web Application
Null bhopal Sep 2016: What it Takes to Secure a Web ApplicationAnant Shrivastava
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Joshua Harlow
 
How to Become Cloud Backup Provider
How to Become Cloud Backup ProviderHow to Become Cloud Backup Provider
How to Become Cloud Backup ProviderCloudian
 
Using aphace-as-proxy-server
Using aphace-as-proxy-serverUsing aphace-as-proxy-server
Using aphace-as-proxy-serverHARRY CHAN PUTRA
 

Similar to Build a Hadoop Cluster on AWS EC2 using Cloudera (20)

How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera cluster
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Apache
ApacheApache
Apache
 
Apache
ApacheApache
Apache
 
Network Manual
Network ManualNetwork Manual
Network Manual
 
CloudStack and cloud-init
CloudStack and cloud-initCloudStack and cloud-init
CloudStack and cloud-init
 
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...
How to become cloud backup provider with Cloudian HyperStore and CloudBerry L...
 
CCCEU15 run cloudstack in docker
CCCEU15 run cloudstack in dockerCCCEU15 run cloudstack in docker
CCCEU15 run cloudstack in docker
 
CloudStack Collab Conference 2015 Run CloudStack in Docker
CloudStack Collab Conference 2015 Run CloudStack in DockerCloudStack Collab Conference 2015 Run CloudStack in Docker
CloudStack Collab Conference 2015 Run CloudStack in Docker
 
Docker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting TechniquesDocker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting Techniques
 
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
 
Java App On Digital Ocean: Deploying With Gitlab CI/CD
Java App On Digital Ocean: Deploying With Gitlab CI/CDJava App On Digital Ocean: Deploying With Gitlab CI/CD
Java App On Digital Ocean: Deploying With Gitlab CI/CD
 
One-Man Ops
One-Man OpsOne-Man Ops
One-Man Ops
 
How to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozieHow to scheduled jobs in a cloudera cluster without oozie
How to scheduled jobs in a cloudera cluster without oozie
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
 
Node.js Cloud deployment
Node.js Cloud deploymentNode.js Cloud deployment
Node.js Cloud deployment
 
Null bhopal Sep 2016: What it Takes to Secure a Web Application
Null bhopal Sep 2016: What it Takes to Secure a Web ApplicationNull bhopal Sep 2016: What it Takes to Secure a Web Application
Null bhopal Sep 2016: What it Takes to Secure a Web Application
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]
 
How to Become Cloud Backup Provider
How to Become Cloud Backup ProviderHow to Become Cloud Backup Provider
How to Become Cloud Backup Provider
 
Using aphace-as-proxy-server
Using aphace-as-proxy-serverUsing aphace-as-proxy-server
Using aphace-as-proxy-server
 

Build a Hadoop Cluster on AWS EC2 using Cloudera

  • 1. Building a Hadoop Cluster on Amazon EC2 using Cloudera April 2013 http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 2. Create Name Node: m1.large Ubuntu Server 12.04.1 LTS 64-bit http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 3. Create Name Node: m1.large Can use defaults for most Wizard screens except Firewall http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 4. Launch Instance, Connect Via SSH http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 5. Download & Run Cloudera Manager Depending on settings, might need to run as “sudo” Might take 5 or so minutes to go through licensing and installation menus http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 6. Open browser And Go To EC2 Instance Public DNS At Port 7180 Ex: http://ec2-50-17-162-58.compute-1.amazonaws.com:7180 On this first login, you set your username and password credentials http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 7. Use Defaults,„18‟ for instances http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 8. Type In AWS Access Key ID & Secret Access Key Credentials can be found under “Security Credentials” in EC2 dashboard http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 9. Review Settings Then Install! Provisioning Instances will take a few minutes http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 10. If Any Installations Fail, Retry Until Success http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 11. Make Sure Consistency Check Passes http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 12. Cluster Services Will Start, Then Success! http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2
  • 13. Finding Hue Public DNS http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2 Hue (Hadoop User Experience) is the more user-friendly way to interact with Hadoop
  • 14. Finding Hue Public DNS http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2 Clicking on the “Hue Web UI” button doesn‟t work, because it references the Internal Address for Amazon EC2 Clicking this link button won‟t work! Need to find the Public DNS for this Internal Address in Amazon Dashboard
  • 15. Finding Hue Public DNS http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2 Type in Internal Address in Search Box to find the Instance having Hue This is the public DNS Address to access Hue
  • 16. Finding Hue Public DNS http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2 Hue is accessed via Port 8888 Ex: http://ec2-54-224-118-78.compute-1.amazonaws.com:8888 Pick your username/password carefully, this is the superuser
  • 17. If You See Hue, You‟re Ready For Analysis! http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-2 This is the Hive editor, which allows for SQL-Like Syntax to create MapReduce jobs