SlideShare a Scribd company logo
Introduction to
Hadoop High Availability
Omid Vahdaty, DevOps Ninja
Prerequisites
● Be sure you know:
○ This is not an installation manual.
○ how to install an hadoop multi node cluster (single namenode + secondary)
○ 2 different ways to install Hadoop HA
■ HA with NFS
■ HA with QJM (Qurum Journal Manager)
○ High availability Concepts:
■ Having Secondary Namenode is NOT HA.
■ Active/standby = Active/Passive != Active/Active
■ Main Concern: Split Brain Scenario. Common solution: STONITH fencing method.
HA with NFS
Hdfs-site.xml additions to HA (both methods)
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>machine1.example.com:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>machine2.example.com:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>machine1.example.com:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>machine2.example.com:50070</value>
</property>
hdfs-site.xml additions to HA with NFS method
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>file:///mnt/namenode</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoopuser/.ssh/id_rsa</value>
</property>
Core-site.xml additions
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
HA with Qurum Journal Manager
Hdfs-site.xml additions to HA with QJM
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/m
ycluster</value>
</property>
Core-site.xml addition with HA QJM method
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/path/to/journal/node/local/data</value>
</property>
Before you start experimenting...
1. Be sure to understand the format NN flow.
2. Learn how to “upgrade” from one NN to 2 NN
3. Test your cluster is healthy (cluster up, does not mean no ERROR in the logs
of each component)
4. Make sure you understand format section in the APACHE documentation of
HA installation carefully.
5. Play with some Scenarios:
a. Manual Failover: hdfs haadmin -failover nn2 nn1
Be Careful of twilight zone
1. Cluster appears to be up(no errors), but both NN are in Standby (default when
starting cluster)
2. Datanode appears to be up(no errors), but Datanode is out of sync with NN.
Format new cluster
● “hadoop-daemon.sh start journalnode” (on all QJM machines)
● “hdfs namenode -format” (for fresh cluster)
● “hdfs namenode -bootstrapStandby” on the unformatted NameNode (on existing
cluster only)
● “hdfs namenode -initializeSharedEdits”, which will initialize the JournalNodes with the
edits data from the local NameNode edits directories.
● “hadoop-daemon.sh start namenode”
● Read logs - JPS is not enough
● Check NN state (make sure you have at least one in active)
○ hdfs haadmin -failover nn2 nn1
Starting a cluster
● start-dfs.sh
● Activate a NN: hdfs haadmin -failover nn2 nn1
● If Quorum not achieved( “Got too many exceptions to achieve quorum size 2/3"):
○ Stop-dfs.sh
○ on all Journal nodes: “hadoop-daemon.sh start journalnode “
○ Start-dfs.sh
○ Activate a NN: “hdfs haadmin -failover nn2 nn1”
○ Confirm NN is active via: “hdfs haadmin -getServiceState nn1”
A quick word about Failover flow
If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the
Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be
attempted in order until one succeeds. Only after this process will the second NameNode be
transitioned to the Active state. If the fencing methods fail, the second NameNode is not transitioned
to Active state and an error is returned.
If the first NameNode is in the Standby state, the command will execute and say the failover is
successful, but no failover occurs and the two NameNodes will remain in their original states.
Corrupted your Testing cluster?
● Delete your namenode folder
● Delete your datanode folder
● Delete your journalnode folder
● Start again…
● Yes, you need to format the cluster again.

More Related Content

What's hot

Hadoop ha system admin
Hadoop ha system adminHadoop ha system admin
Hadoop ha system admin
Trieu Dao Minh
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
Cloudera, Inc.
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
Cloudera, Inc.
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Noteslarsgeorge
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
larsgeorge
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
Alex Moundalexis
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
Alex Moundalexis
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best PracticesCloudera, Inc.
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
Denish Patel
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
Red_Hat_Storage
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
jijukjoseph
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
DataWorks Summit
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
enissoz
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
sudhakara st
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodesEvans Ye
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Series
selvaraaju
 

What's hot (20)

Hadoop ha system admin
Hadoop ha system adminHadoop ha system admin
Hadoop ha system admin
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best Practices
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodes
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Series
 

Viewers also liked

Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzés
Zoltan Varju
 
From Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
From Old School to Cutting Edge: How Booker Leveraged Content for Killer ResultsFrom Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
From Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
Uberflip
 
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAsMary Mullen
 
How to Engage, Generate, and Qualify More Leads Using Interactive Content
How to Engage, Generate, and Qualify More Leads Using Interactive ContentHow to Engage, Generate, and Qualify More Leads Using Interactive Content
How to Engage, Generate, and Qualify More Leads Using Interactive Content
Uberflip
 
Whispering Woods_booklet
Whispering Woods_bookletWhispering Woods_booklet
Whispering Woods_bookletNina Bambrey
 
Webservices intro
Webservices introWebservices intro
Webservices intro
Srikrishna k
 
The Science of Content
The Science of ContentThe Science of Content
The Science of Content
Uberflip
 
Avanade Stageopdrachten
Avanade StageopdrachtenAvanade Stageopdrachten
Avanade Stageopdrachten
Avanade Nederland
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
Omid Vahdaty
 
Graduate from Email Marketing to Marketing Automation
Graduate from Email Marketing to Marketing AutomationGraduate from Email Marketing to Marketing Automation
Graduate from Email Marketing to Marketing Automation
Marketo
 
The Uberflip Experience 2016: Yoav Schwartz
The Uberflip Experience 2016: Yoav SchwartzThe Uberflip Experience 2016: Yoav Schwartz
The Uberflip Experience 2016: Yoav Schwartz
Uberflip
 
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsReal-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsMarketo
 
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
Uberflip
 
10 event trends 2017
10 event trends 201710 event trends 2017
10 event trends 2017
VIVA Marketing
 
How to Get the Most Out of Marketo Summit 2016
How to Get the Most Out of Marketo Summit 2016How to Get the Most Out of Marketo Summit 2016
How to Get the Most Out of Marketo Summit 2016
LeadMD
 
Account-Based Marketing 101: A Marketo Case Study
Account-Based Marketing 101: A Marketo Case StudyAccount-Based Marketing 101: A Marketo Case Study
Account-Based Marketing 101: A Marketo Case Study
Marketo
 
Ansible, best practices
Ansible, best practicesAnsible, best practices
Ansible, best practices
Bas Meijer
 
Personas and Content Marketing
Personas and Content MarketingPersonas and Content Marketing
Personas and Content Marketing
Marketo
 

Viewers also liked (19)

Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzés
 
From Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
From Old School to Cutting Edge: How Booker Leveraged Content for Killer ResultsFrom Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
From Old School to Cutting Edge: How Booker Leveraged Content for Killer Results
 
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs
7 Ab Brain Cytochrome Oxidase Subunit Complementary DNAs
 
How to Engage, Generate, and Qualify More Leads Using Interactive Content
How to Engage, Generate, and Qualify More Leads Using Interactive ContentHow to Engage, Generate, and Qualify More Leads Using Interactive Content
How to Engage, Generate, and Qualify More Leads Using Interactive Content
 
Whispering Woods_booklet
Whispering Woods_bookletWhispering Woods_booklet
Whispering Woods_booklet
 
Webservices intro
Webservices introWebservices intro
Webservices intro
 
RMIT15
RMIT15RMIT15
RMIT15
 
The Science of Content
The Science of ContentThe Science of Content
The Science of Content
 
Avanade Stageopdrachten
Avanade StageopdrachtenAvanade Stageopdrachten
Avanade Stageopdrachten
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Graduate from Email Marketing to Marketing Automation
Graduate from Email Marketing to Marketing AutomationGraduate from Email Marketing to Marketing Automation
Graduate from Email Marketing to Marketing Automation
 
The Uberflip Experience 2016: Yoav Schwartz
The Uberflip Experience 2016: Yoav SchwartzThe Uberflip Experience 2016: Yoav Schwartz
The Uberflip Experience 2016: Yoav Schwartz
 
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsReal-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
 
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
The Rise of Attention Based Marketing: How to Turn Attention into Meaningful ...
 
10 event trends 2017
10 event trends 201710 event trends 2017
10 event trends 2017
 
How to Get the Most Out of Marketo Summit 2016
How to Get the Most Out of Marketo Summit 2016How to Get the Most Out of Marketo Summit 2016
How to Get the Most Out of Marketo Summit 2016
 
Account-Based Marketing 101: A Marketo Case Study
Account-Based Marketing 101: A Marketo Case StudyAccount-Based Marketing 101: A Marketo Case Study
Account-Based Marketing 101: A Marketo Case Study
 
Ansible, best practices
Ansible, best practicesAnsible, best practices
Ansible, best practices
 
Personas and Content Marketing
Personas and Content MarketingPersonas and Content Marketing
Personas and Content Marketing
 

Similar to Introduction to hadoop high availability

#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
Cédric Delgehier
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
Rupak Roy
 
Webinar: Top 5 Hadoop Admin Tasks
Webinar: Top 5 Hadoop Admin TasksWebinar: Top 5 Hadoop Admin Tasks
Webinar: Top 5 Hadoop Admin Tasks
Edureka!
 
Top 5 Hadoop Admin Tasks
Top 5 Hadoop Admin TasksTop 5 Hadoop Admin Tasks
Top 5 Hadoop Admin Tasks
Edureka!
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2
SATOSHI TAGOMORI
 
13160296.ppt
13160296.ppt13160296.ppt
13160296.ppt
snowflakebatch
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
fann wu
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
Apache Apex
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basicHafizur Rahman
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
exsuns
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
baabtra.com - No. 1 supplier of quality freshers
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
Aneesh Pulickal Karunakaran
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
Ceph Community
 
Interacting with hdfs
Interacting with hdfsInteracting with hdfs
Interacting with hdfs
Pradeep Kumbhar
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around HadoopDataWorks Summit
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
Hadoop online training
 

Similar to Introduction to hadoop high availability (20)

#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
 
Hadoop2
Hadoop2Hadoop2
Hadoop2
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
 
Webinar: Top 5 Hadoop Admin Tasks
Webinar: Top 5 Hadoop Admin TasksWebinar: Top 5 Hadoop Admin Tasks
Webinar: Top 5 Hadoop Admin Tasks
 
Top 5 Hadoop Admin Tasks
Top 5 Hadoop Admin TasksTop 5 Hadoop Admin Tasks
Top 5 Hadoop Admin Tasks
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2
 
13160296.ppt
13160296.ppt13160296.ppt
13160296.ppt
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Interacting with hdfs
Interacting with hdfsInteracting with hdfs
Interacting with hdfs
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
 

More from Omid Vahdaty

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
Omid Vahdaty
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
Omid Vahdaty
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
Omid Vahdaty
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
Omid Vahdaty
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
Omid Vahdaty
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
Omid Vahdaty
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
Omid Vahdaty
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
Omid Vahdaty
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
Omid Vahdaty
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
Omid Vahdaty
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
Omid Vahdaty
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
Omid Vahdaty
 

More from Omid Vahdaty (20)

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
 

Recently uploaded

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 

Recently uploaded (20)

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 

Introduction to hadoop high availability

  • 1. Introduction to Hadoop High Availability Omid Vahdaty, DevOps Ninja
  • 2. Prerequisites ● Be sure you know: ○ This is not an installation manual. ○ how to install an hadoop multi node cluster (single namenode + secondary) ○ 2 different ways to install Hadoop HA ■ HA with NFS ■ HA with QJM (Qurum Journal Manager) ○ High availability Concepts: ■ Having Secondary Namenode is NOT HA. ■ Active/standby = Active/Passive != Active/Active ■ Main Concern: Split Brain Scenario. Common solution: STONITH fencing method.
  • 4. Hdfs-site.xml additions to HA (both methods) <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>machine1.example.com:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>machine2.example.com:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>machine1.example.com:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>machine2.example.com:50070</value> </property>
  • 5. hdfs-site.xml additions to HA with NFS method <property> <name>dfs.namenode.shared.edits.dir</name> <value>file:///mnt/namenode</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoopuser/.ssh/id_rsa</value> </property>
  • 7. HA with Qurum Journal Manager
  • 8. Hdfs-site.xml additions to HA with QJM <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/m ycluster</value> </property>
  • 9. Core-site.xml addition with HA QJM method <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/path/to/journal/node/local/data</value> </property>
  • 10. Before you start experimenting... 1. Be sure to understand the format NN flow. 2. Learn how to “upgrade” from one NN to 2 NN 3. Test your cluster is healthy (cluster up, does not mean no ERROR in the logs of each component) 4. Make sure you understand format section in the APACHE documentation of HA installation carefully. 5. Play with some Scenarios: a. Manual Failover: hdfs haadmin -failover nn2 nn1
  • 11. Be Careful of twilight zone 1. Cluster appears to be up(no errors), but both NN are in Standby (default when starting cluster) 2. Datanode appears to be up(no errors), but Datanode is out of sync with NN.
  • 12. Format new cluster ● “hadoop-daemon.sh start journalnode” (on all QJM machines) ● “hdfs namenode -format” (for fresh cluster) ● “hdfs namenode -bootstrapStandby” on the unformatted NameNode (on existing cluster only) ● “hdfs namenode -initializeSharedEdits”, which will initialize the JournalNodes with the edits data from the local NameNode edits directories. ● “hadoop-daemon.sh start namenode” ● Read logs - JPS is not enough ● Check NN state (make sure you have at least one in active) ○ hdfs haadmin -failover nn2 nn1
  • 13. Starting a cluster ● start-dfs.sh ● Activate a NN: hdfs haadmin -failover nn2 nn1 ● If Quorum not achieved( “Got too many exceptions to achieve quorum size 2/3"): ○ Stop-dfs.sh ○ on all Journal nodes: “hadoop-daemon.sh start journalnode “ ○ Start-dfs.sh ○ Activate a NN: “hdfs haadmin -failover nn2 nn1” ○ Confirm NN is active via: “hdfs haadmin -getServiceState nn1”
  • 14. A quick word about Failover flow If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one succeeds. Only after this process will the second NameNode be transitioned to the Active state. If the fencing methods fail, the second NameNode is not transitioned to Active state and an error is returned. If the first NameNode is in the Standby state, the command will execute and say the failover is successful, but no failover occurs and the two NameNodes will remain in their original states.
  • 15. Corrupted your Testing cluster? ● Delete your namenode folder ● Delete your datanode folder ● Delete your journalnode folder ● Start again… ● Yes, you need to format the cluster again.