[OSDC 2013] Hadoop Cluster HA 的經驗分享

Hadoop Cluster HA
的經驗分享
Etu 韓祖棻
jerryhan@etusolution.com

Who am I
韓祖棻 Jerry – Etu 技術經理
• Database Management
• Windows/Linux Application Developer
• Web Developer
• Developer of Etu
jerryhan@etusolution.com

2

Agenda
• Background
•
•
•
•
•
•

Facebook Namenode High Availability
Hadoop 1.0 Namenode High Availability
Hortonworks High Availability
Cloudera High Availability
Etu Appliance High Availability
Conclusion

3

The Hadoop Ecosystem
ETL Tools

BI Reporting

RDBMS

Zookeeper

Pig

HiveQL

Mahout

MapReduce

Data Store
Hive Meta Store

HBase

Avro (Serialization)

Data Processing Layer

HDFS ( Hadoop Distributed File System)

5

HDFS Architecture (Master/Slave)
Metadata ops

Namenode

HDFS cluster consists of a
single Namenode.

Client
Metadata ops
Read

Datanodes

Metadata(Name, replicas..)
(/var/disk/data, 1..

Block ops
Datanodes
The Namenode was a
sing point of failure (SPOF)
B
inreplication
an HDFS Cluster.

Blocks
Write
Rack1

Rack2

Client

6

Facebook Namenode
High Availability

7

Hadoop 1.0 Namenode
High Availability

9

Backup Namenode Approach

• Use case 3f:
– Active running, Standby down for maintenance. Active dies and cannot start.
Standby is started and takes over as active.

10

Hortonworks
High Availability

11

HDPs Full-Stack HA Architecture

12

HA for HDFS NameNode Using VMware

Do not use the NameNode VM for running any
other master daemon.

13

HA for Hadoop Using RHEL (v5.x, v6.x)

14

Cloudera
High Availability

15

Shared Storage Using NFS (After CDH 4.0)
ZK

ZK

ZK
Heartbeat

Heartbeat

FailoverController
Active

FailoverController
Standby

SPOF

Monitor Health
of NN. OS, HW

Monitor Health
of NN. OS, HW

NN
Active

Shared NN state
with single writer
(fencing)

DN

DN

NN
Standby

DN
16

Quorum-based Storage (After CDH 4.1)
ZK

ZK

ZK

Heartbeat

Heartbeat

FailoverController
Standby

FailoverController
Active
Journal Nodes

Monitor Health
of NN. OS, HW

JN

NN
Active

QJM

DN
JN

JN

QJM

NN
Standby

DN

DN
JN

Monitor Health
of NN. OS, HW

JN

JN
17

Etu Appliance
High Availability

18

Summarize previous solutions
Auto
Failover
X
X

Namenode
Namenode

External
Storage
○
○

Vmware (*1)

○

Namenode

○

RHEL (*2)

○

System-wide

○

Shared Storage

○

Namenode(*3)

Optional

Quorum-based Storage

○

Namenode (*3)

Optional

Solution
Facebook
Apache Hadoop 1.0

Avatar Node
Backup Namenode

HA Type

Hortonworks

Cloudera
(Apache Hadoop 2.X)

1. 2 ESX Servers + SAN Arch. (vSphere HA Cluster)
2. RHEL Cluster HA and Power Fencing Device
3. Implementing the Fencing Method for System-wide HA.

19

Two Roles

Master node

Worker

Worker
Master node

Worker

20

Services on Master and Workers
Master

Hadoop Ecosystem
Services

System Services

Worker

Name Node
Job Tracker
HBase Master
Zookeeper (Leader)
Hive

Data Node
Task Tracker
Region Server
Zookeeper

MySQL/PostgreSQL
Kerberos
NTP Server
Syslog

Syslog

21

HA Architecture (Active/Standby)

Active
Etu Master
Service
Disablement

Cluster-ware
Big-Data Services
Failover

Standby
Etu Master
Service
Enablement

Synchronised
File System
Heartbeat

Heartbeat

Orchestration Services
(fully redundant network)

Etu Worker

Etu Worker

Etu Worker

22

HA based on CDH4.0.1
ZK

ZK

ZK
Heartbeat

Heartbeat

FailoverController
Standby

FailoverController
Active
Monitor Health
of NN. OS, HW

Monitor Health
of NN. OS, HW
Synchronized
File System

NN
Active

DN

DN

NN
Standby

DN
23

Data Synchronization
• Hadoop ecosystem
– Configurations are stored in Zookeeper
– Hive meta data is stored in PostgreSQL

• PostgreSQL
– Using PostgreSQL Replication

• User data
• System configurations or data
– PostgreSQL, Kerberos, NTP server, Syslog

24

Requirements
ZK Leader

Active Master

Worker

Worker

ZK

ZK

Standby Master

-

HDFS Service is Running in Active Master
Zookeeper Cluster is ready
Standby Master is ready to activate High
Availability service

25

Failover Scenario
Active Master

Worker

Worker
Standby Master

-

Active Namenode service failure
Active Namenode JVM failure
Active ZKFC service failure
Etu Active Master OS failure
Etu Active Master machine power failure
Failure of NIC cards on the Etu Active
Master machine
Network failure for the Etu Active Master
machine

Worker

26

Design Details – Enabling HA
Namenode

JT, HMaster, …

Active Master
Kerberos, NTP, …

DB Replication

Namenode

Standby Master

edit logs

FC

FC
Kerberos, NTP, Syslog,…

1. Stopping services dependent on HDFS.
(JobTracker, HMaster, …)
2. Stopping Namenode and Datanode services.
3. Configuring HDFS and FC service.
4. Creating Synchronized File System.
5. Initializing Synchronized File System
for share edit logs.
6. Starting Active FC service.

7. Initializing Standby Master.
8. Starting Standby FC service.
9. Synchronizing system configurations
and data.
10. Starting Active Namenode and
Datanode services.
11. Starting Standby Namenode and
Datanode services.
12. Checking Services Status.
13. Starting services dependent on HDFS.
(JobTracker, HMaster, …)

27

Design Details - Failover
Fencing
Namenode

JT, HMaster, …

DB Replication

Active Master
Kerberos, NTP, …

Namenode
Namenode

JT, HMaster, …

Standby Master
Active Master

edit logs

Kerberos, NTP, …

FC

FC

Kerberos, NTP, Syslog,…

1. Fencing Active Master from Standby Master
a. Stopping network service.
b. Stopping Hadoop related services.
c. Stopping system services.
d. Configuring network environment.
e. Removing default services.
2. Stopping Standby FC service.
3. Stopping Standby Namenode service.
4. Removing Synchronized File System .
5. Removing DB Replication.

7. Transition Standby Master to Active Master.
a. Stopping network service.
b. Stopping system services.
c. Configuring network environment.
d. Configuring host information.
e. Configuring system services.
f. Starting network service.
g. Starting System services.
8. Configuring Hadoop related services.
9. Starting Namenode and Datanode services.
10. Starting Hadoop related services.

28

Use case Active Namenode maintenance
Active Master

-

Stop NN
Restart NN

Worker

Worker

Standby Master

Worker

29

Use case Standby Master failure
Active Master

Worker

Worker

Standby Master

-

OS failure
Power failure
Failure of NICs
Network failure

Worker

30

Use case Cluster power failure
Active Master

Worker

Worker

Standby Master

Worker

31

Use case Cluster network failure
Active Master

Worker

Worker

Standby Master

Worker

32

Demo –
Non-HA (VM002)

Activating HA with One-Click

33

Demo –
Activating (VM002 --- VM007)

34

Demo –
Activating Done (VM002 – VM007)

35

Demo –
Failover (VM002 –> VM007)

36

Demo –
Failover Done (VM007)

37

Conclusion
• Leveraging Synchronized File System to share
Namenode edit logs, and system data between
Masters.
• Implements improved fencing method to handle
failover.
• Providing system-wide high availability, not only
for Hadoop Name Node Service.

38

Reference
• Hadoop 1.0.4 Documentation
– http://hadoop.apache.org/docs/stable/index.html
– https://issues.apache.org/jira/secure/attachment/12480489/N
ameNode%20HA_v2_1.pdf

• Hadoop 2.0.3-alpha Documentation
– http://hadoop.apache.org/docs/r2.0.3-alpha/index.html

• Hadoop AvatarNode High Availability
– http://hadoopblog.blogspot.tw/2010/02/hadoop-namenodehigh-availability.html

• Hortonworks Data Platform
– http://hortonworks.com/products/hortonworksdataplatform/
– http://www.vmware.com/files/pdf/Apache-Hadoop-VMwareHA-solution.pdf

39

Reference
• CDH4.2.0 Documentation
– http://www.cloudera.com/content/support/en/documentation/
cdh4-documentation/cdh4-documentation-v4-latest.html

40

[OSDC 2013] Hadoop Cluster HA 的經驗分享

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [OSDC 2013] Hadoop Cluster HA 的經驗分享

Similar to [OSDC 2013] Hadoop Cluster HA 的經驗分享 (20)

Recently uploaded

Recently uploaded (20)

[OSDC 2013] Hadoop Cluster HA 的經驗分享