[OSDC 2013] Hadoop Cluster HA 的經驗分享


Published on

Hadoop HA 是個熱門且重要的議題,目前有諸多設計是以 Namenode HA 為主軸,進而延伸至 Job Tracker 和 HMaster。然而,在實作 Hadoop Cluster HA 時,僅考量Namenode、Job Tracker、和 HMaster 仍然不夠嚴謹,在Production的環境,一個Hadoop Cluster 通常還會需要其他的非Hadoop Ecosystem的服務與之協同運作,如 PostgreSQL、Kerberos、Puppet、 NTP 等,這些服務皆需一併規劃與設計,在 HA 被觸發後,讓 Hadoop Cluster 仍可正確運作。本議程將會介紹 Apache Hadoop、Cloudera、和 Hortonworks 於 Hadoop HA 上的解決方案,以及最新發展,並分享第一手的 Etu Appliance HA 作法。

韓祖棻 現任Etu 技術經理

Published in: Technology

[OSDC 2013] Hadoop Cluster HA 的經驗分享

  1. 1. Hadoop Cluster HA 的經驗分享 Etu 韓祖棻 jerryhan@etusolution.com
  2. 2. Who am I 韓祖棻 Jerry – Etu 技術經理 • Database Management • Windows/Linux Application Developer • Web Developer • Developer of Etu jerryhan@etusolution.com 2
  3. 3. Agenda • Background • • • • • • Facebook Namenode High Availability Hadoop 1.0 Namenode High Availability Hortonworks High Availability Cloudera High Availability Etu Appliance High Availability Conclusion 3
  4. 4. Background 4
  5. 5. The Hadoop Ecosystem ETL Tools BI Reporting RDBMS Zookeeper Pig HiveQL Mahout MapReduce Data Store Hive Meta Store HBase Avro (Serialization) Data Processing Layer HDFS ( Hadoop Distributed File System) 5
  6. 6. HDFS Architecture (Master/Slave) Metadata ops Namenode HDFS cluster consists of a single Namenode. Client Metadata ops Read Datanodes Metadata(Name, replicas..) (/var/disk/data, 1.. Block ops Datanodes The Namenode was a sing point of failure (SPOF) B inreplication an HDFS Cluster. Blocks Write Rack1 Rack2 Client 6
  7. 7. Facebook Namenode High Availability 7
  8. 8. AvatarNode 8
  9. 9. Hadoop 1.0 Namenode High Availability 9
  10. 10. Backup Namenode Approach • Use case 3f: – Active running, Standby down for maintenance. Active dies and cannot start. Standby is started and takes over as active. 10
  11. 11. Hortonworks High Availability 11
  12. 12. HDPs Full-Stack HA Architecture 12
  13. 13. HA for HDFS NameNode Using VMware Do not use the NameNode VM for running any other master daemon. 13
  14. 14. HA for Hadoop Using RHEL (v5.x, v6.x) 14
  15. 15. Cloudera High Availability 15
  16. 16. Shared Storage Using NFS (After CDH 4.0) ZK ZK ZK Heartbeat Heartbeat FailoverController Active FailoverController Standby SPOF Monitor Health of NN. OS, HW Monitor Health of NN. OS, HW NN Active Shared NN state with single writer (fencing) DN DN NN Standby DN 16
  17. 17. Quorum-based Storage (After CDH 4.1) ZK ZK ZK Heartbeat Heartbeat FailoverController Standby FailoverController Active Journal Nodes Monitor Health of NN. OS, HW JN NN Active QJM DN JN JN QJM NN Standby DN DN JN Monitor Health of NN. OS, HW JN JN 17
  18. 18. Etu Appliance High Availability 18
  19. 19. Summarize previous solutions Auto Failover X X Namenode Namenode External Storage ○ ○ Vmware (*1) ○ Namenode ○ RHEL (*2) ○ System-wide ○ Shared Storage ○ Namenode(*3) Optional Quorum-based Storage ○ Namenode (*3) Optional Solution Facebook Apache Hadoop 1.0 Avatar Node Backup Namenode HA Type Hortonworks Cloudera (Apache Hadoop 2.X) 1. 2 ESX Servers + SAN Arch. (vSphere HA Cluster) 2. RHEL Cluster HA and Power Fencing Device 3. Implementing the Fencing Method for System-wide HA. 19
  20. 20. Two Roles Master node Worker Worker Master node Worker 20
  21. 21. Services on Master and Workers Master Hadoop Ecosystem Services System Services Worker Name Node Job Tracker HBase Master Zookeeper (Leader) Hive Data Node Task Tracker Region Server Zookeeper MySQL/PostgreSQL Kerberos NTP Server Syslog Syslog 21
  22. 22. HA Architecture (Active/Standby) Active Etu Master Service Disablement Cluster-ware Big-Data Services Failover Standby Etu Master Service Enablement Synchronised File System Heartbeat Heartbeat Orchestration Services (fully redundant network) Etu Worker Etu Worker Etu Worker 22
  23. 23. HA based on CDH4.0.1 ZK ZK ZK Heartbeat Heartbeat FailoverController Standby FailoverController Active Monitor Health of NN. OS, HW Monitor Health of NN. OS, HW Synchronized File System NN Active DN DN NN Standby DN 23
  24. 24. Data Synchronization • Hadoop ecosystem – Configurations are stored in Zookeeper – Hive meta data is stored in PostgreSQL • PostgreSQL – Using PostgreSQL Replication • User data • System configurations or data – PostgreSQL, Kerberos, NTP server, Syslog 24
  25. 25. Requirements ZK Leader Active Master Worker Worker ZK ZK Standby Master - HDFS Service is Running in Active Master Zookeeper Cluster is ready Standby Master is ready to activate High Availability service 25
  26. 26. Failover Scenario Active Master Worker Worker Standby Master - Active Namenode service failure Active Namenode JVM failure Active ZKFC service failure Etu Active Master OS failure Etu Active Master machine power failure Failure of NIC cards on the Etu Active Master machine Network failure for the Etu Active Master machine Worker 26
  27. 27. Design Details – Enabling HA Namenode JT, HMaster, … Active Master Kerberos, NTP, … DB Replication Namenode Standby Master edit logs FC FC Kerberos, NTP, Syslog,… 1. Stopping services dependent on HDFS. (JobTracker, HMaster, …) 2. Stopping Namenode and Datanode services. 3. Configuring HDFS and FC service. 4. Creating Synchronized File System. 5. Initializing Synchronized File System for share edit logs. 6. Starting Active FC service. 7. Initializing Standby Master. 8. Starting Standby FC service. 9. Synchronizing system configurations and data. 10. Starting Active Namenode and Datanode services. 11. Starting Standby Namenode and Datanode services. 12. Checking Services Status. 13. Starting services dependent on HDFS. (JobTracker, HMaster, …) 27
  28. 28. Design Details - Failover Fencing Namenode JT, HMaster, … DB Replication Active Master Kerberos, NTP, … Namenode Namenode JT, HMaster, … Standby Master Active Master edit logs Kerberos, NTP, … FC FC Kerberos, NTP, Syslog,… 1. Fencing Active Master from Standby Master a. Stopping network service. b. Stopping Hadoop related services. c. Stopping system services. d. Configuring network environment. e. Removing default services. 2. Stopping Standby FC service. 3. Stopping Standby Namenode service. 4. Removing Synchronized File System . 5. Removing DB Replication. 7. Transition Standby Master to Active Master. a. Stopping network service. b. Stopping system services. c. Configuring network environment. d. Configuring host information. e. Configuring system services. f. Starting network service. g. Starting System services. 8. Configuring Hadoop related services. 9. Starting Namenode and Datanode services. 10. Starting Hadoop related services. 28
  29. 29. Use case Active Namenode maintenance Active Master - Stop NN Restart NN Worker Worker Standby Master Worker 29
  30. 30. Use case Standby Master failure Active Master Worker Worker Standby Master - OS failure Power failure Failure of NICs Network failure Worker 30
  31. 31. Use case Cluster power failure Active Master Worker Worker Standby Master Worker 31
  32. 32. Use case Cluster network failure Active Master Worker Worker Standby Master Worker 32
  33. 33. Demo – Non-HA (VM002) Activating HA with One-Click 33
  34. 34. Demo – Activating (VM002 --- VM007) 34
  35. 35. Demo – Activating Done (VM002 – VM007) 35
  36. 36. Demo – Failover (VM002 –> VM007) 36
  37. 37. Demo – Failover Done (VM007) 37
  38. 38. Conclusion • Leveraging Synchronized File System to share Namenode edit logs, and system data between Masters. • Implements improved fencing method to handle failover. • Providing system-wide high availability, not only for Hadoop Name Node Service. 38
  39. 39. Reference • Hadoop 1.0.4 Documentation – http://hadoop.apache.org/docs/stable/index.html – https://issues.apache.org/jira/secure/attachment/12480489/N ameNode%20HA_v2_1.pdf • Hadoop 2.0.3-alpha Documentation – http://hadoop.apache.org/docs/r2.0.3-alpha/index.html • Hadoop AvatarNode High Availability – http://hadoopblog.blogspot.tw/2010/02/hadoop-namenodehigh-availability.html • Hortonworks Data Platform – http://hortonworks.com/products/hortonworksdataplatform/ – http://www.vmware.com/files/pdf/Apache-Hadoop-VMwareHA-solution.pdf 39
  40. 40. Reference • CDH4.2.0 Documentation – http://www.cloudera.com/content/support/en/documentation/ cdh4-documentation/cdh4-documentation-v4-latest.html 40
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.