SlideShare a Scribd company logo
NameNode HA
Suresh Srinivas- Hortonworks
Aaron T. Myers - Cloudera
Overview
• Part 1 – Suresh Srinivas(Hortonworks)
  − HDFS Availability and Data Integrity – what is the record?
  − NN HA Design
• Part 2 – Aaron T. Myers (Cloudera)
  − NN HA Design continued
         Client-NN Connection failover
  − Operations and Admin of HA
  − Future Work




                                          2
Current HDFS Availability & Data Integrity

• Simple design, storage fault tolerance
  − Storage: Rely in OS’s file system rather than use raw disk
  − Storage Fault Tolerance: multiple replicas, active monitoring
  − Single NameNode Master
          Persistent state: multiple copies + checkpoints
          Restart on failure
• How well did it work?
  − Lost 19 out of 329 Million blocks on 10 clusters with 20K nodes in 2009
          7-9’s of reliability
          Fixed in 20 and 21.
  − 18 months Study: 22 failures on 25 clusters - 0.58 failures per year per cluster
          Only 8 would have benefitted from HA failover!! (0.23 failures per cluster year)
  − NN is very robust and can take a lot of abuse
          NN is resilient against overload caused by misbehaving apps

                                              3
HA NameNode
Active work has started on HA NameNode (Failover)
• HA NameNode
  − Detailed design and sub tasks in HDFS-1623


• HA: Related work
  − Backup NN (0.21)
  − Avatar NN (Facebook)
  − HA NN prototype using Linux HA (Yahoo!)
  − HA NN prototype with Backup NN and block report replicator (eBay)


                      HA is the highest priority


                                       4
Approach and Terminology
• Initial goal is Active-Standby
  − With Federation each namespace volume has a NameNode
         Single active NN for any namespace volume
• Terminology
  − Active NN – actively serves the read/write operations from the clients
  − Standby NN - waits, becomes active when Active dies or is unhealthy
         Could serve read operations
  − Standby’s State may be cold, warm or hot
         Cold : Standby has zero state (e.g. started after the Active is declared dead.
         Warm: Standby has partial state:
            • has loaded fsImage & editLogs but has not received any block reports
            • has loaded fsImage and rolled logs and all block reports
         Hot Standby: Standby has all most of the Active’s state and start
          immediately


                                            5
High Level Use Cases
• Planned downtime                Supported failures
 − Upgrades                       • Single hardware failure
 − Config changes
                                    − Double hardware failure not
 − Main reason for downtime           supported
                                  • Some software failures
• Unplanned downtime
                                    − Same software failure affects
 − Hardware failure                   both active and standby
 − Server unresponsive
 − Software failures
 − Occurs infrequently




                              6
Use Cases
• Deployment models
 − Single NN configuration; no failover
 − Active and Standby with manual failover
        Standby could be cold/warm/hot
        Addresses downtime during upgrades – main cause of unavailability
 − Active and Standby with automatic failover
        Hot standby
        Addresses downtime during upgrades and other failures




               See HDFS-1623 for detailed use cases



                                      7
Design
• Failover control outside NN
• Parallel Block reports to Active and Standby (Hot failover)
• Shared or non-shared NN state
• Fencing of shared resources/data
  − Datanodes
  − Shared NN state (if any)
• Client failover
  − IP Failover
  − Smart clients (e.g configuration, or ZooKeeper for coordination)




                                      8
Failover Control Outside NN

                                                        • HA Daemon outside NameNode
                                Quorum
                                Service
                                                        • Daemon manages resources
                                                          − All resources modeled uniformly
                                                          − Resources – OS, HW, Network etc.
                                          Resources
  HA
Daemon       Actions
         start, stop,
                                           Resources
                                            Resources     − NameNode is just another resource
                                                        • Heartbeat with other nodes
         failover, monitor, …




                                 Shared
                                                        • Quorum based leader election
                                Resources

                                                          − Zookeeper for coordination and Quorum
                                                        • Fencing during split brain
                                                          − Prevents data corruption
NN HA with Shared Storage and ZooKeeper
                                   ZK             ZK       ZK
                     Heartbeat                                      Heartbeat


       FailoverController                                               FailoverController
             Active                                                          Standby

                   Cmds
Monitor Health                                                             Monitor Health
of NN. OS, HW                                                              of NN. OS, HW
                      NN                Shared NN state     NN
                     Active               with single     Standby
                                             writer
                                           (fencing)


 Block Reports to Active & Standby
 DN fencing: Update cmds from one

                              DN          DN                DN
HA Design Details


                    11
Client Failover Design
• Smart clients
  − Users use one logical URI, client selects correct NN to connect to
• Implementing two options out of the box
  − Client Knows of multiple NNs
  − Use a coordination service (ZooKeeper)
• Common things between these
  − Which operations are idempotent, therefore safe to retry on a failover
  − Failover/retry strategies
• Some differences
  − Expected time for client failover
  − Ease of administration

                                        12
Ops/Admin: Shared Storage
• To share NN state, need shared storage
  − Needs to be HA itself to avoid just shifting SPOF
         BookKeeper, etc will likely take care of this in the future
  − Many come with IP fencing options
  − Recommended mount options:
         tcp,soft,intr,timeo=60,retrans=10
• Not all edits directories are created equal
  − Used to be all edits dirs were just a pool of redundant dirs
  − Can now configure some edits directories to be required
  − Can now configure number of tolerated failures
  − You want at least 2 for durability, 1 remote for HA



                                          13
Ops/Admin: NN fencing
• Client failover does not solve this problem
• Out of the box
  − RPC to active NN to tell it to go to standby (graceful failover)
  − SSH to active NN and `kill -9’ NN
• Pluggable options
  − Many filers have protocols for IP-based fencing options
  − Many PDUs have protocols for IP-based plug-pulling (STONITH)
         Nuke the node from orbit. It’s the only way to be sure.
• Configure extra options if available to you
  − Will be tried in order during a failover event
  − Escalate the aggressiveness of the method
  − Fencing is critical for correctness of NN metadata


                                         14
Ops/Admin: Monitoring
• New NN metrics
  − Size of pending DN message queues
  − Seconds since the standby NN last read from shared edit log
  − DN block report lag
  − All measurements of standby NN lag – monitor/alert on all of these
• Monitor shared storage solution
  − Volumes fill up, disks go bad, etc
  − Should configure paranoid edit log retention policy (default is 2)
• Canary-based monitoring of HDFS a good idea
  − Pinging both NNs not sufficient



                                       15
Ops/Admin: Hardware
• Active/Standby NNs should be on separate racks
• Shared storage system should be on separate rack
• Active/Standby NNs should have close to the same hardware
  − Same amount of RAM – need to store the same things
  − Same # of processors - need to serve same number of clients
• All the same recommendations still apply for NN
  − ECC memory, 48GB
  − Several separate disks for NN metadata directories
  − Redundant disks for OS drives, probably RAID 5 or mirroring
  − Redundant power



                                     16
Future Work
• Other options to share NN metadata
  − BookKeeper
  − Multiple, potentially non-HA filers
  − Entirely different metadata system
• More advanced client failover/load shedding
  − Serve stale reads from the standby NN
  − Speculative RPC
  − Non-RPC clients (IP failover, DNS failover, proxy, etc.)
• Even Higher HA
  − Multiple standby NNs



                                        17
QA

• Detailed design (HDFS-1623)
 −Community effort
 −HDFS-
  1971, 1972, 1973, 1974, 1975, 2005, 2064, 10
  73




                      18

More Related Content

What's hot

Introduction to hadoop high availability
Introduction to hadoop high availability Introduction to hadoop high availability
Introduction to hadoop high availability
Omid Vahdaty
 
Hadoop, Taming Elephants
Hadoop, Taming ElephantsHadoop, Taming Elephants
Hadoop, Taming Elephants
Ovidiu Dimulescu
 
Policy-driven, Platform-aware Nova Scheduler
Policy-driven, Platform-aware Nova SchedulerPolicy-driven, Platform-aware Nova Scheduler
Policy-driven, Platform-aware Nova Scheduler
Ram (Ramki) Krishnan
 
Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)
Joshua Drake
 
ttec infortrend ds
ttec infortrend dsttec infortrend ds
ttec infortrend ds
TTEC
 
HBase operations
HBase operationsHBase operations
HBase operations
Yaniv Yancovich
 
Vancouver bug enterprise storage and zfs
Vancouver bug   enterprise storage and zfsVancouver bug   enterprise storage and zfs
Vancouver bug enterprise storage and zfs
Rami Jebara
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
Yongseok Oh
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Patrick McGarry
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
xKinAnx
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Community
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
Adam Kawa
 
Private cloud virtual reality to reality a partner story daniel mar_technicom
Private cloud virtual reality to reality a partner story daniel mar_technicomPrivate cloud virtual reality to reality a partner story daniel mar_technicom
Private cloud virtual reality to reality a partner story daniel mar_technicom
Microsoft Singapore
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
larsgeorge
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
OpenStack Foundation
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
xKinAnx
 
Ibm spectrum scale_backup_n_archive_v03_ash
Ibm spectrum scale_backup_n_archive_v03_ashIbm spectrum scale_backup_n_archive_v03_ash
Ibm spectrum scale_backup_n_archive_v03_ash
Ashutosh Mate
 
Session 7362 Handout 427 0
Session 7362 Handout 427 0Session 7362 Handout 427 0
Session 7362 Handout 427 0
jln1028
 
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
VMworld
 

What's hot (19)

Introduction to hadoop high availability
Introduction to hadoop high availability Introduction to hadoop high availability
Introduction to hadoop high availability
 
Hadoop, Taming Elephants
Hadoop, Taming ElephantsHadoop, Taming Elephants
Hadoop, Taming Elephants
 
Policy-driven, Platform-aware Nova Scheduler
Policy-driven, Platform-aware Nova SchedulerPolicy-driven, Platform-aware Nova Scheduler
Policy-driven, Platform-aware Nova Scheduler
 
Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)
 
ttec infortrend ds
ttec infortrend dsttec infortrend ds
ttec infortrend ds
 
HBase operations
HBase operationsHBase operations
HBase operations
 
Vancouver bug enterprise storage and zfs
Vancouver bug   enterprise storage and zfsVancouver bug   enterprise storage and zfs
Vancouver bug enterprise storage and zfs
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Private cloud virtual reality to reality a partner story daniel mar_technicom
Private cloud virtual reality to reality a partner story daniel mar_technicomPrivate cloud virtual reality to reality a partner story daniel mar_technicom
Private cloud virtual reality to reality a partner story daniel mar_technicom
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
 
Ibm spectrum scale_backup_n_archive_v03_ash
Ibm spectrum scale_backup_n_archive_v03_ashIbm spectrum scale_backup_n_archive_v03_ash
Ibm spectrum scale_backup_n_archive_v03_ash
 
Session 7362 Handout 427 0
Session 7362 Handout 427 0Session 7362 Handout 427 0
Session 7362 Handout 427 0
 
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster base...
 

Similar to Nn ha hadoop world.final

Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Cloudera, Inc.
 
HDFS - What's New and Future
HDFS - What's New and FutureHDFS - What's New and Future
HDFS - What's New and Future
DataWorks Summit
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
Cloudera, Inc.
 
Linux on System z – disk I/O performance
Linux on System z – disk I/O performanceLinux on System z – disk I/O performance
Linux on System z – disk I/O performance
IBM India Smarter Computing
 
Lect17
Lect17Lect17
Lect17
Vin Voro
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
DataWorks Summit
 
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
elliando dias
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
Felicia Haggarty
 
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
xKinAnx
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Ceph Community
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
PostgreSQL Experts, Inc.
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
hik_lhz
 
Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkoo
Fabrice dos Santos
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Etu Solution
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
Ceph Community
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
Sander Temme
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
aidanshribman
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
Venu Anuganti
 

Similar to Nn ha hadoop world.final (20)

Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
 
HDFS - What's New and Future
HDFS - What's New and FutureHDFS - What's New and Future
HDFS - What's New and Future
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Linux on System z – disk I/O performance
Linux on System z – disk I/O performanceLinux on System z – disk I/O performance
Linux on System z – disk I/O performance
 
Lect17
Lect17Lect17
Lect17
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkoo
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
Priyanka Aash
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
ankush9927
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 

Recently uploaded (20)

EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 

Nn ha hadoop world.final

  • 1. NameNode HA Suresh Srinivas- Hortonworks Aaron T. Myers - Cloudera
  • 2. Overview • Part 1 – Suresh Srinivas(Hortonworks) − HDFS Availability and Data Integrity – what is the record? − NN HA Design • Part 2 – Aaron T. Myers (Cloudera) − NN HA Design continued  Client-NN Connection failover − Operations and Admin of HA − Future Work 2
  • 3. Current HDFS Availability & Data Integrity • Simple design, storage fault tolerance − Storage: Rely in OS’s file system rather than use raw disk − Storage Fault Tolerance: multiple replicas, active monitoring − Single NameNode Master  Persistent state: multiple copies + checkpoints  Restart on failure • How well did it work? − Lost 19 out of 329 Million blocks on 10 clusters with 20K nodes in 2009  7-9’s of reliability  Fixed in 20 and 21. − 18 months Study: 22 failures on 25 clusters - 0.58 failures per year per cluster  Only 8 would have benefitted from HA failover!! (0.23 failures per cluster year) − NN is very robust and can take a lot of abuse  NN is resilient against overload caused by misbehaving apps 3
  • 4. HA NameNode Active work has started on HA NameNode (Failover) • HA NameNode − Detailed design and sub tasks in HDFS-1623 • HA: Related work − Backup NN (0.21) − Avatar NN (Facebook) − HA NN prototype using Linux HA (Yahoo!) − HA NN prototype with Backup NN and block report replicator (eBay) HA is the highest priority 4
  • 5. Approach and Terminology • Initial goal is Active-Standby − With Federation each namespace volume has a NameNode  Single active NN for any namespace volume • Terminology − Active NN – actively serves the read/write operations from the clients − Standby NN - waits, becomes active when Active dies or is unhealthy  Could serve read operations − Standby’s State may be cold, warm or hot  Cold : Standby has zero state (e.g. started after the Active is declared dead.  Warm: Standby has partial state: • has loaded fsImage & editLogs but has not received any block reports • has loaded fsImage and rolled logs and all block reports  Hot Standby: Standby has all most of the Active’s state and start immediately 5
  • 6. High Level Use Cases • Planned downtime Supported failures − Upgrades • Single hardware failure − Config changes − Double hardware failure not − Main reason for downtime supported • Some software failures • Unplanned downtime − Same software failure affects − Hardware failure both active and standby − Server unresponsive − Software failures − Occurs infrequently 6
  • 7. Use Cases • Deployment models − Single NN configuration; no failover − Active and Standby with manual failover  Standby could be cold/warm/hot  Addresses downtime during upgrades – main cause of unavailability − Active and Standby with automatic failover  Hot standby  Addresses downtime during upgrades and other failures See HDFS-1623 for detailed use cases 7
  • 8. Design • Failover control outside NN • Parallel Block reports to Active and Standby (Hot failover) • Shared or non-shared NN state • Fencing of shared resources/data − Datanodes − Shared NN state (if any) • Client failover − IP Failover − Smart clients (e.g configuration, or ZooKeeper for coordination) 8
  • 9. Failover Control Outside NN • HA Daemon outside NameNode Quorum Service • Daemon manages resources − All resources modeled uniformly − Resources – OS, HW, Network etc. Resources HA Daemon Actions start, stop, Resources Resources − NameNode is just another resource • Heartbeat with other nodes failover, monitor, … Shared • Quorum based leader election Resources − Zookeeper for coordination and Quorum • Fencing during split brain − Prevents data corruption
  • 10. NN HA with Shared Storage and ZooKeeper ZK ZK ZK Heartbeat Heartbeat FailoverController FailoverController Active Standby Cmds Monitor Health Monitor Health of NN. OS, HW of NN. OS, HW NN Shared NN state NN Active with single Standby writer (fencing) Block Reports to Active & Standby DN fencing: Update cmds from one DN DN DN
  • 12. Client Failover Design • Smart clients − Users use one logical URI, client selects correct NN to connect to • Implementing two options out of the box − Client Knows of multiple NNs − Use a coordination service (ZooKeeper) • Common things between these − Which operations are idempotent, therefore safe to retry on a failover − Failover/retry strategies • Some differences − Expected time for client failover − Ease of administration 12
  • 13. Ops/Admin: Shared Storage • To share NN state, need shared storage − Needs to be HA itself to avoid just shifting SPOF  BookKeeper, etc will likely take care of this in the future − Many come with IP fencing options − Recommended mount options:  tcp,soft,intr,timeo=60,retrans=10 • Not all edits directories are created equal − Used to be all edits dirs were just a pool of redundant dirs − Can now configure some edits directories to be required − Can now configure number of tolerated failures − You want at least 2 for durability, 1 remote for HA 13
  • 14. Ops/Admin: NN fencing • Client failover does not solve this problem • Out of the box − RPC to active NN to tell it to go to standby (graceful failover) − SSH to active NN and `kill -9’ NN • Pluggable options − Many filers have protocols for IP-based fencing options − Many PDUs have protocols for IP-based plug-pulling (STONITH)  Nuke the node from orbit. It’s the only way to be sure. • Configure extra options if available to you − Will be tried in order during a failover event − Escalate the aggressiveness of the method − Fencing is critical for correctness of NN metadata 14
  • 15. Ops/Admin: Monitoring • New NN metrics − Size of pending DN message queues − Seconds since the standby NN last read from shared edit log − DN block report lag − All measurements of standby NN lag – monitor/alert on all of these • Monitor shared storage solution − Volumes fill up, disks go bad, etc − Should configure paranoid edit log retention policy (default is 2) • Canary-based monitoring of HDFS a good idea − Pinging both NNs not sufficient 15
  • 16. Ops/Admin: Hardware • Active/Standby NNs should be on separate racks • Shared storage system should be on separate rack • Active/Standby NNs should have close to the same hardware − Same amount of RAM – need to store the same things − Same # of processors - need to serve same number of clients • All the same recommendations still apply for NN − ECC memory, 48GB − Several separate disks for NN metadata directories − Redundant disks for OS drives, probably RAID 5 or mirroring − Redundant power 16
  • 17. Future Work • Other options to share NN metadata − BookKeeper − Multiple, potentially non-HA filers − Entirely different metadata system • More advanced client failover/load shedding − Serve stale reads from the standby NN − Speculative RPC − Non-RPC clients (IP failover, DNS failover, proxy, etc.) • Even Higher HA − Multiple standby NNs 17
  • 18. QA • Detailed design (HDFS-1623) −Community effort −HDFS- 1971, 1972, 1973, 1974, 1975, 2005, 2064, 10 73 18

Editor's Notes

  1. Data – can I read what I wrote, is the service availableWhen I asked one of the original authors of of GFS if there were any decisions they would revist – random writersSimplicity is keyRaw disk – fs take time to stabilize – we can take advantage of ext4, xfs or zfs