Hadoop Distributed File System Reliability and Durability at Facebook

10,082 views
9,556 views

Published on

Published in: Technology, Business
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,082
On SlideShare
0
From Embeds
0
Number of Embeds
2,961
Actions
Shares
0
Downloads
183
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Hadoop Distributed File System Reliability and Durability at Facebook

  1. I accidentally the NamenodeHDFS reliability at FacebookAndrew RyanFacebookApril 2012
  2. The HDFS Namenode: SPOF by design▪  Single Point of Failure by Namenode Secondary Namenode design▪  All metadata operations go through Namenode▪  Earlydesigners made tradeoffs: features & Data Datanode performance first Clients Simplified HDFS Architecture: Namenode as SPOF
  3. HDFS major use cases at FacebookData Warehouse and Facebook Messages Data Warehouse Facebook Messages # of clusters <10 10’s Size of clusters Large Small (100’s – 1000’s of (~100 nodes) nodes) Processing workload MapReduce batch HBase jobs transactions Namenode load Very heavy Very light End-user downtime None Users without impact Messages
  4. HDFS at Facebook: 2009-2012Some things have changed… 2009 2012 # HDFS clusters 1 >100 Largest HDFS cluster size (TB) 600TB >100PB Largest HDFS cluster size (# files) 10 million 200 million HDFS cluster types MapReduce MapReduce, HBase, MySQL backups, +more
  5. HDFS at Facebook: 2009-2012…and some things have not 2009 2012 Single points of failure in HDFS Namenode Namenode HDFS cluster restart time 60 minutes 60 minutes Namenode failover method Manual, Manual, complicated complicated SPOF Namenode as a cause of Unknown Unknown downtime
  6. Data Warehouse▪  Storageand querying of UI Tools structured log data using Hive and Hadoop Workflow (Nocron) MapReduce Query (Hive)▪  Composed of dozens of tools/components Compute (MapReduce)▪  A “vigorous and creative” user population Storage (HDFS) Hadoop
  7. Data Warehouse: all incidents41% are HDFS-related
  8. Data Warehouse: SPOF Namenodeincidents10% are SPOF Namenode
  9. Facebook Messages Clients User Directory Service(www, chat, MTA, etc.)Messages Cell Mail Application Server Anti-spam Outbound MailHBase/HDFS/ZK Mail Servers Haystack
  10. Messages: all incidents16% are HDFS-related
  11. Messages: SPOF Namenode incidents10% are SPOF Namenode
  12. What would happen if…Instead of this… Namenode Secondary Namenode Data Datanode Clients Simplified HDFS Architecture: Namenode as SPOF
  13. What would happen if…We had this! Primary Standby Namenode Namenode Data Datanode Clients Simplified HDFS Architecture: Highly Available Namenode
  14. AvatarNode is our solution AvatarNode client view AvatarNode datanode view
  15. AvatarNode is…▪  A two-node, highly available Namenode with manual failover▪  In production today at Facebook▪  Open-sourced, based on Hadoop 0.20: https://github.com/facebook/hadoop-20
  16. AvatarNode does not…▪  Eliminate the dependency on shared storage for image/edits▪  Provide instant failover (~1 second per million blocks+files)▪  Provide automated failover▪  Guarantee I/O fencing for Primary/Standby (although precautions are taken)▪  Require Zookeeper at all times for proper normal operation (required for failover)▪  Allow for >2 Namenodes to participate in an HA cluster▪  Have any special network requirements
  17. Wrapping up…▪  The SPOF Namenode is a weak link of HDFS’s design▪  In our services which use HDFS, we estimate we could eliminate: ▪  10% of service downtime from unscheduled outages ▪  20-50% of downtime from scheduled maintenance▪  AvatarNode is Facebook’s solution for 0.20, available today▪  Other Namenode HA solutions are being worked on in HDFS trunk (HDFS-1623)
  18. Questions?
  19. Sessions will resume at 11:25am Page 19

×