Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Facebook Scaling Overview


Published on

Tiny presentation i did for a course.

Published in: Technology
  • Be the first to comment

Facebook Scaling Overview

  1. 1. Facebook Scaling WalkthroughMoritz Haarmann - Ultrasuperlargescale Systems
  2. 2. Numbers
  3. 3. • 800.000.000 active users• > 50% log on every given day• 250.000.000 Photos every single day ( Flickr: 6Bn total )• new pieces of content monthly
  4. 4.
  5. 5. if everyone now living on earth posts 4.28 updates every month.
  6. 6. Building Blocks
  7. 7. Facebook is built using• Web Servers ( Running Hip-Hop PHP )• Services ( Search, Ads )• Memcached & MySQL• Immense amounts of glue
  8. 8. Write Strategy• Writes take place centrally in California• 3.5 Million Changed Rows per Second ( Peak )• 2010: 1800 DB Servers• horizontal scaling approach not disclosed• consistency is important ( avoiding „unhappy users“ )
  9. 9. Glue• Massively distributed architecture• Glue keeping it together• Many systems built in-house to meet giiaaanoourmus requirements
  10. 10. Haystack• Photos• Handles everything from HTTP to storage• Aimed at minimizing IO-Operations• Append-Only!
  11. 11. Memcached• Placed between MySQL and Web Tier• Stores only „plain data“, no joins or other complicated stuff• Faster if Web Server works on data
  12. 12. BigPipe• Assembles the output pages• everything that is needed retrieved in parallel• Fault tolerant, will work even if parts of a page are not available
  13. 13. What else?
  14. 14. Live Profiling• Facebook monitors their life systems continously at a PHP-Method level ( using XHProf ).
  15. 15. Graceful Degradation• High awareness ( Monitoring ) of perfomance problems• Features can be disabled ( very ne-grained ) to keep the core features running smoothly
  16. 16. Keeping it running• New features are launched ,dark‘, without visible elements, to stress test the backend with real load• Incremental roll-outs decrease the impact of bug or malfunction
  17. 17. Open Source• Most parts are open source• Either used or created and then os‘d
  18. 18. Big Bang • On September 23, 2010, Facebook was down for most users for about 3 hours • A wrongly identi ed ,invalid‘ cache value lead to requests hammering the DB tier • A system designed to prevent failures created one! • Only way to recover was to completely shut down access to the DB - downtime ( great comments, too )
  19. 19. Thanks.
  20. 20. Sources• facebook/•• pipelining-web-pages-for-high-performance/389414033919•