1. Facebook Scaling WalkthroughMoritz Haarmann - Ultrasuperlargescale Systems
3. • 800.000.000 active users• > 50% log on every given day• 250.000.000 Photos every single day ( Flickr: 6Bn total )• 30.000.000.000 new pieces of content monthly
5. 30.000.000.000As if everyone now living on earth posts 4.28 updates every month.
6. Building Blocks
7. Facebook is built using• Web Servers ( Running Hip-Hop PHP )• Services ( Search, Ads )• Memcached & MySQL• Immense amounts of glue
8. Write Strategy• Writes take place centrally in California• 3.5 Million Changed Rows per Second ( Peak )• 2010: 1800 DB Servers• horizontal scaling approach not disclosed• consistency is important ( avoiding „unhappy users“ )
9. Glue• Massively distributed architecture• Glue keeping it together• Many systems built in-house to meet giiaaanoourmus requirements
10. Haystack• Photos• Handles everything from HTTP to storage• Aimed at minimizing IO-Operations• Append-Only!
11. Memcached• Placed between MySQL and Web Tier• Stores only „plain data“, no joins or other complicated stuﬀ• Faster if Web Server works on data
12. BigPipe• Assembles the output pages• everything that is needed retrieved in parallel• Fault tolerant, will work even if parts of a page are not available
13. What else?
14. Live Profiling• Facebook monitors their life systems continously at a PHP-Method level ( using XHProf ).
15. Graceful Degradation• High awareness ( Monitoring ) of perfomance problems• Features can be disabled ( very ne-grained ) to keep the core features running smoothly
16. Keeping it running• New features are launched ,dark‘, without visible elements, to stress test the backend with real load• Incremental roll-outs decrease the impact of bug or malfunction
17. Open Source• Most parts are open source• Either used or created and then os‘d
18. Big Bang • On September 23, 2010, Facebook was down for most users for about 3 hours • A wrongly identi ed ,invalid‘ cache value lead to requests hammering the DB tier • A system designed to prevent failures created one! • Only way to recover was to completely shut down access to the DB - downtimehttps://www.facebook.com/note.php?note_id=431441338919&id=9445547199&ref=mf ( great comments, too )