Marc facebook

3,495 views

Published on

Published in: Education, Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,495
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
57
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Marc facebook

  1. 1. memcache@facebook Marc Kwiatkowski memcache tech lead QCon Monday, April 12, 2010
  2. 2. How big is facebook? Monday, April 12, 2010
  3. 3. million active users Monday, April 12, 2010 400M Active
  4. 4. million active users Title M M M M M M M Monday, April 12, 2010 400M Active
  5. 5. Objects ▪ More than million status updates posted each day ▪ /s ▪ More than billion photos uploaded to the site each month ▪ /s ▪ More than billion pieces of content (web links, news stories, blog posts, notes, photo albums, etc.) shared each week ▪ K/s ▪ Average user has friends on the site ▪ Billion friend graph edges ▪ Average user clicks the Like button on pieces of content each month Monday, April 12, 2010
  6. 6. - Infrastructure ▪ Thousands of servers in several data centers in two regions ▪ Web servers ▪ DB servers ▪ Memcache Servers ▪ Other services Monday, April 12, 2010
  7. 7. The scale of memcache @ facebook ▪ Memcache Ops/s ▪ over M gets/sec ▪ over M sets/sec ▪ over T cached items ▪ over Tbytes ▪ Network IO ▪ peak rx Mpkts/s GB/s ▪ peak tx Mpkts/s GB/s Monday, April 12, 2010
  8. 8. A typical memcache server’s P.O.V. ▪ Network I/O ▪ rx Kpkts/s . MB/s ▪ tx Kpkts/s MB/s ▪ Memcache OPS ▪ K gets/s ▪ K sets/s ▪ M items Monday, April 12, 2010 All rates are 1 day moving averages
  9. 9. Evolution of facebook’s architecture Monday, April 12, 2010
  10. 10. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  11. 11. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  12. 12. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  13. 13. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  14. 14. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  15. 15. Scaling Facebook: Interconnected data Bob Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  16. 16. Scaling Facebook: Interconnected data Bob Brian Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  17. 17. Scaling Facebook: Interconnected data Felicia Bob Brian Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  18. 18. Memcache Rules of the Game ▪ GET object from memcache ▪ on miss, query database and SET object to memcache ▪ Update database row and DELETE object in memcache ▪ No derived objects in memcache ▪ Every memcache object maps to persisted data in database Monday, April 12, 2010
  19. 19. Scaling memcache Monday, April 12, 2010
  20. 20. Phatty Phatty Multiget Monday, April 12, 2010
  21. 21. Phatty Phatty Multiget Monday, April 12, 2010
  22. 22. Phatty Phatty Multiget (notes) ▪ PHP runtime is single threaded and synchronous ▪ To get good performance for data-parallel operations like retrieving info for all friends, it’s necessary to dispatch memcache get requests in parallel ▪ Initially we just used polling I/O in PHP. ▪ Later we switched to true asynchronous I/O in a PHP C extension ▪ In both case the result was reduced latency through parallelism. Monday, April 12, 2010
  23. 23. Pools and Threads PHP Client Monday, April 12, 2010
  24. 24. sp: cs: sp: cs: cs: sp: PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  25. 25. sp: sp: sp: cs: cs: cs: PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  26. 26. PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  27. 27. Pools and Threads (notes) ▪ Privacy objects are small but have poor hit rates ▪ User-profiles are large but have good hit rates ▪ We achieve better overall caching by segregating different classes of objects into different pools of memcache servers ▪ Memcache was originally a classic single-threaded unix daemon ▪ This meant we needed to run instances with / the RAM on each memcache server ▪ X the number of connections to each both ▪ X the meta-data overhead ▪ We needed a multi-threaded service Monday, April 12, 2010
  28. 28. Connections and Congestion ▪ [animation] Monday, April 12, 2010
  29. 29. Connections and Congestion (notes) ▪ As we added web-servers the connections to each memcache box grew. ▪ Each webserver ran - PHP processes ▪ Each memcache box has K+ TCP connections ▪ UDP could reduce the number of connections ▪ As we added users and features, the number of keys per-multiget increased ▪ Popular people and groups ▪ Platform and FBML ▪ We began to see incast congestion on our ToR switches. Monday, April 12, 2010
  30. 30. Serialization and Compression ▪ We noticed our short profiles weren’t so short ▪ K PHP serialized object ▪ fb-serialization ▪ based on thrift wire format ▪ X faster ▪ smaller ▪ gzcompress serialized strings Monday, April 12, 2010
  31. 31. Multiple Datacenters SC Web SC Memcache SC MySQL Monday, April 12, 2010
  32. 32. Multiple Datacenters SC Web SF Web SC SF Memcache Memcache SC MySQL Monday, April 12, 2010
  33. 33. Multiple Datacenters SC Web SF Web Memcache Proxy Memcache Proxy SC SF Memcache Memcache SC MySQL Monday, April 12, 2010
  34. 34. ▪Multiple Datacenters (notes) ▪ In the early days we had two data-centers ▪ The one we were about to turn off ▪ The one we were about to turn on ▪ Eventually we outgrew a single data-center ▪ Still only one master database tier ▪ Rules of the game require that after an update we need to broadcast deletes to all tiers ▪ The mcproxy era begins Monday, April 12, 2010
  35. 35. Multiple Regions West Coast East Coast SC Web VA Web SC VA Memcache Memcache Memcache Proxy SC MySQL VA MySQL Monday, April 12, 2010
  36. 36. Multiple Regions West Coast East Coast SC Web SF Web VA Web Memcache Proxy Memcache Proxy SC SF VA Memcache Memcache Memcache Memcache Proxy SC MySQL VA MySQL Monday, April 12, 2010
  37. 37. Multiple Regions West Coast East Coast SC Web SF Web VA Web Memcache Proxy Memcache Proxy SC SF VA Memcache Memcache Memcache Memcache Proxy SC MySQL MySql replication VA MySQL Monday, April 12, 2010
  38. 38. ▪ Multiple Regions (notes) ▪ Latency to east coast and European users was/is terrible. ▪ So we deployed a slave DB tier in Ashburn VA ▪ Slave DB tracks syncs with master via MySQL binlog ▪ This introduces a race condition ▪ mcproxy to the rescue again ▪ Add memcache delete pramga to MySQL update and insert ops ▪ Added thread to slave mysqld to dispatch deletes in east coast via mcpro Monday, April 12, 2010
  39. 39. Replicated Keys Memcache Memcache Memcache PHP Client PHP Client PHP Client Monday, April 12, 2010
  40. 40. Replicated Keys Memcache Memcache Memcache key PHP Client PHP Client PHP Client Monday, April 12, 2010
  41. 41. Replicated Keys Memcache Memcache Memcache key key key PHP Client PHP Client PHP Client Monday, April 12, 2010
  42. 42. Replicated Keys Memcache Memcache Memcache key key key PHP Client PHP Client PHP Client Monday, April 12, 2010
  43. 43. Replicated Keys Memcache Memcache Memcache key PHP Client PHP Client PHP Client Monday, April 12, 2010
  44. 44. Replicated Keys Memcache Memcache Memcache key# key# key# key PHP Client PHP Client PHP Client Monday, April 12, 2010
  45. 45. Replicated Keys (notes) ▪ Viral groups and applications cause hot keys ▪ More gets than a single memcache server can process ▪ (Remember the rules of the game!) ▪ That means more queries than a single DB server can process ▪ That means that group or application is effectively down ▪ Creating key aliases allows us to add server capacity. ▪ Hot keys are published to all web-servers ▪ Each web-server picks an alias for gets ▪ get key:xxx => get key:xxx#N ▪ Each web-server deletes all aliases Monday, April 12, 2010
  46. 46. Memcache Rules of the Game ▪ New Rule ▪ If a key is hot, pick an alias and fetch that for reads ▪ Delete all aliases on updates Monday, April 12, 2010
  47. 47. Mirrored Pools Specialized Replica Specialized Replica Shard Shard Shard Shard General pool with wide fanout Shard Shard Shard Shard n ... Monday, April 12, 2010
  48. 48. Mirrored Pools (notes) ▪ As our memcache tier grows the ratio of keys/packet decreases ▪ keys/ server = packet ▪ keys/ server = packets ▪ More network traffic ▪ More memcache server kernel interrupts per request ▪ Confirmed Info - critical account meta-data ▪ Have you confirmed your account? ▪ Are you a minor? ▪ Pulled from large user-profile objects ▪ Since we just need a few bytes of data for many users Monday, April 12, 2010
  49. 49. Hot Misses ▪ [animation] Monday, April 12, 2010
  50. 50. Hot Misses (notes) ▪ Remember the rules of the game ▪ update and delete ▪ miss, query, and set ▪ When the object is very, very popular, that query rate can kill a database server ▪ We need flow control! Monday, April 12, 2010
  51. 51. Memcache Rules of the Game ▪ For hot keys, on miss grab a mutex before issuing db query ▪ memcache-add a per-object mutex ▪ key:xxx => key:xxx#mutex ▪ If add succeeds do the query ▪ If add fails (because mutex already exists) back-off and try again ▪ After set delete mutex Monday, April 12, 2010
  52. 52. Hot Deletes ▪ [hot groups graphics] Monday, April 12, 2010
  53. 53. Hot Deletes (notes) ▪ We’re not out of the woods yet ▪ Cache mutex doesn’t work for frequently updated objects ▪ like membership lists and walls for viral groups and applications. ▪ Each process that acquires a mutex finds that the object has been deleted again ▪ ...and again ▪ ...and again Monday, April 12, 2010
  54. 54. Rules of the Game: Caching Intent ▪ Each memcache server is in the perfect position to detect and mitigate contention ▪ Record misses ▪ Record deletes ▪ Serve stale data ▪ Serve lease-ids ▪ Don’t allow updates without a valid lease id Monday, April 12, 2010
  55. 55. Next Steps Monday, April 12, 2010
  56. 56. Shaping Memcache Traffic ▪ mcproxy as router ▪ admission control ▪ tunneling inter-datacenter traffic Monday, April 12, 2010
  57. 57. Cache Hierarchies ▪ Warming up Cold Clusters ▪ Proxies for Cacheless Clusters Monday, April 12, 2010
  58. 58. Big Low Latency Clusters ▪ Bigger Clusters are Better ▪ Low Latency is Better ▪ L . ▪ UDP ▪ Proxy Facebook Architecture Monday, April 12, 2010
  59. 59. Worse IS better ▪ Richard Gabriel’s famous essay contrasted ▪ ITS and Unix ▪ LISP and C ▪ MIT and New Jersey Monday, April 12, 2010 http://www.jwz.org/doc/worse-is-better.html
  60. 60. Why Memcache Works ▪ Uniform, low latency with partial results is a better user experience ▪ memcache provides a few robust primitives ▪ key-to-server mapping ▪ parallel I/O ▪ flow-control ▪ traffic shaping ▪ that allow ad hoc solutions to a wide range of scaling issues Monday, April 12, 2010 We started with simple, obvious improvements. As we grew we deployed less obvious improvements... But they’ve remained pretty simple
  61. 61. (c) Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. . Monday, April 12, 2010

×