EEDC 2010. Scaling Web Applications

1,121 views
1,041 views

Published on

Seminario realizado en el marco del master CANS en la Facultad de Informática de Barcelona.
Anatomia de una aplicación Web
Demasiadas escrituras en la BD, ¿qué puedo hacer?
¿Cómo puedo aprovechar el "Cloud"?
Optimizando aplicaciones Facebook

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,121
On SlideShare
0
From Embeds
0
Number of Embeds
115
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

EEDC 2010. Scaling Web Applications

  1. 1. 6.1. Web Scale<br />34330<br />EEDC<br />Execution<br />Environments for <br />Distributed <br />Computing<br />6.1.1. Anatomy of a service<br />6.1.2. Too many Writes to Database<br />6.1.3. Cheaper peaks<br />6.1.4. Facebook Platform<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  2. 2. 6.1. Web Scale<br />34330<br />EEDC<br />Execution <br />Environments for <br />Distributed <br />Computing<br />6.1.1. Anatomy of a service<br />6.1.2. Too many Writes to Database<br />6.1.3. Cheaper peaks<br />6.1.4. Facebook Platform<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  3. 3. Anatomy of a Web Service<br />
  4. 4. Problems may arise in…<br />Various browsers, plugins, <br />operatingsystems,<br /> performance, screensize,<br />PEBKAC, etc<br />
  5. 5. Problems may arise in…<br />Internet partitioning, <br />performance bottlenecks,<br />packetloss, jitter<br />
  6. 6. Problems may arise in…<br />DDoStargetinganothercustomer,<br />routingproblems, capacity,<br />Power/coolingproblems, <br />«lazy» remotehands<br />
  7. 7. Problems may arise in…<br />Performance limits, bugs,<br />configurationerrors,<br />faulty HW<br />
  8. 8. Problems may arise in…<br />Networklimits, interruptlimits<br />OS limits, bugs,<br />configurationerrors,<br />faulty HW, error recovery, <br />
  9. 9. Problems may arise in…<br />Speed of clients, #threads, <br />contentnot in sync, <br />unresponsive Apps,<br />toomanysources of contents,<br />userpersistence,<br />configurationerrors, bugs<br />
  10. 10. Problems may arise in… Requests/sec<br />100 KB 5 MB 50 KB 5 KB 50 KB 50 KB<br />Default configuration of Tomcatallows 200 threads/instance<br />
  11. 11. Problems may arise in…<br />Speed of clients, #threads, <br />contentnot in sync, <br />unresponsive Apps,<br />toomanysources of contents,<br />userpersistence<br />configurationerrors, bugs<br />
  12. 12. Problems may arise in…<br />Databaseconcurrency, <br />accessto 3rd party data (APIs),<br />CPU ormemoryboundproblems,<br />datacenterreplication,<br />logginguseractions<br />
  13. 13. Problems may arise in…<br />Database concurrency, <br />modifying schemas,<br />Massive tables -> indexes,<br />disk performance,<br />CPU/memory bound,<br />datacenter replication<br />
  14. 14. Problems may arise in…<br />Availability and performance,<br />More than 24h to analyze daily logs<br />Not reaching Inbox (spam folders)<br />Surpass monitoring capacity<br />
  15. 15. 6.1. Web Scale<br />34330<br />EEDC<br />Execution <br />Environments for <br />Distributed <br />Computing<br />6.1.1. Anatomy of a service<br />6.1.2. Too many Writes to Database<br />6.1.3. Cheaper peaks<br />6.1.4. Facebook Platform<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  16. 16. Too many writes to database<br />There’s no machine that could do 44k/sec over 1 TB of data.<br />Scaling reads is easier: <br />Big cache<br />Replication<br />On write you have to:<br />Update data<br />Update Transaction log<br />Update indexes<br />Invalidate cache<br />Replicate<br />Write to 2 or more disks (RAID x)<br />http://www.scribd.com/doc/2592098/DVPmysqlucFederation-at-Flickr-Doing-Billions-of-Queries-Per-Day<br />
  17. 17. Case<br />Database Federation<br />Sharding per User-ID<br />Global Ring, know where is the data<br />PHP Logic to connect shards and data consistent<br />What’s a Shard?:<br />Horizontal partitioning of a table, usually per Primary Key<br />Benefits<br />You can scale as long as you have budget<br />Disadvantages<br />You lost the possibility to do any JOIN, COUNT, RANGE, between Shards<br />Your application logic has to be aware<br />If you what to rebalance shards, you will need some kind of global unique, beware of auto-increments<br />More services needing HA, BCP, change control, and so on<br />
  18. 18. Case<br />Global Ring?<br />Storing Key-Value of:<br />User_ID -> Shard_ID<br />Photo_ID -> User_ID<br />Group_ID -> Shard_ID<br />Every access to data has to know where -> memcached with a TTL of 30 minutes<br />Global IDs?:<br />You don’t want two objects with the same ID!<br />Strategies<br />GUIDs: 128 bits Ids, so bigger indexes, and poor supported by MySQL<br />Central autoincrement: You have a table where for every Id needed you do an insert and let MySQL take care of everything. At 60 photos/sec will be a BIG table<br />Replace Into: An only MySQL solution, small tables and allows for redundancy (one server provides odd and another even<br />
  19. 19. Case: Replace INTO<br />The Tickets64 schema looks like:<br />CREATE TABLE `Tickets64` ( <br /> `id` bigint(20) unsigned NOT NULL auto_increment,<br />`stub` char(1) NOT NULL default '',<br /> PRIMARY KEY (`id`), UNIQUE KEY `stub` (`stub`)<br />) ENGINE=MyISAM<br />SELECT * from Tickets64 returns a single row that looks something like:<br />+-------------------+------+ <br />| id | stub |<br />+-------------------+------+<br />| 72157623227190423 | a | <br />+-------------------+------+ <br />When they need a new globally unique 64-bit ID they issue the following SQL:<br />REPLACE INTO Tickets64 (stub) VALUES ('a'); SELECT LAST_INSERT_ID(); <br />
  20. 20. Case<br />PHP Logic<br />You lost any kind of intershard relational query (No JOINs)<br />You lost any kind of integrity reference (No ForeignKeys)<br />You have to control distributed transactions<br />You select a Favorite (so they need to update your Shard and the one of the other user)<br />Open 2 connections to the two shards<br />Begin a transaction on both Shards<br />Add the data<br />If everything is ok -> commit, else roll back and error<br />So we improve scalability but impact code complexity and performance off a single page view (hint: async database access)<br />
  21. 21. Case<br />They get an arbitrary scalable infrastructure<br />They have a marginally more complex code<br />
  22. 22. Hai!<br />I’mworking!<br />
  23. 23. Case<br />They get an arbitrary scalable infrastructure<br />They have a marginally more complex code<br />They “only” have 20 engineers, so scalability also means:<br />Roughly 2.5 million Flickr members per engineer.<br />Roughly 200 million photos per engineer.<br />28 user facing pages. <br />23 administrative pages.<br />20 API methods, though only 7.5 public API methods.<br />80 API calls per second.<br />250 CPUs.<br />850 annual deploys.<br />16 feature flags.<br />
  24. 24. 6.1. Web Scale<br />34330<br />EEDC<br />Execution <br />Environments for <br />Distributed <br />Computing<br />6.1.1. Anatomy of a service<br />6.1.2. Too many Writes to Database<br />6.1.3. Cheaper peaks<br />6.1.4. Facebook Platform<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  25. 25. Cheaperpeaks<br />Ifyourcapacityplanning comes fromtheaggregate of allyourcustomers and you plan tohavethousands of them, whatcouldyou do?<br />And your performance impacts in thebrand of yourcustomer (so you’llhaveproblems)<br />You are a Start-up withoutloads of money<br />
  26. 26. Case<br />What a recommendation engine looks like?<br />
  27. 27. Case<br />Have to store data for every page view their customer gets<br />Do MAGIC over millions of rows to calculate related items for YOU<br />Show recommendations to user<br />Only 2 snippets of Javascript/HTML<br />Less than 0’5 seconds per view<br />
  28. 28. Case<br />Option A<br />Every hit to tracker becomes an Insert to a MySQL sharded by customer<br />Every hit to recommender recalculates the list of items to show based on collective intelligence<br />Benefits<br />Straightforward to code and manage<br />Quick and easy for a proof of concept<br />Disadvantages<br />One customer on their peak could surpass the capacity of the MySQL instance<br />The same customer on their valley could be wasting money on an idle instance<br />Our webserver could be overloaded with the sum of all our customers<br />The recommender is a CPU and memory Hog and we need too many servers to cope with our estimated demand<br />
  29. 29. Case<br />Option B<br />Every hit to tracker becomes an Insert to a MySQL sharded by customer<br />We have a cron job that recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB the corresponding set of items<br />Benefits<br />Straightforward to code<br />The compute intensive task is out of critical path, is asynchronous<br />Disadvantages<br />One customer on their peak could surpass the capacity of the MySQL instance<br />The same customer on their valley could be wasting money on an idle instance<br />Our webserver could be overloaded with the sum of all our customers<br />We have to control what are doing our cron jobs and check for errors and tune them so they don’t bring down the database<br />
  30. 30. Case<br />Option C<br />Every hit to tracker is only a static image file with various parameters /a.gif?b=1&c=2&…<br />We have a cron job that gets the log files from the webservers and database stored items and recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB (sharded by customer) the corresponding set of items<br />Benefits<br />Straightforward to code, only had to move and parse files<br />A surge on pageviews don’t bring down the database for writes<br />The compute intensive task is out of critical path, it’s asynchronous<br />Disadvantages<br />One customer on their peak could surpass the capacity of the MySQL instance<br />The same customer on their valley could be wasting money on an idle instance<br />We have to control what are doing our cron jobs and check for errors and tune them so they don’t bring down the database<br />We could hit bandwidth limits<br />
  31. 31. Case<br />Option D<br />Every hit to tracker is only a static image file with various parameters /a.gif?b=1&c=2&…<br />We have a cron job that gets the log files from the webservers and database stored items and recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB the corresponding set of items<br />Went the Hadoop/Hbase way, no more sharding<br />Benefits<br />Easy to add and remove Data servers on demand so no wasting/limits here<br />A surge on page views only costs money, as we get paid per page view, it’s ok <br />The compute intensive task is out of critical path, it’s asynchronous<br />Disadvantages<br />Beta software, poor documentation/examples<br />We have more complexity at our infrastructure<br />We could hit bandwidth limits<br />
  32. 32. Case: Map/Reduce<br />Hadoop: <br />It’s “only” a Framework for running Map/Reduce Applications on large clusters. Allows replication and Fault tolerance, as HW failure will be the norm, using a distributed file system, HDFS<br />Map/Reduce: In a map/reduce application, there are two kinds of jobs, Map and Reduce. <br />Mappers read the HDFS blocks and does local processing and run in parallel. From a webserver log file <url,#hits><br />Reducers get the output of many mappers and consolidate data. If there was a mapper per day, reducer could calculate how many monthly hits get an URL<br />Hbase: <br />Hadoop/MR design gets better throughput than latency so it’s used as analytical platform, but Hbase allow low latency random access to very big tables (billions of rows per millions of columns)<br />Column oriented DB: Table->Row->ColumnFamily->Timestamp=>Value<br />
  33. 33. Case<br />Option D<br />Every hit to tracker is only a static image file with various parameters /a.gif?b=1&c=2&…<br />We have a cron job that gets the log files from the webservers and database stored items and recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB the corresponding set of items<br />Went the Hadoop/Hbase way, no more sharding<br />Benefits<br />Easy to add and remove Data servers on demand so no wasting/limits here<br />A surge on page views only costs money, as we get paid per page view, it’s ok <br />The compute intensive task is out of critical path, it’s asynchronous<br />Disadvantages<br />Beta software, poor documentation/examples<br />We have more complexity at our infrastructure<br />We could hit bandwidth limits<br />
  34. 34. Case<br />Option E<br />Every hit to tracker is only a static image file with various parameters /a.gif?b=1&c=2&…<br />We have a cron job that gets the log files from the webservers and database stored items and recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB the corresponding set of items<br />Went the Hadoop/Hbase way, no more sharding<br />All static files served by a CDN<br />Benefits<br />Easy to add and remove Data servers on demand so no wasting/limits here<br />A surge on page views only costs money, as we get paid per page view, it’s ok <br />The compute intensive task is out of critical path, it’s asynchronous<br />Unlimited bandwidth<br />Disadvantages<br />Beta software, poor documentation/examples<br />We have more complexity at our infrastructure<br />
  35. 35. Case: CDN<br />What’s a Content Delivery Network?<br />Your server or http repository (Amazon S3,..) is the Origin of the content<br />They give you a DNS name (bb.cdn.net) and you have to create a CNAME to this name (www.example.com -> bb.cdn.net.)<br />When a user asks for www.example.com, the CDN will chose which of their nodes is the nearest to the user and give it/they IP addresses<br />The user asks for a content (/a.gif) to the node of the CDN, that will check if it has a fresh copy that will send or if it’s a MISS will check with they upstream caches till your Origin<br />So we get unlimited bandwidth and better latency (we can’t surpass the speed of light)<br />
  36. 36. Case<br />Option E<br />Every hit to tracker is only a static image file with various parameters /a.gif?b=1&c=2&…<br />We have a cron job that gets the log files from the webservers and database stored items and recalculates in advance different sets of related items<br />Every hit to recommender gets from the DB the corresponding set of items<br />Went the Hadoop/Hbase way, no more sharding<br />All static files served by a CDN<br />Benefits<br />Easy to add and remove Data servers on demand so no wasting/limits here<br />A surge on page views only costs money, as we get paid per page view, it’s ok <br />The compute intensive task is out of critical path, it’s asynchronous<br />Unlimited bandwidth<br />Disadvantages<br />Beta software, poor documentation/examples<br />We have more complexity at our infrastructure<br />
  37. 37. Case<br />They get a completely scalable infrastructure at AWS<br />Can provision a new Cruncher, Datastore or Recommender in a matter of minutes and remove it as soon as needed<br />They don’t have any upper limit of how many request could serve<br />All the requests that can impact on the User Experience of the customers of theirs are served by a CDN<br />As there are only 3 kinds of servers and are managed as images, don’t need so much engineers to take care of the infrastructure<br />
  38. 38. 6.1. Web Scale<br />34330<br />EEDC<br />Execution <br />Environments for <br />Distributed <br />Computing<br />6.1.1. Anatomy of a service<br />6.1.2. Too many Writes to Database<br />6.1.3. Cheaper peaks<br />6.1.4. Facebook Platform<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  39. 39. Facebook Platform<br />If your primary data source is not under your control and it’s too far, what happens?<br />An API case<br />
  40. 40. Case<br />DuplicatedGifts<br />
  41. 41. Case<br />Lovingit More «Pongos»<br />Hittingthebullseye?<br />
  42. 42. Case<br />It’s a social wish list application<br />When you access checks if your friends have enabled the application and shows their wish lists<br />You can share your wish lists on Facebook<br />You can capture wishes (gifts) and be shown a feed of possible merchants<br />Initial loading time is critical<br />Expect virality so we won’t have too much response time<br />
  43. 43. Case<br />Flow<br />
  44. 44. Case<br />Nicebut<br />Slow.<br />3 to 7 secondsto load<br />
  45. 45. Case<br />Define goals<br />Define metrics<br />Analizemetrics<br />Improveone at time<br />
  46. 46. Case: Goals<br />Time to load < 1 second<br />Everythingworks<br />
  47. 47. Case: Metrics<br />Time tosessionsetup<br />Validatingto Facebook<br />GettingFriendsInformation<br />Lookupsto local Database (lists, items, captureditems)<br />Time to load «home» page<br />Get HTML<br />Getwidgets<br />GetJavascripts<br />Getvariousgraphicassets<br />
  48. 48. Case: Analyzing Metrics<br />Time to session setup<br />Validating to Facebook (300 ms)<br />Getting Friends Information (3 sec)<br />Lookups to local Database (lists, items, captured items) (30 ms)<br />Time to load «home» page<br />Get HTML (400 ms)<br />Get widgets (300 ms)<br />Get Javascripts (300 ms)<br />Get various graphic assets (500 ms)<br />
  49. 49. Case: Facebook access<br />From<br />To<br />From 3 seconds to 500 ms!<br />
  50. 50. Case: Facebook access<br />In ASP.net we “only” have 12 threads/CPU -> Only 12 concurrent requests. From 4 users/sec to 24/sec<br />We could use asynchronous calls but:<br />Low parallelism, if we don’t know the GetAppUsers, we can’t ask for GetUserInfo, so no speedup<br />We could increase the default #threads to another number (.NET 4.0 defaults at 5000/CPU)<br />We can get fail resiliency adjusting timeouts and increasing threads, connections, and so on<br />
  51. 51. Case: Leveraging “free” tools<br />Set future Expires on static files<br />Users leverage their browser’s cache and are lighter at server’s side<br />Use “free” CDN to get Jquery et Al.<br />Microsoft and Google provide a public and free repository of Javascript tools<br />Use CSS sprites<br />Although graphic files are small, they need a TCP connection to retrieve. Combining most graphic assets in a big file and use CSS to select which one to show<br />#nav li a {background-image:url('../img/image_nav.gif')} <br />#nav li a.item1 {background-position:0px 0px} <br />#nav li a:hover.item1 {background-position:0px -72px}<br />
  52. 52. Case: more on Sprites<br />Avg size 2KB/file<br />HTTP/1.1 (rfc 2616) suggests that browsers download no more than 2 components in parallel per hostname<br />Small files doesn’t use all available bandwidth. TCP Slow Start…<br />Latency also plays an important role<br />
  53. 53. Aboutthissession<br />Sergi Morales, Founder & CTO of Expertos en TIPhone: +34 6688-XPNTIEmail : sergi.morales+eedc@expertosenti.comBlog : http://blog.expertosenti.com<br /> Web: http://www.expertosenti.com<br />Expertos en TI: We help Internet oriented projects to leverage all the research done by the big sites (Flickr, Facebook, Twitter, Salesforce, Google, and so on) so they can improve their bottom line and be prepared for growth<br />
  54. 54. About the EEDC course <br />34330 Execution Environments for Distributed Computing (EEDC), Master in Computer Architecture, Networks and Systems (CANS) Computer Architectura Department (AC)<br /> Universitat Politècnica de Catalunya – Barcelona Tech (UPC) ECTS credits: 6<br />INSTRUCTOR<br /> Professor Jordi TorresPhone: +34 93 401 7223 Email : torres@ac.upc.eduOffice : Campus Nord, Modul C6. Room 217.<br /> Web: http://www.JordiTorres.org<br />
  55. 55. 34330<br />EEDC<br />Execution<br />Environments for <br />Distributed <br />Computing<br />Sergi Morales<br />Founder & CTO<br />T: 668897684<br />E: sergi.morales@expertosenti.com<br />L: www.linkedin.com/in/sergimorales<br />Master in Computer Architecture, Networks and Systems - CANS<br />
  56. 56. Case<br />Asynchronous access to Facebook API server<br />Expect to fail<br />Tables with so many rows, a key/value approach<br />Consistent hashing to loadbalance data<br />Sticky servers?<br />

×