Sudden Impact - Designing LAMP Applications for High Loads

3,014 views

Published on

Learn to design and optimize web applications that won\'t buckle under the stress of high concurrency. Includes a great tutorial on Memcache and utilizing different caching policies for different scenarios. Also includes some basic optimization techniques for data structure design.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,014
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Sudden Impact - Designing LAMP Applications for High Loads

  1. 1. Sudden Impact Designing LAMP Applications for High Loads
  2. 2. In this Presentation <ul><li>What are we trying to do? </li></ul><ul><li>Preparing to design a high-load app </li></ul><ul><li>Understanding the layers of your application </li></ul><ul><li>Designing an optimized data structure </li></ul><ul><li>Choosing a caching policy </li></ul><ul><li>Dealing with hotspots </li></ul><ul><li>Bringing it all together </li></ul>
  3. 3. Who am I? <ul><li>C. Filipe Medeiros </li></ul><ul><li>Senior PHP Engineer at Gaia Interactive Inc. since July 2006 </li></ul><ul><li>Focus on Events and Sponsorships </li></ul>
  4. 4. What is a Gaia Event? <ul><li>Introduce temporary gameplay or behavior to the site for a short period of time </li></ul><ul><li>High profile, high load applications </li></ul><ul><li>A true test of scalability </li></ul>
  5. 5. Example: Gaia Prom <ul><li>7 day long event </li></ul><ul><li>DDR-style prom game w/ leaderboards </li></ul><ul><li>530,000 players </li></ul><ul><li>5 million plays </li></ul><ul><li>For the Dance Game alone: </li></ul><ul><ul><li>3-5 million INSERTs </li></ul></ul><ul><ul><li>20-25 million UPDATEs </li></ul></ul><ul><li>300k event forum posts </li></ul><ul><li>Relatively low-load event </li></ul>
  6. 6. Example: Gaia Prom – Flash Hangouts
  7. 7. Example: Gaia Prom - Dance Game
  8. 8. What are we trying to do?
  9. 9. The Problem <ul><li>High load applications tax important systems causing bottlenecks that not only affect the application itself, but the rest of your site: </li></ul><ul><ul><li>Databases (systems, particular tables, and even particular rows) </li></ul></ul><ul><ul><li>Firewalls & load balancers </li></ul></ul><ul><ul><li>Web, Database, and Graphic servers (CPUs and storage space) </li></ul></ul><ul><ul><li>Bandwidth limitations </li></ul></ul>
  10. 10. Possible Characteristics of a High Load Web Application <ul><li>High-profile applications </li></ul><ul><li>Lots of data processed / created / updated, for example: </li></ul><ul><ul><li>frequent database queries </li></ul></ul><ul><ul><li>image compilation </li></ul></ul><ul><ul><li>generating/sending mass e-mails </li></ul></ul><ul><li>Lots of requests: database, http, etc. </li></ul><ul><li>Large number of end-users </li></ul><ul><li>Recruits many subsystems </li></ul>
  11. 11. Our Example Gaia’s Halloween Hysteria Event <ul><li>Running from Oct 23 rd to Oct 31 st </li></ul><ul><li>Web-based multiplayer RPG-style battle-royale forum game </li></ul><ul><li>Tens of thousands of users causing 2-4 data updates every 1–10 seconds for event gameplay alone </li></ul><ul><li>Potentially large amounts of processing by PHP and MySQL </li></ul>
  12. 12. Choosing a Team
  13. 13. The Event Page
  14. 14. Forums / Battlegrounds
  15. 15. Attack / Aiding Another Player
  16. 16. Attacking / Aiding Another Player (cont…)
  17. 17. Preparing to Design an Optimized App or “ Predicting the Future for Fun and Profit”
  18. 18. Preparing to Design an Optimized App: Gathering Requirements & Use Cases <ul><li>Collect as much data as you can on how the app might be used. </li></ul><ul><li>Your app will evolve, but early research and planning will help you avoid headaches later </li></ul><ul><li>Rookie Mistake: Designing on assumption </li></ul>
  19. 19. Preparing to Design an Optimized App: Gathering Use Cases A Simple Example From the Event <ul><li>When a player attempts a game action: </li></ul><ul><ul><li>If they have not waited long enough since their last action, throw an error </li></ul></ul><ul><ul><li>Else, allow them to do the action, and store the current timestamp for the next check </li></ul></ul><ul><ul><li>Show the user how much time until they can act again </li></ul></ul>
  20. 20. Preparing to Design an Optimized App: Gathering Use Cases A More Complex Example From the Event <ul><li>When a player chooses to use an ability against an enemy </li></ul><ul><ul><li>Player clicks “Attack” button next to an enemy target </li></ul></ul><ul><ul><li>Check for action timeout </li></ul></ul><ul><ul><li>Player chooses an ability and hits the “Do it” button to execute it </li></ul></ul><ul><ul><li>If the ability cost Energy to use, deduct this from the Player </li></ul></ul><ul><ul><li>Calculate the damage the ability caused (taking into account the attacker’s attack strength vs. the defender’s defense strength) </li></ul></ul><ul><ul><li>If damage is positive, apply this damage to the defender </li></ul></ul><ul><ul><li>If the defender dropped below 0 health, flag them as dead </li></ul></ul><ul><ul><li>Add Experience points to the attacker based on the attack used. Add 5 points of experience if they also managed to kill the defender. </li></ul></ul><ul><ul><li>If the attacker reaches a new level as a result of gaining experience, grant them the item and new abilities for the next level </li></ul></ul>
  21. 21. Preparing to Design an Optimized App: Roughing the Data Structure <ul><li>Decide what data you need </li></ul><ul><li>Sketch out a rough data structure </li></ul><ul><li>Use cases, access patterns, estimated usage, and other tools will help us refine this </li></ul>
  22. 22. Preparing to Design an Optimized App: Roughing the Data Structure Examples From the Event <ul><li>Halloween_Player </li></ul><ul><li>user_id (int, primary) </li></ul><ul><li>level (int) </li></ul><ul><li>current_faction_id (int) </li></ul><ul><li>current_location_id (int) </li></ul><ul><li>experience (int) </li></ul><ul><li>current_health (int) </li></ul><ul><li>max_health (int) </li></ul><ul><li>current_energy (int) </li></ul><ul><li>max_energy (int) </li></ul><ul><li>attack_strength (int) </li></ul><ul><li>defense_strength (int) </li></ul><ul><li>next_action_timestamp (timestamp) </li></ul><ul><li>Halloween_Log </li></ul><ul><li>id (int, primary) </li></ul><ul><li>user_id (int) </li></ul><ul><li>origin_user_id (int) </li></ul><ul><li>action_type (int) </li></ul><ul><li>action_value (int) </li></ul><ul><li>action_timestamp (timestamp) </li></ul>
  23. 23. Preparing to Design an Optimized App: Roughing Your Interfaces <ul><li>Put together a rough list of pages that need to be mocked, and the data that is going to be read/shown/updated/inserted when using that interface. </li></ul><ul><li>Useful for: </li></ul><ul><ul><li>UI designers </li></ul></ul><ul><ul><li>Getting a jumpstart on the business logic </li></ul></ul><ul><ul><li>Understanding how often/where data is getting accessed </li></ul></ul>
  24. 24. Preparing to Design an Optimized App: Roughing Your Interfaces Examples From the Event Main Event Page Halloween_Player Data: Level Faction Current/Max Health Current/Max Energy Experience Points Attack/Defense Strength Halloween_Log Data: A list of the last 15 log entries User Data: A list of usernames for each user who created a log entry for this user Game Forum Map Link to Comic Link to Trick or Treating Link to How to Play Action Select Page Halloween_Player Data: Energy of the attacker Time until next action Health of defender Halloween_Ability Data: A list of abilities available to the attacker User Data: The username of the defender A button to initiate the action Forum HUD Halloween_Player Data: Level Faction Current/Max Health Current/Max Energy A list of 15 targets in that location User Data: The usernames of each user in the target list Buttons to attack/aid each enemy/ ally
  25. 25. Preparing to Design an Optimized App: Understand the Layers of Your Application <ul><li>All layers of your app are places to spread load as well as potential hotspots </li></ul><ul><li>Identify them and how they will relate/interact with your application </li></ul><ul><ul><li>Client (Web Browser / Flash) </li></ul></ul><ul><ul><li>Database (e.g., MySQL, Postgres) </li></ul></ul><ul><ul><li>Memcache </li></ul></ul><ul><ul><li>PHP / Web Servers </li></ul></ul>
  26. 26. Preparing to Design an Optimized App: Identify Involved Systems <ul><li>If you’re designing a web app, your application might utilize existing domain objects and systems from around your site, e.g., systems to manage: </li></ul><ul><ul><li>Authentication / Authorization </li></ul></ul><ul><ul><li>User management </li></ul></ul><ul><ul><li>Forum management </li></ul></ul><ul><ul><li>Friend management </li></ul></ul><ul><ul><li>Private messaging </li></ul></ul><ul><ul><li>User Generated Content </li></ul></ul><ul><ul><li>Instant Messaging </li></ul></ul><ul><ul><li>Etc. </li></ul></ul><ul><li>Understand the extent to which these systems will be used </li></ul>
  27. 27. Preparing to Design an Optimized App: Identify Involved Systems Examples From the Event <ul><li>User Manager: Shared class for caching, retrieving, and updating user data. </li></ul><ul><li>Forums: The game creates load on the forums and encourages forum posting, which can be a database-intensive task. </li></ul><ul><li>Avatar Compiler: When you choose a faction, we recompile your avatar to have that faction’s “skin”. This can be a very hefty process to do on a large scale. </li></ul>
  28. 28. Preparing to Design an Optimized App: Estimating Potential Usage <ul><li>Estimate number of clients for your app (whether end-users, cronjobs, etc.) </li></ul><ul><li>Estimate the number of INSERTs and UPDATEs for each of your data structures </li></ul><ul><li>Estimate the total size (in rows) of each of your tables and indexes </li></ul><ul><li>Consider the number of round-trips you’re going to be making (to cache, web servers, graphics servers, database servers, etc.) </li></ul><ul><li>Good rule of thumb: The fewer round trips, the better! </li></ul>
  29. 29. Preparing to Design an Optimized App: Ultimate Goal: Identify Potential Hotspots <ul><li>Now that you have a very rough snapshot of how your application is going to be used, you can begin to identify areas where bottlenecks may occur. </li></ul>
  30. 30. Preparing to Design an Optimized App: Identify Potential Hotpots Examples From the Event <ul><li>We knew that Avatar Recompiling would happen fairly frequently, and since we’d be putting unusual load on this system, it would have to be dealt with. </li></ul><ul><li>Getting relevant targets at the top of the forum list would require a very hefty query that needed to be done fairly frequently. We knew we needed to carefully plan indexes and a good caching strategy to prevent those queries from killing our databases. </li></ul>
  31. 31. Understanding The Layers of Your Application or “ Caramelizing the Electric Onion”
  32. 32. Understanding The Layers of Your Application: Overview Client Web Browser / Flash PHP / Web Servers Memcache Database
  33. 33. Understanding The Layers of Your Application: Overview Client Web Browser / Flash PHP / Web Servers Memcache Database
  34. 34. Understanding The Layers of Your Application: The Client <ul><li>At the highest level of an application is the Client and Frontend (e.g., Web Browser, Flash) </li></ul><ul><ul><li>PROS: </li></ul></ul><ul><ul><ul><li>Any work done here happens on the client’s machine </li></ul></ul></ul><ul><ul><ul><li>No chance for collision with processes happening on other clients </li></ul></ul></ul><ul><ul><ul><li>First line of defense against load by potentially preventing requests to the backend </li></ul></ul></ul><ul><ul><li>CONS: </li></ul></ul><ul><ul><ul><li>Untrustable data </li></ul></ul></ul><ul><ul><ul><li>Limited processing potential for many high-level apps </li></ul></ul></ul>
  35. 35. Understanding The Layers of Your Application: The Client Examples From the Event <ul><li>Showed action timers before users could take their next action to reduce number of page reloads. </li></ul><ul><li>Preloaded data in hidden divs on pages (even if it will go unviewed) to prevent multiple page requests. </li></ul><ul><li>Designed interfaces to require as few page-loads as possible (i.e., reduce multi-interface processes) </li></ul>
  36. 36. Understanding The Layers of Your Application: Overview Client Web Browser / Flash PHP / Web Servers Memcache Database
  37. 37. Understanding The Layers of Your Application: PHP / Web Servers <ul><li>The next in line is your PHP code </li></ul><ul><ul><li>PROS: </li></ul></ul><ul><ul><ul><li>Work done here is relatively fast </li></ul></ul></ul><ul><ul><ul><li>Little chance of collisions with other PHP processes (not including DB / Memcache / Filesystem access) </li></ul></ul></ul><ul><ul><li>CONS: </li></ul></ul><ul><ul><ul><li>Relies heavily on best practices </li></ul></ul></ul><ul><ul><ul><li>Common place to overload memory </li></ul></ul></ul><ul><ul><ul><li>Common place for generating fatal errors </li></ul></ul></ul>
  38. 38. Understanding The Layers of Your Application: PHP / Web Servers Examples From the Event <ul><li>Define small, infrequently changing data sets in your PHP rather than at the database level (e.g., action types, level progression data) </li></ul><ul><li>Join smaller data sets at the PHP layer rather than doing cross-server joins at the Database layer (e.g. joining User data to Halloween_Player rows) </li></ul><ul><li>Validate as much as possible upfront to avoid querying the database only to have an exception occur down the line. </li></ul>
  39. 39. Understanding The Layers of Your Application: Overview Client Web Browser / Flash PHP / Web Servers Memcache Database
  40. 40. Understanding The Layers of Your Application: Memcache <ul><li>The final line of defense against load on the database is Memcache </li></ul><ul><ul><li>PROS: </li></ul></ul><ul><ul><ul><li>Fairly fast </li></ul></ul></ul><ul><ul><ul><li>Extremely versatile, and a great solution to a number of optimization problems </li></ul></ul></ul><ul><ul><li>CONS: </li></ul></ul><ul><ul><ul><li>Most efficient uses of it require careful management </li></ul></ul></ul><ul><ul><ul><li>Each write/read from Memcache requires a trip between servers </li></ul></ul></ul>
  41. 41. Understanding The Layers of Your Application: Overview Client Web Browser / Flash PHP / Web Servers Memcache Database
  42. 42. Understanding The Layers of Your Application: Database <ul><li>Often the most problematic and difficult to optimize aspect of your application, the database: </li></ul><ul><ul><li>PROS: </li></ul></ul><ul><ul><ul><li>DB systems tend to be extensively developed to do complex data retrieval operations as fast as possible </li></ul></ul></ul><ul><ul><li>CONS: </li></ul></ul><ul><ul><ul><li>Requires knowledge of best practices and good foresight of access patterns </li></ul></ul></ul><ul><ul><ul><li>Most difficult layer to distribute load </li></ul></ul></ul><ul><ul><ul><li>Most potential for user actions to collide and interrupt each others’ actions at this layer </li></ul></ul></ul>
  43. 43. Maximizing Memcache’s Potential or “ Why Haven’t I Been Using This For Years?”
  44. 44. A Quick Review of Memcache <ul><li>“ memcached” is a distributed memory object caching system. Consists of: </li></ul><ul><ul><li>The memcached daemon which can run across multiple servers </li></ul></ul><ul><ul><li>PHP’s memcache interface for writing to and reading from memcache </li></ul></ul>
  45. 45. Caching Policies and You <ul><li>Memcache is intentionally very primitive, which allows PHP developers to leverage it for some tremendous tasks. </li></ul><ul><li>There are numerous ways to use Memcache effectively depending on the access patterns of the data you’re caching. </li></ul>
  46. 46. Choosing a Caching Policy <ul><li>Things to consider for a particular dataset or process: </li></ul><ul><ul><li>Data Freshness – how current does read data need to be? </li></ul></ul><ul><ul><li>Risk / Risk Window – how bad would losing data be? </li></ul></ul><ul><ul><li>Load Displacement – how much of a benefit does it actually have in reducing load? </li></ul></ul>
  47. 47. Caching Policy #1: No Write Allocation
  48. 48. Caching Policy #1: No Write Allocation <ul><li>Advantages: </li></ul><ul><ul><li>Requires very little management, easy to maintain </li></ul></ul><ul><ul><li>Often sufficient for read-only data </li></ul></ul><ul><li>Disadvantages: </li></ul><ul><ul><li>Potential for very stale data </li></ul></ul><ul><ul><li>Inappropriate for frequently changing data sets </li></ul></ul><ul><ul><li>Suffers from “Thundering Herd” problems </li></ul></ul>
  49. 49. Caching Policy #1: No Write Allocation Examples from the Event <ul><li>Ability Data </li></ul><ul><ul><li>Almost never changes </li></ul></ul><ul><ul><li>Low risk window: 30 minutes is acceptable for staleness, no major consequences if stale data is used </li></ul></ul><ul><ul><li>Low risk if data is stale: ability descriptions and energy costs aren’t adjusted </li></ul></ul><ul><ul><li>Read-only: Never written/updated via the app </li></ul></ul>
  50. 50. Caching Policy #2: Delete Cache on Write
  51. 51. Caching Policy #2: Delete Cache on Write <ul><li>Advantages: </li></ul><ul><ul><li>Data is always fresh </li></ul></ul><ul><ul><li>Relatively easy to manage with a good design </li></ul></ul><ul><li>Disadvantages: </li></ul><ul><ul><li>Still suffers from “Thundering Herd” problems </li></ul></ul>
  52. 52. Caching Policy #2: Delete Cache on Write Examples from the Event <ul><li>Faction Archive Data </li></ul><ul><ul><li>No real “Thundering Herd” issue because of access pattern: </li></ul></ul><ul><ul><ul><li>Infrequently read </li></ul></ul></ul><ul><ul><ul><li>Each row only accessed by a single end user </li></ul></ul></ul><ul><ul><li>Infrequent writes </li></ul></ul>
  53. 53. Caching Policy #3: Write Through Cache
  54. 54. Caching Policy #3: Write Through Cache <ul><li>Advantages: </li></ul><ul><ul><li>Data is always fresh </li></ul></ul><ul><ul><li>Very infrequent database reads </li></ul></ul><ul><ul><li>Avoids “Thundering Herd” problem </li></ul></ul><ul><li>Disadvantages: </li></ul><ul><ul><li>Few </li></ul></ul><ul><ul><li>Sloppy development can potentially lead to a difference between cache and the database </li></ul></ul>
  55. 55. Caching Policy #3: Write Through Cache Examples from the Event <ul><li>Player Data </li></ul><ul><ul><li>Access Pattern </li></ul></ul><ul><ul><ul><li>Frequent reads: Potential thundering herd problem if cache expires, so keeping a long-lived cache is ideal </li></ul></ul></ul><ul><ul><ul><li>Frequent writes: Deleting cache rather than updating would almost be pointless since the new cache would only live a few moments </li></ul></ul></ul><ul><ul><li>Risk is high if this data is stale </li></ul></ul>
  56. 56. Caching Policy #4: Write Behind Cache
  57. 57. Caching Policy #4: Write Behind Cache <ul><li>Advantages: </li></ul><ul><ul><li>All the benefits of “Write Through Cache” </li></ul></ul><ul><ul><li>Greatly reduces database writes as well as reads </li></ul></ul><ul><li>Disadvantages: </li></ul><ul><ul><li>Potential for data loss </li></ul></ul><ul><ul><li>Should only be used when risk window is relatively short and risk of data loss is low. </li></ul></ul>
  58. 58. Caching Policy #4: Write Behind Cache Examples from the Event <ul><li>Player Data </li></ul><ul><ul><li>Frequent database writes: we were seeing hundreds to thousands of database writes a second, which was a major hotspot </li></ul></ul><ul><li>Solution: Write to DB only after 60 or more seconds have expired, and never write to DB for non-essential data (e.g., health/energy updates) </li></ul><ul><ul><li>Risk / Risk window low: losing 60 seconds of data in a worst-case-scenario doesn’t disrupt the app too much </li></ul></ul><ul><ul><li>Stored “last_write_time” in memcache-only: If that cached failed, just write again and update </li></ul></ul>
  59. 59. Good Practices for Caching <ul><li>Strong Domain/Business Objects: </li></ul><ul><ul><li>Reduce number of areas where data is written </li></ul></ul><ul><ul><li>Provide a very easy interface for managing their own data caches </li></ul></ul><ul><ul><li>Provide interfaces which handle as many Use Cases as possible the most efficient way possible </li></ul></ul><ul><ul><li>Favor immutability </li></ul></ul>
  60. 60. Good Practices for Caching <ul><li>The longer your cache can live and still be useful, the better. </li></ul><ul><li>If risk of staleness is low, store joined data in a single cache – even if both data sources are updated independently. </li></ul><ul><li>Do multi-gets from cache whenever possible to reduce round trips from the server </li></ul><ul><li>Very short lived, low risk data can be stored exclusively in cache (e.g., flood control timers) </li></ul>
  61. 61. Optimizing Your Data Structure
  62. 62. Getting Specific: Data Structure Documentation
  63. 63. Getting Specific: Good Practices with Data Structure <ul><li>Use the right data types! </li></ul><ul><ul><li>Know how large a type you need. Don’t use a float when you need an int , don’t use an int when all you need a tinyint . </li></ul></ul><ul><ul><li>If you don’t need to store negative numerical types, always use UNSIGNED. It doubles the amount of data you can store in that field. </li></ul></ul><ul><ul><li>Use TEXT / BLOB fields sparingly – looking them up requires separate lookups on the database engine side </li></ul></ul><ul><ul><li>If you’re only going to be storing ‘0’ and ‘1’, make your field size 1 (e.g., tinyint(1) ). </li></ul></ul><ul><ul><li>varchar(255) will always use at least 255 bytes, even if you store only 1 character in it (dependent on storage engine) </li></ul></ul><ul><ul><li>If a field is a foreign key to another table, make sure the matching field is of the same type to avoid type conversion processing! </li></ul></ul>
  64. 64. Getting Specific: Good Practices with Data Structure <ul><li>Choose the right storage engine </li></ul><ul><ul><li>InnoDB is transactional and supports foreign keys – MyISAM is not </li></ul></ul><ul><ul><li>InnoDB supports row-level locking, MyISAM uses table locks </li></ul></ul><ul><ul><li>Different storage engines are faster with different access patterns </li></ul></ul>
  65. 65. Getting Specific: Data Structure Documentation
  66. 66. Getting Specific: Data Structure Documentation <ul><li>It may sound like a headache to know all the queries you will be using, but it will save you much larger headaches later. </li></ul><ul><li>Know your queries ahead of time is integral to tight data structure design. </li></ul><ul><li>You’ll be surprised: You might not realize just how much work you’re doing until you map out the work that’s being done. </li></ul><ul><li>Data structure design documents are a great tool for communicating to other developers, DB admins, and operations folk. </li></ul>
  67. 67. Getting Specific: Good Practices with Data Structure <ul><li>Keep your indexes few and useful </li></ul><ul><ul><li>More indexes is not always better – they can slow down INSERTs and UPDATEs </li></ul></ul><ul><ul><li>Base indexes off of actual queries you will be running. Adding queries without considering indexes is easy, but can be very dangerous. </li></ul></ul><ul><ul><li>Index order matters! If you’re querying WHERE field1 = ‘foo’ AND field2 = ‘bar’ then make your index on (field1, field2) and make sure your queries adhere. (Dependent MySQL on storage engine) </li></ul></ul>
  68. 68. More Examples of Addressing Hotspots
  69. 69. Addressing Hotposts An Example from the Event Target Problem <ul><li>Querying for a list of targets in a particular area is a relatively hefty operation every 15 seconds: </li></ul><ul><li>A RAND() against any dataset can be scary </li></ul><ul><li>We also have to pull the username and gender for each user in this table afterwards. </li></ul>SELECT * FROM halloween2008_player WHERE current_forum_id = $forum_id ORDER BY RAND() LIMIT 15;
  70. 70. Addressing Hotposts An Example from the Event Addressing the Target Problem <ul><li>Reduce number of round trips: Instead of pulling the 15 targets we plan to display every 15 seconds, do a single query for 300 targets every 90 seconds and cache this </li></ul><ul><li>Offload work onto PHP: Use PHP to pull 15 random targets from the cached list of 300. This data will still appear fresh. </li></ul><ul><li>Store joins in Cache : Instead of just storing the Halloween2008_Player data in cache, we pull the data, query-and-join it with the user data (username/gender). It’s unlikely their username or gender will become stale in those 90 seconds. </li></ul>
  71. 71. Addressing Hotposts An Example from the Event Avatar Resave Problem <ul><li>Avatar resaving requires complex image compilation, and is a very hefty process. </li></ul><ul><li>On joining the event, 75% of users will instantly resave their avatar (unless they chose the “Human” faction) </li></ul>
  72. 72. Addressing Hotposts An Example from the Event Addressing the Avatar Resave Problem <ul><li>Soft Launching: Release the app early to a smaller userset, and don’t announce it until hotspots were addressed. </li></ul><ul><li>Flood Control: Used memcache keys (and the database) to restrict faction changes within an hour of each other. </li></ul><ul><li>Offload Recompiling Work: Originally avatar resaving was happening on the web servers. After we identified that the first two solutions alone didn’t do the trick, we made the actual avatar recompilation happen on the servers that serve up avatars (which were more idle than our main web servers). </li></ul>
  73. 73. Addressing Hotposts Don’t Keep Hotspots a Secret! <ul><li>If you anticipate a hotspot, even if you think you have it under control, let your sysadmins/operations people know. </li></ul><ul><li>Even after the best design, sometimes “more servers” is a necessary solution. </li></ul>
  74. 74. Bringing it All Together
  75. 75. Early Design <ul><li>Predict but don’t restrict: Early designs will change, but good planning prevents problems. </li></ul><ul><li>Rough it out: Wireframes, rough data structures, and lists help individuals and teams understand a design. </li></ul><ul><li>Know your access patterns: You won’t know them exactly until your app is released, but it will reveal hotspots at the design phase. </li></ul>
  76. 76. Data Structure <ul><li>Design to your data structure: A good database design often reveals how your app works. </li></ul><ul><li>The right tool for the job: Use appropriate database engines and field types. </li></ul><ul><li>Design indexes to queries: Know your queries and then implement indexes to them, not vice versa. </li></ul><ul><li>Know your queries: It only takes one bad query to take down a website. </li></ul>
  77. 77. Understand Your Environment <ul><li>Understand where processing will have the most impact: Clientside is often lightest, Database is often heaviest, and there are many places in between. </li></ul><ul><li>You have many tools to prevent load, use them: If you can use methods in the client or PHP to prevent load to Memcache or the Database, do so! </li></ul><ul><li>Avoid round trips: Every trip away from the server adds wait times, load on the network layer, and load on multiple resources. </li></ul>
  78. 78. Caching <ul><li>USE CACHING : This cannot be stressed enough. It is one of the most powerful tools against load in many high-volume web apps. </li></ul><ul><li>Choose an appropriate caching policy: Err in favor of longer caches and fewer database hits as much as possible/reasonable. </li></ul>

×