Real Life Scaling:
      A Tale of Two Websites

By Justin Carmony - Utah Open Source Conference 2009
“It was the best of times, it was
the worst of times...”
           - A Tale of Two Cities, Charles Dickens
About Me
• Website Hobbyist since 1997
• Professional Since 2005
• Worked in Large Open Source & Proprietary Solutions
• F...
This Presentation
• Point You In The Right Direction
 ‣ Not a Substitute for Research & Homework
• Developers Perspective,...
The Scaling Conundrum
  Because No One Likes Premature Escalation
The Scale of Scaling
         Traffic
The Scale of Scaling
    # of Websites   Traffic
The Scale of Scaling
    # of Websites   Traffic
The Problem:
Many Developers Skip “Medium”
   and go Straight to “XXL”
The Solution:
Implement Reasonable Solutions
    For Your Website’s Size
Scalability vs Performance
    The Difference Is Important... Yet Useless
Performance Before Scaling
        Except With Limitations
“Zero Theory”
“Zero Theory”
200 Current Visitors
500 New Sign-Ups
300 Messages Sent
1,000 Users
10,000 Users
100,000 Users
“Zero Theory”
200 Current Visitors   2,000 Current Visitors
500 New Sign-Ups       5,000 New Sign-Ups
300 Messages Sent   ...
Tale #1: Cyber Evolution
Website: www.cevo.com

Online Gaming League

Co-Developed w/ Eric Ping

Communication Between Man...
Tale #2: Dating DNA
Website: http://www.datingdna.com

Online Free Dating Service

#1 iPhone Dating App

Overnight 1,000% ...
#1 Challenge: The Database
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related

•   Issues Aren’t Noticeable Until
   ...
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related

•   Issues Aren’t Noticeable Until
   ...
Database Abuse!
Just Because You Can, Doesn’t Mean You Should
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELET...
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELET...
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELET...
Example: Custom Forums
•   In Our Defense, We Made This In
    About 24 Hours

•   Nested Queries = 1000+ Queries
    Per ...
Example: Standings Page
• Before:
 ‣ Nested Queries To Determine
   Stats Each Row
 ‣ Regenerated Every Time Match
   Upda...
MySQL Locks
• Problem: Popular Tables Using MyISAM
• Solutions:
 ‣ Switched to InnoDB
 ‣ Optimized Logic to Reduce Writes
...
Database Replication
Complicated, Major Trade-Offs, Requires Preparation
Open APIs
•   Can Get Heavily Abused

•   Can Grow Unexpectedly

•   Must Be Extremely Efficient

•   CEVO’s APIs
    ‣ 100...
Identifying What’s Unique
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”

•   Important to Identify...
X 2      -X
Dating DNA’s Compatibility Score Growth

            X = # of Users
# of Score Records
# of Score Records
10 Users
200 Users
5,000 Users
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users               90 Records
200 Users
5,000 Users
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users                90 Records
200 Users            39,800 Records
5,000 Users
25,000 Users
300,000...
# of Score Records
10 Users                  90 Records
200 Users             39,800 Records
5,000 Users        24,995,000...
# of Score Records
10 Users                   90 Records
200 Users              39,800 Records
5,000 Users         24,995,...
# of Score Records
10 Users                     90 Records
200 Users                39,800 Records
5,000 Users          24...
# of Score Records
10 Users                     90 Records
200 Users                39,800 Records
5,000 Users          24...
How We Solved It
• Introduce Limits on Records per User
• Only Save Decent Scores
• Smart Logic to Predetermine Good Match...
User Uploaded Content
• Multiple Aspects to Consider
 ‣ Storage (File Size)
 ‣ Serving (File Sizes, Bandwidth, Redundancy)...
Memcached
• Scales Much Easier than Traditional Databases
• Reduced Load Off Databases by 70%
• Implement Quickly -- Its A...
Hardware
• Be careful of “Throwing Hardware” at Problems
• Run on Realistic Hardware
• Developers, You Need To Be Cost Con...
Other Challenges
•   Apache Configurations
    ‣ MaxClient Limits Reached

•   MySQL Configurations
    ‣ Max Connections Re...
Tools I’ve Used
• MySQL - JetProfiler (No FOSS, Hopefully One Day)
• Server Performance - top, htop, atop, iotop
• PHP - Ze...
So... Where’s The Scaling?
Getting Ready To Scale
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Kee...
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Kee...
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Kee...
Makes Components of
    Dating DNA
Makes Components of
    Dating DNA


       Dating DNA
       Application
Makes Components of
             Dating DNA
   Jobs / Cron                   User Uploaded
     System                    ...
Makes Components of
             Dating DNA
   Jobs / Cron                   User Uploaded
     System                    ...
Makes Components of
             Dating DNA
   Jobs / Cron                     User Uploaded
     System                  ...
Scaling Components
Scaling Components


Web Application




  Server #1
Scaling Components
Comp E1

Comp D1

Comp C1

Comp B1

Comp A1


Server #1
Scaling Components
Comp E1

Comp D1

Comp C1

Comp B1

Comp A1


Server #1      Server #2
Scaling Components


Comp C1

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2
Scaling Components


Comp C1        Comp A2

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2
Scaling Components


Comp C1        Comp A2

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2   S...
Scaling Components


Comp C1        Comp A2     Comp C2

Comp B1        Comp E1     Comp B2

Comp A1        Comp D1     Co...
Scaling Components


Comp C1        Comp A2     Comp C2

Comp B1        Comp E1     Comp B2     Comp B3

Comp A1        Co...
Scaling Components
• Should Be Able to “Live” With Each Other
• Deployment & Management all Automated
 ‣ Not Necessarily “...
Scaling Web Servers
• Challenges                      Load Balancer
 ‣ Routing
 ‣ Sessions
• Ideas                 Web Ser...
Asynchronous Score
          Generation
• Pass Message To “Score Server”
 ‣ Pass User ID Only
• Quickly Generate a Small S...
The Open Source Advantage
• Choice & Options w/ Cost
• Solid Applications, Tested & Proven
• Great Communities
• Give Back...
Last Thoughts
• Once again, K.I.S.S
• Proactive > Reactive
• Monitoring, Monitoring, Monitoring!
• The “Cloud”
• Get Advic...
Questions?
Thank You
Thank You
Website   www.justincarmony.com

  Email   justin@justincarmony.com

Twitter   JustinCarmony

 Skype    JustinCa...
Upcoming SlideShare
Loading in …5
×

Real Life Scaling: A Tale of Two Websites

2,822 views

Published on

Scaling is a real issue for many websites. However, many developers try to implement solutions used by the giants of the internet: Google, Facebook, Twitter, WordPress, etc. However, many of these techniques are designed for very unique and high demand situations. Implementing these techniques prematurely to smaller websites can lead to overly-complex solutions that can be difficult to manage, and even hinder progress.

See real life examples of from two popular websites: cevo.com and datingdna.com. Learn what challenges and bottlenecks they faced, and the solutions they created to overcome them. Find out what optimizations and implementations returned the greatest ROI for scaling with the project.

CEVO is an online gaming league that hosts thousands of matches for hundreds of video game events. They have contracted with DirectTV, Dell, and other companies to offer branded gaming events. Because the company is run by a staff spread across the USA and Canada, the website runs 100% of their business. Hundreds of gaming servers and player clients communicate with CEVO’s APIs to get real-time information. Learn how they’ve handled the ever increasing load on their servers.

Dating DNA is an online dating community that integrates with today’s popular social networking sites. The site has exploded since it released its iPhone App, which is currently the #1 iPhone Dating App. Learn how Dating DNA has scaled from having daily user sign-ups increase 1000% in a matter of weeks.

This presentation will be given by Justin Carmony, a lead developer for both projects.

Published in: Technology
  • Be the first to comment

Real Life Scaling: A Tale of Two Websites

  1. 1. Real Life Scaling: A Tale of Two Websites By Justin Carmony - Utah Open Source Conference 2009
  2. 2. “It was the best of times, it was the worst of times...” - A Tale of Two Cities, Charles Dickens
  3. 3. About Me • Website Hobbyist since 1997 • Professional Since 2005 • Worked in Large Open Source & Proprietary Solutions • Full Time Contractor & Consultant • Sponsorship Manager for UTOSC • Blog: http://www.justincarmony.com/blog/
  4. 4. This Presentation • Point You In The Right Direction ‣ Not a Substitute for Research & Homework • Developers Perspective, Not So Much Sys Admin • Q & A Session Afterwards • Available for Questions Remainder of Conference • Ask via Email: justin@justincarmony.com
  5. 5. The Scaling Conundrum Because No One Likes Premature Escalation
  6. 6. The Scale of Scaling Traffic
  7. 7. The Scale of Scaling # of Websites Traffic
  8. 8. The Scale of Scaling # of Websites Traffic
  9. 9. The Problem: Many Developers Skip “Medium” and go Straight to “XXL”
  10. 10. The Solution: Implement Reasonable Solutions For Your Website’s Size
  11. 11. Scalability vs Performance The Difference Is Important... Yet Useless
  12. 12. Performance Before Scaling Except With Limitations
  13. 13. “Zero Theory”
  14. 14. “Zero Theory” 200 Current Visitors 500 New Sign-Ups 300 Messages Sent 1,000 Users 10,000 Users 100,000 Users
  15. 15. “Zero Theory” 200 Current Visitors 2,000 Current Visitors 500 New Sign-Ups 5,000 New Sign-Ups 300 Messages Sent 3,000 Messages Sent 1,000 Users 10,000 Users 10,000 Users 100,000 Users 100,000 Users 1,000,000 Users
  16. 16. Tale #1: Cyber Evolution Website: www.cevo.com Online Gaming League Co-Developed w/ Eric Ping Communication Between Many “Clients” & “Servers” Great Growth Over 4 Years Learning by Fire for Myself
  17. 17. Tale #2: Dating DNA Website: http://www.datingdna.com Online Free Dating Service #1 iPhone Dating App Overnight 1,000% Growth Continuing to Grow Rapidly Unique Scaling Challenges
  18. 18. #1 Challenge: The Database
  19. 19. #1 Challenge: The Database • 80% of Performance Issues were Database Related
  20. 20. #1 Challenge: The Database • 80% of Performance Issues were Database Related • Issues Aren’t Noticeable Until After Growth & High Load
  21. 21. #1 Challenge: The Database • 80% of Performance Issues were Database Related • Issues Aren’t Noticeable Until After Growth & High Load • Most Issues Were Due to Poor Queries and/or Poor Indexes
  22. 22. Database Abuse! Just Because You Can, Doesn’t Mean You Should
  23. 23. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs
  24. 24. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs Prevention: Learn About Database Design
  25. 25. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs Prevention: Learn About Database Design This is NOT for DBAs Only
  26. 26. Example: Custom Forums • In Our Defense, We Made This In About 24 Hours • Nested Queries = 1000+ Queries Per Page - Not Joking • 10 second Load Times • End Result: Users Hated It
  27. 27. Example: Standings Page • Before: ‣ Nested Queries To Determine Stats Each Row ‣ Regenerated Every Time Match Updated • After: ‣ Generate Every 30 Minutes ‣ Proper JOINs, Only One Query
  28. 28. MySQL Locks • Problem: Popular Tables Using MyISAM • Solutions: ‣ Switched to InnoDB ‣ Optimized Logic to Reduce Writes ‣ Reduced JOINs in Queries
  29. 29. Database Replication Complicated, Major Trade-Offs, Requires Preparation
  30. 30. Open APIs • Can Get Heavily Abused • Can Grow Unexpectedly • Must Be Extremely Efficient • CEVO’s APIs ‣ 100,000s of Player Lookups ‣ Thousands of CMN Players ‣ Match History Calls
  31. 31. Identifying What’s Unique
  32. 32. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?”
  33. 33. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?”
  34. 34. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?” • Important to Identify Early • Allows you to Plan for the Future • Many Solutions for the Common Stuff to Scale
  35. 35. X 2 -X Dating DNA’s Compatibility Score Growth X = # of Users
  36. 36. # of Score Records
  37. 37. # of Score Records 10 Users 200 Users 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  38. 38. # of Score Records 10 Users 90 Records 200 Users 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  39. 39. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  40. 40. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 300,000 Users 1,000,000 Users
  41. 41. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 1,000,000 Users
  42. 42. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 89,999,700,000 Records 1,000,000 Users
  43. 43. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 89,999,700,000 Records 1,000,000 Users One Trillion Records!
  44. 44. How We Solved It • Introduce Limits on Records per User • Only Save Decent Scores • Smart Logic to Predetermine Good Matches • Shard Table Into Multiple Tables • We’re Still Managing & Finding More Optimizations
  45. 45. User Uploaded Content • Multiple Aspects to Consider ‣ Storage (File Size) ‣ Serving (File Sizes, Bandwidth, Redundancy) ‣ Backups (Time, Speed) • Challenges ‣ Content Outgrowing Server ‣ Serving Multiple Versions
  46. 46. Memcached • Scales Much Easier than Traditional Databases • Reduced Load Off Databases by 70% • Implement Quickly -- Its Awesome! • Gave Presentation On @ UPHPU ‣ Check My Blog for Recording
  47. 47. Hardware • Be careful of “Throwing Hardware” at Problems • Run on Realistic Hardware • Developers, You Need To Be Cost Conscious • Can You Afford to Scale Up? How About Scale Down?
  48. 48. Other Challenges • Apache Configurations ‣ MaxClient Limits Reached • MySQL Configurations ‣ Max Connections Reached • Network Communication ‣ Saturated Bandwidth Between Servers
  49. 49. Tools I’ve Used • MySQL - JetProfiler (No FOSS, Hopefully One Day) • Server Performance - top, htop, atop, iotop • PHP - Zend Debugger & Profiler, xdebug • FirePHP & FireBug
  50. 50. So... Where’s The Scaling?
  51. 51. Getting Ready To Scale
  52. 52. Getting Ready To Scale • Compartmentalize “Parts” Into “Components”
  53. 53. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component”
  54. 54. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications
  55. 55. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications • Create Metrics to Monitor the Health of the Components
  56. 56. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications • Create Metrics to Monitor the Health of the Components • “Zero Theory” For Each Component
  57. 57. Makes Components of Dating DNA
  58. 58. Makes Components of Dating DNA Dating DNA Application
  59. 59. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Dating DNA Database Web Servers Application Cluster Score Generation Memcache System Cluster
  60. 60. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Dating DNA Database Web Servers Application Cluster Score Generation Memcache System Cluster
  61. 61. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Database Web Servers Communication Cluster Score Generation Memcache System Cluster
  62. 62. Scaling Components
  63. 63. Scaling Components Web Application Server #1
  64. 64. Scaling Components Comp E1 Comp D1 Comp C1 Comp B1 Comp A1 Server #1
  65. 65. Scaling Components Comp E1 Comp D1 Comp C1 Comp B1 Comp A1 Server #1 Server #2
  66. 66. Scaling Components Comp C1 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2
  67. 67. Scaling Components Comp C1 Comp A2 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2
  68. 68. Scaling Components Comp C1 Comp A2 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2 Server #3 Server #4
  69. 69. Scaling Components Comp C1 Comp A2 Comp C2 Comp B1 Comp E1 Comp B2 Comp A1 Comp D1 Comp A3 Server #1 Server #2 Server #3 Server #4
  70. 70. Scaling Components Comp C1 Comp A2 Comp C2 Comp B1 Comp E1 Comp B2 Comp B3 Comp A1 Comp D1 Comp A3 Comp A4 Server #1 Server #2 Server #3 Server #4
  71. 71. Scaling Components • Should Be Able to “Live” With Each Other • Deployment & Management all Automated ‣ Not Necessarily “Automatic” • Fault Tolerance • Testing & Staging Environments
  72. 72. Scaling Web Servers • Challenges Load Balancer ‣ Routing ‣ Sessions • Ideas Web Server Web Server ‣ Separate Dynamic & Static Cache DB ‣ Use Sub Domains
  73. 73. Asynchronous Score Generation • Pass Message To “Score Server” ‣ Pass User ID Only • Quickly Generate a Small Set of Scores • Queue User for Full Process ‣ Spawn Small Generation • Update “MEMORY” MySQL Table on Status
  74. 74. The Open Source Advantage • Choice & Options w/ Cost • Solid Applications, Tested & Proven • Great Communities • Give Back, Karma++
  75. 75. Last Thoughts • Once again, K.I.S.S • Proactive > Reactive • Monitoring, Monitoring, Monitoring! • The “Cloud” • Get Advice from the Community
  76. 76. Questions?
  77. 77. Thank You
  78. 78. Thank You Website www.justincarmony.com Email justin@justincarmony.com Twitter JustinCarmony Skype JustinCarmony IRC irc.freenode.net #uphpu

×