Scaling 101 Chris Finne CTO of Venrock's Quarry
Venrock and the Quarry are looking for Summer Interns: Ruby on Rails Engineers  Community Manager Digital Media Analyst Digital Media Associate Full Time Professional Web Engineers Mobile/Media/Social MVC / Full stack OS->DB->App->Code->Web Server->HTML/JS/CSS/Ajax->Flash www.venrock.com
Scaling 101 - Assumptions / Misc Target Audience Engineers (but not professional web infrastructure) Give a &quot;lay of the land&quot; rather than heavy specifics < 20M database rows Will distribute the preso Links to various topics are provided as appendix Please interrupt with questions, but might table them for later About 30 slides
Scaling can be rocket science...
Doesn't have to be rocket science... Scaling 101 isn't hard Understand the principles If Scaling 101 isn't enough to handle your traffic... you've probably have enough traffic to get Series A funding Sometimes there is a quick-n-easy solution If not, follow basic problem solving cycle... analyze research trial-n-error (hopefully with testing ;-)
Scaling 101 Principles - Slow to Fast Infrastructure Speed External network accessed Internal network accessed DB on another server: DB Delete DB Update DB Select (goes to disk) DB Select (table in memory) DB Select cached Memcached local filesystem local DB local memcached local app server memory Specific Examples Roundtrip query to Facebook Database table scans on large tables (>100 rows) Dynamically Rendering high volume generic pages Too many database calls per page load (>5-10)
Typical Infrastructure Evolution Single Server  - get the app out! Optimize  to try to stay on a single server Dedicated Database Server Multiple Web/App Servers  -  Load Balancing Add More Database Servers More Separation -  Web and App servers Specific Guidelines for when to level up...
Specific Guidelines - Nada There aren't any. Why? heavy static files vs. very dynamic application code is different database activity is different lots of selects vs. inserts vs. updates vs. deletes lots of complex JOINs vs. simple selects traffic patterns 8, 12 or 24 hour day? Occasional Spikes (Digg, Slashdot) hardware is different from hosting vendor to hosting vendor
Optimize - Profiling your app Questions to ask: Which pages are getting hit the most Google Analytics Web server logs Which pages take the longest to render Firebug - Net tab Yahoo's YSlow Add-on Apache Benchmark tool JMeter external monitoring site, e.g. Site24x7
Optimize - Profiling your app Next question: What pieces of those long, popular pages are taking the longest? Facebook queries application code blocks database queries static file downloads (images,CSS, JS)
Optimize - Profiling your app How to actually find the  offenders ... Application Code: put debug statements to see where the most time is being spent Page Loading / Static file downloads Firefox Extensions: Yahoo YSlow for Firebug:  see which pieces take the longest to download / finish rendering Live HTTP Headers:  Are your static files being cached locally? Database MySQL slow query log MySQL Query log MySQL &quot;EXPLAIN&quot; Queries
Optimize - Profiling your app Now what? Focus on the bottlenecks
Optimize - Remedies Facebook queries Reduce Roundtrips to FB with more complex FQL (e.g. subqueries) Cache Results Use FBML where possible - make FB do the work fb:user fb:name fb:profile-pic Fix inefficient application code any examples?
Optimize - Remedies Optimize SQL Queries Database Indexes Only select what you need to select Views, Stored Procedures Confirm static files are being cached locally in browser (Images, JS and CSS) Apache Config... Caching...
Optimize - Caching What? - Expensive pieces Facebook User Information (24hrs) complex calculations generic pages (or fragments) complex, big or long DB queries
Optimize - Caching Where to? Filesystem HTML pages served directly from Apache w/ no PHP expensive HTML fragments loaded via PHP User's Session Facebook User Info Application state Memcached - (or BerkeleyDB) HTML Session Facebook data (24hrs max) expensive DB query results Database Query Cache repeated queries
Optimize - Apache Are you using all your resources? If you have 10 Apache processes (MaxClients) and 15 users hit your app at the same time, 10 will get served, 5 will get an error Do you need more Apache processes? No if your box's CPU and/or RAM are maxing out (use top) need top optimize the app or add more servers Yes if the requests involve waiting for a long time for Facebook to answer a query (i.e. Apache is just waiting) Add more processes (MaxClients)
Optimize - Apache # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves <IfModule mpm_prefork_module>     StartServers          5     MinSpareServers       5     MaxSpareServers      10     MaxClients          150     MaxRequestsPerChild   0 </IfModule>
Multi-Server - Dedicated DB Needs to be sitting next to the web/app server with a fast link Port 3306 open from Web/App to DB servers MySQL User account has to allow connections from the web/app servers
Multi-Server - Dedicated DB If standard hardware/slices: keep your DB server where it is setup a new slice as a web/app server get your new slice working test load-test cut-over your DNS address
Multi-Server - Dedicated DB If DB performance is a bottleneck and you are going to a larger server for your DB... Configure / Test Configure the new DB mysqldump of your existing DB (do you have enough disk? dump over the wire) new DB: mysql < dump configure your new DB server as an Apache and PHP server as well for testing functional test / load test
Multi-Server - Dedicated DB If DB performance is a bottleneck and you are going to a larger server for your DB... Cut-over shutdown the production webserver mysqldump old import mysql to new configure your existing web/app server to point to new db
Multiple Web/App Servers Bring up new app server Test it by itself Add to load-balancing...
Load Balancing Goals: Load and/or Fault Tolerance Technologies DNS Round Robin Potential Windows / IE issues due to caching Open Source Software Apache reverse Proxy (Apache 2.2) HAProxy  Pound Hardware 10x or more performance than software some hosting vendors provide it as an add-on or part of a package
Multiple DB Servers Getting more involved... Master/Slave reads go to slaves transactional reads or guaranteed updated data reads go to master writes go to master Getting close to Rock Science... Master/Master (possibly with slaves) clusters
Other Easy Tricks Put static files on Amazon's S3 Images, JS, CSS, Videos Optimize Page Loads (not really scalability, but...) Put external &quot;stuff&quot; at the bottom of the page outside TABLES Ad Tags Google Analytics Digg buttons DIV's instead of tables if possible tables wait for all content to be loaded before rendering div's typically render piece by piece
Finally - The End Feedback to Yee or me: cfinne at venrock . com Follow-on questions or consults: cfinne at venrock . com Next Talk? Interaction Design - David Cortright Venture Capital - Some VC (Brian, Ilya, Dev...) Code / Web App Design - me (again???)
Links MySQL Slow Query Log:  http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html   MySQL General Query Log:  http://dev.mysql.com/doc/refman/5.0/en/query-log.html   MySQL Query Cache:  http://jayant7k.blogspot.com/2007/07/mysql-query-cache.html http://dev.mysql.com/doc/refman/5.0/en/query-cache.html   MySQL Optimize Queries and DB Indexes http://www.databasejournal.com/features/mysql/article.php/1382791   Memcached:  http://us3.php.net/memcache http://www.danga.com/memcached/  
Links Firebug: Firebug:  https://addons.mozilla.org/en-US/firefox/addon/1843   YSlow:  http://developer.yahoo.com/yslow/   PHP Facebook Sessions:  http://wiki.developers.facebook.com/index.php/PHP_Sessions   Live HTTP Headers:  https://addons.mozilla.org/en-US/firefox/addon/3829   Apache Prefork Config:  http://httpd.apache.org/docs/2.0/mod/prefork.html   Load balancing: Apache Reverse Proxy:  http://httpd.apache.org/docs/2.2/mod/mod_proxy.html   HAProxy: http://haproxy.1wt.eu/   Pound:  http://www.apsis.ch/pound

Scaling 101 test

  • 1.
    Scaling 101 ChrisFinne CTO of Venrock's Quarry
  • 2.
    Venrock and theQuarry are looking for Summer Interns: Ruby on Rails Engineers Community Manager Digital Media Analyst Digital Media Associate Full Time Professional Web Engineers Mobile/Media/Social MVC / Full stack OS->DB->App->Code->Web Server->HTML/JS/CSS/Ajax->Flash www.venrock.com
  • 3.
    Scaling 101 -Assumptions / Misc Target Audience Engineers (but not professional web infrastructure) Give a &quot;lay of the land&quot; rather than heavy specifics < 20M database rows Will distribute the preso Links to various topics are provided as appendix Please interrupt with questions, but might table them for later About 30 slides
  • 4.
    Scaling can berocket science...
  • 5.
    Doesn't have tobe rocket science... Scaling 101 isn't hard Understand the principles If Scaling 101 isn't enough to handle your traffic... you've probably have enough traffic to get Series A funding Sometimes there is a quick-n-easy solution If not, follow basic problem solving cycle... analyze research trial-n-error (hopefully with testing ;-)
  • 6.
    Scaling 101 Principles- Slow to Fast Infrastructure Speed External network accessed Internal network accessed DB on another server: DB Delete DB Update DB Select (goes to disk) DB Select (table in memory) DB Select cached Memcached local filesystem local DB local memcached local app server memory Specific Examples Roundtrip query to Facebook Database table scans on large tables (>100 rows) Dynamically Rendering high volume generic pages Too many database calls per page load (>5-10)
  • 7.
    Typical Infrastructure EvolutionSingle Server - get the app out! Optimize to try to stay on a single server Dedicated Database Server Multiple Web/App Servers - Load Balancing Add More Database Servers More Separation - Web and App servers Specific Guidelines for when to level up...
  • 8.
    Specific Guidelines -Nada There aren't any. Why? heavy static files vs. very dynamic application code is different database activity is different lots of selects vs. inserts vs. updates vs. deletes lots of complex JOINs vs. simple selects traffic patterns 8, 12 or 24 hour day? Occasional Spikes (Digg, Slashdot) hardware is different from hosting vendor to hosting vendor
  • 9.
    Optimize - Profilingyour app Questions to ask: Which pages are getting hit the most Google Analytics Web server logs Which pages take the longest to render Firebug - Net tab Yahoo's YSlow Add-on Apache Benchmark tool JMeter external monitoring site, e.g. Site24x7
  • 10.
    Optimize - Profilingyour app Next question: What pieces of those long, popular pages are taking the longest? Facebook queries application code blocks database queries static file downloads (images,CSS, JS)
  • 11.
    Optimize - Profilingyour app How to actually find the offenders ... Application Code: put debug statements to see where the most time is being spent Page Loading / Static file downloads Firefox Extensions: Yahoo YSlow for Firebug: see which pieces take the longest to download / finish rendering Live HTTP Headers: Are your static files being cached locally? Database MySQL slow query log MySQL Query log MySQL &quot;EXPLAIN&quot; Queries
  • 12.
    Optimize - Profilingyour app Now what? Focus on the bottlenecks
  • 13.
    Optimize - RemediesFacebook queries Reduce Roundtrips to FB with more complex FQL (e.g. subqueries) Cache Results Use FBML where possible - make FB do the work fb:user fb:name fb:profile-pic Fix inefficient application code any examples?
  • 14.
    Optimize - RemediesOptimize SQL Queries Database Indexes Only select what you need to select Views, Stored Procedures Confirm static files are being cached locally in browser (Images, JS and CSS) Apache Config... Caching...
  • 15.
    Optimize - CachingWhat? - Expensive pieces Facebook User Information (24hrs) complex calculations generic pages (or fragments) complex, big or long DB queries
  • 16.
    Optimize - CachingWhere to? Filesystem HTML pages served directly from Apache w/ no PHP expensive HTML fragments loaded via PHP User's Session Facebook User Info Application state Memcached - (or BerkeleyDB) HTML Session Facebook data (24hrs max) expensive DB query results Database Query Cache repeated queries
  • 17.
    Optimize - ApacheAre you using all your resources? If you have 10 Apache processes (MaxClients) and 15 users hit your app at the same time, 10 will get served, 5 will get an error Do you need more Apache processes? No if your box's CPU and/or RAM are maxing out (use top) need top optimize the app or add more servers Yes if the requests involve waiting for a long time for Facebook to answer a query (i.e. Apache is just waiting) Add more processes (MaxClients)
  • 18.
    Optimize - Apache# prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves <IfModule mpm_prefork_module>     StartServers          5     MinSpareServers       5     MaxSpareServers      10     MaxClients          150     MaxRequestsPerChild   0 </IfModule>
  • 19.
    Multi-Server - DedicatedDB Needs to be sitting next to the web/app server with a fast link Port 3306 open from Web/App to DB servers MySQL User account has to allow connections from the web/app servers
  • 20.
    Multi-Server - DedicatedDB If standard hardware/slices: keep your DB server where it is setup a new slice as a web/app server get your new slice working test load-test cut-over your DNS address
  • 21.
    Multi-Server - DedicatedDB If DB performance is a bottleneck and you are going to a larger server for your DB... Configure / Test Configure the new DB mysqldump of your existing DB (do you have enough disk? dump over the wire) new DB: mysql < dump configure your new DB server as an Apache and PHP server as well for testing functional test / load test
  • 22.
    Multi-Server - DedicatedDB If DB performance is a bottleneck and you are going to a larger server for your DB... Cut-over shutdown the production webserver mysqldump old import mysql to new configure your existing web/app server to point to new db
  • 23.
    Multiple Web/App ServersBring up new app server Test it by itself Add to load-balancing...
  • 24.
    Load Balancing Goals:Load and/or Fault Tolerance Technologies DNS Round Robin Potential Windows / IE issues due to caching Open Source Software Apache reverse Proxy (Apache 2.2) HAProxy Pound Hardware 10x or more performance than software some hosting vendors provide it as an add-on or part of a package
  • 25.
    Multiple DB ServersGetting more involved... Master/Slave reads go to slaves transactional reads or guaranteed updated data reads go to master writes go to master Getting close to Rock Science... Master/Master (possibly with slaves) clusters
  • 26.
    Other Easy TricksPut static files on Amazon's S3 Images, JS, CSS, Videos Optimize Page Loads (not really scalability, but...) Put external &quot;stuff&quot; at the bottom of the page outside TABLES Ad Tags Google Analytics Digg buttons DIV's instead of tables if possible tables wait for all content to be loaded before rendering div's typically render piece by piece
  • 27.
    Finally - TheEnd Feedback to Yee or me: cfinne at venrock . com Follow-on questions or consults: cfinne at venrock . com Next Talk? Interaction Design - David Cortright Venture Capital - Some VC (Brian, Ilya, Dev...) Code / Web App Design - me (again???)
  • 28.
    Links MySQL SlowQuery Log: http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html MySQL General Query Log: http://dev.mysql.com/doc/refman/5.0/en/query-log.html MySQL Query Cache: http://jayant7k.blogspot.com/2007/07/mysql-query-cache.html http://dev.mysql.com/doc/refman/5.0/en/query-cache.html MySQL Optimize Queries and DB Indexes http://www.databasejournal.com/features/mysql/article.php/1382791 Memcached: http://us3.php.net/memcache http://www.danga.com/memcached/  
  • 29.
    Links Firebug: Firebug: https://addons.mozilla.org/en-US/firefox/addon/1843 YSlow: http://developer.yahoo.com/yslow/   PHP Facebook Sessions: http://wiki.developers.facebook.com/index.php/PHP_Sessions Live HTTP Headers: https://addons.mozilla.org/en-US/firefox/addon/3829 Apache Prefork Config: http://httpd.apache.org/docs/2.0/mod/prefork.html Load balancing: Apache Reverse Proxy: http://httpd.apache.org/docs/2.2/mod/mod_proxy.html HAProxy: http://haproxy.1wt.eu/ Pound: http://www.apsis.ch/pound