5. BUT, um, wait....
• How many databases ?
• How many webservers ?
• How much shared storage ?
• How many network switches ?
• What about caching ?
• How many CPUs in all of these ?
• How much RAM ?
• How many drives in each ?
• WHEN should we order all of these ?
6. some stats
• - ~35M photos in squid cache (total)
• - ~2M photos in squid’s RAM
• - ~470M photos, 4 or 5 sizes of each
• - 38k req/sec to memcached (12M
objects)
• - 2 PB raw storage (consumed about
~1.5TB on Sunday)
•
23. what do you expect ?
• define what is acceptable
• examples:
• squid hits should take less than X
milliseconds
• SQL queries less than Y
milliseconds, and also keep up with
replication
41. What we know now
• we can do at least 1500 qps (peak)
without:
- slave lag
- unacceptable avg response time
- waiting on disk IO
42. MySQL capacity
1. find ceilings of existing h/w
2. tie app usage to server stats
3. find ceiling:usage ratio
4. do this again:
- regularly (monthly)
- when new features are released
- when new h/w is deployed
44. caching ceilings
squid, memcache
• working-set specific:
• - tiny enough to all fit in memory ?
• - some/more/all on disk ?
• - watch LRU churn
45. churning full caches
• Ceilings at:
• - LRU ref age small enough to affect
hit ratio too much
• - Request rate large enough to affect
disk IO (to 100%)
50. What we know now
• we can do at least 620 req/sec (peak)
without:
- LRU affecting hit ratio
- unacceptable avg response time
- waiting too much on diskIO
51. not full caches
• (working set smaller than max size)
• - request rate large enough to bring
network or CPU to 100%